$$\rightleftharpoonup{xx}$$
$$\longleftharp{xx}$$,
$$\longrightharp{xx}$$,
Descriptive statistics of retained variables
Table 2 reports the descriptive statistics of the retained self-report survey variables in the continuous-score dataset before binary recoding. The mean self-reported higher-order thinking score was 3.71 (SD = 0.68), which was above the fixed midpoint threshold of 3 on the 1–5 response scale. Trust in generative artificial intelligence also exceeded this threshold, with a mean of 3.54 (SD = 0.60), while academic performance was close to the threshold, with a mean of 3.07 (SD = 1.12). The remaining variables showed mean scores of 3.48 for generative artificial intelligence anxiety, 2.72 for problematic smartphone use, 2.59 for academic procrastination, 2.99 for parental upbringing, 2.12 for negative emotions, and 3.26 for attitudes toward generative artificial intelligence, with standard deviations of 0.75, 0.74, 0.64, 0.43, 0.75, and 0.57, respectively.
These descriptive statistics indicate that the retained variables did not have identical distributions before binary recoding. Self-reported higher-order thinking, trust in generative artificial intelligence, generative artificial intelligence anxiety, academic performance, and attitudes toward generative artificial intelligence were at or above the midpoint threshold, whereas problematic smartphone use, academic procrastination, parental upbringing, and negative emotions were below or close to the threshold. This distributional pattern is important because the same midpoint coding rule was later applied to all retained variables.
Binary coding and class distribution
Table 1 summarizes the binary coding rule and class counts used in the auxiliary classification workflow. Scores less than or equal to 3 were coded as 0, and scores greater than 3 were coded as 1. After this coding rule was applied to the self-reported higher-order thinking score, 118 students were classified as Low-HOT and 658 students were classified as High-HOT. Thus, the full dataset was imbalanced, with Low-HOT representing 15.21% of cases and High-HOT representing 84.79% of cases.
This imbalance provides important context for interpreting the decision tree output. Because the majority class accounted for 84.79% of the full dataset, overall accuracy alone could overstate practical classification performance. The retained tree was therefore evaluated with class-specific recall and precision in addition to overall accuracy.
Representative decision tree output
In the auxiliary C5.0 classification step, eight respondent-level variables were retained in the pruned tree used to classify binary self-reported higher-order thinking. In descending order of model-reported importance, these variables were generative artificial intelligence anxiety, trust in generative artificial intelligence, problematic smartphone use, academic procrastination, academic performance, parental upbringing, negative emotions, and attitudes toward generative artificial intelligence. Figure 1 presents the retained pruned tree generated from the training subset.
The tree should be interpreted as an auxiliary organization of learner profile patterns rather than as a causal model. The position of a variable in the tree indicates its role in the retained classification structure under the specified coding rule, partitioning procedure, and C5.0 settings. It does not show that the variable directly causes higher or lower higher-order thinking.
To improve interpretability, the complete terminal-node output is provided in Supplementary File 4. This file reports the rule path, node sample size, predicted class, Low-HOT count, High-HOT count, and classification confidence for each terminal node. The node-level output is especially important for interpreting branches that may appear counter-intuitive in the simplified tree diagram.
Classification performance of the retained model
Table 3 presents the confusion matrices for the training and testing subsets. In the training subset (n = 544), the retained model correctly classified 37 Low-HOT cases and 450 High-HOT cases. Forty-eight Low-HOT cases were classified as High-HOT, and 9 High-HOT cases were classified as Low-HOT. In the testing subset (n = 232), the model correctly classified 10 Low-HOT cases and 190 High-HOT cases. Twenty-three Low-HOT cases were classified as High-HOT, and nine High-HOT cases were classified as Low-HOT.
As shown in Table 4, the training accuracy was 89.52% (487/544), and the testing accuracy was 86.21% (200/232). However, the no-information rate in the testing subset was 85.78% because 199 of the 232 testing cases belonged to the High-HOT class. The testing accuracy therefore only modestly exceeded the majority class baseline.
Table 5 reports the corrected class-specific recall and precision values in the testing subset. Low-HOT recall was 30.30% (10/33), and Low-HOT precision was 52.63% (10/19). High-HOT recall was 95.48% (190/199), and High-HOT precision was 89.20% (190/213). This pattern shows that the retained classifier recovered the High-HOT class much more effectively than the Low-HOT class. The weak Low-HOT recall means that most students in the Low-HOT class were not identified by the retained tree. For this reason, the model should not be interpreted as a validated screening tool for detecting students with lower self-reported higher-order thinking. Its main value in this article is to demonstrate a reproducible and interpretable survey-based classification workflow.

Figure 1: Auxiliary C5.0 decision tree classification of self-reported higher-order thinking. Retained pruned C5.0 decision tree generated from the training subset to classify binary self-reported higher-order thinking. The model used generative artificial intelligence anxiety, trust in generative artificial intelligence, problematic smartphone use, academic procrastination, academic performance, parental upbringing, negative emotions, and attitudes toward generative artificial intelligence as input variables. The tree was trained in IBM SPSS Modeler 18.4 using the fixed 70:30 stratified split. Node-level class distribution and prediction information are shown in the figure or provided in Supplementary File 4. Please click here to view a larger version of this figure.
| Variable | Coding | Count | Percentage |
| Self-reported higher-order thinking | 0 = Low-HOT | 118 | 15.21% |
| Self-reported higher-order thinking | 1 = High-HOT | 658 | 84.79% |
| Generative artificial intelligence anxiety | 0 = Low | 252 | 32.47% |
| Generative artificial intelligence anxiety | 1 = High | 524 | 67.53% |
| Trust in generative artificial intelligence | 0 = Low | 217 | 27.96% |
| Trust in generative artificial intelligence | 1 = High | 559 | 72.04% |
| Problematic smartphone use | 0 = Low | 591 | 76.16% |
| Problematic smartphone use | 1 = High | 185 | 23.84% |
| Academic procrastination | 0 = Low | 629 | 81.06% |
| Academic procrastination | 1 = High | 147 | 18.94% |
| Academic performance | 0 = Low | 478 | 61.60% |
| Academic performance | 1 = High | 298 | 38.40% |
| Parental upbringing | 0 = Low | 483 | 62.24% |
| Parental upbringing | 1 = High | 293 | 37.76% |
| Negative emotions | 0 = Low | 694 | 89.43% |
| Negative emotions | 1 = High | 82 | 10.57% |
| Attitudes toward generative artificial intelligence | 0 = Low | 341 | 43.94% |
| Attitudes toward generative artificial intelligence | 1 = High | 435 | 56.06% |
Table 1: Binary coding rule and class distribution. Binary coding scheme and class counts for the target and input variables used in the auxiliary classification workflow. Scores ≤3 on the 1–5 response scale were coded as 0, and scores >3 were coded as 1. For the target variable, 0 indicates Low-HOT and 1 indicates High-HOT. For input variables, 0 indicates a low or neutral-or-lower level, and 1 indicates a high or agreement-level response.
| Variable | Response range | Mean | SD | Binary coding threshold |
| Self-reported higher-order thinking | 1–5 | 3.71 | 0.68 | 3 |
| Generative artificial intelligence anxiety | 1–5 | 3.48 | 0.75 | 3 |
| Trust in generative artificial intelligence | 1–5 | 3.54 | 0.6 | 3 |
| Problematic smartphone use | 1–5 | 2.72 | 0.74 | 3 |
| Academic procrastination | 1–5 | 2.59 | 0.64 | 3 |
| Academic performance | 1–5 | 3.07 | 1.12 | 3 |
| Parental upbringing | 1–5 | 2.99 | 0.43 | 3 |
| Negative emotions | 1–5 | 2.12 | 0.75 | 3 |
| Attitudes toward generative artificial intelligence | 1–5 | 3.26 | 0.57 | 3 |
Table 2: Descriptive statistics before binary recoding. Descriptive statistics of the retained self-report survey variables in the continuous score dataset. Means, standard deviations, response ranges, and the fixed midpoint threshold are reported before binary recoding. The response range column indicates that retained variables were analyzed on a 1–5 scale before application of the midpoint coding threshold.
| Dataset | Actual class | Predicted Low-HOT | Predicted High-HOT | Total actual cases |
| Training subset | Low-HOT | 37 | 48 | 85 |
| Training subset | High-HOT | 9 | 450 | 459 |
| Testing subset | Low-HOT | 10 | 23 | 33 |
| Testing subset | High-HOT | 9 | 190 | 199 |
Table 3: Confusion matrices for the retained C5.0 classifier. Confusion matrices for the retained pruned C5.0 tree in the training and testing subsets. The table compares actual and predicted class membership for Low-HOT and High-HOT cases. The training subset contained 544 cases, and the testing subset contained 232 cases.
| Dataset | Result / metric | Count | Percentage |
| Training subset | Correct classifications | 487 | 89.52% |
| Training subset | Incorrect classifications | 57 | 10.48% |
| Training subset | Total cases | 544 | 100.00% |
| Training subset | No-information rate | 459 / 544 | 84.38% |
| Testing subset | Correct classifications | 200 | 86.21% |
| Testing subset | Incorrect classifications | 32 | 13.79% |
| Testing subset | Total cases | 232 | 100.00% |
| Testing subset | No-information rate | 199 / 232 | 85.78% |
Table 4: Overall accuracy and majority class baseline. Overall accuracy of the retained C5.0 tree in the training and testing subsets. Correct classifications, incorrect classifications, total cases, and no-information rate are reported to contextualize accuracy under class imbalance. The no-information rate is the majority-class proportion in each subset. It is reported because the target variable was imbalanced, with High-HOT representing the majority class.
| Class | True positives | False negatives | False positives | Recall | Precision |
| Low-HOT | 10 | 23 | 9 | 30.30% | 52.63% |
| High-HOT | 190 | 9 | 23 | 95.48% | 89.20% |
Table 5: Corrected class-specific recall and precision in the testing subset. Corrected recall and precision values for Low-HOT and High-HOT cases in the testing subset. Recall was calculated as TP/(TP + FN), and precision was calculated as TP/(TP + FP). The corrected values show that the retained classifier recovered the High-HOT class much more effectively than the Low-HOT class in the testing subset.
Supplementary Table 1: Scoring and measurement record. Scoring and measurement record for the retained self-report survey variables. The table includes item count, response range, response anchors, scoring direction, reverse-keyed items, composite-score rule, Cronbach’s alpha, item-total checking result, and final variable label. Composite scores were calculated at the respondent level before binary recoding. The DASS-21-derived negative-emotion items were not reverse-scored in this workflow because all retained items were scored in the same direction, with higher values indicating stronger negative-emotion experience. Cronbach’s alpha was reported for retained multi-item scales only.Please click here to download this file.
Supplementary Table 2: C5.0 model settings and exported outputs. Model setting and output record for the auxiliary C5.0 decision-tree workflow. The table includes the target variable, input variables, coding threshold, split rule, random seed, software version, pruning settings, class-weighting setting, misclassification-cost setting, and exported model outputs. All settings were fixed before model evaluation and preserved unchanged across the auxiliary classification workflow.Please click here to download this file.
Supplementary File 1: Respondent-facing questionnaire. Full questionnaire used for survey administration, including consent text, background items, generative artificial intelligence exposure items, scale item wording, response anchors, item order, scoring direction, and final variable labels.Please click here to download this file.
Supplementary File 2: Sample-screening log. Screening log documenting the transition from invited students to the final analytic sample, including returned questionnaires, exclusion categories, and retained valid cases.Please click here to download this file.
Supplementary File 3: Binary-coded analysis-file structure. Coding structure for the binary analysis file used in C5.0 classification. The file documents source variables, binary variable labels, coding rules, value labels, class counts, and class percentages.Please click here to download this file.
Supplementary File 4: Terminal-node and variable-importance output. Terminal node and variable importance output from the retained pruned C5.0 tree, including rule paths, node sample sizes, predicted classes, class distributions, class probabilities, and model-reported importance values.Please click here to download this file.