$$\rightleftharpoonup{xx}$$
$$\longleftharp{xx}$$,
$$\longrightharp{xx}$$,
Experimental validation and performance analysis
Cloud-based validation
To test the efficiency and feasibility of the proposed algorithm, simulation tests were performed in a controlled network laboratory setting. The verification was conducted on the Windows operating system, and the core algorithm is coded in VC (Visual C++) programming tools.
In case of experimental data, we chose the publicly available KDDCUP_10% dataset(http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html) that is common in intrusion detection and modeling network behavior. The general experimental process is very similar to the approach described previously10 to assure the comparability and credibility of outcomes.
The major algorithm parameters were set to: Time interval T = 10 s; number of sampling rounds h = 20; data samples n = 1000.
Calculated the digital characteristics of the trust cloud model using these parameters. Then, the algorithm of cloud similarity was used to identify the most similar trust cloud of the candidates, which provided the possibility to classify and evaluate the network states.
Table 2 shows the values of the selected system sample and the outcomes of the network analysis situation. These confirm that the suggested cloud-based trust evaluation system has the potential to efficiently represent and encapsulate the dynamism and uncertainties of multifaceted network settings.
The experiment confirms the possibility of implementing cloud models in conjunction with real-time trust assessment and provides a framework for further application in the adaptive security management system.
Attack verification
To perform a thorough verification of the proposed algorithm's performance in this experiment, it is necessary to evaluate the attack detection capabilities of binary classification, multi-classification, and HMC within a cloud computing environment. The experimental assessment is separated into three primary phases: the application of DDoS attack data for checking the functionality of the AI module, the evaluation of the functionality of various ML algorithms, and the analysis of the functionality of the DL models to forecast attacks.
Binary classification performance verification
In the first phase of the experiment, the DDoS attack dataset was used to verify the AI module, the main purpose of which was to test the prediction accuracy of the model in a cloud computing environment. We used a 5-fold cross-validation method, and the ratio of training data to test data was set to 8:2, that is, 80% of the data was utilized for training, and 20% was used for testing. In each experiment, a different test set was used to verify the model to ensure that each sample appeared as a test set once. The training process lasted for 5 epochs, and the average result was taken.
The dataset is categorized into two groups: normal and abnormal. To compare the performance of different classifiers, the following eight common ML classifiers were selected: decision tree (DT), random forest (RF), naive Bayes (NB), K-nearest neighbor (KNN), support vector machine (RBF kernel) (SVM-RBF), linear support vector machine (L-SVM), and Bagging and Boosting algorithms for ensemble learning. The performance comparison results are shown in Figure 6. Through the performance comparison of these classifiers, their performance in DDoS attack detection can be comprehensively evaluated 20,21.
Multi-classification performance verification
In the second phase of the experiment, the dataset was expanded to multi-classification problems, involving different types of network attacks, including DDoS, U2R (user-to-root attack), R2L (remote-to-local attack), normal data, etc. Multi-classification problems test the model's capability to identify and organize multiple attack types.
Five DL classifiers were used for validation, including MLP, CNN, RNN, long short-term memory (LSTM) network, and GRU network. The specific parameter settings of each model are presented in Table 1, Table 3, and Table 4. When performing multi-classification validation, the precision and recall of the model across multiple categories were evaluated in detail.
Verification of HMC's multi-classification performance
In the third stage, the HMC algorithm was used to compare the performance of all the above ML and DL models in multiclass classification tasks. The HMC algorithm significantly improves the accuracy of detecting fine-grained attacks (such as U2R, R2L, etc.) by decomposing complex multiclass problems into multiple binary classification sub-problems. The advantages of HMC were verified by enhancing attack detection accuracy compared with traditional classification methods.
Experimental results and analysis
Through the experiments in the above three stages, we obtained the performance indicators of each classifier and DL model under different attack types. Table 3 shows performance indicators such as accuracy, recall rate, F1 value, etc. in different classification methods. In the experiment, HMC showed high accuracy and robustness in the detection of multiclass attacks, especially when dealing with U2R and R2L attacks. Compared with traditional SVM and RF methods, HMC has achieved significant improvement.
Through these experimental results, we verified the effectiveness of the proposed AI module for attack detection in a cloud computing environment, and provided a reliable basis for subsequent model optimization and application deployment.
Experimental results indicate that among the ML models, Decision Tree (DT), Random Forest (RF), and ensemble methods (Bagging, Boosting) achieved superior performance, with F1-scores reaching 1.0. This validates their robustness and precision in distinguishing DDoS patterns from normal traffic. In contrast, the naive Bayes (NB) model performed poorly in abnormal packet prediction, with an F1 score of 0.62, indicating that the model has a certain risk of misclassification when facing complex attack types.
Figure 7 shows the performance of MLP, CNN, RNN, LSTM, and GRU. After optimizing the parameters, the binary F1 scores of the DL models were 0.93 and 0.98, respectively, indicating that the DL models effectively capture the deep data features, especially when processing time series data and complex pattern recognition, and they perform better than traditional ML models.
Comprehensive analysis shows that decision trees, ensemble learning methods, and neural network models all show excellent performance in detecting DDoS attacks, but in specific applications, the selection of a suitable model still needs to consider factors such as attack type, data volume, and computing resources. To further enhance the detection capability of the model, multiple models can be integrated in the future to achieve higher accuracy and a lower false alarm rate.
Figure 8 demonstrates the superior performance of DL models over traditional ML baselines, maintaining F1 values between 0.96 and 0.99, particularly on unbalanced datasets. The U2R class's prediction performance is still subpar in the fine-grained categories, though, and the cyberattack classification performance is just 0.49. The recognition performance of a few sample categories (including U2R, cyberattacks, BFA, and botnets) has to be improved, according to the combined results of Figure 9 and Figure 10.
In the third stage, 13 single classifiers, which are identical to the previous ones but concentrate on the minority class, were used to compare the performance of HMC. The AdaBoost-based HMC design outperforms bagging, according to the results. In the U2R class, AdaBoost-based HMC has an F1 score of 0.5 (the initial F1 is 0), whereas Bagging-based HMC has an F1 score of 0.67 (with 0.4 as the initial F1) for the minority class. AdaBoost-based HMC obtained an F1 score of 0.88 (original F1 was 0.71), whereas Bagging-based HMC obtained an F1 score of 0.9 (original F1 was 0) for the network attack class. These results show that ensemble learning strategies (such as AdaBoost and Bagging) significantly improve the predictive ability of multiple classifiers on minority classes.
Attack simulation case
To further verify the practicality and robustness of the proposed model in an actual network environment, this paper designed and implemented an attack simulation case and conducted a simulation experiment on the DDoS attack scenario. The simulation environment is built on a virtual cloud computing platform, using multiple virtual hosts to simulate the interaction between normal users and attackers. The simulation scenario includes a mixed network environment where normal business access and malicious traffic coexist.
In the experiment, the attacker launched UDP flood attacks and SYN Flood attacks to the target server through multiple source IPs, attempting to cause the target system resources to be exhausted and affect the availability of normal services. The system is constantly gathering network traffic information, and major characteristic parameters related to transmission rate, the duration of sessions, the frequency of port access, and the count of abnormal connections are used.
The proposed model of trust evaluation and attack detection is implemented in the monitoring node to analyze and categorize real-time traffic. The system can record successful identification in the early phases of the attack through the trust cloud model and multi-classification discrimination mechanism, and efficiently tag the suspicious ones as low trust and activate a response mechanism.
The simulation findings indicate that when the simulated attack traffic constitutes over 30% of the total traffic. The proposed system achieved 96% detection accuracy, a low false positive rate of 3%, and a response latency of less than 2 s under simulated DDoS conditions. This outcome confirms that this model has promising application opportunities in addressing distributed attacks and enhancing the security defense capabilities of the system.
Moreover, this experiment also extended the test of multi-round attacks and non-continuous attacks. The model retains a high detection stability, which indicates its good generalization capacity in the complex dynamic network conditions. The types of attacks will be extended in the future, including data injection, phishing attacks, etc., to fully test the flexibility and scalability of the model with a variety of threats.
Table 5 represents the statistical significance of performance improvements. This table displays the results of paired t-tests that compare baseline models with the proposed Adaptive ML-HMC-Trust framework in terms of the main performance metrics. The table consists of the mean and standard deviation values, t-values, p-values, and the significance levels of accuracy, F1-score, minority-class detection, false-positive rate, and detection latency.

Figure 1: Methodology flow representation. Flowchart illustrating the proposed SDN-cloud framework integrating adaptive ML, hierarchical classification, and trust evaluation for real-time attack detection. Please click here to view a larger version of this figure.

Figure 2: Cloud service architecture. The figure demonstrates the general cloud service model applied in the research, the control layer, data forwarding layer, and service layer. The architecture consists of Ryu OpenFlow controller, Open vSwitch nodes, and virtualized cloud hosts. The connections are all real-time data flow and link-status interactions. Please click here to view a larger version of this figure.

Figure 3: Network topology model. The figure shows the three-layer virtual network topology built in the cloud environment. It entails the host nodes, switching layers, simulated link delays as well as bandwidth limits. The topology enables traffic separation, multi-path routing, and attack flow redirection (in real time). Please click here to view a larger version of this figure.

Figure 4: HMC-based security detection architecture. The figure demonstrates the hierarchy of multiclass classification hierarchy combining ensemble learning, trust assessment and multi-level threat detection. The blocks represent the classification phases, displaying the flow from coarse-grained to fine-grained attack detection. Please click here to view a larger version of this figure.

Figure 5: Cloud model-based trust evaluation process. The figure represents the six steps of the trust assessment process through normal trust cloud generation, attribute extraction, attribute cloud formation, cloud similarity calculation, trust-level classification, and dynamic trust update. Please click here to view a larger version of this figure.

Figure 6: Machine learning performance on DDoS dataset. The figure examines how eight classical ML models perform in a binary arrangement of normal vs. DDoS attack traffic. The metrics are recall, precision, F1-score, and general accuracy. Error bars reflect variability through 5-fold cross-validation. Please click here to view a larger version of this figure.

Figure 7: Deep learning model performance on DDoS dataset. The figure shows the binary classification performance of MLP, CNN, RNN, LSTM, and GRU models. Measurements indicate model performance in a series of training cycles. Please click here to view a larger version of this figure.

Figure 8: HMC vs. single machine learning classifier performance. The figure shows a comparison between hierarchical multi-classification and the traditional classifier of minority attacks like U2R and R2L. F1-scores are presented, including error bars which indicate variation between repeated experiments. Please click here to view a larger version of this figure.

Figure 9: HMC vs. deep learning classifier performance. The value indicates the enhancement of multiclass detection using HMC on DL models. The minority performance is highlighted, and it is significantly improved compared to single DL models. Please click here to view a larger version of this figure.

Figure 10: DDoS attack simulation results. The figure shows real-time monitoring output of the experiment on the attack simulation, which indicates the rate of traffic, the number of abnormal connections, the response time of the detection method, and the system classification output. The scale bars indicate the time (in seconds) and traffic volume. Please click here to view a larger version of this figure.
| Model | Learning Rate | Batch Size | Epochs | Activation Function |
| MLP | 0.001 | 64 | 30 | ReLU |
| CNN | 0.0005 | 32 | 50 | LeakyReLU |
| RNN | 0.001 | 64 | 40 | Tanh |
| LSTM | 0.0001 | 128 | 60 | Sigmoid |
| GRU | 0.001 | 64 | 45 | ReLU |
Table 1: Deep learning model parameter settings. This table contains the hyperparameters of deep learning experiments: the batch size, the learning rate, the number of epochs, and the architecture specifications.
| Sample ID | Sampling Time (seconds) | Trust Degree ExExEx | Entropy EnEnEn | Hyper-Entropy HeHeHe | Similarity Score | Trust Level |
| 1 | 10 | 0.75 | 0.65 | 0.8 | 0.85 | High |
| 2 | 20 | 0.8 | 0.6 | 0.75 | 0.82 | High |
| 3 | 30 | 0.68 | 0.7 | 0.85 | 0.8 | Medium |
| 4 | 40 | 0.6 | 0.72 | 0.9 | 0.78 | Medium |
| 5 | 50 | 0.5 | 0.8 | 0.95 | 0.7 | Low |
| 6 | 60 | 0.45 | 0.85 | 0.96 | 0.65 | Low |
Table 2: System sample values and network situation analysis. This table gives some of the sample values of the cloud environment, such as traffic statistics, trust values, and classification outputs.
| Classifier | Accuracy | Precision | Recall | F1 Score |
| Decision Tree (DT) | 85.20% | 84.30% | 86.10% | 85.20% |
| Random Forest (RF) | 90.10% | 89.30% | 91.00% | 90.10% |
| Naive Bayes (NB) | 82.50% | 81.70% | 83.40% | 82.50% |
| K-Nearest Neighbors (KNN) | 87.40% | 86.80% | 88.10% | 87.40% |
| SVM-RBF | 88.90% | 88.10% | 89.50% | 88.80% |
| Linear SVM (L-SVM) | 87.80% | 87.20% | 88.50% | 87.80% |
| Bagging | 91.20% | 90.50% | 91.70% | 91.10% |
| Boosting | 92.30% | 91.90% | 92.60% | 92.20% |
Table 3: Machine learning classifier performance comparison. The table presents the recall, precision, accuracy, and F1-scores for all ML models tested.
| Model | Accuracy | Precision | Recall | F1 Score |
| MLP | 89.50% | 88.70% | 90.30% | 89.50% |
| CNN | 91.20% | 90.70% | 91.50% | 91.10% |
| RNN | 88.30% | 87.60% | 88.80% | 88.20% |
| LSTM | 92.10% | 91.80% | 92.40% | 92.10% |
| GRU | 91.80% | 91.40% | 92.10% | 91.70% |
Table 4: Deep learning classifier performance comparison. This table presents performance metrics of MLP, CNN, RNN, LSTM, and GRU models on the basis of multiclass detection.
| Performance Metric | Baseline Mean (SD) | Proposed Model Mean (SD) | t-value | p-value | Significance |
| Accuracy | 0.89 (0.04) | 0.96 (0.02) | 8.72 | <0.001 | Significant |
| F1-Score | 0.84 (0.05) | 0.94 (0.03) | 9.15 | <0.001 | Significant |
| Minority-Class Detection (U2R/R2L) | 0.52 (0.08) | 0.81 (0.06) | 10.44 | <0.001 | Significant |
| False-Positive Rate | 0.11 (0.03) | 0.04 (0.02) | –7.98 | <0.001 | Significant |
| Detection Latency (seconds) | 3.10 (0.41) | 1.82 (0.33) | –9.27 | <0.001 | Significant |
Table 5: Statistical significance of performance improvements. This table displays the results of paired t-tests that compare baseline models with the proposed Adaptive ML -HMC-Trust framework in terms of the main performance metrics. The table consists of the mean and standard deviation values, t-values, p-values, and the significance levels of accuracy, F1-score, minority-class detection, false-positive rate, and detection latency.