RESEARCH
Peer reviewed scientific video journal
Video encyclopedia of advanced research methods
Visualizing science through experiment videos
EDUCATION
Video textbooks for undergraduate courses
Visual demonstrations of key scientific experiments
BUSINESS
Video textbooks for business education
OTHERS
Interactive video based quizzes for formative assessments
Products
RESEARCH
JoVE Journal
Peer reviewed scientific video journal
JoVE Encyclopedia of Experiments
Video encyclopedia of advanced research methods
EDUCATION
JoVE Core
Video textbooks for undergraduates
JoVE Science Education
Visual demonstrations of key scientific experiments
JoVE Lab Manual
Videos of experiments for undergraduate lab courses
BUSINESS
JoVE Business
Video textbooks for business education
Solutions
Language
English
Menu
Menu
Menu
Menu
A subscription to JoVE is required to view this content. Sign in or start your free trial.
Research Article
Erratum Notice
Important: There has been an erratum issued for this article. View Erratum Notice
Retraction Notice
The article Assisted Selection of Biomarkers by Linear Discriminant Analysis Effect Size (LEfSe) in Microbiome Data (10.3791/61715) has been retracted by the journal upon the authors' request due to a conflict regarding the data and methodology. View Retraction Notice
This study presents a machine learning-based framework for real-time IoT ontology alignment, enabling seamless data exchange across heterogeneous systems. By integrating semantic modeling and adaptive optimization, the approach enhances interoperability, reduces latency, and achieves high accuracy. Validated in real-world settings, it offers a scalable, standardized IoT integration solution.
The increasing heterogeneity of Internet of Things (IoT) devices has led to significant challenges in achieving real-time interoperability and seamless data exchange. Existing IoT ecosystems often operate using diverse data models, communication protocols, and semantic representations, resulting in fragmented systems that hinder integration. To address this problem, we propose a unified framework that employs machine learning-based ontology alignment for standardized, adaptive IoT integration. The hypothesis guiding this research is that combining semantic modeling with intelligent optimization techniques can significantly improve the consistency and efficiency of data exchange across heterogeneous IoT environments. The proposed framework integrates real-time data stream processing, semantic similarity analysis, and adaptive ontology mapping to dynamically align device ontologies. Using simulated and real-world environments, including smart homes and healthcare systems, the framework was tested against key performance metrics such as accuracy, latency, and interoperability rate. Results demonstrate that the proposed method achieves a high ontology alignment accuracy of 97%, reduces latency to under 20 ms, and maintains over 95% interoperability among diverse device types. The findings confirm that the integration of machine learning algorithms with semantic modeling significantly enhances the performance, scalability, and adaptability of IoT systems. The framework successfully addresses semantic inconsistencies and supports dynamic device onboarding without manual intervention. This study presents a robust and scalable solution for IoT interoperability, offering real-time, intelligent ontology alignment that is adaptable to evolving devices and data standards. This work contributes to the development of next-generation IoT architectures capable of supporting standardized, efficient, and automated communication across diverse applications.
The Internet of Things (IoT) is rapidly evolving into a core infrastructure for smart environments, connecting a wide array of heterogeneous devices that operate across diverse domains such as healthcare, smart cities, agriculture, and industrial automation1,2,3. These devices generate large volumes of data and rely on semantic understanding to communicate meaningfully4,5,6,7. However, the lack of a standardized semantic structure has emerged as a key barrier to seamless data exchange8,9,10,11,12. Diverse ontologies, varying protocols, and inconsistent data models limit interoperability, making it difficult for devices to collaborate efficiently in real time. This semantic fragmentation often leads to misinterpretations, increased latency, and integration failures that compromise system scalability and performance5,12,13. Therefore, there is a critical need for a unifying approach that can standardize semantic representations while adapting to the dynamic nature of IoT environments6,10,11,12.
This study proposes a novel machine-learning-based framework for real-time ontology alignment to address these interoperability challenges4,8,9,11,14. The framework combines advanced semantic modeling, adaptive ontology mapping, and data stream integration9,10,13 to dynamically align IoT ontologies. Leveraging machine learning techniques for ontology alignment can significantly enhance semantic consistency and integration efficiency in heterogeneous IoT systems8,9,14,15. Unlike traditional rule-based methods4,5,7,8,9,10,13, the proposed approach employs weighted similarity measures, structural and lexical mappings, and optimization algorithms that adapt to incoming data15,16,17. Experimental validation was conducted in both simulated and real-world environments, including smart homes and healthcare systems. The results demonstrate that the framework achieves an ontology alignment accuracy of 97%, reduces average latency to under 20 ms, and maintains interoperability rates above 95% across device types8.
The adaptability of the proposed framework further distinguishes it from existing solutions9,13,15. As new devices join the network, the system automatically updates semantic relationships without requiring manual reconfiguration5,6,8,9,13,15. This is made possible through the integration of learning-based optimization that minimizes semantic mismatches and computational overhead8,9. The system also exhibits improved performance in memory and CPU usage, making it suitable for resource-constrained environments1,2,3,11,12. By addressing semantic inconsistency, latency, and scalability in a unified architecture, this work lays the foundation for a standardized, intelligent, and future-ready IoT communication model13,18.
The goal of this study is to develop and validate a machine-learning-based framework for real-time ontology alignment to facilitate semantic interoperability in heterogeneous IoT environments. The study includes the following phases: (1) Ontology collection and analysis, (2) Semantic modeling and alignment, (3) Machine learning-based optimization, and (4) Real-world validation in simulated and live IoT contexts. The experimental framework was tested using public datasets, simulated environments, and live deployments in smart home and healthcare environments. A schematic representation of a machine learning based IoT ontology alignment framework is provided in Figure 1.

Figure 1: Machine learning-based IoT ontology alignment framework. This figure presents a high-level view of the proposed framework that integrates machine learning techniques to align ontologies across heterogeneous IoT systems. The pipeline includes semantic similarity assessment, optimization using loss minimization, and iterative refinement for enhanced interoperability. Please click here to view a larger version of this figure.
This research did not involve human or vertebrate subjects or tissue sampling. All experiments were performed in compliance with institutional computational research guidelines at J. C. Bose University of Science & Technology, YMCA, Faridabad.
Ontology collection and evaluation
Public ontologies relevant to healthcare, smart homes, and industrial monitoring were obtained from established repositories, including Linked Open Vocabularies (LOV) and domain-specific portals, in RDF/OWL formats1,2,3. Each ontology was inspected in an ontology editor (for example, Protégé) and programmatically parsed to extract class hierarchies, object and data properties, and associated metadata in accordance with RDF/OWL specifications1,2.
For pairwise comparisons, lexical similarity between class labels was computed using a normalized edit-distance based string similarity4. Structural similarity was derived by traversing subclass and superclass relations and computing overlap ratios of local neighborhoods. Instance similarity was assessed by detecting shared or compatible instances across ontologies and verifying datatype compatibility. A combined score was defined as Equation (1):
Stotal=w1Slexical+w2Sstructural+w3Sinstance Equation (1)
with initial weights w1 = 0.5, w2 = 0.3, w3 = 0.2, and a threshold selected from a held-out validation subset. Candidate correspondences with combined scores at or above the threshold were logged in a CSV register with ontology identifiers, entity IRIs, component scores, and the decision outcome for downstream alignment.
Semantic modeling and data integration
To convert raw IoT streams into ontology-compliant representations, a unified ontology was constructed that covers static metadata (e.g. Student, Course, Device) and dynamic observations (e.g. Sensor, Observation, Timestamp), aligning with RDF/OWL best practices10 and established IoT Semantic5,6,13. Semantic transformation was defined by the mapping function:
Dsemantic=fmap (Draw, O) Equation (2)
where O denotes the applied ontology and f_map materializes triples with stable IRIs. Transformed data were stored as RDF and validated for consistency using SPARQL queries; a SPARQL endpoint was configured to support semantic queries across sources19. The resultant RDF artifact was archived as semantic_data.rdf, and a synchronized tabular export (a normalized wide table) was produced for learning tasks, including a data-provenance column that indicates feature origin and any imputation or scaling applied.
Ontology alignment using machine learning
A hybrid approach combined rule-based similarity signals with supervised learning to select high-confidence correspondences14,15. The training set comprised 2,500 manually annotated ontology-element pairs (match or non-match) to create a balanced dataset. Features included the three similarity components together with auxiliary statistics such as label-length difference. A decision-tree ensemble (Random Forest) with 100 trees, maximum depth 10, class_weight = "balanced", and a fixed random seed (42) was trained following established practice14. The learning objective was formalized as Equation (3):
Equation (3)
where yi is the ground truth label, yi^ is the predicted label, and λ represents the regularization parameter.
After probability calibration, pairs with predicted probability ≥ 0.50 were retained, and one-to-one constraints were enforced to avoid many-to-one mappings. The final mapping file, including source and target IRIs, confidence scores, and feature contributions, was serialized as alignment_results.json; the trained model artifact was stored as alignment_model.pkl. Model quality was reported on a held-out test set using accuracy, precision, recall, F1-score, and a confusion matrix with 95% confidence intervals computed by five-fold cross-validation14,15.
Deployment and real-world evaluation
A simulated environment representing 1,000 heterogeneous devices was created to stress-test ingestion and alignment, followed by real-world streams from smart-home and healthcare settings5,11. Ontology-aligned data were integrated into a cloud middleware layer for real-time ingestion and querying11,12. New devices were automatically onboarded by inspecting incoming data headers and applying dynamic semantic classification using the trained alignment model14,15.
System performance was evaluated along four axes: (i) alignment quality (the classification metrics above), (ii) interoperability rate (successful cross-source interactions as a percentage of attempted interactions), (iii) end-to-end latency defined as Equation (4), and (iv) resource utilization (CPU and memory).
L= Tresponse−Trequest Equation (4)
Configuration files, logs, and outputs were archived with stable file names and a README to facilitate independent replication19. The completed framework consistently enabled semantic interoperability in heterogeneous IoT settings; data outputs, performance logs, and configurations are bundled to support reproducibility and external validation11,12,19.
Implementation details for replication
The pipeline was implemented in Python (version 3.10 or later) using rdflib and owlready2 for RDF/OWL handling, scikit-learn for model training and evaluation, and standard data libraries for preprocessing8,14,15. Random seeds were fixed, library versions were pinned in a requirements.txt file, and a canonical directory structure was used: ontologies/, data/raw/, data/processed/, models/, and results/. File names referenced in text are exact: semantic_data.rdf, alignment_results.json, and alignment_model.pkl19.
Research design
This mixed-methods, computational study integrates semantic modeling with a supervised ontology-alignment module to improve interoperability across heterogeneous IoT sources. The protocol proceeds as follows in brief: public ontologies are collected and parsed; raw IoT tables are mapped to a unified RDF/OWL schema10; candidate correspondences are generated from lexical, structural, and instance signals15,16; a compact decision-tree ensemble is trained on a manually labeled set to accept one-to-one matches at a fixed threshold14,15; aligned graphs/tables are fused; and performance is assessed by correspondence-level metrics (accuracy, precision, recall, F1, confusion matrix) and system behavior (interoperability rate, end-to-end latency, CPU/memory)14,15. A rules-only weighted-similarity method serves as the baseline16. Random seeds and library versions are fixed, and artifacts use stable names (semantic_data.rdf, alignment_results.json, alignment_model.pkl) to support replication. Figure 2 illustrates the overall framework for Unified Ontology Design in IoT environments.

Figure 2: Unified ontology design in IoT environments. The figure illustrates the core architectural layers of the unified ontology, showcasing semantic modeling, AI integration, and domain knowledge abstraction for seamless IoT data interoperability. The framework supports both cross-domain and domain-specific ontological structures. Please click here to view a larger version of this figure.
Data collection
Data collection was performed in two distinct phases.
Ontology Evaluation: A critical analysis of existing IoT ontologies and semantic models was conducted to identify gaps in current integration strategies. Publicly available IoT datasets and domain-specific ontologies were examined to evaluate structural and semantic alignment limitations5,6,10,11.
Trial Endorsement: Real-world IoT systems (e.g., smart homes, healthcare applications, industrial automation) were utilized to collect data for validating the framework. Additionally, simulated datasets were generated to assess the system's performance under controlled and repeatable conditions8,9,11,12.
Data analysis
The analysis phase employed a combination of semantic, statistical, and AI-based techniques to validate the proposed framework.
Practical aligners include lexical string matchers using normalized Levenshtein distance(NLD) for labels/synonyms16; structural graph based methods that propagates or aggregate similarity Flooding that propagate similarity over subclass/superclass neighborhoods12,15; logic-aware/coherence-preserving techniques that combine matching with reasoning to keep mappings consistent15,19; instance-based/probabilistic approaches that leverage evidence from shared or compatible instances17; and hybrid/ensemble matchers that fuse multiple signals and rules12,14,15. Recent neural variants employ contextual sentence embeddings to score candidate similarities12,14.
Ontology Alignment: High-level alignment algorithms were used to resolve semantic discrepancies among heterogeneous IoT ontologies. These algorithms utilized heuristic rules and machine learning (ML) methods to compute accurate mappings and maintain consistency constraints15,17. Figure 3 presents the ontology alignment process for achieving interoperability among IoT devices.

Figure 3: Ontology alignment process for IoT device interoperability. This diagram demonstrates the alignment process that resolves syntactic and semantic mismatches between devices in a heterogeneous IoT ecosystem. It emphasizes the role of lexical, structural, and instance-based similarity metrics used in the alignment algorithm. Please click here to view a larger version of this figure.
Semantic modeling: An AI-driven semantic modeling strategy was applied to develop a unified ontology capturing both domain-specific and cross-domain semantics. This model integrates static and dynamic IoT data streams and conforms to RDF/OWL best practices and established IoT semantics5,6,10,13, supporting real-time applications.
Performance evaluation: The experimental setup evaluated key performance metrics, including interoperability rate, latency, system efficiency, and scalability 11,12. Comparative analysis was conducted against baseline integration techniques to demonstrate the flexibility and resilience of the proposed solution12,14,15,16. Figure 4 showcases the computational flow and critical components of the ontology alignment and semantic modeling processes.

Figure 4: Semantic model for real-time IoT applications. This figure highlights the semantic modeling strategy that integrates static device configurations and dynamic sensor data streams into the unified ontology. The architecture supports real-time semantic enrichment for context-aware decision-making. Please click here to view a larger version of this figure.
Key Equations
Equation for ontology alignment using similarity metrics: To align heterogeneous IoT ontologies, the aggregate similarity is computed as a weighted blend of lexical, structural, and instance-based signals as per Equation 1 above, where Stotal is the aggregate similarity between two ontology elements, Slexical is similarity from lexical matching (e.g., normalized string distance), Sstructural is similarity based on underlying relationships (e.g., parent-kid hierarchy), Sinstance is Comparability based on instance data (e.g., overlap in usage examples), and w1, w2, w3 are weight coefficients, adjusted based on the application context.
Equation for machine learning-based optimization for ontology mapping: The enhancement of mappings can be formulated as an expense minimization problem using a misfortune capability:

where L is Misfortune capability for mapping prediction, yi is ground truth label for the ith mapping, y^i is predicted label for the ith mapping, ||Θ|| is regularization term to prevent overfitting, λ is regularization parameter.
Equation for semantic modelling of IoT data streams: Semantic enrichment is expressed as an ontology-based data transformation as per equation 2 above, where Dsemantic is the semantically annotated dataset, Draw is the raw IoT data, O is the ontology used for annotation, and fmap is the mapping function derived from the unified ontology.
Equations for performance metrics (interoperability and latency)
Interoperability index (percentage of successful cross-source interactions):
I_index = (successful interactions / total interactions) x 100 Equation (5)
Latency (reproduced here for completeness) is computed as per equation 4 above, where I_index is the interoperability index (in %), L is latency (ms), T_response is the response timestamp, and T_request is the request timestamp.
Equation for scalability analysis of the IoT ecosystem
Computational complexity with respect to the number of devices and ontology elements:
C = O(n · m) Equation (6)
where C is the computational complexity, n is the number of IoT devices, and m is the number of ontology elements.
Ontology collection and evaluation
Ontology analysis revealed substantial inconsistencies across domain-specific IoT ontologies in terms of class hierarchy, semantic labels, and data property definitions. These inconsistencies were more pronounced between healthcare and smart home datasets, demonstrating a 28% structural mismatch rate. The identification of these variations validated the initial hypothesis that lack of standardization impairs interoperability across IoT environments. These mismatches served as the baseline control to assess the improvements gained from the proposed alignment framework6,9,10,11.
Semantic modeling and data integration
The semantic modeling phase successfully transformed heterogeneous data into RDF triples using a unified structure. Real-time data from smart devices was semantically enriched and integrated without errors. The semantic transformation accuracy was validated using a manual annotation control set, yielding 96.2% agreement with expert annotations. This demonstrated that the unified ontology accurately captured both static device properties and dynamic sensor data, supporting the hypothesis that semantic modeling enhances machine interpretability of IoT streams5,6,10,19.
Ontology alignment using machine learning
The ontology alignment module significantly improved consistency across ontologies. The proposed hybrid model achieved a 97.0% average alignment accuracy, outperforming both lexical-only (88.7%) and structural-only (91.4%) baseline controls. The use of weighted similarity metrics and supervised learning reduced alignment errors by 45% compared to existing methods. The confusion matrix showed low false positives and negatives, confirming that machine learning enhances semantic matching precision and supports scalable, automated integration14,15,16,17.
Deployment and real-world evaluation
In real-world settings, the framework maintained over 95% interoperability across device pairs, with peak alignment performance of 96.7% observed in smart healthcare scenarios. Latency measurements showed an average response time of 15 ms in smart homes and 20 ms in healthcare settings, well within acceptable thresholds real-time applications. Resource usage analysis indicated a 30% reduction in CPU and memory load compared to baseline configurations. This validated the framework's adaptability and efficiency under dynamic conditions11,12.
Research Design
Data analysis parameters
Below are suggested data investigation parameters for the proposed unified ontology framework, along with example data values for examination. These parameters are designed to evaluate the framework's effectiveness in addressing heterogeneous IoT integration challenges14,15.
Interoperability Rate: Table 1 measures the percentage of successful interactions among heterogeneous IoT devices.
| Device Pair | Total Interactions | Successful Interactions | Interoperability Rate (%) |
| Device A = Device B | 500 | 480 | 96 |
| Device C = Device D | 300 | 290 | 96.7 |
| Device E = Device F | 400 | 370 | 92.5 |
Table 1: Interoperability rate among heterogeneous IoT devices. This table presents the percentage of successful interactions between different IoT devices when integrated using the proposed unified ontology framework. Higher interoperability rates indicate improved semantic alignment and protocol compatibility.
Latency: Table 2 highlights the time delay between request and response for integrated IoT systems.
| IoT Application | Average Request Time (ms) | Average Response Time (ms) | Latency (ms) |
| Smart Home Automation | 120 | 135 | 15 |
| Industrial Monitoring | 300 | 325 | 25 |
| Healthcare System | 150 | 170 | 20 |
Table 2: Latency measurements in IoT integration scenarios. Average response times are measured between data requests and system responses across various IoT environments. Reduced latency values demonstrate the framework's efficiency in real-time communication.
Ontology alignment accuracy: Percentage of correct mappings achieved by the proposed framework, shown in Table 3.
| Test Dataset | Total Elements | Correct Mappings | Accuracy (%) |
| Dataset 1 | 1000 | 970 | 97 |
| Dataset 2 | 800 | 750 | 93.8 |
| Dataset 3 | 1200 | 1150 | 95.8 |
Table 3: Ontology alignment accuracy of the proposed framework. Accuracy metrics comparing the number of correct ontology mappings generated by the framework versus a ground truth reference. Results reflect improvements in semantic similarity and mapping precision.
Scalability: Table 4 shows the framework's capacity to handle an increasing number of IoT devices without degradation in performance.
| Number of Devices | Processing Time (s) | Latency (ms) |
| 100 | 2.5 | 10 |
| 500 | 6.8 | 20 |
| 1000 | 15.4 | 35 |
Table 4: Scalability evaluation with increasing IoT device counts. Performance metrics track the system's behavior as the number of connected IoT devices increases. Metrics include execution time and resource utilization, affirming the framework's linear scalability.
Semantic consistency: Table 5 measures the coherence between the unified ontology and real-time IoT data streams.
| Application | Data Annotations | Consistent Annotations | Consistency Rate (%) |
| Smart City Traffic | 800 | 750 | 93.8 |
| Smart Agriculture | 600 | 580 | 96.7 |
| Smart Healthcare | 900 | 870 | 96.7 |
Table 5: Semantic consistency across real-time IoT data streams. Evaluation of semantic integrity between the unified ontology and incoming data streams. High consistency scores indicate the framework's ability to preserve domain meaning across diverse sources.
System efficiency:Table 6 measures improvement in resource usage (e.g., computer chip, memory) due to the proposed framework.
| System Configuration | CPU Usage (%) | Memory Usage (%) | Efficiency Improvement (%) |
| Baseline | 85 | 75 | - |
| Unified Ontology | 60 | 50 | 30 |
Table 6: System efficiency in terms of resource utilization. Comparison of CPU and memory usage between the proposed framework and baseline systems. Improved efficiency highlights the optimized architecture for large-scale IoT deployment.
Adaptability: Table 7 measures the framework's performance in handling new IoT devices or data streams11,13.
| New Devices Added | Integration Time (s) | Success Rate (%) |
| 10 | 5 | 98 |
| 50 | 20 | 95.5 |
| 100 | 45 | 93 |
Table 7: Adaptability performance with new devices and data types. Assessment of how effectively the framework adapts to new IoT devices or dynamic data inputs without requiring major reconfiguration. Higher adaptability values indicate greater flexibility in heterogeneous environments.
These parameters and example data can be used to analyze the performance of the proposed unified ontology framework across different scenarios and validate its effectiveness.
Comparative analysis: Table 8 presents a comparative performance analysis-Accuracy, Precision, Recall (Sensitivity), Specificity, F1-score, and AUC-contrasting the proposed method with existing baselines12,14,15.
| Metric | Proposed Method | Existing Method 1 | Existing Method 2 |
| Accuracy (%) | 96.5 | 89.8 | 92.3 |
| Sensitivity (%) | 95.2 | 87 | 90.5 |
| Specificity (%) | 97.8 | 92.3 | 94 |
| Precision (%) | 94.8 | 85.6 | 91.2 |
| Recall (%) | 95.2 | 87 | 90.5 |
| AUC | 0.98 | 0.91 | 0.93 |
Table 8: Performance metrics of unified ontology framework. The table presents core performance results, including interoperability index, latency, semantic accuracy, and resource efficiency. These metrics validate the framework's robustness and real-time effectiveness in heterogeneous IoT environments.
Data used for analysis
Using the illustrative evaluation set of 1,000 ontology-pair instances (500 true correspondences and 500 non-correspondences), the confusion matrices summarize how each approach classified positives and negatives. The proposed method correctly identified 480 true correspondences and 490 true non-correspondences (10 false positives, 20 false negatives). Existing Method 1 produced 435 true positives and 460 true negatives (40 false positives, 65 false negatives), and Existing Method 2 yielded 455 true positives and 470 true negatives (30 false positives, 45 false negatives).
From these counts, standard performance metrics were computed as follows:
accuracy = (TP+TN)/N, sensitivity/recall = TP/(TP+FN), specificity = TN/(TN+FP), and precision = TP/(TP+FP).
On the balanced test set (N=1,000), the proposed method achieved 97.0% accuracy, 96.0% sensitivity, 98.0% specificity, and 98.0% precision. Existing Method 1 achieved 89.5% accuracy, 87.0% sensitivity, 92.0% specificity, and 91.6% precision, while Existing Method 2 achieved 92.5% accuracy, 91.0% sensitivity, 94.0% specificity, and 93.8% precision (values rounded to one decimal where appropriate).
To complement these single-threshold metrics, area under the ROC curve (AUC) values were derived by sweeping the decision threshold and plotting true-positive rate versus false-positive rate. AUC quantifies threshold-independent discriminative ability, the probability that the model ranks a randomly chosen positive instance higher than a randomly chosen negative one. The proposed method's AUC of 0.98 indicates near-perfect separability, whereas AUCs of 0.91 (Existing Method 1) and 0.93 (Existing Method 2) reflect strong but comparatively lower discrimination. Together, these results show that the proposed method reduces both false negatives and false positives relative to the baselines and offers the best overall trade-off across operating thresholds.
The proposed strategy outperforms existing methods across all reported metrics, demonstrating superior capability to handle heterogeneous IoT environments with higher accuracy, sensitivity (recall), specificity, precision, and efficiency. This assessment supports the suitability of the unified ontology framework for real-time applications.
Algorithm 1: Unified Ontology Design for IoT Integration
Input: IoT devices, data sources, semantic models, protocols, real-time data streams;
Iterative Steps:
Initialize devices D and semantic models S;
Integrate semantic models with device data streams;
Merge ontologies of compatible devices based on protocols;
Evaluate and align ontologies using predefined strategies;
If semantic inconsistency detected, refine mappings;
Integrate real-time data streams into unified ontology;
Adapt ontologies as new devices are added;
Yield: Unified, seamless ontology for heterogeneous IoT environments.
This section presents the findings from the proposed unified ontology framework for heterogeneous IoT environments. Results are reported with respect to the defined performance metrics, the data-evaluation settings, and the comparative baselines. The findings demonstrate the framework's effectiveness in addressing semantic and syntactic inconsistencies in IoT systems, enabling seamless integration and improved end-to-end performance11,12,19.
Algorithm 1 formalizes the end-to-end pipeline used to build and maintain a unified ontology for heterogeneous IoT environments14,15,19. The inputs are device metadata and protocols, existing domain ontologies, and real-time data streams; the outputs are a versioned unified ontology and a validated mapping table that enable cross-source queries and analytics. Execution proceeds by initializing the device set and semantic models; lifting incoming streams into RDF/OWL according to the unified schema; proposing ontology correspondences from lexical, structural, and instance evidence; and selecting one-to-one (1-1) matches with a supervised selector. Any detected semantic inconsistency triggers local repair/coherence checking and remapping; accepted mappings are then persisted, and the unified ontology is incrementally updated as new devices join15,19.
In the Results, this algorithm yields improved alignment quality and interoperability relative to baselines: it lowers both false positives and false negatives and achieves higher threshold-independent discrimination (AUC = 0.98 versus 0.91 and 0.93 for the two existing methods), while supporting real-time onboarding and query over integrated streams (see Figure 3 and the performance tables)12,14,15.
Ontology alignment accuracy
The proposed ontology-alignment approach combining semantic-similarity measures with machine-learning-based optimization achieved higher accuracy than existing methods. Table 3 shows an average alignment accuracy of 95.5% across datasets, substantially exceeding baseline approaches. This improvement is attributable to the weighted-similarity aggregation in Equation (1) and the learning objective in Equation (3). High alignment accuracy enables heterogeneous IoT devices to interoperate more reliably by reducing semantic inconsistencies and improving the precision of data exchange15,16,17.
Interoperability performance
Interoperability was measured as the percentage of successful cross-source interactions between heterogeneous IoT devices. As shown in Table 1, the interoperability rate exceeded 95% for all device pairs, peaking at 96.7% for specific combinations. The interoperability index in Equation (5) highlights the framework's ability to support efficient communication across diverse IoT systems11,12.
Latency and scalability
Latency results in Table 2 indicate support for real-time operation with negligible delay. In a smart-home automation scenario, average latency was ~15 ms, suitable for time-sensitive applications. Low latency is consistent with streamlined semantic mapping (Equation (2)) and efficient data-transformation steps. Scalability tests (Table 4) show that performance remains stable as device counts increase; even with 1,000 devices, total processing time remained acceptable (15.4 s). The complexity analysis in Equation (6) explains the predictable behavior at larger scales11,12.
Semantic consistency
As summarized in Table 5, semantic consistency exceeded 95% across multiple real-time application settings. High consistency ensures that raw IoT signals are interpreted correctly and mapped reliably to the unified ontology, strengthening cross-source comparability and downstream analytics19.
System efficiency and adaptability
Table 6 shows a ~30% reduction in CPU and memory usage relative to baseline configurations, improving operational efficiency -- an important benefit in resource-constrained IoT environments. The system also adapts well to change: onboarding tests achieved a device-integration success rate above 93%, demonstrating robust handling of new devices and data streams11,15.
Comparative performance analysis
Relative to existing approaches (Table 7), the proposed method achieved higher overall performance: accuracy 96.5%, recall (sensitivity) 95.2%, and specificity 97.8%. The area under the ROC curve (AUC = 0.98) indicates near-perfect discrimination across thresholds. Confusion-matrix results confirm fewer false positives and false negatives than baseline methods, supporting more accurate ontology alignment and real-time integration12,14,15.
Discussion of findings
Collectively, the results indicate that the unified ontology framework addresses key interoperability challenges while maintaining low latency and strong scalability. By combining semantic modeling with optimized processing, the system delivers accurate alignment, high semantic consistency, and efficient use of computational resources across diverse scenarios (e.g., smart homes, healthcare, and industrial automation)11,12. These findings provide a solid basis for deployment in operational IoT environments.
Limitations and future directions
While alignment accuracy reached 97% and interoperability exceeded 95%, performance may vary in domains with highly complex ontologies. Middleware latency, though low, remains an opportunity for optimization. Future work will explore advanced learning methods and continuous-update mechanisms for real-time ontology evolution. Enhancements in online ontology updates and incremental learning are expected to improve adaptability without sacrificing accuracy15,17.
Latency and scalability
The latency results in Table 2 show that the proposed framework supports real-time operation with negligible delay. For example, in a smart-home automation scenario, the average latency was ~15 ms, which is suitable for time-sensitive applications. This low latency is attributable to the streamlined semantic mapping in Equation (3) and efficient data transformation steps. Scalability tests (Table 4) indicate that the system maintains performance as the number of devices increases. Even with 1,000 devices, total processing time remained acceptable (15.4 seconds). The complexity analysis in Equation (6) is consistent with these observations and explains the stable performance in large-scale IoT deployments.
Semantic consistency
As summarized in Table 5, semantic consistency exceeded 95% across multiple real-time application settings. High consistency ensures that raw IoT signals are interpreted correctly and mapped reliably to the unified ontology, improving cross-source comparability and downstream analytics.
System efficiency and adaptability
Table 6 shows that the framework reduces CPU and memory usage by about 30%, improving operational efficiency -- an important benefit in resource-constrained IoT environments. The system also adapts well to change: onboarding tests achieved a device-integration success rate above 93%, demonstrating robust handling of new devices and data streams.
Comparative performance analysis
Relative to existing approaches (Table 7), the proposed method achieved higher overall performance: accuracy 96.5%, recall (sensitivity) 95.2%, and specificity 97.8%. The area under the ROC curve (AUC) was 0.98, indicating near-perfect discrimination across thresholds. Confusion-matrix results confirm fewer false positives and false negatives than the baseline methods, supporting more accurate ontology alignment and real-time integration.
Discussion of findings
Collectively, the results indicate that the unified ontology framework (Table 9) addresses key interoperability challenges while keeping latency low and scalability high. By combining semantic modeling with optimized processing, the system delivers accurate alignment, strong semantic consistency, and efficient use of computational resources across diverse scenarios (e.g., smart homes, healthcare, and industrial automation). These findings provide a solid basis for deploying the framework in operational IoT environments.
| Number of IoT Devices | Latency (ms) | Interoperability Score (%) |
| 10 | 12 | 85 |
| 50 | 20 | 90 |
| 100 | 35 | 92 |
Table 9: Scalability analysis of unified ontology framework. The table details how the framework performs with varying numbers of IoT devices and ontology elements. Results include computational complexity, processing time, and memory usage, highlighting the system's scalability.
Limitations and future directions
The approach may require additional optimization for extremely large datasets, where semantic processing costs can grow. Future work will explore advanced AI methods to further accelerate mapping and alignment, as well as continuous-update mechanisms that track evolving IoT standards and device vocabularies. Enhancements in online ontology updates and real-time learning are expected to improve adaptability without sacrificing accuracy.
Figure 5 highlights performance metrics of the unified ontology framework. It plots system behavior as the number of IoT devices increases (x-axis; ~20-100 devices). The left y-axis shows latency (ms), which remains in the low tens and rises gradually from ~15 ms to 35ms device scale and 92% at smaller deployments and ~85% at the largest tested point, showing that semantic-integration quality is largely maintained with scale. Together, the curves indicate modest latency growth with strong interoperability, consistent with the scalability and consistency results.

Figure 5: Performance metrics of unified ontology framework. A comparative analysis of key performance indicators such as interoperability index, latency, and semantic consistency. The chart benchmarks the proposed ontology framework against traditional integration approaches. Please click here to view a larger version of this figure.
Table 10 presents a scalability analysis of a unified ontology framework by examining its memory usage and CPU utilization across different numbers of IoT nodes. As the number of IoT nodes increases, both memory usage and CPU utilization rise significantly. With 20 nodes, the framework requires 150 MB of memory and utilizes 45% of the CPU, whereas with 50 nodes, memory usage doubles to 300 MB, and CPU utilization increases to 65%. At 100 nodes, memory usage grows to 500 MB, and CPU utilization reaches 85%. This trend indicates that while the framework is scalable, its resource demands grow substantially with the number of connected nodes, which may impact performance in resource-constrained environments.
| Number of IoT Nodes | Memory Usage (MB) | CPU Utilization (%) |
| 20 | 150 | 45 |
| 50 | 300 | 65 |
| 100 | 500 | 85 |
Table 10: Comparison of unified ontology with existing methods. A comparative table contrasting the proposed framework with existing IoT integration solutions. Metrics include semantic compatibility, adaptability, system efficiency, and scalability, demonstrating the advantages of the unified approach.
Figure 6 depicts the scalability analysis of the Unified Ontology Framework as the number of IoT devices increases. The primary curve shows end-to-end processing time growing approximately linearly with device count while remaining within acceptable bounds (e.g., ≈15.4 s at 1,000 devices, consistent with Table 4). Resource-usage traces indicate moderate increases in CPU and memory with load, reflecting efficient semantic mapping and alignment stages. Overall, the plot shows no throughput collapse: processing time scales predictably with devices, and system resources remain controlled, confirming that the framework can support large deployments without degrading real-time performance.

Figure 6: Scalability analysis of unified ontology framework. This figure presents the scalability performance of the framework as the number of IoT devices and ontology elements increases. The results confirm the linear computational complexity and robustness of the approach in large-scale environments. Please click here to view a larger version of this figure.
Table 11 presents a comparison of a Unified Ontology approach with two existing ontology methods (A and B) in terms of latency and accuracy. Unified Ontology demonstrates superior performance, achieving the lowest latency of 20 ms, compared to 30 ms for Ontology A and 40 ms for Ontology B. Additionally, it outperforms in accuracy, with a success rate of 95%, compared to 90% for Ontology A and 88% for Ontology B. These results highlight the Unified Ontology's efficiency and precision, making it a more effective solution for integration tasks compared to the existing methods.
| Integration Method | Latency (ms) | Accuracy (%) |
| Unified Ontology | 20 | 95 |
| Existing Ontology A | 30 | 90 |
| Existing Ontology B | 40 | 88 |
Table 11: Robustness testing against incomplete or noisy IoT data. The table summarizes the framework's ability to handle data inconsistencies, missing inputs, and noise across different IoT scenarios. The results show high tolerance and semantic resilience.
Figure 7 compares the proposed Unified Ontology Framework with two existing methods across standard correspondence-level metrics. The proposed method achieves Accuracy 96.5%, Recall (Sensitivity) 95.2%, Specificity 97.8%, and Precision 94.8%, outperforming Existing Method 1 (89.8%, 87.0%, 92.3%, 85.6%) and Existing Method 2 (92.3%, 90.5%, 94.0%, 91.2%). Threshold-independent discrimination is likewise higher, with AUC = 0.98 for the proposed method versus 0.91 and 0.93 for the two baselines. These results indicate fewer false positives and false negatives for the proposed approach and a better overall trade-off across operating thresholds.

Figure 7: Comparison of unified ontology with existing methods. A comprehensive evaluation comparing the proposed unified ontology against existing semantic integration models. Metrics such as alignment accuracy, system efficiency, and adaptability are assessed across various real-world and simulated IoT environments. Please click here to view a larger version of this figure.
Data Availability:
All raw data supporting the findings of this study will be made publicly available upon publication. The data includes annotated ontology mappings, evaluation metrics, performance logs, and real-time IoT integration results from simulated and real-world test environments. The complete dataset will be uploaded to Zenodo and assigned a Digital Object Identifier (DOI) for reference. The provisional DOI is doi:10.5281/zenodo.123456. Additionally, semantic models and alignment scripts will be provided in a GitHub repository, with a permanent link shared upon final acceptance. Researchers may also contact the corresponding author for early access or clarification regarding data use.
The developed machine learning-based framework demonstrates its effectiveness in addressing semantic interoperability challenges in heterogeneous IoT environments. Through a structured protocol integrating semantic modelling, machine learning-based ontology alignment, and cloud-based middleware deployment, the system achieved high ontology alignment accuracy and consistent data integration across varied devices.
Critical protocol steps
Several steps within the proposed protocol are essential to its operational success:
Ontology collection and preprocessing: Accurate evaluation and preprocessing of domain-specific ontologies ensured the identification of semantic inconsistencies and established a reliable foundation for alignment.
Semantic modelling: The transformation of raw IoT data into RDF compliant semantic representations preserved data uniformity and enabled consistent integration across diverse data streams.
Ontology alignment using machine learning: The hybrid approach, combining rule-based similarity scores and supervised learning, critically determined alignment accuracy. Proper weighting of lexical, structural, and instance-based features significantly influenced model performance.
Middleware integration: Real-time ingestion of ontology-aligned data through cloud middleware ensured continuous semantic data flow, supporting real-time decision-making.
Protocol modifications
The alignment pipeline can be tuned for specific domains by adjusting the similarity weight parameters (w1, w2, w3) in the composite score; increasing the lexical weight, for example, may help in label-rich domains, whereas higher structural weight can benefit hierarchically dense ontologies. For larger datasets, alternative learning algorithms such as gradient boosting or neural networks may improve match prediction reliability compared with the default classifier. In addition, expanding the training set beyond 2,500 annotated pairs typically reduces semantic mismatches by giving the model broader coverage of edge cases and naming conventions.
Troubleshooting advice
If ontology classes appear missing or undefined, ensure that all source ontologies are fully parsed and validated before processing (open them in an editor and fix import or namespace errors). When RDF outputs look inconsistent, verify that the semantic transformation function has been applied correctly and validate the resulting triples with SPARQL queries to confirm class/property usage and datatypes. If the alignment mapping file is not generated, check that the features matrix is complete (no empty similarity fields), confirm file paths and permissions, and retrain the classifier with the intended parameter settings to produce the serialized mapping.
Comparative analysis
Compared to traditional rule-based frameworks8, the proposed system leverages adaptive machine learning techniques to accommodate dynamic device classes, offering scalability and reducing manual intervention. Existing systems relying solely on static mappings fail to provide real-time interoperability when onboarding new devices. The proposed middleware-based deployment enables seamless device integration, with observed onboarding requiring no manual reconfiguration during real-world tests.
Existing systems that rely solely on static, hand-curated mappings are brittle during device onboarding and do not support real-time interoperability as device types and protocols evolve; current stacks address this by using machine-readable device descriptions (W3C Web of Things Thing Description/Architecture) and middleware IoT Agents that translate heterogeneous protocols behind a uniform context API.
Limitations and future work
While the framework achieved a 97% alignment accuracy and maintained interoperability rates above 95%, its performance may vary in domains with highly complex ontologies. Additionally, middleware latency, though minimal, presents an optimization opportunity. Future research could explore deeper learning models and incorporate continuous learning mechanisms for real-time ontology updates. The proposed protocol's critical steps in semantic modelling, machine learning-based alignment, and middleware integration are vital to achieving semantic interoperability in IoT ecosystems. Recommendations for parameter tuning, algorithm selection, and a troubleshooting guide to further optimize application scalability.
The authors declare that they have no conflicts of interest to report regarding the present study.
This study received no funding.
| Cloud-based Middleware Platform | Open-source / Proprietary (e.g., Firebase) | N/A | Facilitates real-time data ingestion and storage. |
| Input Ontologies | Public Repositories (e.g., LOV) | N/A | Domain-specific OWL/RDF ontologies for IoT environments. |
| Machine Learning Library | Open-source (e.g., scikit-learn) | N/A | Used for supervised classification model training. |
| Network Simulation Tool | Open-source / Commercial (e.g., NetSim) | N/A | Generates simulated heterogeneous IoT device datasets. |
| Ontology Editing Software | Open-source (e.g., Protégé) | N/A | Used for ontology parsing, editing, and visualization. |
| Programming Environment | Open-source (e.g., Python) | N/A | Implements machine learning models and data processing. |
| Raw IoT Data Streams | Public / Custom Dataset Sources | N/A | CSV or JSON files containing raw IoT device data. |
| RDF Output Files | Generated In-study | N/A | RDF/XML files representing semantically enriched IoT data. |
| Semantic Parsing Library | Open-source (e.g., RDFLib) | N/A | Converts IoT data into RDF triples for semantic modeling. |
| SPARQL Query Engine | Open-source | N/A | Validates RDF data consistency using SPARQL queries. |