Comparative Evaluation of Ensemble Machine Learning Approaches for Heart Disease Prediction

Sumati Baral; Suneeta Satpathy; Rabi Narayan Satpathy; Ajit Kumar Baral

doi:10.3791/70124

Research Article

Comparative Evaluation of Ensemble Machine Learning Approaches for Heart Disease Prediction

DOI:

10.3791/70124

⸱

April 10th, 2026

Sumati Baral¹ , Suneeta Satpathy² , Rabi Narayan Satpathy³ , Ajit Kumar Baral⁴

¹Department of Computer Science and Engineering, Trident Academy of Technology, ²Center for Cyber Security, SOA University, ³Department of Computer Science and Engineering, Jagadguru Kripalu University, ⁴Technical Information Security Officer (TISO), Invesco Inc.

Summary

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

This protocol outlines a computational process to create and assess ensemble machine learning models for heart disease prediction using publicly accessible benchmark data within a reproducible preprocessing and evaluation structure.

Abstract

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

This paper presents a computational bench-marking assessment of Ensemble Learning algorithms in the prediction of heart disease, combining different Machine Learning algorithms, such as hard voting, soft voting, and stacking, in a single framework. The evaluation was conducted using publicly available cardiovascular dataset obtained from the Kaggle repository (https://www.kaggle.com/datasets/sid321axn/heart-statlog-cleveland-hungary-final) comprising 1,190 instances and 11 clinical features. The process involves data preprocessing, which includes handling missing values, removing outliers, scaling variables and class balancing to ensure uniform input feature selection, based on Random Forest (RF), is used to eliminate unnecessary features. Among the evaluated models, the stacking ensemble classifier achieved the highest overall accuracy of 91.88% on the test dataset. Although additional metrics such as precision, recall and F1-score were computed for comparative analysis, the emphasis of this study remains on methodological benchmarking rather than clinical validation.

Various base classifiers, including Decision Tree, Random Forest, AdaBoost, and XGBoost, are applied and tested independently. These models are then combined using ensemble techniques with hard voting, soft voting, and stacking. In stacking, Logistic Regression is used as the meta-model, which is trained on cross-validated predictions of the out-of-fold samples to avoid overfitting.

Evaluations are carried out using accuracy as the primary criterion for comparison, so that individual classification systems and their combination strategies can be compared uniformly in the same preprocessing and validation environment. Though performance metrics are provided for comparative indications, the emphasis of the approach lies in the development and evaluation of strategies and not in their clinical assessment.

This protocol makes it easy to compare ensemble machine learning algorithms on publicly available cardiovascular datasets and helps to make a systematic comparison of data preprocessing and ensemble configuration approaches.

Introduction

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Publicly available cardiovascular disease datasets are widely used as benchmark problems in machine learning research for evaluating classification algorithms and predictive modelling techniques¹^,²^,³. Such datasets, which contain clinical and demographic attributes, provide a standardized and reproducible basis for comparing preprocessing strategies, feature selection methods, and ensemble learning architectures under controlled experimental conditions. Consequently, they are commonly employed to assess algorithmic behaviour rather than to support clinical inference or real-w....

Access restricted. Please log in or start a trial to view this content.

Protocol

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

This protocol described a reproducible computational workflow for benchmarking ensemble machine learning models using a publicly available cardiovascular disease dataset.

Selection of the dataset
The study employed a publicly available cardiovascular disease dataset obtained from the Kaggle repository. The dataset comprised 1190 instances and included 11 features. For model training, 80% of the data was utilized, while the remaining 20% was allocated for performance evaluation. Table 1 presented a detailed description of the dataset features. The dataset was loaded into Python (version 3.9) using the pa....

Access restricted. Please log in or start a trial to view this content.

Results

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

This section presents the effects of data preprocessing and feature selection, compares the performance of individual and ensemble classifiers and summarizes benchmarking outcomes across standard evaluation metrics. Data preprocessing, including missing value removal, class balancing and feature scaling, produced consistent input distributions across all classifiers. Models trained on preprocessed data exhibited reduced variability in performance across repeated 5-fold cross-validation runs and train–test splits compared.......

Access restricted. Please log in or start a trial to view this content.

Discussion

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

This study demonstrates that stacking-based ensemble learning consistently achieved superior and more stable classification performance compared with individual classifiers and voting-based ensembles under standardized preprocessing and benchmarking conditions.

The consistency of ensemble performance observed in this study reinforces the importance of methodological rigor in comparative machine learning research¹⁹^,²⁰^,

Access restricted. Please log in or start a trial to view this content.

Disclosures

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

The authors declare that they have no conflicts of interest related to this research work. No financial, personal, or professional relationships have influenced the results, analysis, or conclusions presented in this manuscript.

Acknowledgements

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

The authors acknowledge the use of publicly available datasets and open-source software resources that supported this study. The authors also thank their respective institutions, including Sri Sri University, Bhubaneswar, and SOA University, Bhubaneswar, for providing the academic environment and research facilities necessary to conduct this work.

....

Access restricted. Please log in or start a trial to view this content.

Materials

List of materials used in this article
Name	Company	Catalog Number	Comments
AdaBoostClassifier	scikit-learn Developers	N/A	Ensemble boosting classifier used for benchmarking
Jupyter Notebook	Project Jupyter	N/A	Computational notebook environment
Kaggle Heart Statlog (Cleveland–Hungary) Dataset	Kaggle	N/A	Public cardiovascular dataset (1190 instances, 11 features). URL: https://www.kaggle.com/datasets/sid321axn/heart-statlog-cleveland-hungary-final
LogisticRegression	scikit-learn Developers	N/A	Meta-classifier used in stacking ensemble
Matplotlib	Matplotlib Development Team	N/A	Data visualization library
NumPy	NumPy Developers	N/A	Numerical computation library
pandas (Version 1.5.3)	pandas Development Team	N/A	Data preprocessing and handling
Python (Version 3.9)	Python Software Foundation	N/A	Programming environment used for implementation
RandomForestClassifier	scikit-learn Developers	N/A	Base classifier and feature importance computation
Seaborn	Seaborn Development Team	N/A	Heatmap visualization of correlation matrix
StandardScaler	scikit-learn Developers	N/A	Feature scaling function
VotingClassifier	scikit-learn Developers	N/A	Hard and soft voting ensemble implementation
XGBoostClassifier	DMLC	N/A	Gradient boosting classifier used as base learner

References

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Libby, P., Bonow, R. O., et al. Braunwald’s Heart Disease: A Textbook of Cardiovascular Medicine. Elsevier Health Sci. 9, (2011).
Nayak, O., Pallapothala, T., Gupta, G. P. Heart disease prediction framework using soft voting-based ensemble learning techniques. Convergence of Big Data Technologies and Computational Intellig....

Access restricted. Please log in or start a trial to view this content.

Reprints and Permissions

Request permission to reuse the text or figures of this JoVE article

Request Permission

Comparative Evaluation of Ensemble Machine Learning Approaches for Heart Disease Prediction

In This Article

Summary

Abstract

Introduction

Protocol

Results

Discussion

Disclosures

Acknowledgements

Materials

References

Reprints and Permissions

Tags

Related Articles