Skip to main navigation Skip to search Skip to main content

Cardiovascular risk prediction via ensemble machine learning and oversampling methods

Research output: Contribution to journalArticle (Contribution to Journal)peer-review

Abstract

Cardiovascular diseases are a leading cause of global mortality, with hypertension, obesity, and other factors contributing significantly to risk. Artificial Intelligence has emerged as a valuable tool for early detection, offering predictive models that outperform traditional methods. This study analyzed a dataset of 709 individuals from Ecuador, including demographic and clinical variables, to estimate cardiovascular risk. During preprocessing, records with missing values and duplicates were removed, and highly correlated variables were excluded to reduce multicollinearity and prevent overfitting. The performance of several machine learning algorithms–including Decision Trees, Random Forest, Gradient Boosting, Extreme Gradient Boosting, LightGBM, Extra Trees, AdaBoost, and Bagging–was compared, while addressing class imbalance using SMOTE and a hybrid ROS–SMOTE approach. Gradient Boosting with the hybrid technique achieved the best performance, obtaining an accuracy of 0.87, a precision of 0.81, a recall of 0.74, and an F1-score of 0.75. Its superior performance is attributed to its sequential error correction mechanism and integrated regularization strategies, which effectively reduce overfitting and improve generalization in noisy or imbalanced datasets. These findings demonstrate the potential of AI-based models to improve early detection and management of cardiovascular disease, highlighting the importance of anthropometric, clinical, and blood pressure variables in predicting cardiovascular risk.

Original languageEnglish
Article number43576
JournalScientific Reports
Volume15
Issue number1
DOIs
StatePublished - Dec 2025

Fingerprint

Dive into the research topics of 'Cardiovascular risk prediction via ensemble machine learning and oversampling methods'. Together they form a unique fingerprint.

Cite this