COMPREHENSIVE ANALYSIS OF HEART DISEASE PREDICTION USING SCIKIT-LEARN









Abstract

Heart Disease, as a cardiovascular disorder, is the leading cause of death for men and women. It's the primary source of morbidity and mortality today. Hence, scientists are still working to support healthcare experts in assessing this complicated process using data mining methods. Even though the healthcare sector is wealthier while inside the database, this data isn't correctly mined to detect hidden routines and make conclusions according to these patterns. The most significant target with the learning identifies hidden levels by simply employing multiple Scikit-learn methods that almost certainly give notable benefits to ensure the current clear presence of cardiovascular illness among individuals. Numerous classification methods have been utilized to detect such patterns for exploration from the medical trade. Even the data set comprising 14 features has examined for the prediction platform. The dataset from the UCI repository contains some widely used medical terms and phrases, including blood pressure, cholesterol amount, torso pain, along with 11 other features used to anticipate heart disease. However, you will find many features or anomalies from this dataset that will not offer fantastic results. Hence data preprocessing and feature engineering is utilized to handle this type of issue. Even the most frequently occurring and effectual classification methods employed inside this research paper are Decision Tree, k-nearest neighbor, Extra Trees Classifier, Random Forest, Support Vector Machine, Naïve Bays, Logistic Regression, AdaBoost Classifier, Voting Classifier, Ridge CV. I evaluate such scikit-learn models using some parameters like Accuracy, Precision, Recall, and F1-score. According to our practical consequences, the Extra trees classifier's accuracy is 93.44percent, which's regarded as somewhat excellent, while other models lie below the Extra tree classifier. According to our experimentation investigation, the Extra Tree classifier with the best accuracy believed most useful way of Heart disease prediction. Keywords: Heart Disease, Scikit-learn, Ensemble learning, Machine Learning, Extra Tree Classifier, Feature Engineering.


Modules


Algorithms


Software And Hardware

• Hardware: Processor: i3 ,i5 RAM: 4GB Hard disk: 16 GB • Software: operating System : Windws2000/XP/7/8/10 Anaconda,jupyter,spyder,flask Frontend :-python Backend:- MYSQL