Table of Contents :-
-
About the Project
- Overview
- Built With
- Dataset
- Results
- License
-
Overview
Cardiovascular disease is one of the most heinous diseases, especially the silent heart attack, which attacks a person so abruptly that there's no time to get it treated and such disease is very difficult to be diagnosed. Various medical data mining and machine learning techniques are being implemented to extract the valuable information regarding the heart disease prediction. Yet, the accuracy of the desired results are not satisfactory. This Model proposes a heart disease prediction system using Machine learning techniques. Health care field has a vast amount of data, for processing those data certain techniques are used. Data Mining Is one of the techniques often used. Heart diseases are the Leading cause of death worldwide. This System predicts the arising possibilities of Heart Disease. The datasets used are classified in terms of medical parameters. This system evaluates those parameters using data mining classification technique. The datasets are processed in python programming using five main Machine Learning Algorithms namely Decision tree Algorithm and Naive Bayes Algorithm, Linear Regression, Knn Algorithm, Artificial Neural Networking which shows the best algorithm among these two in terms of accuracy level of heart disease.
According to the World Health Organisation, every year 12 million deaths occur worldwide due to Heart Disease. Heart disease is one of the biggest causes of morbidity and mortality among the population of the world. Prediction of cardiovascular disease is regarded as one of the most important subjects in the section of data analysis. The load of cardiovascular disease is rapidly increasing all over the world from the past few years. Many researches have been conducted in an attempt to pinpoint the most influential factors of heart disease as well as accurately predict the overall risk. Heart Disease is even highlighted as a silent killer which leads to the death of the person without obvious symptoms. The early diagnosis of heart disease plays a vital role in making decisions on lifestyle changes in high-risk patients and in turn reduces the complications. Machine learning proves to be effective in assisting in making decisions and predictions from the large quantity of data produced by the healthcare industry. This project aims to predict future Heart Disease by analysing data of patients which classifies whether they have heart disease or not using a machine-learning algorithm. Machine Learning techniques can be a boon in this regard. Even though heart disease can occur in different forms, there is a common set of core risk factors that influence whether someone will ultimately be at risk for heart disease or not. By collecting the data from various sources, classifying them under suitable headings & finally analysing to extract the desired data we can say that this technique can be very well adapted to do the prediction of heart disease
The main motivation of doing this research is to present a heart disease prediction model for the prediction of occurrence of heart disease. Further, this research work is aimed towards identifying the best classification algorithm for identifying the possibility of heart disease in a patient. This work is justified by performing a comparative study and analysis using three classification algorithms namely Naïve Bayes, Decision Tree, and Random Forest are used at different levels of evaluations. Although these are commonly used machine learning algorithms, the heart disease prediction is a vital task involving highest possible accuracy. Hence, the three algorithms are evaluated at numerous levels and types of evaluation strategies. This will provide researchers and medical practitioners to establish a better
The major challenge in heart disease is its detection. There are instruments available which can predict heart disease but either they are expensive or are not efficient to calculate the chance of heart disease in humans. Early detection of cardiac diseases can decrease the mortality rate and overall complications. However, it is not possible to monitor patients everyday in all cases accurately and consultation of a patient for 24 hours by a doctor is not available since it requires more patience, time and expertise. Since we have a good amount of data in today’s world, we can use various machine learning algorithms to analyse the data for hidden patterns. The hidden patterns can be used for health diagnosis in medicinal data.
The working of the system starts with the collection of data and selecting the important attributes. Then the required data is preprocessed into the required format. The data is then divided into two parts: training and testing data. The algorithms are applied and the model is trained using the training data. The accuracy of the system is obtained by testing the system using the testing data. This system is implemented using the following modules. 1.) Collection of Dataset 2.) Selection of attributes 3.) Data Preprocessing 4.) Balancing of Data 5.) Disease Prediction
- Source : https://www.kaggle.com/datasets/johnsmith88/heart-disease-dataset Each data-set consisted of 14 attributes. Moreover, it is recommended to use only 14 for our analysis & later we will find out that only 6 attributes have a significant effect. In addition to this, heart-disease prediction is carried out using Logistic Regression Model, Multinomial Naive Bayes Model, Decision Tree Model, K-Nearest Neighbour and Artificial Neural Network .
- Attribute Information:-
- age
- sex
- chest pain type (4 values)
- resting blood pressure
- serum cholestoral in mg/dl
- fasting blood sugar > 120 mg/dl
- resting electrocardiographic results (values 0,1,2)
- maximum heart rate achieved
- exercise induced angina
- oldpeak = ST depression induced by exercise relative to rest
- the slope of the peak exercise ST segment
- number of major vessels (0-3) colored by flourosopy
- thal: 0 = normal; 1 = fixed defect; 2 = reversable defect
-
Linear Regression Model
- Accuracy(%) : 76.92
- Sensitivity(%) : 68.51
- Specificity(%) : 89.18
-
Multinomial Naive Bayes Model
- Accuracy(%) : 73.77
- Sensitivity(%) : 77.41
- Specificity(%) : 70.0
-
Decision Tree Model
- Accuracy(%) : 78.02
- Sensitivity(%) : 69.09
- Specificity(%) : 91.66
-
K-Nearest Neighbour Model
- Accuracy(%) : 82.41
- Sensitivity(%) : 75.51
- Specificity(%) : 90.47
-
Artificial Neural Network Model
- Accuracy(%) : 83.51
- Sensitivity(%) : 90.0
- Specificity(%) : 91.46
Distributed under the MIT License. See LICENSE for more information.