Abstract
Background: Diagnosing diseases is an intricate job in medical field. Machine learning when applied to health care is capable of early detection of disease which would aid to provide early medical intervention. In heart disease prediction, machine learning techniques have played a significant role. Analysis of disease has become vital in health care sectors. The massive data collected by healthcare sectors are preprocessed and analyzed to discover the underlying information in the data for effective decision making and to provide proper medical intervention. The success of machine learning in medical industry is its capability in analyzing the huge amount of data gathered by the health sector and its effectiveness in decision making. Since medical field involves too many manual processes it has become necessary to automate these procedures. Remarkable advancements in electronic medical records have made it possible. Diagnosing diseases is an intricate job in medical field.
Objective: The objective of this research is to design a robust machine learning algorithm to predict heart disease. The prediction of heart disease is performed using Ensemble of machine learning algorithms. This is to boost the accuracy achieved by individual machine learning algorithms.
Methods: Heart Disease Prediction System is developed where the user can input the patient details and the prediction for the particular patient is made using the model developed. The model will predict the output to be either normal or risky. Linear Discriminant Analysis (LDA), Classification and Regression Trees (CART), Support Vector Machines (SVM), K-Nearest Neighbors (KNN) and Naïve Bayes classifier are used as base learners. These algorithms are combined using random forest as the meta classifier.
Results: The predictions of classifier are combined using random forest algorithm. The accuracy is lifted from 85.53 % to 87.64 % which is an impressive improvement on accuracy.
Conclusion: Various techniques were adopted to preprocess the data to suite the requirement of analysis. Feature selections were made to optimize the performance of machine learning algorithms. Ensemble prediction gave better accuracy when combined using Random forest algorithm as combiner. Better feature selection techniques can be applied to further improve the accuracy.
Keywords: Machine learning, heart disease prediction, graphical user interface, ensemble of machine learning algorithms, naive bayes, classification and regression trees, k-nearest neighbor.
Graphical Abstract