Abstract
Background: Currently, Machine Learning (ML) is considered a popular and important area in diverse fields of science and technology, image processing, automobiles, banking, finance, health care sector, etc. The easy availability of data and rapid improvements over machine learning techniques have made it more feasible to understand and to work on various channels of real-time health analytics.
Methods: In this paper, a health status prediction system is proposed to detect cardiovascular diseases through patients’ tweets. Further analytics is carried on a distributed Apache Spark (AS) framework to reduce the time taken for both training and testing when compared with regular standalone machines. Social media streaming data is considered as one of the major sources for data in the proposed system. In this model, attributes of the incoming user tweets are analyzed, and accordingly, cardiovascular risk is predicted, and the latest health status is tweeted back as a reply to the respective user along with a copy to the family and caretakers.
Results: Performance of the proposed framework with Extreme Learning Machine (ELM) - Tree classifier is evaluated on two different corpora. It outperforms other classifiers such as Decision Trees, Naïve Bayes, Linear SVC, DNN, etc. in both accuracy and time.
Conclusion: This proposed study hypothesizes a model for an alert-based system for heart status prediction by adding some additional features impacting the accuracy besides reducing the response time by using Big data Apache Spark Distributed Framework.
Keywords: Machine learning, social media, streaming data, health status, prediction system, apache spark.
Graphical Abstract
[http://dx.doi.org/10.1145/2674026.2674028]
[http://dx.doi.org/10.1007/978-3-319-08976-8_16]
[http://dx.doi.org/10.1016/j.jchf.2016.04.006] [PMID: 27256756]
[http://dx.doi.org/10.1016/j.compeleceng.2017.03.009]
[http://dx.doi.org/10.1016/j.ijmedinf.2017.03.013] [PMID: 28495341]
[http://dx.doi.org/10.1161/CIRCOUTCOMES.112.966531] [PMID: 22592757]
[http://dx.doi.org/10.3390/ijgi7050196]
[http://dx.doi.org/10.1016/j.chb.2011.08.016]
[http://dx.doi.org/10.5121/csit.2014.4807]
[http://dx.doi.org/10.1007/978-3-642-23644-0_16]
[http://dx.doi.org/10.1109/TKDE.2012.29]
[http://dx.doi.org/10.1109/HealthCom.2013.6720662]
[http://dx.doi.org/10.1007/s11042-017-5293-6]
[http://dx.doi.org/10.1007/s11042-019-08025-x]
[http://dx.doi.org/10.1007/s11042-018-6801-z]
[http://dx.doi.org/10.1109/JBHI.2016.2543741] [PMID: 27008680]
[http://dx.doi.org/10.1109/MIS.2014.29]
[http://dx.doi.org/10.20448/808.2.1.9.18]
[http://dx.doi.org/10.1109/TKDE.2014.2382600]
[http://dx.doi.org/10.1109/CDAN.2016.7570891]
[http://dx.doi.org/10.1016/j.ijar.2007.12.002]