Abstract
Background: The process of election prediction started long back when common practice for election predictions were traditional methods like pundits, hereditary factor etc. However, in recent times new methods and techniques are being used for election forecasting like Data mining, Data Science, Big data, and numerous machine learning techniques. By using such computational techniques the whole process of political forecasting is changed and poll predictions are carried out through them.
Objective: The main objective of this research work is to propose an election prediction model for developing areas especially for the state of Jammu and Kashmir (India).
Methods: The election prediction model is developed in Jupyter notebook web application using different supervised machine learning techniques. To obtain the optimal results, we perform the hyperparameter tuning of all the proposed classifiers. For measuring the performance of poll prediction system we used confusion matrix along with AUROC curve which depicts that this methods can be well suited for political forecasting. An important contribution of this article is to design a Prediction system which can be used for making prediction in other fields like cardiovascular disease predictions, weather forecasting etc.
Results: This model is tested and trained with real-time dataset of the state Jammu and Kashmir (India). We applied features selection techniques like Random Forest, Decision Tree Classifier, Gradient boosting Classifier and Extra Gradient Boosting and obtained eight most important parameters like (Central Influence, Religion Followers, Party Wave, Party Abbreviations, Sensitive Areas, Vote Bank, Incumbent Party, and Caste Factor) for poll predictions with their mean weightages. By applying different classifier to get mean weightage of different parameters for this election prediction models, it has been observed that Party wave got maximum mean weightage of 0.82% as compared to others parameters. After obtaining the vital parameters for political forecasting, we applied various machine learning algorithms like Decision tree, Random forest, K-nearest neighbor and support vector machine for the early prediction of elections. Experimental results show that Support Vector Machine outperformed with a higher accuracy of 0.84% in contrast to others classifiers.
Conclusion: In this paper, a clear overview of election prediction models, their potentials, techniques, parameters as well as limitations are outlined. We conclude this work by stating that election predictions can indeed be forecasted with significant parameters however, with caution due to the limitations which were outlined in developing nations like sensitive areas, social unrest, religion etc. This research work may be considered as the first attempt to use multiple classifier for forecasting the Assembly election results of the state Jammu and Kashmir (India).
Keywords: Election predictions, forecasting, data mining, social media, machine learning, predictions.
Graphical Abstract