Abstract
Background: E-mail is an efficient way to communicate. It is one of the most commonly used communication methods, and it can be used for achieving legitimate and illegitimate activities. Many features that can be effective in detecting email fraud attacks, are still under investigation.
Methods: This paper proposes an improved classification accuracy for fraudulent emails that is implemented through feature extraction and hybrid Machine Learning (ML) classifier that combines Adaboost and Majority Voting. Eleven ML classifiers are evaluated experimentally within the hybrid classifier, and the performance of the email fraud filtering is evaluated by using WEKA and R tool on a data set of 9298 email messages.
Results: The performance evaluation shows that the hybrid model of Voting using Adaboost outperforms all other classifiers, with the lowest Error Rate of 0.6991%, highest f1-measure of 99.30%, and highest Area Under the Curve (AUC) of 99.9%.
Conclusion: The utilized proposed email features with the combination of Adaboost and Voting algorithms prove the efficiency of fraud email detection.
Keywords: Knowledge discovery, machine learning, hybrid classification, fraud email, feature extraction, ensemble methods.
Graphical Abstract