Generic placeholder image

Journal of Intelligent Systems in Current Computer Engineering

Editor-in-Chief

ISSN (Print): 3050-5070
ISSN (Online): 3050-5089

Research Article

Breast Cancer Prediction Using Oversampling Technique, Particle Swarm Optimization, and Fuzzy Logic

In Press, (this is not the final "Version of Record"). Available online 28 May, 2024
Author(s): Kapil Dev Mahato*, Chitra Saini, S S Gourab Kumar Das, Abhilash Kumar, Chandrashekhar Azad and Uday Kumar
Published on: 28 May, 2024

Article ID: e080724231705

DOI: 10.2174/0126662949298774240507114739

Price: $95

Abstract

Background: Breast cancer is one of the most devastating and prominent reasons for women's life loss on the planet today. Therefore, the development of tools for the early detection of breast cancer based on the use of a large amount of available clinical data in combination with machine learning/artificial intelligence models can be a cost-effective and game-changer solution for saving women's lives. Researchers have been working hard to put such a tool into practice in recent years.

Introduction: Breast cancer is a serious health issue, and detecting it early is crucial to improving patient outcomes. Breast cancer is the second most prevalent chronic illness in women behind lung cancer and can also affect males, though it does so significantly on a smaller scale.

Objective: For early detection of breast cancer, a wide range of methods, including oversampling, feature selection, fuzzy logic, and machine learning algorithms, have been explored. This research article aims to predict the type and risk factors of breast cancer.

Methods: In this study, the support vector machine-synthetic minority oversampling technique (SVM-SMOTE) was used to balance the Wisconsin Breast Cancer Original (WBCO) dataset from the UCI repository. To enhance the models’ performance and decrease the number of features, features were extracted using particle swarm optimization (PSO). Furthermore, thirteen types of models are proposed: Decision Tree (DT), Logistic Regression (LR), Random Forest (RF), Naïve Bayes (NB), K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Gradient Boosting (GB), Extreme Gradient Boosting (XGB), Adaptive Boosting (AB), Categorical Boosting (CB), Light Gradient Boosting Machine (LGBM), Multi-Layer Perceptron (MLP), and Extra Trees (ET). The PSO-suggested features are utilized as input fuzzy variables, and fuzzy logic is trained.

Results: As a result of these proposed techniques, the highest accuracy was obtained at 92.64%, 98.31%, and 97.19% using SVM-SMOTE, SVM-SMOTE+PSO, and PSO+fuzzy logic, respec-tively.

Conclusion: In our investigation, we used fewer attributes (three), such as clump thickness, mar-ginal adhesion, and Bland chromatin, in the fuzzy logic systems and obtained satisfactory results.


Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy