Abstract
Machine learning is applied in medical diagnosis to do early prediction of
diseases, for increasing the possibility of recoverability around the globe. Cancer is a
disease, which spreads quickly and would be difficult to control in advanced stages.
The idea is to diagnose the disease at an early stage, so as to increase the chances of
fast recovery. Breast cancer is common in women, and is a disease that causes the
death of women in the age of fifty years or older. The purpose is to apply machine
learning concepts to do early detection of disease. The system is fed with the images of
all stages of cancer patients and the classification tools are used to train the system with
the cases. This helps to predict the stage of cancer. After the prediction of the stage, the
patient is prescribed with the medication or other appropriate treatment processes by
the doctor. The right time diagnoses help to improve the prognosis and increase the
chances of survival. The type of the tumour, size and its re-occurring nature need to be
monitored from time to time to check it in control. The Data Mining algorithm in
collaboration with Deep learning or Machine learning concepts can be used to design a
system for early predictions. The proposal is to use the machine learning concepts to do
performance comparison using different classifiers, such as Support Vector Machine
(SVM), Decision Tree and K-Nearest Neighbour (KNN) on the Wisconsin Diagnostic
Breast Cancer (WDBC) dataset [1]. The main aim of cancer detection is to classify
tumours into malignant or benign, thus we use machine learning techniques to improve
the accuracy of diagnosis.
The main objective is to assess the efficiency, effectiveness and correctness of the
algorithm using performance metrics like Accuracy, Precision, F1 score and Recall
Experimentation is done using Jupyter Notebook.