Abstract
Early detection and treatment of breast cancer are very necessary, and effective classification of breast tissue is helpful for the diagnosis of breast cancer; so, a classification method named FT_GA_GBDT is proposed. First, the correlations between the features and classification labels of breast tissue samples were determined, and features with higher correlation were analyzed statistically and combined by weight. Thus, feature transformation (FT) is realized. The datasets were then enhanced by calculating the mean and root mean square of the feature attributes of each adjacent odd- and even-row sample with both belonging to the same class. Finally, the genetic algorithm (GA) was used to search the optimal parameters of the gradient boosting decision tree (GBDT) model, and the optimal parameters were substituted into the GBDT to classify the breast tissue. In addition, the K-nearest-neighbor (KNN), support-vector-machine (SVM) and GBDT methods were also used to test the breast tissue classification. Results of 6-fold cross validation on three breast tissue datasets showed that the average Precision, Recall, and F1 score obtained by the FT_GA_GBDT method were better than those obtained by the KNN, SVM and GBDT methods. The results further show that the FT algorithm and searching for the optimal hyper-parameters by the GA were helpful in improving the performance of the breast tissue classification model, which is more obvious when the correlations between features and classification labels are generally not high.
[http://dx.doi.org/10.1016/j.procs.2017.11.256]
[http://dx.doi.org/10.1007/s00330-019-06118-7] [PMID: 30927100]
[http://dx.doi.org/10.35940/ijitee.K1553.0981119]
[http://dx.doi.org/10.1002/mp.12920] [PMID: 29676025]
[http://dx.doi.org/10.1038/srep38857] [PMID: 27934955]
[http://dx.doi.org/10.3389/fgene.2019.00899] [PMID: 31632436]
[http://dx.doi.org/10.32604/cmc.2020.05247]
[http://dx.doi.org/10.14778/3342263.3342273]
[http://dx.doi.org/10.1108/COMPEL-08-2021-0296]
[http://dx.doi.org/10.1186/s12885-017-3877-1] [PMID: 29301500]