Elastic Net Regularized Softmax Regression Methods for Multi-subtype Classification in Cancer

Lin       Zhang; Yanling        He; Haiting       Song; Xuesong       Wang; Nannan       Lu; Lei       Sun; Hui       Liu

doi:10.2174/1574893613666181112141724

Abstract

Background: Various regularization methods have been proposed to improve the prediction accuracy in cancer diagnosis. Elastic net regularized logistic regression has been widely adopted for cancer classification and gene selection in genetics and molecular biology but is commonly applied to binary classification and regression. However, usually, the cancer subtypes can be more, and most likely cannot be decided precisely.

Objective: Besides the multi-class issue, the feature selection problem is also a critical problem for cancer subtype classification.

Methods: An Elastic Net Regularized Softmax Regression (ENRSR) for multi-classification is put forward to tackle the multiple classification issue. As an extension of elastic net regularized logistic regression, ENRSR enforces structure sparsity and ‘grouping effect’ for gene selection based on gene expression data, which may exhibit high correlation. The sparsity structure and ‘grouping effect’ help to select more propriate discriminable features for multi-classification.

Result: It is demonstrated that ENRSR gains more accurate and robust performance compared to the other 6 competing algorithms (K-means, Hierarchical Clustering, Expectation Maximization, Nonnegative Matrix Factorization, Support Vector Machine and Random Forest) in predicting cancer subtypes both on simulation data and real cancer gene expression data in terms of F measure.

Conclusion: Our proposed ENRSR method is a reliable regularized softmax regression for multisubtype classification.

Keywords: regularization, softmax regression, elastic net, multiple classification, gene selection, cancer.

« Previous Next »

Graphical Abstract

[1] 
Nair VS, Maeda LS, Ioannidis JPA. Clinical outcome prediction by microRNAs in human cancer: a systematic review. J Natl Cancer Inst  2012; 104(7): 528-40.
[http://dx.doi.org/10.1093/jnci/djs027] [PMID: 22395642] 
[2] 
Wei JS, Greer BT, Westermann F, et al. Prediction of clinical outcome using gene expression profiling and artificial neural networks for patients with neuroblastoma. Cancer Res  2004; 64(19): 6883-91.
[http://dx.doi.org/10.1158/0008-5472.CAN-04-0695] [PMID: 15466177] 
[3] 
Zhang W, Wan YW, Allen GI, Pang K, Anderson ML, Liu Z. Molecular pathway identification using biological network-regularized logistic models. BMC Genomics  2013; 14(Suppl. 8): S7.
[http://dx.doi.org/10.1186/1471-2164-14-S8-S7] [PMID: 24564637] 
[4] 
Algamal ZY, Lee MH. Regularized logistic regression with adjusted adaptive elastic net for gene selection in high dimensional cancer classification. Comput Biol Med  2015; 67: 136-45.
[http://dx.doi.org/10.1016/j.compbiomed.2015.10.008] [PMID: 26520484] 
[5] 
Ma S, Huang J. Penalized feature selection and classification in bioinformatics. Brief Bioinform  2008; 9(5): 392-403.
[http://dx.doi.org/10.1093/bib/bbn027] [PMID: 18562478] 
[6] 
Kastrin A, Peterlin B. Rasch-based high-dimensionality data reduction and class prediction with applications to microarray gene expression data. Expert Syst Appl  2010; 37: 5178-85.
[http://dx.doi.org/10.1016/j.eswa.2009.12.074] 
[7] 
Chandra B, Gupta M. An efficient statistical feature selection approach for classification of gene expression data. J Biomed Inform  2011; 44(4): 529-35.
[http://dx.doi.org/10.1016/j.jbi.2011.01.001] [PMID: 21241823] 
[8] 
Kalina J. Classification methods for high-dimensional genetic data. Biocybern Biomed Eng  2014; 34: 10-8.
[http://dx.doi.org/10.1016/j.bbe.2013.09.007] 
[9] 
Lotfi E, Keshavarz A. Gene expression microarray classification using PCA-BEL. Comput Biol Med  2014; 54: 180-7.
[http://dx.doi.org/10.1016/j.compbiomed.2014.09.008] [PMID: 25282708] 
[10] 
Algamal ZY, Lee MH. Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer classification. Expert Syst Appl  2015; 42: 9326-32.
[http://dx.doi.org/10.1016/j.eswa.2015.08.016] 
[11] 
Zheng CH, Chong YW, Wang HQ. Gene selection using independent variable group analysis for tumor classification. Neural Comput Appl  2011; 20: 161-70.
[http://dx.doi.org/10.1007/s00521-010-0513-2] 
[12] 
Zheng S, Liu W. An experimental comparison of gene selection by Lasso and Dantzig selector for cancer classification. Comput Biol Med  2011; 41(11): 1033-40.
[http://dx.doi.org/10.1016/j.compbiomed.2011.08.011] [PMID: 21955335] 
[13] 
Weston J, Mukherjee S, Chapelle O, Pontil M, Poggio T, Vapnik V. Feature selection for SVMs. Adv Neural Inf Process Syst  2001; 13: 668-74.
[14] 
Chen Y, Lin C. Combining SVMs with various feature selection strategies. Studies in Fuzziness & Soft Computing  2006; 207: 315-24.
[http://dx.doi.org/10.1007/978-3-540-35488-8_13] 
[15] 
Tan M, Wang L, Tsang IW, Eds. Learning sparse svm for feature selection on very high dimensional datasets. Proceedings of the 27th international conference on machine learning. June 21-25; Haifa, Israel. Heidelberg: Springer-Verlag 2010.
[16] 
Lazar C, Taminau J, Meganck S, et al. A survey on filter techniques for feature selection in gene expression microarray analysis. IEEE/ACM Trans Comput Biol Bioinformatics  2012; 9(4): 1106-19.
[http://dx.doi.org/10.1109/TCBB.2012.33] [PMID: 22350210] 
[17] 
Dudoit S, Fridlyand S, Speed TP. Comparison of discrimination methods for the classification of tumors using gene expression data. J Am Stat Assoc  2002; 97: 77-87.
[http://dx.doi.org/10.1198/016214502753479248] 
[18] 
Li T, Zhang C, Ogihara M. A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression. Bioinformatics  2004; 20(15): 2429-37.
[http://dx.doi.org/10.1093/bioinformatics/bth267] [PMID: 15087314] 
[19] 
Ding C, Peng H. Minimum redundancy feature selection from microarray gene expression data. J Bioinform Comput Biol  2005; 3(2): 185-205.
[http://dx.doi.org/10.1142/S0219720005001004] [PMID: 15852500] 
[20] 
Lee JW, Lee JB, Park M, Song SH. An extensive evaluation of recent classification tools applied to microarray data. Comput Stat Data Anal  2005; 48: 869-85.
[http://dx.doi.org/10.1016/j.csda.2004.03.017] 
[21] 
Monari G, Dreyfus G. Withdrawing an example from the training set: an analytic estimation of its effect on a nonlinear parameterized model. Neurocomputing Letters  2000; 35: 195-201.
[http://dx.doi.org/10.1016/S0925-2312(00)00325-8] 
[22] 
Rivals I, Personnaz L. MLPs (mono-layer polynomials and multi-layer perceptrons) for nonlinear modeling. J Mach Learn Res  2003; 3: 1383-98.
[23] 
Guyon I, Elisseff A. An Introduction to variable and feature selection. J Mach Learn Res  2003; 3: 1157-82.
[24] 
Zou H, Hastie T. Regularization and variable selection via the elastic net. J R Stat Soc Series B Stat Methodol  2005; 67: 301-20.
[http://dx.doi.org/10.1111/j.1467-9868.2005.00503.x] 
[25] 
De Mol C, De Vito E, Rosasco L. Elastic-net regularization in learning theory. J Complexity  2009; 25: 201-30.
[http://dx.doi.org/10.1016/j.jco.2009.01.002] 
[26] 
Ogutu JO, Schulz-Streeck T, Piepho H-P, Eds. Genomic selection using regularized linear regression models: ridge regression, lasso, elastic net and their extensions.Proceedings of the 15th European workshop on QTL mapping and marker assisted selection (QTLMAS) May 19-20. Rennes, France 2011.
[http://dx.doi.org/10.1186/1753-6561-6-S2-S10] 
[27] 
Winham S, Wang C, Motsinger-Reif AA. A comparison of multifactor dimensionality reduction and L1-penalized regression to identify gene-gene interactions in genetic association studies. Stat Appl Genet Mol Biol  2011; 10: 4.
[http://dx.doi.org/10.2202/1544-6115.1613] [PMID: 21291414] 
[28] 
Nan XF, Wang N, Gong P, Zhang CY, Chen YX, Wilkins D. Biomarker discovery using 1-norm regularization for multiclass earthworm microarray gene expression data. Neurocomputing  2012; 92: 36-43.
[http://dx.doi.org/10.1016/j.neucom.2011.09.035] 
[29] 
Barretina J, Caponigro G, Stransky N, et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature  2012; 483(7391): 603-7.
[http://dx.doi.org/10.1038/nature11003] [PMID: 22460905] 
[30] 
Peng HY, Fu YL, Liu JS, Fang X, Jiang CF. Optimal gene subset selection using the modified SFFS algorithm for tumor classification. Neural Comput Appl  2013; 23: 1531-8.
[http://dx.doi.org/10.1007/s00521-012-1148-2] 
[31] 
Kamkar I, Gupta SK, Phung D, Venkatesh S. Stable feature selection for clinical prediction: exploiting ICD tree structure using Tree-Lasso. J Biomed Inform  2015; 53: 277-90.
[http://dx.doi.org/10.1016/j.jbi.2014.11.013] [PMID: 25500636] 
[32] 
Hoerl AE, Kennard RW. Ridge regression biased estimation for non-orthogonal problems. Technometrics  1970; 8: 27-51.
[33] 
Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Series B Stat Methodol  1996; 58: 267-88.
[34] 
Liu Z, Jiang F, Tian G, et al. Sparse logistic regression with Lp penalty for biomarker identification. Stat Appl Genet Mol Biol 2007.  6e6
[http://dx.doi.org/10.2202/1544-6115.1248] [PMID: 17402921] 
[35] 
Li JT, Jia YM, Zhao ZH. Partly adaptive elastic net and its application to microarray classification. Neural Comput Appl  2013; 22: 1193-200.
[http://dx.doi.org/10.1007/s00521-012-0885-6] 
[36] 
Mitov V, Claassen M. A fused elastic net logistic regression model for multi-task binary classification. arXiv preprint arXiv:13127750 [Internet] 2013  Available from:. https://arxiv.org/abs/1312.7750
[37] 
Zhao C, Deshwar AG, Morris Q. Relapsing-remitting multiple sclerosis classification using elastic net logistic regression on gene expression data. Syst Biomed  2013; 1: 247-53.
[http://dx.doi.org/10.4161/sysb.26131] 
[38] 
Kim M, Banerjee S, Park SM, Pathak J, Eds. Improving risk prediction for depression via Elastic Net regression-Results from Korea National Health Insurance Services Data. AMIA Annual Symposium Proceedings. Nov 4-8; Chicago, USA. Washington, D.C.: American Medical Informatics Association 2016.
[39] 
Liu W, Li Q. An efficient elastic net with regression coefficients method for variable selection of spectrum data. PLoS One  2017; 12(2) e0171122
[http://dx.doi.org/10.1371/journal.pone.0171122] [PMID: 28152003] 
[40] 
Gonzales GB, De Saeger S. Elastic net regularized regression for time-series analysis of plasma metabolome stability under sub-optimal freezing condition. Sci Rep  2018; 8(1): 3659.
[http://dx.doi.org/10.1038/s41598-018-21851-7] [PMID: 29483546] 
[41] 
Liang Y, Liu C, Luan XZ, et al. Sparse logistic regression with a L1/2 penalty for gene selection in cancer classification. BMC Bioinformatics  2013; 14: 198.
[http://dx.doi.org/10.1186/1471-2105-14-198] [PMID: 23777239] 
[42] 
Gourevitch B, Le Bouquin-Jeannes R. K-means clustering method for auditory evoked potentials selection. Med Biol Eng Comput  2003; 41(4): 397-402.
[http://dx.doi.org/10.1007/BF02348081] [PMID: 12892361] 
[43] 
Kaneko H, Suzuki SS, Okada J, Akamatsu M. Multineuronal spike classification based on multisite electrode recording, whole-waveform analysis, and hierarchical clustering. IEEE Trans Biomed Eng  1999; 46(3): 280-90.
[http://dx.doi.org/10.1109/10.748981] [PMID: 10097463] 
[44] 
Slatkin M, Excoffier L. Testing for linkage disequilibrium in genotypic data using the Expectation-Maximization algorithm. Heredity  1996; 76(Pt 4): 377-83.
[http://dx.doi.org/10.1038/hdy.1996.55] [PMID: 8626222] 
[45] 
Sajda P, Du S, Brown TR, et al. Nonnegative matrix factorization for rapid recovery of constituent spectra in magnetic resonance chemical shift imaging of the brain. IEEE Trans Med Imaging  2004; 23(12): 1453-65.
[http://dx.doi.org/10.1109/TMI.2004.834626] [PMID: 15575404] 
[46] 
Cortes C, Vapnik V. Support-vector networks. Mach Learn  1995; 20: 273-370.
[http://dx.doi.org/10.1007/BF00994018] 
[47] 
Díaz-Uriarte R, Alvarez de Andrés S. Gene selection and classification of microarray data using random forest. BMC Bioinformatics  2006; 7: 3.
[http://dx.doi.org/10.1186/1471-2105-7-3] [PMID: 16398926] 
[48] 
Combettes PL, Wajs VR. Signal recovery by proximal forward-backward splitting. Multiscale Model Simul  2005; 4: 1168-200.
[http://dx.doi.org/10.1137/050626090] 
[49] 
Friedman J, Hastie T, Höfling H, Tibshirani R. Pathwise coordinate optimization. Ann Appl Stat  2007; 1: 302-32.
[http://dx.doi.org/10.1214/07-AOAS131] 
[50] 
D W. E B; R Rf. Action recognition from arbitrary views using 3D exemplar. Proceedings of IEEE International Conference on Computer Vision.  1-7.
[51] 
Chen C, Peng J, Sun SR, Peng CW, Li Y, Pang DW. Tapping the potential of quantum dots for personalized oncology: current status and future perspectives. Nanomedicine (Lond)  2012; 7(3): 411-28.
[http://dx.doi.org/10.2217/nnm.12.9] [PMID: 22385199] 
[52] 
Goldhirsch A, Winer EP, Coates AS, et al. Panel members. Personalizing the treatment of women with early breast cancer: highlights of the St Gallen International Expert Consensus on the Primary Therapy of Early Breast Cancer 2013. Ann Oncol  2013; 24(9): 2206-23.
[http://dx.doi.org/10.1093/annonc/mdt303] [PMID: 23917950] 
[53] 
Chen C, Yuan JP, Wei W, et al. Subtype classification for prediction of prognosis of breast cancer from a biomarker panel: correlations and indications. Int J Nanomedicine  2014; 9: 1039-48.
[http://dx.doi.org/10.2147/IJN.S58270] [PMID: 24591826] 
[54] 
Chow KH, Factor RE, Ullman KS. The nuclear envelope environment and its cancer connections. Nat Rev Cancer  2012; 12(3): 196-209.
[http://dx.doi.org/10.1038/nrc3219] [PMID: 22337151] 
[55] 
Carpenter RL, Lo HW. Regulation of Apoptosis by HER2 in Breast Cancer. J Carcinog Mutagen  2013; 2013(Suppl. 7): 300.
[PMID: 27088047] 
[56] 
Huang W, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc  2009; 4(1): 44-57.
[http://dx.doi.org/10.1038/nprot.2008.211] [PMID: 19131956] 
[57] 
Kafri M, Metzl-Raz E, Jona G, Barkai N. The Cost of Protein Production. Cell Rep  2016; 14(1): 22-31.
[http://dx.doi.org/10.1016/j.celrep.2015.12.015] [PMID: 26725116] 
[58] 
Mukhopadhyay A, Bandyopadhyay S, Maulik U. Multi-class clustering of cancer subtypes through SVM based ensemble of pareto-optimal solutions for gene marker identification. PLoS One  2010; 5(11) e13803
[http://dx.doi.org/10.1371/journal.pone.0013803] [PMID: 21103052] 

Rights & Permissions Print Cite

Article Metrics

8

Journal Information

For Authors

For Editors

For Reviewers

Explore Articles

Open Access

Open Access Articles

For Visitors

DOI https://dx.doi.org/10.2174/1574893613666181112141724	Print ISSN 1574-8936
Publisher Name Bentham Science Publisher	Online ISSN 2212-392X

Current Bioinformatics

Elastic Net Regularized Softmax Regression Methods for Multi-subtype Classification in Cancer

Abstract Play Pause

Graphical Abstract

Related Journals

Related Books

Abstract