Abstract
Background: In the past decades, a number of methods are proposed for dealing with the classification of molecular data. Supervised ML methods such as linear discriminant analysis and decision trees were used to predict structural properties of molecules. Furthermore, logistic regression, Bayesian networks and artificial neural networks have been used to distinguish drugs and non-drugs. However, most of them can not hierarchically extract deep features.
Objective: The feature extracted by the SAEs based model is useful for classification of molecules.
Method: In this study, the model is a mix of deep learning architecture and softmax classifier. Firstly, the molecular data was preprocessed by the feature selection strategies. Secondly, the applicability of stacked auto-encoders was verified by information-based molecular classification. Then, another method of classifying based on multi-dimensional features was proposed. Finally, we proposed a new deep learning model, from which a higher classification accuracy could be gained.
Results: The deep learning model AE mentioned above which is used to classify the data of molecule, and SAEs as the corresponding deep architecture have been practiced. Therefore, we combined the SAEs and softmax by taking the output of the last SAE as the input of softmax. That is, classifying drug and nondrug by using outstanding features can be learned from SAEs.
Conclusion: Experimental results show that the performance of classifiers in this deep learning-based model is competitive. In addition, the proposition of joint multi-dimensional deep neural network is a breakthrough for future research. Also it presents the potential of deep learning-based methods on accurate drug and nondrug classification.
Keywords: Auto-encoder (AE), deep learning, feature extraction, molecular data classification, softmax classifier, stacked auto-encoder (SAE).
Graphical Abstract