Generic placeholder image

Current Bioinformatics

Editor-in-Chief

ISSN (Print): 1574-8936
ISSN (Online): 2212-392X

Comparison of Kernel and Decision Tree-based Algorithms for the Prediction of microRNAs Associated with Cancer

Author(s): Ram Kothandan and Sumit Biswas

Volume 11, Issue 1, 2016

Page: [143 - 151] Pages: 9

DOI: 10.2174/1574893611666151120102307

Price: $65

Abstract

The discovery of microRNAs (miRs) in the 1990's spawned a genre of research which has thrown light on the involvement of these small non-coding RNAs in several developmental pathways and diseases, one of which happens to be cancer. While algorithms which predict the binding of miRNAs to their targets are abundant, the same is not true for the association of miRNAs to targets which can be implicated in cancer. Machine learning approaches, which have been implemented in target prediction need to be extrapolated with proper feature selection to reach an acceptable level of accuracy in the prediction of associations of miRNAs to cancer. In this study we present a comparison of three different learning algorithms viz., the kernel-based Support Vector Machines (SVM), Decision Tree-based Random Forest (RF) and C4.5 to predict miRNAs associated with cancer. 60 informative features were extracted from a dataset of experimentally validated miRNA based on sequence, thermodynamics of miRNA-mRNA binding and their hybridization. Initially, features were ranked based on F-score and a two-stage Recursive Feature Elimination (RFE) process was employed to select the optimal subset of features for individual classifier. Class imbalance in the training set was overcome by employing cost-sensitive approach. The performance of each individual learning algorithm was evaluated in terms of precision, recall, F-measure and AUC. Subsequently, the learning algorithm with better performance measure would be utilized for constructing a two-step binary classifier viz., miRSEQ and miRINT, which will identify a miRNA to be associated with the cancer pathway. Based on our comparative analysis, it was evident that the decision tree based RF model performed well in terms of better precision and AUC (for miRSEQ), but was moderate (for miRINT).

Keywords: AUC, RFE, Cost-sensitive, Class imbalance, Thermodynamics of miRNA-mRNA binding.

Graphical Abstract


Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy