Abstract
Background: Molecular biomarkers show new ways to understand many disease processes. Noncoding RNAs as biomarkers play a crucial role in several cellular activities, which are highly correlated to many human diseases especially cancer. The classification and the identification of ncRNAs have become a critical issue due to their application, such as biomarkers in many human diseases.
Objective: Most existing computational tools for ncRNA classification are mainly used for classifying only one type of ncRNA. They are based on structural information or specific known features. Furthermore, these tools suffer from a lack of significant and validated features. Therefore, the performance of these methods is not always satisfactory.
Methods: We propose a novel approach named imCnC for ncRNA classification based on multisource deep learning, which integrates several data sources such as genomic and epigenomic data to identify several ncRNA types. Also, we propose an optimization technique to visualize the extracted features pattern from the multisource CNN model to measure the epigenomics features of each ncRNA type.
Results: The computational results using a dataset of 16 human ncRNA classes downloaded from RFAM show that imCnC outperforms the existing tools. Indeed, imCnC achieved an accuracy of 94,18%. In addition, our method enables to discover new ncRNA features using an optimization technique to measure and visualize the features pattern of the imCnC classifier.
Keywords: Multisource deep-learning, ncRNA classification, epigenetics, biomarkers, features pattern extraction, optimization.
Graphical Abstract
[http://dx.doi.org/10.1042/BST20160089] [PMID: 27528754]
[http://dx.doi.org/10.1038/nrg3074] [PMID: 22094949]
[http://dx.doi.org/10.2217/epi.15.37] [PMID: 25929784]
[http://dx.doi.org/10.1161/CIRCRESAHA.116.308434] [PMID: 28104771]
[PMID: 26508480]
[http://dx.doi.org/10.1016/B978-0-12-802208-5.00012-6]
[http://dx.doi.org/10.1186/1471-2105-2-8] [PMID: 11801179]
[http://dx.doi.org/10.1186/1471-2105-11-S1-S29] [PMID: 20122201]
[http://dx.doi.org/10.1371/journal.pcbi.0020033] [PMID: 16628248]
[http://dx.doi.org/10.1038/nbt1144] [PMID: 16273071]
[http://dx.doi.org/10.1093/nar/gkx1038] [PMID: 29112718]
[http://dx.doi.org/10.1093/nar/gkx1107] [PMID: 29140524]
[http://dx.doi.org/10.1155/2017/9139504] [PMID: 28553651]
[http://dx.doi.org/10.1093/nar/gkt646] [PMID: 23892401]
[http://dx.doi.org/10.1186/1471-2105-15-311] [PMID: 25239089]
[http://dx.doi.org/10.1371/journal.pgen.0020029] [PMID: 16683024]
[http://dx.doi.org/10.1093/nar/gkm391]
[http://dx.doi.org/10.1186/1471-2164-15-127] [PMID: 24521294]
[http://dx.doi.org/10.1186/s13040-017-0148-2] [PMID: 28785313]
[http://dx.doi.org/10.1145/1133905.1133908]
[http://dx.doi.org/10.1186/gb-2014-15-3-r48] [PMID: 24594072]
[http://dx.doi.org/10.1093/nar/gkt1300] [PMID: 24357408]
[http://dx.doi.org/10.1093/bioinformatics/btu270] [PMID: 24931994]
[http://dx.doi.org/10.1093/nar/gks877] [PMID: 23012263]
[http://dx.doi.org/10.1093/nar/gku207] [PMID: 24623808]
[http://dx.doi.org/10.1038/543183a] [PMID: 28277509]
[http://dx.doi.org/10.1186/s12920-017-0269-y] [PMID: 28589857]
[http://dx.doi.org/10.3390/ijms18040840] [PMID: 28420141]
[http://dx.doi.org/10.1101/338582]
[http://dx.doi.org/10.1371/journal.pone.0179787] [PMID: 28622364]
[http://dx.doi.org/10.1145/2487575.2487612]
[http://dx.doi.org/10.1109/CVPR.2014.244]
[http://dx.doi.org/10.1101/gr.208108.116]
[http://dx.doi.org/10.1101/cshperspect.a019521] [PMID: 27037415]
[http://dx.doi.org/10.1093/bioinformatics/btw427] [PMID: 27587684]
[http://dx.doi.org/10.1093/nar/gkw211] [PMID: 27084938]
[http://dx.doi.org/10.1093/nar/gkp968] [PMID: 19892823]
[http://dx.doi.org/10.1093/abbs/gmr112] [PMID: 22194012]