Abstract
Background: Accumulating experimental studies demonstrated that long non-coding RNAs (LncRNAs) play crucial roles in the occurrence and development progress of various complex human diseases. Nonetheless, only a small portion of LncRNA–disease associations have been experimentally verified at present. Automatically predicting LncRNA–disease associations based on computational models can save the huge cost of wet-lab experiments.
Methods and Result: To develop effective computational models to integrate various heterogeneous biological data for the identification of potential disease-LncRNA, we propose a hierarchical extension based on the Boolean matrix for LncRNA-disease association prediction model (HEBLDA). HEBLDA discovers the intrinsic hierarchical correlation based on the property of the Boolean matrix from various relational sources. Then, HEBLDA integrates these hierarchical associated matrices by fusion weights. Finally, HEBLDA uses the hierarchical associated matrix to reconstruct the LncRNA– disease association matrix by hierarchical extending. HEBLDA is able to work for potential diseases or LncRNA without known association data. In 5-fold cross-validation experiments, HEBLDA obtained an area under the receiver operating characteristic curve (AUC) of 0.8913, improving previous classical methods. Besides, case studies show that HEBLDA can accurately predict candidate disease for several LncRNAs.
Conclusion: Based on its ability to discover the more-richer correlated structure of various data sources, we can anticipate that HEBLDA is a potential method that can obtain more comprehensive association prediction in a broad field.
Keywords: LncRNA, disease, association prediction, Boolean matrix, hierarchical extensión, associated matrix.
[http://dx.doi.org/10.1038/nature08975] [PMID: 20393566]
[http://dx.doi.org/10.1038/onc.2017.184] [PMID: 28604750]
[http://dx.doi.org/10.1111/febs.12737] [PMID: 24495014]
[http://dx.doi.org/10.1093/nar/gkr1175] [PMID: 22135294]
[http://dx.doi.org/10.1093/nar/gks915] [PMID: 23042674]
[http://dx.doi.org/10.1038/onc.2011.621] [PMID: 22266873]
[PMID: 27573194]
[PMID: 27345524]
[http://dx.doi.org/10.1093/bioinformatics/btt426] [PMID: 24002109]
[http://dx.doi.org/10.1039/C3MB70608G] [PMID: 24850297]
[http://dx.doi.org/10.1109/TCBB.2018.2827373] [PMID: 29993639]
[http://dx.doi.org/10.1039/C4MB00511B] [PMID: 25502053]
[http://dx.doi.org/10.1093/bioinformatics/bty327] [PMID: 29718113]
[http://dx.doi.org/10.1039/C4MB00478G] [PMID: 25354589]
[http://dx.doi.org/10.18632/oncotarget.11141] [PMID: 27517318]
[http://dx.doi.org/10.1038/srep16840] [PMID: 26577439]
[PMID: 28172495]
[http://dx.doi.org/10.1093/bioinformatics/btx794] [PMID: 29228285]
[http://dx.doi.org/10.1109/DSAA.2015.7344813]
[http://dx.doi.org/10.1145/3132847.3133054]
[PMID: 23175614]
[http://dx.doi.org/10.1093/nar/gkt1248] [PMID: 24297251]
[http://dx.doi.org/10.1093/nar/gku1173] [PMID: 25399422]
[http://dx.doi.org/10.1093/nar/gkv1094] [PMID: 26481356]
[http://dx.doi.org/10.1089/cmb.2012.0273] [PMID: 23509857]
[PMID: 28338194]
[http://dx.doi.org/10.3233/CBM-160376] [PMID: 28269753]
[http://dx.doi.org/10.1159/000443038] [PMID: 26906068]