Abstract
Background: With the rapid development of biological research, microRNAs (miRNAs) have increasingly attracted worldwide attention. The increasing biological studies and scientific experiments have proven that miRNAs are related to the occurrence and development of a large number of key biological processes which cause complex human diseases. Thus, identifying the association between miRNAs and disease is helpful to diagnose the diseases. Although some studies have found considerable associations between miRNAs and diseases, there are still a lot of associations that need to be identified. Experimental methods to uncover miRNA-disease associations are time-consuming and expensive. Therefore, effective computational methods are urgently needed to predict new associations.
Methodology: In this work, we propose an integrated method for predicting potential associations between miRNAs and diseases (IMPMD). The enhanced similarity for miRNAs is obtained by combination of functional similarity, gaussian similarity and Jaccard similarity. To diseases, it is obtained by combination of semantic similarity, gaussian similarity and Jaccard similarity. Then, we use these two enhanced similarities to construct the features and calculate cumulative score to choose robust features. Finally, the general linear regression is applied to assign weights for Support Vector Machine, K-Nearest Neighbor and Logistic Regression algorithms.
Results: IMPMD obtains AUC of 0.9386 in 10-fold cross-validation, which is better than most of the previous models. To further evaluate our model, we implement IMPMD on two types of case studies for lung cancer and breast cancer. 49 (Lung Cancer) and 50 (Breast Cancer) out of the top 50 related miRNAs are validated by experimental discoveries.
Conclusion: We built a software named IMPMD which can be freely downloaded from https:// github.com/Sunmile/IMPMD.
Keywords: miRNA, disease, miRNA-disease associations, integrated algorithm, IMPMD, computational methods.
Graphical Abstract
[http://dx.doi.org/10.1038/nature02871] [PMID: 15372042]
[http://dx.doi.org/10.1016/S0092-8674(04)00045-5] [PMID: 14744438]
[http://dx.doi.org/10.1038/nature02873] [PMID: 15372041]
[http://dx.doi.org/10.1016/S0092-8674(01)00616-X] [PMID: 11779458]
[http://dx.doi.org/10.1111/jcmm.13336] [PMID: 28857494]
[http://dx.doi.org/10.1016/0092-8674(93)90529-Y] [PMID: 8252621 ]
[http://dx.doi.org/10.1126/science.1113329] [PMID: 16141076]
[http://dx.doi.org/10.1093/nar/gkq1027] [PMID: 21037258]
[http://dx.doi.org/10.1016/S0092-8674(03)00428-8] [PMID: 12809598]
[http://dx.doi.org/10.1016/j.tig.2004.09.010] [PMID: 15522457]
[http://dx.doi.org/10.1093/nar/gki200] [PMID: 15741182]
[http://dx.doi.org/10.1016/j.gde.2005.08.005] [PMID: 16099643]
[http://dx.doi.org/10.1073/pnas.0605298103] [PMID: 16885212]
[http://dx.doi.org/10.1073/pnas.242606799] [PMID: 12434020]
[http://dx.doi.org/10.1093/bib/bbx130] [PMID: 29045685]
[http://dx.doi.org/10.1007/s00432-013-1392-6] [PMID: 23568547]
[http://dx.doi.org/10.1371/journal.pone.0145930] [PMID: 26720041]
[http://dx.doi.org/10.1111/j.1549-8719.2011.00153.x] [PMID: 22136461]
[http://dx.doi.org/10.1007/s11033-012-2442-x] [PMID: 23307300]
[http://dx.doi.org/10.1016/S0076-6879(07)27006-5] [PMID: 17720481]
[http://dx.doi.org/10.1038/srep13186] [PMID: 26278472]
[http://dx.doi.org/10.1093/bioinformatics/bty503] [PMID: 29939227]
[http://dx.doi.org/10.1371/journal.pcbi.1006418] [PMID: 30142158]
[http://dx.doi.org/10.1371/journal.pcbi.1005912] [PMID: 29253885]
[http://dx.doi.org/10.1371/journal.pcbi.1005455] [PMID: 28339468]
[http://dx.doi.org/10.1111/jcmm.13583] [PMID: 29532987]
[http://dx.doi.org/10.1093/bioinformatics/bty333] [PMID: 29701758]
[http://dx.doi.org/10.1038/s41419-017-0003-x] [PMID: 29305594]
[http://dx.doi.org/10.1093/bioinformatics/btz297] [PMID: 31038664]
[http://dx.doi.org/10.1186/1752-0509-4-S1-S2] [PMID: 20522252]
[http://dx.doi.org/10.1186/1752-0509-7-101] [PMID: 24103777]
[http://dx.doi.org/10.1039/c2mb25180a] [PMID: 22875290]
[http://dx.doi.org/10.1093/bioinformatics/btv039] [PMID: 25618864]
[http://dx.doi.org/10.3389/fgene.2018.00324] [PMID: 30186308]
[http://dx.doi.org/10.1038/srep05501] [PMID: 24975600]
[http://dx.doi.org/10.1080/15476286.2018.1460016] [PMID: 29619882]
[http://dx.doi.org/10.1080/15476286.2018.1517010] [PMID: 30196756]
[http://dx.doi.org/10.1186/s12859-019-2640-9] [PMID: 30691413]
[http://dx.doi.org/10.1038/srep27036] [PMID: 27246786]
[http://dx.doi.org/10.1093/bioinformatics/btq241] [PMID: 20439255]
[http://dx.doi.org/10.1371/journal.pone.0070204] [PMID: 23950912]
[http://dx.doi.org/10.1080/15476286.2019.1568820] [PMID: 30646823]
[http://dx.doi.org/10.18632/oncotarget.15061] [PMID: 28177900]
[http://dx.doi.org/10.1093/nar/gkw1079] [PMID: 27899556]
[http://dx.doi.org/10.1186/gb-2010-11-1-r6] [PMID: 20089154]
[http://dx.doi.org/10.1016/j.suronc.2014.04.003] [PMID: 25031224]
[http://dx.doi.org/10.1586/erm.09.57] [PMID: 19895222]
[http://dx.doi.org/10.1016/j.ajhg.2011.05.003] [PMID: 28472664]
[PMID: 1562997]
[http://dx.doi.org/10.1200/JCO.2005.08.043] [PMID: 15681531]
[http://dx.doi.org/10.1093/jnci/dji055] [PMID: 15741570]
[http://dx.doi.org/10.1158/0008-5472.CAN-05-1783] [PMID: 16103053]
[http://dx.doi.org/10.1158/0008-5472.CAN-09-0587] [PMID: 19584273]
[http://dx.doi.org/10.1038/nature03702] [PMID: 15944708]
[http://dx.doi.org/10.1055/s-0032-1328075] [PMID: 24771909]
[http://dx.doi.org/10.3322/caac.20107] [PMID: 21296855]
[http://dx.doi.org/10.2174/1381612820666140128205239] [PMID: 24479805]
[http://dx.doi.org/10.1007/s13402-014-0176-6] [PMID: 25027758]
[http://dx.doi.org/10.1371/journal.pone.0100664] [PMID: 24945253]
[http://dx.doi.org/10.3892/or.2017.5600] [PMID: 28440475]
[http://dx.doi.org/10.1093/bioinformatics/btr500] [PMID: 21893517]
[http://dx.doi.org/10.1093/bioinformatics/btt426] [PMID: 24002109]