Identification of Drug-Disease Associations by Using Multiple Drug and
Disease Networks

Ying      Yang; Lei      Chen

doi:10.2174/1574893616666210825115406

Abstract

Background: Drug repositioning is a new research area in drug development. It aims to discover novel therapeutic uses of existing drugs. It could accelerate the process of designing novel drugs for some diseases and considerably decrease the cost. The traditional method to determine novel therapeutic uses of an existing drug is quite laborious. It is alternative to design computational methods to overcome such defect.

Objective: This study aims to propose a novel model for the identification of drug–disease associations.

Methods: Twelve drug networks and three disease networks were built, which were fed into a powerful network-embedding algorithm called Mashup to produce informative drug and disease features. These features were combined to represent each drug–disease association. Classic classification algorithm, random forest, was used to build the model.

Results: Tenfold cross-validation results indicated that the MCC, AUROC, and AUPR were 0.7156, 0.9280, and 0.9191, respectively.

Conclusion: The proposed model showed good performance. Some tests indicated that a small dimension of drug features and a large dimension of disease features were beneficial for constructing the model. Moreover, the model was quite robust even if some drug or disease properties were not available.

Keywords: Drug repositioning, drug-disease association, network embedding method, random forest, mashup, classic classification algorithm.

« Previous Next »

Graphical Abstract

[1] 
Pan S-Y, Zhou S-F, Gao S-H, et al. New perspectives on how to discover drugs from herbal medicines: CAM’s outstanding contribution to modern therapeutics. Evid Based Complement Alternat Med  2013; 2013627375
[http://dx.doi.org/10.1155/2013/627375] [PMID: 23634172] 
[2] 
Hurle MR, Yang L, Xie Q, Rajpal DK, Sanseau P, Agarwal P. Computational drug repositioning: from data to therapeutics. Clin Pharmacol Ther  2013; 93(4): 335-41.
[http://dx.doi.org/10.1038/clpt.2013.1] [PMID: 23443757] 
[3] 
Li J, Zheng S, Chen B, Butte AJ, Swamidass SJ, Lu Z. A survey of current trends in computational drug repositioning. Brief Bioinform  2016; 17(1): 2-12.
[http://dx.doi.org/10.1093/bib/bbv020] [PMID: 25832646] 
[4] 
Napolitano F, Zhao Y, Moreira VM, et al. Drug repositioning: a machine-learning approach through data integration. J Cheminform  2013; 5(1): 30.
[http://dx.doi.org/10.1186/1758-2946-5-30] [PMID: 23800010] 
[5] 
Cui Z, Gao Y-L, Liu J-X, Wang J, Shang J, Dai L-Y. The computational prediction of drug-disease interactions using the dual-network L2,1-CMF method. BMC Bioinformatics  2019; 20(1): 5.
[http://dx.doi.org/10.1186/s12859-018-2575-6] [PMID: 30611214] 
[6] 
Wang Y, Chen S, Deng N, Wang Y. Drug repositioning by kernel-based integration of molecular structure, molecular activity, and phenotype data. PLoS One  2013; 8(11)e78518
[http://dx.doi.org/10.1371/journal.pone.0078518] [PMID: 24244318] 
[7] 
Lu L, Yu H. DR2DI: a powerful computational tool for predicting novel drug-disease associations. J Comput Aided Mol Des  2018; 32(5): 633-42.
[http://dx.doi.org/10.1007/s10822-018-0117-y] [PMID: 29687309] 
[8] 
Gottlieb A, Stein GY, Ruppin E, Sharan R. PREDICT: a method for inferring novel drug indications with application to personalized medicine. Mol Syst Biol  2011; 7: 496.
[http://dx.doi.org/10.1038/msb.2011.26] [PMID: 21654673] 
[9] 
Wu G, Liu J, Wang C. Predicting drug-disease interactions by semi-supervised graph cut algorithm and three-layer data integration. BMC Med Genomics  2017; 10(Suppl. 5): 79.
[http://dx.doi.org/10.1186/s12920-017-0311-0] [PMID: 29297383] 
[10] 
Chiang AP, Butte AJ. Systematic evaluation of drug-disease relationships to identify leads for novel drug uses. Clin Pharmacol Ther  2009; 86(5): 507-10.
[http://dx.doi.org/10.1038/clpt.2009.103] [PMID: 19571805] 
[11] 
Wu C, Gudivada RC, Aronow BJ, Jegga AG. Computational drug repositioning through heterogeneous network clustering. BMC Syst Biol  2013; 7(Suppl. 5): S6.
[http://dx.doi.org/10.1186/1752-0509-7-S5-S6] [PMID: 24564976] 
[12] 
Luo H, Wang J, Li M, et al. Drug repositioning based on comprehensive similarity measures and Bi-Random walk algorithm. Bioinformatics  2016; 32(17): 2664-71.
[http://dx.doi.org/10.1093/bioinformatics/btw228] [PMID: 27153662] 
[13] 
Wang W, Yang S, Zhang X, Li J. Drug repositioning by integrating target information through a heterogeneous network model. Bioinformatics  2014; 30(20): 2923-30.
[http://dx.doi.org/10.1093/bioinformatics/btu403] [PMID: 24974205] 
[14] 
Martínez V, Navarro C, Cano C, Fajardo W, Blanco A. DrugNet: network-based drug-disease prioritization by integrating heterogeneous data. Artif Intell Med  2015; 63(1): 41-9.
[http://dx.doi.org/10.1016/j.artmed.2014.11.003] [PMID: 25704113] 
[15] 
Huang Y-F, Yeh H-Y, Soo V-W. Inferring drug-disease associations from integration of chemical, genomic and phenotype data using network propagation. BMC Med Genomics  2013; 6(3)(Suppl. 3): S4.
[http://dx.doi.org/10.1186/1755-8794-6-S3-S4] [PMID: 24565337] 
[16] 
Cho H, Berger B, Peng J. Compact integration of multi-network topology for functional analysis of genes. Cell Syst  2016; 3(6): 540-548.e5.
[http://dx.doi.org/10.1016/j.cels.2016.10.017] [PMID: 27889536] 
[17] 
Breiman L. Random forests. Mach Learn  2001; 45(1): 5-32.
[http://dx.doi.org/10.1023/A:1010933404324] 
[18] 
Mattingly CJ, Rosenstein MC, Colby GT, Forrest JN Jr, Boyer JL. The Comparative Toxicogenomics Database (CTD): a resource for comparative toxicological studies. J Exp Zoolog A Comp Exp Biol  2006; 305(9): 689-92.
[http://dx.doi.org/10.1002/jez.a.307] [PMID: 16902965] 
[19] 
Davis AP, Grondin CJ, Johnson RJ, et al. Comparative Toxicogenomics Database (CTD): update 2021. Nucleic Acids Res  2021; 49(D1): D1138-43.
[http://dx.doi.org/10.1093/nar/gkaa891] [PMID: 33068428] 
[20] 
Zhao X, Chen L, Guo Z-H, Liu T. Predicting drug side effects with compact integration of heterogeneous networks. Curr Bioinform  2019; 14(8): 709-20.
[http://dx.doi.org/10.2174/1574893614666190220114644] 
[21] 
Zhao R, Chen L, Zhou B, Guo Z-H, Wang S. Aorigele. Recognizing novel tumor suppressor genes using a network machine learning strategy. IEEE Access 2019; 7: 155002-3.
[http://dx.doi.org/10.1109/ACCESS.2019.2949415] 
[22] 
Barabási AL, Gulbahce N, Loscalzo J. Network medicine: a network-based approach to human disease. Nat Rev Genet  2011; 12(1): 56-68.
[http://dx.doi.org/10.1038/nrg2918] [PMID: 21164525] 
[23] 
Zhu Y, Hu B, Chen L, Dai Q. iMPTCE-Hnetwork: a multi-label classifier for identifying metabolic pathway types of chemicals and enzymes with a heterogeneous network. Comput Math Methods Med  2021; 20216683051
[http://dx.doi.org/10.1155/2021/6683051] [PMID: 33488764] 
[24] 
Zhou J-P, Chen L, Guo Z-H. iATC-NRAKEL: an efficient multi-label classifier for recognizing anatomical therapeutic chemical classes of drugs. Bioinformatics  2020; 36(5): 1391-6.
[PMID: 31593226] 
[25] 
Luo Y, Zhao X, Zhou J, et al. A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information. Nat Commun  2017; 8(1): 573.
[http://dx.doi.org/10.1038/s41467-017-00680-8] [PMID: 28924171] 
[26] 
Pan X, Lu L, Cai YD. Predicting protein subcellular location with network embedding and enrichment features. Biochim Biophys Acta Proteins Proteomics  2020; 1868(10)140477
[http://dx.doi.org/10.1016/j.bbapap.2020.140477] [PMID: 32593761] 
[27] 
Gao J, Hu B, Chen L. A path-based method for identification of protein phenotypic annotations. Curr Bioinform  2021; 16(9): 1214-22.
[http://dx.doi.org/10.2174/1574893616666210531100035] 
[28] 
Dai W, Liu X, Gao Y, et al. Matrix factorization-based prediction of novel drug indications by integrating genomic space. Comput Math Methods Med  2015; 2015275045
[http://dx.doi.org/10.1155/2015/275045] [PMID: 26078775] 
[29] 
Lee T, Yoon Y. Drug repositioning using drug-disease vectors based on an integrated network. BMC Bioinformatics  2018; 19(1): 446.
[http://dx.doi.org/10.1186/s12859-018-2490-x] [PMID: 30463505] 
[30] 
Zhou J-P, Chen L, Wang T, Liu M. iATC-FRAKEL: a simple multi-label web server for recognizing anatomical therapeutic chemical classes of drugs with their fingerprints only. Bioinformatics  2020; 36(11): 3568-9.
[http://dx.doi.org/10.1093/bioinformatics/btaa166] [PMID: 32154836] 
[31] 
Zhang W, Liu F, Luo L, Zhang J. Predicting drug side effects by multi-label learning and ensemble learning. BMC Bioinformatics  2015; 16(1): 365.
[http://dx.doi.org/10.1186/s12859-015-0774-y] [PMID: 26537615] 
[32] 
Liu M, Wu Y, Chen Y, et al. Large-scale prediction of adverse drug reactions using chemical, biological, and phenotypic properties of drugs. J Am Med Inform Assoc  2012; 19(e1): e28-35.
[http://dx.doi.org/10.1136/amiajnl-2011-000699] [PMID: 22718037] 
[33] 
Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res  1999; 27(1): 29-34.
[http://dx.doi.org/10.1093/nar/27.1.29] [PMID: 9847135] 
[34] 
Kanehisa M, Furumichi M, Sato Y, Ishiguro-Watanabe M, Tanabe M. KEGG: integrating viruses and cellular organisms. Nucleic Acids Res  2021; 49(D1): D545-51.
[http://dx.doi.org/10.1093/nar/gkaa970] [PMID: 33125081] 
[35] 
Hattori M, Tanaka N, Kanehisa M, Goto S. SIMCOMP/SUBCOMP: chemical structure search servers for network analyses. Nucleic Acids Res  2010; 38(Web Server issue): W652-6.
[http://dx.doi.org/10.1093/nar/gkq367] [PMID: 20460463] 
[36] 
Kuhn M, Szklarczyk D, Pletscher-Frankild S, et al. STITCH 4: integration of protein-chemical interactions with user data. Nucleic Acids Res  2014; 42(Database issue): D401-7.
[http://dx.doi.org/10.1093/nar/gkt1207] [PMID: 24293645] 
[37] 
Zhao X, Chen L, Lu J. A similarity-based method for prediction of drug side effects with heterogeneous information. Math Biosci  2018; 306: 136-44.
[http://dx.doi.org/10.1016/j.mbs.2018.09.010] [PMID: 30296417] 
[38] 
Wishart DS, Feunang YD, Guo AC, et al. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res  2018; 46(D1): D1074-82.
[http://dx.doi.org/10.1093/nar/gkx1037] [PMID: 29126136] 
[39] 
Kuhn M, Campillos M, Letunic I, Jensen LJ, Bork P. A side effect resource to capture phenotypic effects of drugs. Mol Syst Biol  2010; 6: 343.
[http://dx.doi.org/10.1038/msb.2009.98] [PMID: 20087340] 
[40] 
Guo Z-H, Chen L, Zhao X. A network integration method for deciphering the types of metabolic pathway of chemicals with heterogeneous information. Comb Chem High Throughput Screen  2018; 21(9): 670-80.
[http://dx.doi.org/10.2174/1386207322666181206112641] [PMID: 30520371] 
[41] 
Tranchevent LC, Nazarov PV, Kaoma T, et al. Predicting clinical outcome of neuroblastoma patients using an integrative network-based approach. Biol Direct  2018; 13(1): 12.
[http://dx.doi.org/10.1186/s13062-018-0214-9] [PMID: 29880025] 
[42] 
Schwartz GW, Petrovic J, Zhou Y, Faryabi RB. Differential integration of transcriptome and proteome identifies pan-cancer prognostic biomarkers. Front Genet  2018; 9: 205.
[http://dx.doi.org/10.3389/fgene.2018.00205] [PMID: 29971090] 
[43] 
Wang R, Liu G, Wang C, Su L, Sun L. Predicting overlapping protein complexes based on core-attachment and a local modularity structure. BMC Bioinformatics  2018; 19(1): 305.
[http://dx.doi.org/10.1186/s12859-018-2309-9] [PMID: 30134824] 
[44] 
Liu H, Hu B, Chen L, Lu L. Identifying protein subcellular location with embedding features learned from networks. Curr Proteomics  2020; 18(5): 646-60.
[http://dx.doi.org/10.2174/1570164617999201124142950] 
[45] 
Tong H, Faloutsos C, Pan J. Fast random walk with restart and its applications. Sixth International Conference on Data Mining (ICDM’06).  613-22.
[http://dx.doi.org/10.1109/ICDM.2006.70] 
[46] 
Köhler S, Bauer S, Horn D, Robinson PN. Walking the interactome for prioritization of candidate disease genes. Am J Hum Genet  2008; 82(4): 949-58.
[http://dx.doi.org/10.1016/j.ajhg.2008.02.013] [PMID: 18371930] 
[47] 
Kohavi R. editor A study of cross-validation and bootstrap for accuracy estimation and model selection. International joint Conference on artificial intelligence; 1995: Lawrence Erlbaum Associates Ltd. 
[48] 
Jia Y, Zhao R, Chen L. Similarity-based machine learning model for predicting the metabolic pathways of compounds. IEEE Access  2020; 8: 130687-96.
[http://dx.doi.org/10.1109/ACCESS.2020.3009439] 
[49] 
Liang H, Chen L, Zhao X, Zhang X. Prediction of drug side effects with a refined negative sample selection strategy. Comput Math Methods Med  2020; 20201573543
[http://dx.doi.org/10.1155/2020/1573543] [PMID: 32454877] 
[50] 
Urista DV, Carrué DB, Otero I, et al. Prediction of antimalarial drug-decorated nanoparticle delivery systems with random forest models. Biology (Basel)  2020; 9(8): 198.
[http://dx.doi.org/10.3390/biology9080198] [PMID: 32751710] 
[51] 
Ma X, Guo J, Sun X. Sequence-based prediction of rna-binding proteins using random forest with minimum redundancy maximum relevance feature selection. BioMed Res Int  2015; 2015425810
[http://dx.doi.org/10.1155/2015/425810] [PMID: 26543860] 
[52] 
Montes C, Kapelan Z, Saldarriaga J. Predicting non-deposition sediment transport in sewer pipes using Random forest. Water Res  2021; 189116639
[http://dx.doi.org/10.1016/j.watres.2020.116639] [PMID: 33227613] 
[53] 
Zhang Y-H, Li H, Zeng T, et al. Identifying transcriptomic signatures and rules for SARS-CoV-2 infection. Front Cell Dev Biol  2021; 8627302
[http://dx.doi.org/10.3389/fcell.2020.627302] [PMID: 33505977] 
[54] 
Pan X, Li H, Zeng T, et al. Identification of protein subcellular localization with network and functional embeddings. Front Genet  2021; 11626500
[http://dx.doi.org/10.3389/fgene.2020.626500] [PMID: 33584818] 
[55] 
Zhang Y-H, Li Z, Zeng T, Lu W, Huang T, Cai Y-D. Identifying the immunological gene signatures of immune cell subtypes. BioMed Res Int  2021; 20216639698
[56] 
Yuan F, Li Z, Chen L, et al. Identifying the signatures and rules of circulating extracellular microRNA for distinguishing cancer subtypes. Front Genet  2021; 12651610
[http://dx.doi.org/10.3389/fgene.2021.651610] [PMID: 33767734] 
[57] 
Fernandez-Delgado M, Cernadas E, Barro S, Amorim D. Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res  2014; 15: 3133-81.
[58] 
Frank E, Hall M, Trigg L, Holmes G, Witten IH. Data mining in bioinformatics using Weka. Bioinformatics  2004; 20(15): 2479-81.
[http://dx.doi.org/10.1093/bioinformatics/bth261] [PMID: 15073010] 
[59] 
Witten IH, Frank E, Eds. Data Mining: Practical Machine Learning Tools and Techniques. San Francisco: Morgan, Kaufmann 2005.
[60] 
Cover T, Hart P. Nearest neighbor pattern classification. IEEE Trans Inf Theory  1967; 13(1): 21-7.
[http://dx.doi.org/10.1109/TIT.1967.1053964] 
[61] 
Matthews BW. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta  1975; 405(2): 442-51.
[http://dx.doi.org/10.1016/0005-2795(75)90109-9] [PMID: 1180967] 
[62] 
Chen L, Wang S, Zhang Y-H, Li J, Xing Z-H, Yang J, et al. Identify key sequence features to improve CRISPR sgRNA efficacy. IEEE Access  2017; 5: 26582-90.
[http://dx.doi.org/10.1109/ACCESS.2017.2775703] 

Rights & Permissions Print Cite

Article Metrics

22

3

1

Journal Information

For Authors

For Editors

For Reviewers

Explore Articles

Open Access

Open Access Articles

For Visitors

DOI https://dx.doi.org/10.2174/1574893616666210825115406	Print ISSN 1574-8936
Publisher Name Bentham Science Publisher	Online ISSN 2212-392X

Current Bioinformatics

Identification of Drug-Disease Associations by Using Multiple Drug and Disease Networks

Abstract Play Pause

Graphical Abstract

Related Journals

Related Books

Abstract