Generic placeholder image

Current Bioinformatics

Editor-in-Chief

ISSN (Print): 1574-8936
ISSN (Online): 2212-392X

Research Article

Identification of Drug-Disease Associations by Using Multiple Drug and Disease Networks

Author(s): Ying Yang and Lei Chen*

Volume 17, Issue 1, 2022

Published on: 08 December, 2021

Page: [48 - 59] Pages: 12

DOI: 10.2174/1574893616666210825115406

Price: $65

conference banner
Abstract

Background: Drug repositioning is a new research area in drug development. It aims to discover novel therapeutic uses of existing drugs. It could accelerate the process of designing novel drugs for some diseases and considerably decrease the cost. The traditional method to determine novel therapeutic uses of an existing drug is quite laborious. It is alternative to design computational methods to overcome such defect.

Objective: This study aims to propose a novel model for the identification of drug–disease associations.

Methods: Twelve drug networks and three disease networks were built, which were fed into a powerful network-embedding algorithm called Mashup to produce informative drug and disease features. These features were combined to represent each drug–disease association. Classic classification algorithm, random forest, was used to build the model.

Results: Tenfold cross-validation results indicated that the MCC, AUROC, and AUPR were 0.7156, 0.9280, and 0.9191, respectively.

Conclusion: The proposed model showed good performance. Some tests indicated that a small dimension of drug features and a large dimension of disease features were beneficial for constructing the model. Moreover, the model was quite robust even if some drug or disease properties were not available.

Keywords: Drug repositioning, drug-disease association, network embedding method, random forest, mashup, classic classification algorithm.

Graphical Abstract

[1]
Pan S-Y, Zhou S-F, Gao S-H, et al. New perspectives on how to discover drugs from herbal medicines: CAM’s outstanding contribution to modern therapeutics. Evid Based Complement Alternat Med 2013; 2013627375
[http://dx.doi.org/10.1155/2013/627375] [PMID: 23634172]
[2]
Hurle MR, Yang L, Xie Q, Rajpal DK, Sanseau P, Agarwal P. Computational drug repositioning: from data to therapeutics. Clin Pharmacol Ther 2013; 93(4): 335-41.
[http://dx.doi.org/10.1038/clpt.2013.1] [PMID: 23443757]
[3]
Li J, Zheng S, Chen B, Butte AJ, Swamidass SJ, Lu Z. A survey of current trends in computational drug repositioning. Brief Bioinform 2016; 17(1): 2-12.
[http://dx.doi.org/10.1093/bib/bbv020] [PMID: 25832646]
[4]
Napolitano F, Zhao Y, Moreira VM, et al. Drug repositioning: a machine-learning approach through data integration. J Cheminform 2013; 5(1): 30.
[http://dx.doi.org/10.1186/1758-2946-5-30] [PMID: 23800010]
[5]
Cui Z, Gao Y-L, Liu J-X, Wang J, Shang J, Dai L-Y. The computational prediction of drug-disease interactions using the dual-network L2,1-CMF method. BMC Bioinformatics 2019; 20(1): 5.
[http://dx.doi.org/10.1186/s12859-018-2575-6] [PMID: 30611214]
[6]
Wang Y, Chen S, Deng N, Wang Y. Drug repositioning by kernel-based integration of molecular structure, molecular activity, and phenotype data. PLoS One 2013; 8(11)e78518
[http://dx.doi.org/10.1371/journal.pone.0078518] [PMID: 24244318]
[7]
Lu L, Yu H. DR2DI: a powerful computational tool for predicting novel drug-disease associations. J Comput Aided Mol Des 2018; 32(5): 633-42.
[http://dx.doi.org/10.1007/s10822-018-0117-y] [PMID: 29687309]
[8]
Gottlieb A, Stein GY, Ruppin E, Sharan R. PREDICT: a method for inferring novel drug indications with application to personalized medicine. Mol Syst Biol 2011; 7: 496.
[http://dx.doi.org/10.1038/msb.2011.26] [PMID: 21654673]
[9]
Wu G, Liu J, Wang C. Predicting drug-disease interactions by semi-supervised graph cut algorithm and three-layer data integration. BMC Med Genomics 2017; 10(Suppl. 5): 79.
[http://dx.doi.org/10.1186/s12920-017-0311-0] [PMID: 29297383]
[10]
Chiang AP, Butte AJ. Systematic evaluation of drug-disease relationships to identify leads for novel drug uses. Clin Pharmacol Ther 2009; 86(5): 507-10.
[http://dx.doi.org/10.1038/clpt.2009.103] [PMID: 19571805]
[11]
Wu C, Gudivada RC, Aronow BJ, Jegga AG. Computational drug repositioning through heterogeneous network clustering. BMC Syst Biol 2013; 7(Suppl. 5): S6.
[http://dx.doi.org/10.1186/1752-0509-7-S5-S6] [PMID: 24564976]
[12]
Luo H, Wang J, Li M, et al. Drug repositioning based on comprehensive similarity measures and Bi-Random walk algorithm. Bioinformatics 2016; 32(17): 2664-71.
[http://dx.doi.org/10.1093/bioinformatics/btw228] [PMID: 27153662]
[13]
Wang W, Yang S, Zhang X, Li J. Drug repositioning by integrating target information through a heterogeneous network model. Bioinformatics 2014; 30(20): 2923-30.
[http://dx.doi.org/10.1093/bioinformatics/btu403] [PMID: 24974205]
[14]
Martínez V, Navarro C, Cano C, Fajardo W, Blanco A. DrugNet: network-based drug-disease prioritization by integrating heterogeneous data. Artif Intell Med 2015; 63(1): 41-9.
[http://dx.doi.org/10.1016/j.artmed.2014.11.003] [PMID: 25704113]
[15]
Huang Y-F, Yeh H-Y, Soo V-W. Inferring drug-disease associations from integration of chemical, genomic and phenotype data using network propagation. BMC Med Genomics 2013; 6(3)(Suppl. 3): S4.
[http://dx.doi.org/10.1186/1755-8794-6-S3-S4] [PMID: 24565337]
[16]
Cho H, Berger B, Peng J. Compact integration of multi-network topology for functional analysis of genes. Cell Syst 2016; 3(6): 540-548.e5.
[http://dx.doi.org/10.1016/j.cels.2016.10.017] [PMID: 27889536]
[17]
Breiman L. Random forests. Mach Learn 2001; 45(1): 5-32.
[http://dx.doi.org/10.1023/A:1010933404324]
[18]
Mattingly CJ, Rosenstein MC, Colby GT, Forrest JN Jr, Boyer JL. The Comparative Toxicogenomics Database (CTD): a resource for comparative toxicological studies. J Exp Zoolog A Comp Exp Biol 2006; 305(9): 689-92.
[http://dx.doi.org/10.1002/jez.a.307] [PMID: 16902965]
[19]
Davis AP, Grondin CJ, Johnson RJ, et al. Comparative Toxicogenomics Database (CTD): update 2021. Nucleic Acids Res 2021; 49(D1): D1138-43.
[http://dx.doi.org/10.1093/nar/gkaa891] [PMID: 33068428]
[20]
Zhao X, Chen L, Guo Z-H, Liu T. Predicting drug side effects with compact integration of heterogeneous networks. Curr Bioinform 2019; 14(8): 709-20.
[http://dx.doi.org/10.2174/1574893614666190220114644]
[21]
Zhao R, Chen L, Zhou B, Guo Z-H, Wang S. Aorigele. Recognizing novel tumor suppressor genes using a network machine learning strategy. IEEE Access 2019; 7: 155002-3.
[http://dx.doi.org/10.1109/ACCESS.2019.2949415]
[22]
Barabási AL, Gulbahce N, Loscalzo J. Network medicine: a network-based approach to human disease. Nat Rev Genet 2011; 12(1): 56-68.
[http://dx.doi.org/10.1038/nrg2918] [PMID: 21164525]
[23]
Zhu Y, Hu B, Chen L, Dai Q. iMPTCE-Hnetwork: a multi-label classifier for identifying metabolic pathway types of chemicals and enzymes with a heterogeneous network. Comput Math Methods Med 2021; 20216683051
[http://dx.doi.org/10.1155/2021/6683051] [PMID: 33488764]
[24]
Zhou J-P, Chen L, Guo Z-H. iATC-NRAKEL: an efficient multi-label classifier for recognizing anatomical therapeutic chemical classes of drugs. Bioinformatics 2020; 36(5): 1391-6.
[PMID: 31593226]
[25]
Luo Y, Zhao X, Zhou J, et al. A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information. Nat Commun 2017; 8(1): 573.
[http://dx.doi.org/10.1038/s41467-017-00680-8] [PMID: 28924171]
[26]
Pan X, Lu L, Cai YD. Predicting protein subcellular location with network embedding and enrichment features. Biochim Biophys Acta Proteins Proteomics 2020; 1868(10)140477
[http://dx.doi.org/10.1016/j.bbapap.2020.140477] [PMID: 32593761]
[27]
Gao J, Hu B, Chen L. A path-based method for identification of protein phenotypic annotations. Curr Bioinform 2021; 16(9): 1214-22.
[http://dx.doi.org/10.2174/1574893616666210531100035]
[28]
Dai W, Liu X, Gao Y, et al. Matrix factorization-based prediction of novel drug indications by integrating genomic space. Comput Math Methods Med 2015; 2015275045
[http://dx.doi.org/10.1155/2015/275045] [PMID: 26078775]
[29]
Lee T, Yoon Y. Drug repositioning using drug-disease vectors based on an integrated network. BMC Bioinformatics 2018; 19(1): 446.
[http://dx.doi.org/10.1186/s12859-018-2490-x] [PMID: 30463505]
[30]
Zhou J-P, Chen L, Wang T, Liu M. iATC-FRAKEL: a simple multi-label web server for recognizing anatomical therapeutic chemical classes of drugs with their fingerprints only. Bioinformatics 2020; 36(11): 3568-9.
[http://dx.doi.org/10.1093/bioinformatics/btaa166] [PMID: 32154836]
[31]
Zhang W, Liu F, Luo L, Zhang J. Predicting drug side effects by multi-label learning and ensemble learning. BMC Bioinformatics 2015; 16(1): 365.
[http://dx.doi.org/10.1186/s12859-015-0774-y] [PMID: 26537615]
[32]
Liu M, Wu Y, Chen Y, et al. Large-scale prediction of adverse drug reactions using chemical, biological, and phenotypic properties of drugs. J Am Med Inform Assoc 2012; 19(e1): e28-35.
[http://dx.doi.org/10.1136/amiajnl-2011-000699] [PMID: 22718037]
[33]
Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 1999; 27(1): 29-34.
[http://dx.doi.org/10.1093/nar/27.1.29] [PMID: 9847135]
[34]
Kanehisa M, Furumichi M, Sato Y, Ishiguro-Watanabe M, Tanabe M. KEGG: integrating viruses and cellular organisms. Nucleic Acids Res 2021; 49(D1): D545-51.
[http://dx.doi.org/10.1093/nar/gkaa970] [PMID: 33125081]
[35]
Hattori M, Tanaka N, Kanehisa M, Goto S. SIMCOMP/SUBCOMP: chemical structure search servers for network analyses. Nucleic Acids Res 2010; 38(Web Server issue): W652-6.
[http://dx.doi.org/10.1093/nar/gkq367] [PMID: 20460463]
[36]
Kuhn M, Szklarczyk D, Pletscher-Frankild S, et al. STITCH 4: integration of protein-chemical interactions with user data. Nucleic Acids Res 2014; 42(Database issue): D401-7.
[http://dx.doi.org/10.1093/nar/gkt1207] [PMID: 24293645]
[37]
Zhao X, Chen L, Lu J. A similarity-based method for prediction of drug side effects with heterogeneous information. Math Biosci 2018; 306: 136-44.
[http://dx.doi.org/10.1016/j.mbs.2018.09.010] [PMID: 30296417]
[38]
Wishart DS, Feunang YD, Guo AC, et al. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res 2018; 46(D1): D1074-82.
[http://dx.doi.org/10.1093/nar/gkx1037] [PMID: 29126136]
[39]
Kuhn M, Campillos M, Letunic I, Jensen LJ, Bork P. A side effect resource to capture phenotypic effects of drugs. Mol Syst Biol 2010; 6: 343.
[http://dx.doi.org/10.1038/msb.2009.98] [PMID: 20087340]
[40]
Guo Z-H, Chen L, Zhao X. A network integration method for deciphering the types of metabolic pathway of chemicals with heterogeneous information. Comb Chem High Throughput Screen 2018; 21(9): 670-80.
[http://dx.doi.org/10.2174/1386207322666181206112641] [PMID: 30520371]
[41]
Tranchevent LC, Nazarov PV, Kaoma T, et al. Predicting clinical outcome of neuroblastoma patients using an integrative network-based approach. Biol Direct 2018; 13(1): 12.
[http://dx.doi.org/10.1186/s13062-018-0214-9] [PMID: 29880025]
[42]
Schwartz GW, Petrovic J, Zhou Y, Faryabi RB. Differential integration of transcriptome and proteome identifies pan-cancer prognostic biomarkers. Front Genet 2018; 9: 205.
[http://dx.doi.org/10.3389/fgene.2018.00205] [PMID: 29971090]
[43]
Wang R, Liu G, Wang C, Su L, Sun L. Predicting overlapping protein complexes based on core-attachment and a local modularity structure. BMC Bioinformatics 2018; 19(1): 305.
[http://dx.doi.org/10.1186/s12859-018-2309-9] [PMID: 30134824]
[44]
Liu H, Hu B, Chen L, Lu L. Identifying protein subcellular location with embedding features learned from networks. Curr Proteomics 2020; 18(5): 646-60.
[http://dx.doi.org/10.2174/1570164617999201124142950]
[45]
Tong H, Faloutsos C, Pan J. Fast random walk with restart and its applications. Sixth International Conference on Data Mining (ICDM’06). 613-22.
[http://dx.doi.org/10.1109/ICDM.2006.70]
[46]
Köhler S, Bauer S, Horn D, Robinson PN. Walking the interactome for prioritization of candidate disease genes. Am J Hum Genet 2008; 82(4): 949-58.
[http://dx.doi.org/10.1016/j.ajhg.2008.02.013] [PMID: 18371930]
[47]
Kohavi R. editor A study of cross-validation and bootstrap for accuracy estimation and model selection. International joint Conference on artificial intelligence; 1995: Lawrence Erlbaum Associates Ltd.
[48]
Jia Y, Zhao R, Chen L. Similarity-based machine learning model for predicting the metabolic pathways of compounds. IEEE Access 2020; 8: 130687-96.
[http://dx.doi.org/10.1109/ACCESS.2020.3009439]
[49]
Liang H, Chen L, Zhao X, Zhang X. Prediction of drug side effects with a refined negative sample selection strategy. Comput Math Methods Med 2020; 20201573543
[http://dx.doi.org/10.1155/2020/1573543] [PMID: 32454877]
[50]
Urista DV, Carrué DB, Otero I, et al. Prediction of antimalarial drug-decorated nanoparticle delivery systems with random forest models. Biology (Basel) 2020; 9(8): 198.
[http://dx.doi.org/10.3390/biology9080198] [PMID: 32751710]
[51]
Ma X, Guo J, Sun X. Sequence-based prediction of rna-binding proteins using random forest with minimum redundancy maximum relevance feature selection. BioMed Res Int 2015; 2015425810
[http://dx.doi.org/10.1155/2015/425810] [PMID: 26543860]
[52]
Montes C, Kapelan Z, Saldarriaga J. Predicting non-deposition sediment transport in sewer pipes using Random forest. Water Res 2021; 189116639
[http://dx.doi.org/10.1016/j.watres.2020.116639] [PMID: 33227613]
[53]
Zhang Y-H, Li H, Zeng T, et al. Identifying transcriptomic signatures and rules for SARS-CoV-2 infection. Front Cell Dev Biol 2021; 8627302
[http://dx.doi.org/10.3389/fcell.2020.627302] [PMID: 33505977]
[54]
Pan X, Li H, Zeng T, et al. Identification of protein subcellular localization with network and functional embeddings. Front Genet 2021; 11626500
[http://dx.doi.org/10.3389/fgene.2020.626500] [PMID: 33584818]
[55]
Zhang Y-H, Li Z, Zeng T, Lu W, Huang T, Cai Y-D. Identifying the immunological gene signatures of immune cell subtypes. BioMed Res Int 2021; 20216639698
[56]
Yuan F, Li Z, Chen L, et al. Identifying the signatures and rules of circulating extracellular microRNA for distinguishing cancer subtypes. Front Genet 2021; 12651610
[http://dx.doi.org/10.3389/fgene.2021.651610] [PMID: 33767734]
[57]
Fernandez-Delgado M, Cernadas E, Barro S, Amorim D. Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res 2014; 15: 3133-81.
[58]
Frank E, Hall M, Trigg L, Holmes G, Witten IH. Data mining in bioinformatics using Weka. Bioinformatics 2004; 20(15): 2479-81.
[http://dx.doi.org/10.1093/bioinformatics/bth261] [PMID: 15073010]
[59]
Witten IH, Frank E, Eds. Data Mining: Practical Machine Learning Tools and Techniques. San Francisco: Morgan, Kaufmann 2005.
[60]
Cover T, Hart P. Nearest neighbor pattern classification. IEEE Trans Inf Theory 1967; 13(1): 21-7.
[http://dx.doi.org/10.1109/TIT.1967.1053964]
[61]
Matthews BW. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta 1975; 405(2): 442-51.
[http://dx.doi.org/10.1016/0005-2795(75)90109-9] [PMID: 1180967]
[62]
Chen L, Wang S, Zhang Y-H, Li J, Xing Z-H, Yang J, et al. Identify key sequence features to improve CRISPR sgRNA efficacy. IEEE Access 2017; 5: 26582-90.
[http://dx.doi.org/10.1109/ACCESS.2017.2775703]

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy