Generic placeholder image

Current Pharmaceutical Biotechnology

Editor-in-Chief

ISSN (Print): 1389-2010
ISSN (Online): 1873-4316

Research Article

Linking Phenotypes and Genotypes with Matrix Factorizations

Author(s): Jianqiang Li, Yu Guan, Xi Xu, Zerui Ma and Yan Pei*

Volume 24, Issue 12, 2023

Published on: 15 March, 2023

Page: [1576 - 1588] Pages: 13

DOI: 10.2174/1389201024666230207153738

Price: $65

Abstract

Aims: We linked phenotypes and genotypes by PheGe-Net, a unified operation frame.

Background: Genotype refers to the general name of all gene combinations of an individual. It reflects the genetic composition of organisms. Phenotype refers to the macroscopic characteristics of an organism that can be observed.

Objective: Identifying the phenotype-genotype association assists in the explanation of the pathogenesis and the progress of genomic medicine.

Methods: PheGe-Net exploited the similarity net of phenotypes and genotypes and recognized phenotype-genotype relationships to discover their hidden interactions.

Results: By conducting experiments with a real-world dataset, the validity of our PheGe-Net is verified. Our method outperformed the second-best one by around 3% on Accuracy and NMI when clustering the phenotype/genotype; it also successfully detected phenotype-genotype associations, for example, the association for obesity (OMIM ID: 601665) was analyzed, and among the top ten scored genes, two known ones were assigned with scores more than 0.75, and other eight predicted ones are also explainable.

Conclusion: PheGe-Net is not only able to discover latent phenotype or genotype clusters but also can uncover the hidden relationships among them, as long as there are known similarity networks of phenotype, genotype, and acknowledged pheno-genotype relationships.

Graphical Abstract

[1]
Bunyan, D.J.; Shea-Simonds, J.; Reck, A.C.; Finnis, D.; Eccles, D.M. Genotype-phenotype correlations of new causative APC gene mutations in patients with familial adenomatous polyposis. J. Med. Genet., 1995, 32(9), 728-731.
[http://dx.doi.org/10.1136/jmg.32.9.728] [PMID: 8544194]
[2]
Hamosh, A.; Grade, K.; Coutelle, C.; Reis, A. Correlation between genotype and phenotype in patients with cystic fibrosis. N. Engl. J. Med., 1993, 329(18), 1308-1313.
[http://dx.doi.org/10.1056/NEJM199310283291804] [PMID: 8166795]
[3]
Schwartz, P.J.; Priori, S.G.; Spazzolini, C.; Moss, A.J.; Vincent, G.M.; Napolitano, C.; Denjoy, I.; Guicheney, P.; Breithardt, G.; Keating, M.T.; Towbin, J.A.; Beggs, A.H.; Brink, P.; Wilde, A.A.M.; Toivonen, L.; Zareba, W.; Robinson, J.L.; Timothy, K.W.; Corfield, V.; Wattanasirichaigoon, D.; Corbett, C.; Haverkamp, W.; Schulze-Bahr, E.; Lehmann, M.H.; Schwartz, K.; Coumel, P.; Bloise, R. Genotype-phenotype correlation in the long-QT syndrome: Gene-specific triggers for life-threatening arrhythmias. Circulation, 2001, 103(1), 89-95.
[http://dx.doi.org/10.1161/01.CIR.103.1.89] [PMID: 11136691]
[4]
Lesage, S.; Zouali, H.; Cézard, J.P.; Colombel, J.F.; Belaiche, J.; Almer, S.; Tysk, C.; O’Morain, C.; Gassull, M.; Binder, V.; Finkel, Y.; Modigliani, R.; Gower-Rousseau, C.; Macry, J.; Merlin, F.; Chamaillard, M.; Jannot, A.S.; Thomas, G.; Hugot, J.P. CARD15/NOD2 mutational analysis and genotype-phenotype correlation in 612 patients with inflammatory bowel disease. Am. J. Hum. Genet., 2002, 70(4), 845-857.
[http://dx.doi.org/10.1086/339432] [PMID: 11875755]
[5]
Lillicrap, D. Genotype/phenotype association in von Willebrand disease: Is the glass half full or empty? J. Thromb. Haemost., 2009, 7(S1), 65-70.
[http://dx.doi.org/10.1111/j.1538-7836.2009.03367.x] [PMID: 19630771]
[6]
Peters, J.E.; Lyons, P.A.; Lee, J.C.; Richard, A.C.; Fortune, M.D.; Newcombe, P.J.; Richardson, S.; Smith, K.G.C. Insight into genotype-phenotype associations through eQTL mapping in multiple cell types in health and immune-mediated disease. PLoS Genet., 2016, 12(3) e1005908
[http://dx.doi.org/10.1371/journal.pgen.1005908] [PMID: 27015630]
[7]
Okuda, D.T.; Srinivasan, R.; Oksenberg, J.R.; Goodin, D.S.; Baranzini, S.E.; Beheshtian, A.; Waubant, E.; Zamvil, S.S.; Leppert, D.; Qualley, P.; Lincoln, R.; Gomez, R.; Caillier, S.; George, M.; Wang, J.; Nelson, S.J.; Cree, B.A.C.; Hauser, S.L.; Pelletier, D. Genotype–Phenotype correlations in multiple sclerosis: HLA genes influence disease severity inferred by 1HMR spectroscopy and MRI measures. Brain, 2009, 132(1), 250-259.
[http://dx.doi.org/10.1093/brain/awn301] [PMID: 19022862]
[8]
Ginsburg, G.S.; Willard, H.F. Genomic and personalized medicine: Foundations and applications. Transl. Res., 2009, 154(6), 277-287.
[http://dx.doi.org/10.1016/j.trsl.2009.09.005] [PMID: 19931193]
[9]
Aerts, S.; Lambrechts, D.; Maity, S.; Van Loo, P.; Coessens, B.; De Smet, F.; Tranchevent, L.C.; De Moor, B.; Marynen, P.; Hassan, B.; Carmeliet, P.; Moreau, Y. Gene prioritization through genomic data fusion. Nat. Biotechnol., 2006, 24(5), 537-544.
[http://dx.doi.org/10.1038/nbt1203] [PMID: 16680138]
[10]
Wu, X.; Jiang, R.; Zhang, M.Q.; Li, S. Network‐based global inference of human disease genes. Mol. Syst. Biol., 2008, 4(1), 189.
[http://dx.doi.org/10.1038/msb.2008.27] [PMID: 18463613]
[11]
Hwang, T.; Atluri, G.; Xie, M.; Dey, S.; Hong, C.; Kumar, V.; Kuang, R. Co-clustering phenome–genome for phenotype classification and disease gene discovery. Nucleic Acids Res., 2012, 40(19)e146
[http://dx.doi.org/10.1093/nar/gks615] [PMID: 22735708]
[12]
Godard, P.; Page, M. PCAN: Phenotype consensus analysis to support disease-gene association. BMC Bioinformatics, 2016, 17(1), 518.
[http://dx.doi.org/10.1186/s12859-016-1401-2] [PMID: 27923364]
[13]
Ritchie, M.D.; Holzinger, E.R.; Li, R.; Pendergrass, S.A.; Kim, D. Methods of integrating data to uncover genotype–phenotype interactions. Nat. Rev. Genet., 2015, 16(2), 85-97.
[http://dx.doi.org/10.1038/nrg3868] [PMID: 25582081]
[14]
Bertsekas, D. Nonlinear Programming; Athena Scientific: Belmont, MA, 1999.
[15]
Chaibub Neto, E.; Keller, M.P.; Attie, A.D.; Yandell, B.S. Causal graphical models in systems genetics: A unified framework for joint inference of causal network and genetic architecture for correlated phenotypes. Ann. Appl. Stat., 2010, 4(1), 320-339.
[http://dx.doi.org/10.1214/09-AOAS288] [PMID: 21218138]
[16]
Bertone, A.; Hanck, J.; Kogan, C.; Chaudhuri, A.; Cornish, K. Associating neural alterations and genotype in autism and fragile x syndrome: Incorporating perceptual phenotypes in causal modeling. J. Autism Dev. Disord., 2010, 40(12), 1541-1548.
[http://dx.doi.org/10.1007/s10803-010-1110-z] [PMID: 20872060]
[17]
Hageman, R.S.; Leduc, M.S.; Korstanje, R.; Paigen, B.; Churchill, G.A. A bayesian framework for inference of the genotype-phenotype map for segregating populations. Genetics, 2011, 187(4), 1163-1170.
[http://dx.doi.org/10.1534/genetics.110.123273] [PMID: 21242536]
[18]
Sinoquet, C.; Mourad, R.; Leray, P. Forests of latent tree models to decipher genotype-phenotype associations. In: Biomedical Engineering Systems and Technologies. BIOSTEC 2012. Communications in Computer and Information Science; Springer: Berlin, Heidelberg, 2013; p. 357.
[http://dx.doi.org/10.1007/978-3-642-38256-7_8]
[19]
Hormozdiari, F.; Kang, E.Y.; Bilow, M.; Ben-David, E.; Vulpe, C.; McLachlan, S.; Lusis, A.J.; Han, B.; Eskin, E. Imputing phenotypes for genome-wide association studies. Am. J. Hum. Genet., 2016, 99(1), 89-103.
[http://dx.doi.org/10.1016/j.ajhg.2016.04.013] [PMID: 27292110]
[20]
Li, Y.; Patra, J.C. Genome-wide inferring gene–phenotype relationship by walking on the heterogeneous network. Bioinformatics, 2010, 26(9), 1219-1224.
[http://dx.doi.org/10.1093/bioinformatics/btq108] [PMID: 20215462]
[21]
Li, Y.; Li, J. Disease gene identification by random walk on multigraphs merging heterogeneous genomic and phenotype data. BMC genomics, 2012, 13(S7), S27.
[http://dx.doi.org/10.1186/1471-2164-13-S7-S27]
[22]
Morota, G.; Koyama, M.; M Rosa, G.J.; Weigel, K.A.; Gianola, D. Predicting complex traits using a diffusion kernel on genetic markers with an application to dairy cattle and wheat data. Genet. Sel. Evol., 2013, 45(1), 17.
[http://dx.doi.org/10.1186/1297-9686-45-17] [PMID: 23763755]
[23]
Qi, Y.; Suhail, Y.; Lin, Y.; Boeke, J.D.; Bader, J.S. Finding friends and enemies in an enemies-only network: A graph diffusion kernel for predicting novel genetic interactions and co-complex membership from yeast genetic interactions. Genome Res., 2008, 18(12), 1991-2004.
[http://dx.doi.org/10.1101/gr.077693.108] [PMID: 18832443]
[24]
Lee, M.J.; Ye, A.S.; Gardino, A.K.; Heijink, A.M.; Sorger, P.K.; MacBeath, G.; Yaffe, M.B. Sequential application of anticancer drugs enhances cell death by rewiring apoptotic signaling networks. Cell, 2012, 149(4), 780-794.
[http://dx.doi.org/10.1016/j.cell.2012.03.031] [PMID: 22579283]
[25]
Zhong, Q.; Simonis, N.; Li, Q.R.; Charloteaux, B.; Heuze, F.; Klitgord, N.; Tam, S.; Yu, H.; Venkatesan, K.; Mou, D.; Swearingen, V.; Yildirim, M.A.; Yan, H.; Dricot, A.; Szeto, D.; Lin, C.; Hao, T.; Fan, C.; Milstein, S.; Dupuy, D.; Brasseur, R.; Hill, D.E.; Cusick, M.E.; Vidal, M. Edgetic perturbation models of human inherited disorders. Mol. Syst. Biol., 2009, 5(1), 321.
[http://dx.doi.org/10.1038/msb.2009.80] [PMID: 19888216]
[26]
Duren, Z.; Chen, X.; Zamanighomi, M.; Zeng, W.; Satpathy, A.T.; Chang, H.Y.; Wang, Y.; Wong, W.H. Integrative analysis of single-cell genomics data by coupled nonnegative matrix factorizations. Proc. Natl. Acad. Sci., 2018, 115(30), 7723-7728.
[http://dx.doi.org/10.1073/pnas.1805681115] [PMID: 29987051]
[27]
Gligorijevic, V.; Panagakis, Y.; Zafeiriou, S. Non-negative matrix factorizations for multiplex network analysis. IEEE Trans. Pattern Anal. Mach. Intell., 2019, 41(4), 928-940.
[http://dx.doi.org/10.1109/TPAMI.2018.2821146] [PMID: 29993651]
[28]
Wang, F.; Li, T.; Wang, X.; Zhu, S.; Ding, C. Community discovery using nonnegative matrix factorization. Data Min. Knowl. Discov., 2011, 22(3), 493-521.
[http://dx.doi.org/10.1007/s10618-010-0181-y]
[29]
Ding, C.; Li, T.; Peng, W.; Park, H. Orthogonal nonnegative matrix t-factorizations for clustering. KDD, 2006, 06, 126-135.
[http://dx.doi.org/10.1145/1150402.1150420]
[30]
Zheng, X.; Ding, H.; Mamitsuka, H.; Zhu, S. Collaborative matrix factorization with multiple similarities for predicting drug-target interactions. Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, 2013.
[http://dx.doi.org/10.1145/2487575.2487670]
[31]
Zhang, P.; Wang, F.; Hu, J. Towards drug repositioning: A unified computational framework for integrating multiple aspects of drug similarity and disease similarity. AMIA Annu. Symp. Proc., 2014, 2014, 1258-1267.
[32]
Wang, F.; Wang, X.; Li, T. Generalized cluster aggregation. Proceedings of the International Joint Conference on Artificial Intelligence, 2009, pp. 1279-1284.
[33]
Strehl, A.; Ghosh, J. Cluster ensembles --- a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res., 2002, 3, 583-617.
[http://dx.doi.org/10.1162/153244303321897735]
[34]
van Driel, M.A.; Bruggeman, J.; Vriend, G.; Brunner, H.G.; Leunissen, J.A.M. A text-mining analysis of the human phenome. Eur. J. Hum. Genet., 2006, 14(5), 535-542.
[http://dx.doi.org/10.1038/sj.ejhg.5201585] [PMID: 16493445]
[35]
Pruitt, K.D.; Tatusova, T.; Maglott, D.R. NCBI reference sequences (RefSeq): A curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res., 2007, 35(Database), D61-D65.
[http://dx.doi.org/10.1093/nar/gkl842] [PMID: 17130148]
[36]
The Human Phenotype Ontology (HPO) Available from: https://raw.githubusercontent.com/obophenotype/human-phenotype-ontology/master/hp.obo (Accessed on: 2019-01-28)
[37]
McKusick, V.A. Mendelian inheritance in man and its online version, OMIM. Am. J. Hum. Genet., 2007, 80(4), 588-604.
[http://dx.doi.org/10.1086/514346] [PMID: 17357067]
[38]
Shiryev, S.A.; Papadopoulos, J.S.; Schäffer, A.A.; Agarwala, R. Improved BLAST searches using longer words for protein seeding. Bioinformatics, 2007, 23(21), 2949-2951.
[http://dx.doi.org/10.1093/bioinformatics/btm479] [PMID: 17921491]
[39]
Su, A.I.; Wiltshire, T.; Batalov, S.; Lapp, H.; Ching, K.A.; Block, D.; Zhang, J.; Soden, R.; Hayakawa, M.; Kreiman, G.; Cooke, M.P.; Walker, J.R.; Hogenesch, J.B. A gene atlas of the mouse and human protein-encoding transcriptomes. Proc. Natl. Acad. Sci., 2004, 101(16), 6062-6067.
[http://dx.doi.org/10.1073/pnas.0400782101] [PMID: 15075390]
[40]
Keshava Prasad, T.S.; Goel, R.; Kandasamy, K.; Keerthikumar, S.; Kumar, S.; Mathivanan, S.; Telikicherla, D.; Raju, R.; Shafreen, B.; Venugopal, A.; Balakrishnan, L.; Marimuthu, A.; Banerjee, S.; Somanathan, D.S.; Sebastian, A.; Rani, S.; Ray, S.; Harrys Kishore, C.J.; Kanth, S.; Ahmed, M.; Kashyap, M.K.; Mohmood, R.; Ramachandra, Y.L.; Krishna, V.; Rahiman, B.A.; Mohan, S.; Ranganathan, P.; Ramabadran, S.; Chaerkady, R.; Pandey, A. Human protein reference database--2009 update. Nucleic Acids Res., 2009, 37(Database), D767-D772.
[http://dx.doi.org/10.1093/nar/gkn892] [PMID: 18988627]
[41]
Ashburner, M.; Ball, C.A.; Blake, J.A.; Botstein, D.; Butler, H.; Cherry, J.M.; Davis, A.P.; Dolinski, K.; Dwight, S.S.; Eppig, J.T.; Harris, M.A.; Hill, D.P.; Issel-Tarver, L.; Kasarskis, A.; Lewis, S.; Matese, J.C.; Richardson, J.E.; Ringwald, M.; Rubin, G.M.; Sherlock, G. Gene Ontology: Tool for the unification of biology. Nat. Genet., 2000, 25(1), 25-29.
[http://dx.doi.org/10.1038/75556] [PMID: 10802651]
[42]
Yu, G.; Li, F.; Qin, Y.; Bo, X.; Wu, Y.; Wang, S. GOSemSim: An R package for measuring semantic similarity among GO terms and gene products. Bioinformatics, 2010, 26(7), 976-978.
[http://dx.doi.org/10.1093/bioinformatics/btq064] [PMID: 20179076]
[43]
Chen, Y.; Wu, X.; Jiang, R. Integrating human omics data to prioritize candidate genes. BMC Med. Genomics, 2013, 6(1), 57.
[http://dx.doi.org/10.1186/1755-8794-6-57] [PMID: 24344781]
[44]
Zhang, Y.; Scarpace, P.J. The role of leptin in leptin resistance and obesity. Physiol. Behav., 2006, 88(3), 249-256.
[http://dx.doi.org/10.1016/j.physbeh.2006.05.038] [PMID: 16782141]
[45]
Yeung, E.H.; Zhang, C.; Chen, J.; Bowers, K.; Hu, F.B.; Kang, G.; Qi, L. Polymorphisms in the neuropeptide Y gene and the risk of obesity: Findings from two prospective cohorts. J. Clin. Endocrinol. Metab., 2011, 96(12), E2055-E2062.
[http://dx.doi.org/10.1210/jc.2011-0195] [PMID: 21937627]
[46]
Ma, Y.; Wang, S.Q.; Xu, W.R.; Wang, R.L.; Chou, K.C. Design novel dual agonists for treating type-2 diabetes by targeting peroxisome proliferator-activated receptors with core hopping approach. PLoS One, 2012, 7(6)e38546
[http://dx.doi.org/10.1371/journal.pone.0038546] [PMID: 22685582]
[47]
Gloyn, A.L.; Siddiqui, J.; Ellard, S. Mutations in the genes encoding the pancreatic beta-cell KATP channel subunits Kir6.2 (KCNJ11) and SUR1 (ABCC8) in diabetes mellitus and hyperinsulinism. Hum. Mutat., 2006, 27(3), 220-231.
[http://dx.doi.org/10.1002/humu.20292] [PMID: 16416420]
[48]
Vistisen, D.; Witte, D.R.; Tabák, A.G.; Herder, C.; Brunner, E.J.; Kivimäki, M.; Færch, K. Patterns of obesity development before the diagnosis of type 2 diabetes: The Whitehall II cohort study. PLoS Med., 2014, 11(2)e1001602
[http://dx.doi.org/10.1371/journal.pmed.1001602] [PMID: 24523667]
[49]
Remmers, E.F.; Plenge, R.M.; Lee, A.T.; Graham, R.R.; Hom, G.; Behrens, T.W.; de Bakker, P.I.W.; Le, J.M.; Lee, H.S.; Batliwalla, F.; Li, W.; Masters, S.L.; Booty, M.G.; Carulli, J.P.; Padyukov, L.; Alfredsson, L.; Klareskog, L.; Chen, W.V.; Amos, C.I.; Criswell, L.A.; Seldin, M.F.; Kastner, D.L.; Gregersen, P.K. STAT4 and the risk of rheumatoid arthritis and systemic lupus erythematosus. N. Engl. J. Med., 2007, 357(10), 977-986.
[http://dx.doi.org/10.1056/NEJMoa073003] [PMID: 17804842]

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy