Generic placeholder image

Current Bioinformatics

Editor-in-Chief

ISSN (Print): 1574-8936
ISSN (Online): 2212-392X

Research Article

MSSD: An Efficient Method for Constructing Accurate and Stable Phylogenetic Networks by Merging Subtrees of Equal Depth

Author(s): Jiajie Xing, Xu Song, Meiju Yu, Juan Wang* and Jing Yu*

Volume 19, Issue 9, 2024

Published on: 04 October, 2023

Page: [879 - 889] Pages: 11

DOI: 10.2174/0115748936256923230927081102

Price: $65

Abstract

Background: Systematic phylogenetic networks are essential for studying the evolutionary relationships and diversity among species. These networks are particularly important for capturing non-tree-like processes resulting from reticulate evolutionary events. However, existing methods for constructing phylogenetic networks are influenced by the order of inputs. The different orders can lead to inconsistent experimental results. Moreover, constructing a network for large datasets is time-consuming and the network often does not include all of the input tree nodes. Aims: This paper aims to propose a novel method, called as MSSD, which can construct a phylogenetic network from gene trees by Merging Subtrees with the Same Depth in a bottom-up way.

Methods: The MSSD first decomposes trees into subtrees based on depth. Then it merges subtrees with the same depth from 0 to the maximum depth. For all subtrees of one depth, it inserts each subtree into the current networks by means of identical subtrees.

Results: We test the MSSD on the simulated data and real data. The experimental results show that the networks constructed by the MSSD can represent all input trees and the MSSD is more stable than other methods. The MSSD can construct networks faster and the constructed networks have more similar information with the input trees than other methods.

Conclusion: MSSD is a powerful tool for studying the evolutionary relationships among species in biologyand is free available at https://github.com/xingjiajie2023/MSSD.

« Previous
[1]
Leventhal GE, Kouyos R, Stadler T, et al. Inferring epidemic contact structure from phylogenetic trees. PLOS Comput Biol 2012; 8(3): e1002413.
[http://dx.doi.org/10.1371/journal.pcbi.1002413] [PMID: 22412361]
[2]
Forster P, Forster L, Renfrew C, Forster M. Phylogenetic network analysis of SARS-CoV-2 genomes. Proc Natl Acad Sci 2020; 117(17): 9241-3.
[http://dx.doi.org/10.1073/pnas.2004999117] [PMID: 32269081]
[3]
Sapoval N, Aghazadeh A, Nute MG, et al. Current progress and open challenges for applying deep learning across the biosciences. Nat Commun 2022; 13(1): 1728.
[http://dx.doi.org/10.1038/s41467-022-29268-7] [PMID: 35365602]
[4]
Linder CR. Network (reticulate) evolution: Biology, models, and algorithms Ninth pacific symposium on biocomputing, Hawaii. USA. 2004.
[5]
Maddison WP. Gene trees in species trees. Syst Biol 1997; 46(3): 523-36.
[http://dx.doi.org/10.1093/sysbio/46.3.523]
[6]
Nakhleh L. Evolutionary phylogenetic networks: Models and issues. In: Heath L, Ramakrishnan N, Eds. Problem Solving Handbook in Computational Biology and Bioinformatics. Boston, MA: Springer 2010; pp. 125-58.
[http://dx.doi.org/10.1007/978-0-387-09760-2_7]
[7]
Gusfield D, Bansal V. A fundamental decomposition theory for phylogenetic networks and incompatible characters. In: Miyano S, Mesirov J, Kasif S, Istrail S, Pevzner PA, Waterman M, Eds. Research in Computational Molecular BiologyRECOMB 2005 Lecture Notes in Computer Science. Berlin, Heidelberg: Springer 2005; p. 3500.
[http://dx.doi.org/10.1007/11415770_17]
[8]
Song YS, Hein J. Constructing minimal ancestral recombination graphs. J Comput Biol 2005; 12(2): 147-69.
[http://dx.doi.org/10.1089/cmb.2005.12.147] [PMID: 15767774]
[9]
Huson DH, Scornavacca C. A survey of combinatorial methods for phylogenetic networks. Genome Biol Evol 2011; 3: 23-35.
[http://dx.doi.org/10.1093/gbe/evq077] [PMID: 21081312]
[10]
Huson DH, Bryant D. Application of phylogenetic networks in evolutionary studies. Mol Biol Evol 2006; 23(2): 254-67.
[http://dx.doi.org/10.1093/molbev/msj030] [PMID: 16221896]
[11]
Jin G, Nakhleh L, Snir S, Tuller T. Maximum likelihood of phylogenetic networks. Bioinformatics 2006; 22(21): 2604-11.
[http://dx.doi.org/10.1093/bioinformatics/btl452] [PMID: 16928736]
[12]
Zhu J, Liu X, Ogilvie HA, Nakhleh LK. A divide-and-conquer method for scalable phylogenetic network inference from multilocus data. Bioinformatics 2019; 35(14): i370-8.
[http://dx.doi.org/10.1093/bioinformatics/btz359] [PMID: 31510688]
[13]
Yu Y, Nakhleh L. A maximum pseudo-likelihood approach for phylo genetic networks. BMC Genomics 2015; 16(10): 1-10.
[14]
Zhu J, Nakhleh L. Inference of species phylogenies from bi-allelic markers using pseudo-likelihood. Bioinformatics 2018; 34(13): i376-85.
[http://dx.doi.org/10.1093/bioinformatics/bty295] [PMID: 29950004]
[15]
Wheeler WC. Phylogenetic network analysis as a parsimony optimization problem. BMC Bioinformatics 2015; 16(1): 296.
[http://dx.doi.org/10.1186/s12859-015-0675-0] [PMID: 26382078]
[16]
Van Iersel L, Jones M, Scornavacca C. Improved maximum parsimony models for phylogenetic networks. Syst Biol 2018; 67(3): 518-42.
[http://dx.doi.org/10.1093/sysbio/syx094] [PMID: 29272537]
[17]
Bryant C, Fischer M, Linz S, Semple C. On the quirks of maximum parsimony and likelihood on phylogenetic networks. J Theor Biol 2017; 417: 100-8.
[http://dx.doi.org/10.1016/j.jtbi.2017.01.013] [PMID: 28087420]
[18]
Nakhleh L. Reconstructing phylogenetic networks using maximum parsimony. IEEE Computational Systems Bioinformatics Conference (CSB’05). Stanford, CA, USA. 2017; pp. 93-102.
[19]
Kannan L, Wheeler WC. Maximum parsimony on phylogenetic networks. Algorithms Mol Biol 2012; 7(1): 9.
[http://dx.doi.org/10.1186/1748-7188-7-9] [PMID: 22551229]
[20]
Yan Z, et al. Maximum parsimony inference of phylogenetic networks in the presence of polyploid complexes. Syst Biol 2022; 71(3): 706-20.
[PMID: 34605924]
[21]
Hejase HA. Fast and accurate statistical inference of phylogenetic networks using large-scale genomic sequence data. RECOMB International conference on Comparative Genomics Cham: Springer 2018; 242-59.
[http://dx.doi.org/10.1007/978-3-030-00834-5_14]
[22]
Lutteropp S. NetRAX: Accurate and fast maximum likelihood phylogenetic network inference. Bioinformatics 2021; 38(15): 3725-3.
[PMID: 35713506]
[23]
Zhang C, Ogilvie HA, Drummond AJ, Stadler T. Bayesian inference of species networks from multilocus sequence data. Mol Biol Evol 2018; 35(2): 504-17.
[http://dx.doi.org/10.1093/molbev/msx307] [PMID: 29220490]
[24]
Poormohammadi H. Constructing rooted phylogenetic networks from triplets based on height function. JETAE 2014; 2: 389-93.
[25]
Reyhani MH, Poormohammadi H. RPNCH: A method for constructing rooted phylogenetic networks from rooted triplets based on height function. Archives Adv Biosci 2017; 8(4): 14-20.
[26]
Tusserkani R, Hadi P, Azin A, Changiz E. Inferring phylogenies from minimal information. Third International Conference on Advances in Electrical. Chennai, India. 2017; pp. 202-6.
[27]
Poormohammadi H, Zarchi MS, Ghaneai H. NCHB: A method for constructing rooted phylogenetic networks from rooted triplets based on height function and binarization. J Theor Biol 2020; 489: 110144.
[http://dx.doi.org/10.1016/j.jtbi.2019.110144] [PMID: 31911141]
[28]
Poormohammadi H, Mohsen SZ. Netcombin: An algorithm for constructing optimal phylogenetic network from rooted triplets. PLoS One 2020; 15(9): e0227842.
[29]
Huson DH, Rupp R. Summarizing multiple gene trees using cluster networks. In: Algorithms in Bioinformatics (WABI) Berlin: Springer . 2008; 5251: pp. 296-305.
[30]
Huson DH, Rupp R, Berry V, Gambette P, Paul C. Computing galled networks from real data. Bioinformatics 2009; 25(12): i85-93.
[http://dx.doi.org/10.1093/bioinformatics/btp217] [PMID: 19478021]
[31]
van Iersel L, Kelk S, Rupp R, Huson D. Phylogenetic networks do not need to be complex: Using fewer reticulations to represent conflicting clusters. Bioinformatics 2010; 26(12): i124-31.
[http://dx.doi.org/10.1093/bioinformatics/btq202] [PMID: 20529896]
[32]
Wang J, Guo M, Xing L, Che K, Liu X, Wang C. BIMLR: A method for constructing rooted phylogenetic networks from rooted phylogenetic trees. Gene 2013; 527(1): 344-51.
[http://dx.doi.org/10.1016/j.gene.2013.06.036] [PMID: 23816409]
[33]
Wang J, Guo M, Liu X, et al. Lnetwork: An efficient and effective method for constructing phylogenetic networks. Bioinformatics 2013; 29(18): 2269-76.
[http://dx.doi.org/10.1093/bioinformatics/btt378] [PMID: 23811095]
[34]
Wang J, Zhibin Z. yanjuan L. Constructing phylogenetic networks based on the isomorphism of datasets. Biomed Res Int 2016; 2016: 4236858.
[35]
van Iersel L, Janssen R, Jones M, et al. A practical fixed-parameter algorithm for constructing tree-child networks from multiple binary trees. Algorithmica 2019; 84: 917-60.
[36]
Solís-Lemus C, Ané C. Inferring phylogenetic networks with maximum pseudolikelihood under incomplete lineage sorting. PLoS Genet 2016; 12(3): e1005896.
[37]
Allman ES, Baños H, Rhodes JA. NANUQ: A method for inferring species networks from gene trees under the coalescent mode. Algorithms Mol Biol 2019; 14: 24.
[38]
Markin A. Robinson-Foulds reticulation networks BCB ’19: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. Niagara Falls, NY, USA. 2019; pp. 77-86.
[39]
Alexey M. RF-Net 2: Fast inference of virus reassortment and hybridization networks. Bioinformatics 2022; 38(8): 2144-52.
[40]
Tutsoy O. Graph theory based large-scale machine learning with multi-dimensional constrained optimization approaches for exact epidemiological modelling of pandemic diseases. IEEE Trans Pattern Anal Mach Intell 2023; 1-10.
[http://dx.doi.org/10.1109/TPAMI.2023.3256421]
[41]
Jukes TH, Cantor CR. Evolution of protein molecules. In: Mammalian Protein Metabolism Amsterdam: Elsevier. 2019; 3: pp. 21-132.
[42]
Wang J, Guo MZ, Xing LL. Methodology FastJoin, an improved neighbor-joining algorithm. Genet Mol Res 2012; 11(3): 1909-22.
[http://dx.doi.org/10.4238/2012.July.19.10] [PMID: 22869546]
[43]
Wang J, Guo M. A metric on the space of rooted phylogenetic trees. Curr Bioinform 2018; 13(5): 487-91.
[http://dx.doi.org/10.2174/1574893612666171122145801]
[44]
Wang J, Qi X, Cui B, Guo M. A survey of metrics measuring difference for rooted phylogenetic trees. Curr Bioinform 2020; 15(7): 697-702.
[http://dx.doi.org/10.2174/1574893614666191017130217]
[45]
Rambaut A, Grass NC. Seq-Gen: An application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees. Bioinformatics 1997; 13(3): 235-8.
[http://dx.doi.org/10.1093/bioinformatics/13.3.235] [PMID: 9183526]
[46]
Wang J, Guo M. A review of metrics measuring dissimilarity for rooted phylogenetic networks. Brief Bioinform 2019; 20(6): 1972-80.
[http://dx.doi.org/10.1093/bib/bby062] [PMID: 30020404]
[47]
Larkin MA, Blackshields G, Brown NP, et al. Clustal W and Clustal X version 2.0. Bioinformatics 2007; 23(21): 2947-8.
[http://dx.doi.org/10.1093/bioinformatics/btm404] [PMID: 17846036]
[48]
Tamura K, Stecher G, Kumar S. MEGA11: Molecular evolutionary genetics analysis version 11. Mol Biol Evol 2021; 38(7): 3022-7.
[http://dx.doi.org/10.1093/molbev/msab120] [PMID: 33892491]

Rights & Permissions Print Cite
© 2025 Bentham Science Publishers | Privacy Policy