Generic placeholder image

Current Bioinformatics

Editor-in-Chief

ISSN (Print): 1574-8936
ISSN (Online): 2212-392X

Research Article

An Iterative Model for Identifying Essential Proteins Based on the Whole Process Network of Protein Evolution

Author(s): Zhen Zhang, Yaocan Zhu*, Hongjing Pei, Xiangyi Wang and Lei Wang*

Volume 18, Issue 4, 2023

Published on: 06 April, 2023

Page: [359 - 373] Pages: 15

DOI: 10.2174/1574893618666230315154807

Price: $65

Abstract

Introduction: Essential proteins play important roles in cell growth and regulation. However, due to the high costs and low efficiency of traditional biological experiments to identify essential proteins, in recent years, with the development of high-throughput technologies and bioinformatics, more and more computational models have been proposed to infer key proteins based on Protein-Protein Interaction (PPI) networks.

Methods: In this manuscript, a novel prediction model named MWPNPE (Model based on the Whole Process Network of Protein Evolution) was proposed, in which, a whole process network of protein evolution was constructed first based on known PPI data and gene expression data downloaded from benchmark databases. And then, considering that the interaction between proteins is a kind of dynamic process, a new measure was designed to estimate the relationships between proteins, based on which, an improved iterative algorithm was put forward to evaluate the importance of proteins.

Results: Finally, in order to verify the predictive performance of MWPNPE, we compared it with stateof- the-art representative computational methods, and experimental results demonstrated that the recognition accuracy of MWPNPE in the top 100, 200, and 300 candidate key proteins can reach 89, 166, and 233 respectively, which is significantly better than the predictive accuracies achieved by these competitive methods.

Conclusion: Hence, it can be seen that MWPNPE may be a useful tool for the development of key protein recognition in the future.

« Previous
Graphical Abstract

[1]
Mistry D, Wise RP, Dickerson JA. DiffSLC: A graph centrality method to detect essential proteins of a protein-protein interaction network. PLoS One 2017; 12(11): e0187091.
[http://dx.doi.org/10.1371/journal.pone.0187091] [PMID: 29121073]
[2]
Giaever G, Chu AM, Ni L, et al. Functional profiling of the Saccharomyces cerevisiae genome. Nature 2002; 418(6896): 387-91.
[http://dx.doi.org/10.1038/nature00935] [PMID: 12140549]
[3]
Kamath RS, Fraser AG, Dong Y, et al. Systematic functional analysis of the Caenorhabditis elegans genome using RNAi. Nature 2003; 421(6920): 231-7.
[http://dx.doi.org/10.1038/nature01278] [PMID: 12529635]
[4]
Cullen LM, Arndt GM. Genome‐wide screening for gene function using RNAi in mammalian cells. Immunol Cell Biol 2005; 83(3): 217-23.
[http://dx.doi.org/10.1111/j.1440-1711.2005.01332.x] [PMID: 15877598]
[5]
Dai W, Chen B, Peng W, Li X, Zhong J, Wang J. A novel multi-ensemble method for identifying essential proteins. J Comput Biol 2021; 28(7): 637-49.
[http://dx.doi.org/10.1089/cmb.2020.0527] [PMID: 33439753]
[6]
Zhang W, Xue X, Xie C, et al. CEGSO: Boosting essential proteins prediction by integrating protein complex, gene expression, gene ontol-ogy, subcellular localization and orthology information. Interdiscip Sci 2021; 13(3): 349-61.
[http://dx.doi.org/10.1007/s12539-021-00426-7] [PMID: 33772722]
[7]
Estrada E, Rodríguez-Velázquez JA. Subgraph centrality in complex networks. Phys Rev E Stat Nonlin Soft Matter Phys 2005; 71(5): 056103.
[http://dx.doi.org/10.1103/PhysRevE.71.056103] [PMID: 16089598]
[8]
Bonacich P. Power and centrality: A family of measures. Am J Sociol 1987; 92(5): 1170-82.
[http://dx.doi.org/10.1086/228631]
[9]
Hahn MW, Kern AD. Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks. Mol Biol Evol 2005; 22(4): 803-6.
[http://dx.doi.org/10.1093/molbev/msi072] [PMID: 15616139]
[10]
Joy MP, Brock A, Ingber DE, Huang S. High-betweenness proteins in the yeast protein interaction network. J Biomed Biotechnol 2005; 2005(2): 96-103.
[http://dx.doi.org/10.1155/JBB.2005.96] [PMID: 16046814]
[11]
Wuchty S, Stadler PF. Centers of complex networks. J Theor Biol 2003; 223(1): 45-53.
[http://dx.doi.org/10.1016/S0022-5193(03)00071-7] [PMID: 12782116]
[12]
Jianxin Wang , Min Li , Huan Wang , Yi Pan . Identification of essential proteins based on edge clustering coefficient. IEEE/ACM Trans Comput Biol Bioinformatics 2012; 9(4): 1070-80.
[http://dx.doi.org/10.1109/TCBB.2011.147] [PMID: 22084147]
[13]
Zelen SM. Rethinking centrality: Methods and examples. Soc Networks 1989; 11(1): 1-37.
[14]
Yi Q, Luo J. Prediction of essential proteins based on local interaction density. IEEE/ACM Trans Comput Biol Bioinform 2016; 13(6): 1170-82.
[http://dx.doi.org/10.1109/TCBB.2015.2509989] [PMID: 26701891]
[15]
Li M, Lu Y, Wang J, Wu FX, Pan Y. A topology potential-based method for identifying essential proteins from PPI networks. IEEE/ACM Trans Comput Biol Bioinformatics 2015; 12(2): 372-83.
[http://dx.doi.org/10.1109/TCBB.2014.2361350] [PMID: 26357224]
[16]
Lin CY, Chin CH, Wu HH, Chen SH, Ho CW, Ko MT. Hubba: hub objects analyzer—a framework of interactome hubs identification for network biology. Nucleic Acids Res 2008; 36: W438-43.
[http://dx.doi.org/10.1093/nar/gkn257] [PMID: 18503085]
[17]
Jeong H, Mason SP, Barabási AL, Oltvai ZN. Lethality and centrality in protein networks. Nature 2001; 411(6833): 41-2.
[http://dx.doi.org/10.1038/35075138] [PMID: 11333967]
[18]
Sprinzak E, Sattath S, Margalit H. How reliable are experimental protein-protein interaction data? J Mol Biol 2003; 327(5): 919-23.
[http://dx.doi.org/10.1016/S0022-2836(03)00239-0] [PMID: 12662919]
[19]
Kuchaiev O, Rašajski M, Higham DJ, Pržulj N. Geometric de-noising of protein-protein interaction networks. PLOS Comput Biol 2009; 5(8): e1000454.
[http://dx.doi.org/10.1371/journal.pcbi.1000454] [PMID: 19662157]
[20]
Zhang F, Peng W, Yang Y, Dai W, Song J. A novel method for identifying essential genes by fusing dynamic protein–protein interactive networks. Genes 2019; 10(1): 31.
[http://dx.doi.org/10.3390/genes10010031] [PMID: 30626157]
[21]
Lei X, Yang X, Wu F-X. Artificial fish swarm optimization-based method to identify essential proteins. IEEE/ACM Trans Comput Biol Bioinform 2018; 17(2): 495-505.
[http://dx.doi.org/10.1109/TCBB.2018.2865567] [PMID: 30113899]
[22]
Zhao B, Wang J, Li M, Wu F, Pan Y. Prediction of essential proteins based on overlapping essential modules. IEEE Trans Nanobiosci 2014; 13(4): 415-24.
[http://dx.doi.org/10.1109/TNB.2014.2337912] [PMID: 25122840]
[23]
Zhang X, Xu J, Xiao W. A new method for the discovery of essential proteins. PLoS One 2013; 8(3): e58763.
[http://dx.doi.org/10.1371/journal.pone.0058763] [PMID: 23555595]
[24]
Ren R, Wang J, Li M, et al. Prediction of essential proteins by integration of PPI network topology and protein complexes Bioinformatics research and applications. proceedings of the 7th International Symposium on Bioinformatics Research and Applications (ISBRA). Berlin, Heidelberg: Springer 2011; pp. 12-24.
[25]
Li M, Zhang H, Wang J, Pan Y. A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data. BMC Syst Biol 2012; 6(1): 15.
[http://dx.doi.org/10.1186/1752-0509-6-15] [PMID: 22405054]
[26]
Tang J, Liu G, Pan Q. A review on representative swarm intelligence algorithms for solving optimization problems: Applications and trends IEEE/CAA J Automatica Sinica 2021; 8(10): 17.
[http://dx.doi.org/10.1109/JAS.2021.1004129]
[27]
Gao S, Cheng J, Todo Y, et al. Incorporation of solvent effect into multi-objective evolutionary algorithm for improved protein structure prediction. IEEE/ACM Trans Comput Biol Bioinformat 2017; 15(4): 1365-78.
[http://dx.doi.org/10.1109/TCBB.2017.2705094] [PMID: 2853478]
[28]
You ZH, Zhou MC, Luo X, et al. Highly efficient framework for predicting interactions between proteins. IEEE Trans Cybern 2016; 47(3): 731-43.
[PMID: 28113829]
[29]
Kehyayan C, Mansour N, Kanj F, Khachfe H. Evolutionary Algorithm for Protein Structure Prediction. proceedings of the International Conference on Advanced Computer Theory and Engineering (ICACTE). Phuket, Thailand: IEEE 2008; pp. 925-9.
[30]
Li G, Li M, Peng W, Li Y, Pan Y, Wang J. A novel extended pareto optimality consensus model for predicting essential proteins. J Theor Biol 2019; 480: 141-9.
[http://dx.doi.org/10.1016/j.jtbi.2019.08.005] [PMID: 31398315]
[31]
Zhong J, Tang C, Peng W, et al. A novel essential protein identification method based on PPI networks and gene expression data. BMC Bioinformat 2021; 22(1): 248.
[http://dx.doi.org/10.1186/s12859-021-04175-8] [PMID: 33985429]
[32]
Dai W, Chang Q, Peng W, Zhong J, Li Y. Network embedding the protein–protein interaction network for human essential genes identifica-tion. Genes 2020; 11(2): 153.
[http://dx.doi.org/10.3390/genes11020153] [PMID: 32023848]
[33]
Sun W, Wang L, Peng J, et al. A cross-entropy-based method for essential protein identification in yeast protein–protein interaction net-work. Curr Bioinform 2020; 15(4): 1-11.
[http://dx.doi.org/10.2174/1574893615999201116210840]
[34]
Li S, Chen Z, He X, et al. An iteration method for identifying yeast essential proteins from weighted PPI network based on topological and functional features of proteins. IEEE Access 2020; 8(99): 90792-804.
[http://dx.doi.org/10.1109/ACCESS.2020.2993860,]
[35]
Peng W, Wang J, Wang W, Liu Q, Wu FX, Pan Y. Iteration method for predicting essential proteins based on orthology and protein-protein interaction networks. BMC Syst Biol 2012; 6(1): 87.
[http://dx.doi.org/10.1186/1752-0509-6-87] [PMID: 22808943]
[36]
Zhang W, Xu J, Li Y, Zou X. Detecting essential proteins based on network topology, gene expression data, and gene ontology infor-mation. IEEE/ACM Trans Comput Biol Bioinform 2016; 15(1): 109-6.
[http://dx.doi.org/10.1109/TCBB.2016.2615931] [PMID: 28650821]
[37]
Zhang W, Xu J, Zou X. Predicting essential proteins by integrating network topology, subcellular localization information, gene expression profile and go annotation data. IEEE/ACM Trans Comput Biol Bioinform 2019; 17(6): 2053-61.
[http://dx.doi.org/10.1109/TCBB.2019.2916038] [PMID: 31095490]
[38]
Lei X, Zhao J, Fujita H, Zhang A. Predicting essential proteins based on RNA-Seq, subcellular localization and GO annotation datasets. Knowl Base Syst 2018; 151: 136-48.
[http://dx.doi.org/10.1016/j.knosys.2018.03.027]
[39]
Qin C, Sun Y, Dong Y. A new computational strategy for identifying essential proteins based on network topological properties and bio-logical information. PLoS One 2017; 12(7): e0182031.
[http://dx.doi.org/10.1371/journal.pone.0182031] [PMID: 28753682]
[40]
Lei X, Yang X, Fujita H. Random walk-based method to identify essential proteins by integrating network topology and biological charac-teristics. Knowl Base Syst 2019; 167(3): 53-67.
[http://dx.doi.org/10.1016/j.knosys.2019.01.012]
[41]
Zhao B, Zhao Y, Zhang X, Zhang Z, Zhang F, Wang L. An iteration method for identifying yeast essential proteins from heterogeneous network. BMC Bioinformat 2019; 20(1): 355.
[http://dx.doi.org/10.1186/s12859-019-2930-2] [PMID: 31234779]
[42]
Hu L, Yang S, Luo X, et al. A distributed framework for large-scale protein-protein interaction data analysis and prediction using mapReduce. IEEE/CAA J Automatica Sinica 2021; 9(1): 160-72.
[http://dx.doi.org/10.1109/JAS.2021.1004198]
[43]
Das S, Chakrabarti S. Classification and prediction of protein–protein interaction interface using machine learning algorithm. Sci Rep 2021; 11(1): 1761.
[http://dx.doi.org/10.1038/s41598-020-80900-2] [PMID: 33469042]
[44]
Menor-Flores M, Vega-Rodriguez M A. Decomposition-based multi-objective optimization approach for PPI network alignment. Knowledge-based system 2022; 243 108527.
[http://dx.doi.org/10.1016/j.knosys.2022.108527]
[45]
Debnath S, Mollah AF. A supervised machine learning approach for sequence based Protein-Protein Interaction (PPI)Prediction arXiv. 2022; pp. 1-10.
[http://dx.doi.org/10.48550/arXiv.2203.12659]
[46]
Gavin AC, Aloy P, Grandi P, et al. Proteome survey reveals modularity of the yeast cell machinery. Nature 2006; 440(7084): 631-36.
[http://dx.doi.org/10.1038/nature04532] [PMID: 16429126]
[47]
Krogan NJ, Cagney G, Yu H, et al. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 2006; 440(7084): 637-43.
[http://dx.doi.org/10.1038/nature04670] [PMID: 16554755]
[48]
Xenarios I, Salwínski L, Duan XJ, Higney P, Kim S-M, Eisenberg D. DIP, the database of interacting proteins: A research tool for studying cellular networks of protein interactions. Nucleic Acids Res 2002; 30(1): 303-5.
[http://dx.doi.org/10.1093/nar/30.1.303] [PMID: 11752321]
[49]
Dai C, He J, Hu K, Ding Y. Identifying essential proteins in dynamic protein networks based on an improved h-index algorithm. BMC Med Inform Decis Mak 2020; 20(1): 110.
[http://dx.doi.org/10.1186/s12911-020-01141-x] [PMID: 32552708]
[50]
Horyu D, Hayashi T. Comparison between pearson correlation coefficient and mutual information as a similarity measure of gene expres-sion profiles. Japanese J Biomet 2013; 33(2): 125-43.
[http://dx.doi.org/10.5691/jjb.33.125]
[51]
J. Michael Cherry. The Saccharomyces genome database: Exploring genome features and their annotations. Cold Spring Harbor Protocols 2015; 12: pdb.prot088922..
[http://dx.doi.org/10.1101/pdb.prot088922]
[52]
Cherry J, Adler C, Ball C, et al. Sgd: Saccharomyces genome database. Nucleic Acids Res 1998; 26(1): 73-9.
[http://dx.doi.org/10.1093/nar/26.1.73] [PMID: 9399804]
[53]
Zhang R, Lin Y. DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes. Nucleic Acids Res 2009; 37: D455-8.
[http://dx.doi.org/10.1093/nar/gkn858] [PMID: 18974178]
[54]
Mewes HW, Frishman D, Mayer KF, et al. MIPS: analysis and annotation of proteins from whole genomes in 2005. Nucleic Acids Res 2006; 34(90001) (Suppl. 1): D169-72.
[http://dx.doi.org/10.1093/nar/gkj148] [PMID: 16381839]
[55]
Östlund G, Schmitt T, Forslund K, et al. InParanoid 7: New algorithms and tools for eukaryotic orthology analysis. Nucleic Acids Res 2010; 38: D196-203.
[http://dx.doi.org/10.1093/nar/gkp931] [PMID: 19892828]
[56]
Tu BP, Kudlicki A, Rowicka M, McKnight SL. Logic of the yeast metabolic cycle: Temporal compartmentalization of cellular processes. Science 2005; 310(5751): 1152-8.
[http://dx.doi.org/10.1126/science.1120499] [PMID: 16254148]
[57]
Holman AG, Davis PJ, Foster JM, Carlow CKS, Kumar S. Computational prediction of essential genes in an unculturable endosymbiotic bacterium, Wolbachia of Brugia malayi. BMC Microbiol 2009; 9(1): 243.
[http://dx.doi.org/10.1186/1471-2180-9-243] [PMID: 19943957]
[58]
Davis J. The Relationship Between Precision-Recall and ROC Curves. Proceedings of the 23th International Conference on Machine Learning. New York, NY, USA ACM 2006; pp. 233-40.

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy