An Iterative Model for Identifying Essential Proteins Based on the Whole
Process Network of Protein Evolution

Zhen      Zhang; Yaocan      Zhu; Hongjing      Pei; Xiangyi      Wang; Lei      Wang

doi:10.2174/1574893618666230315154807

Abstract

Introduction: Essential proteins play important roles in cell growth and regulation. However, due to the high costs and low efficiency of traditional biological experiments to identify essential proteins, in recent years, with the development of high-throughput technologies and bioinformatics, more and more computational models have been proposed to infer key proteins based on Protein-Protein Interaction (PPI) networks.

Methods: In this manuscript, a novel prediction model named MWPNPE (Model based on the Whole Process Network of Protein Evolution) was proposed, in which, a whole process network of protein evolution was constructed first based on known PPI data and gene expression data downloaded from benchmark databases. And then, considering that the interaction between proteins is a kind of dynamic process, a new measure was designed to estimate the relationships between proteins, based on which, an improved iterative algorithm was put forward to evaluate the importance of proteins.

Results: Finally, in order to verify the predictive performance of MWPNPE, we compared it with stateof- the-art representative computational methods, and experimental results demonstrated that the recognition accuracy of MWPNPE in the top 100, 200, and 300 candidate key proteins can reach 89, 166, and 233 respectively, which is significantly better than the predictive accuracies achieved by these competitive methods.

Conclusion: Hence, it can be seen that MWPNPE may be a useful tool for the development of key protein recognition in the future.

« Previous

Graphical Abstract

[1]
Mistry D, Wise RP, Dickerson JA. DiffSLC: A graph centrality method to detect essential proteins of a protein-protein interaction network. PLoS One  2017; 12(11): e0187091.
 [http://dx.doi.org/10.1371/journal.pone.0187091] [PMID:  29121073]

[2]
Giaever G, Chu AM, Ni L, et al. Functional profiling of the Saccharomyces cerevisiae genome. Nature  2002; 418(6896): 387-91.
 [http://dx.doi.org/10.1038/nature00935] [PMID:  12140549]

[3]
Kamath RS, Fraser AG, Dong Y, et al. Systematic functional analysis of the Caenorhabditis elegans genome using RNAi. Nature  2003; 421(6920): 231-7.
 [http://dx.doi.org/10.1038/nature01278] [PMID:  12529635]

[4]
Cullen LM, Arndt GM. Genome‐wide screening for gene function using RNAi in mammalian cells. Immunol Cell Biol  2005; 83(3): 217-23.
 [http://dx.doi.org/10.1111/j.1440-1711.2005.01332.x] [PMID:  15877598]

[5]
Dai W, Chen B, Peng W, Li X, Zhong J, Wang J. A novel multi-ensemble method for identifying essential proteins. J Comput Biol  2021; 28(7): 637-49.
 [http://dx.doi.org/10.1089/cmb.2020.0527] [PMID:  33439753]

[6]
Zhang W, Xue X, Xie C, et al. CEGSO: Boosting essential proteins prediction by integrating protein complex, gene expression, gene ontol-ogy, subcellular localization and orthology information. Interdiscip Sci  2021; 13(3): 349-61.
 [http://dx.doi.org/10.1007/s12539-021-00426-7] [PMID:  33772722]

[7]
Estrada E, Rodríguez-Velázquez JA. Subgraph centrality in complex networks. Phys Rev E Stat Nonlin Soft Matter Phys  2005; 71(5): 056103.
 [http://dx.doi.org/10.1103/PhysRevE.71.056103] [PMID:  16089598]

[8]
Bonacich P. Power and centrality: A family of measures. Am J Sociol  1987; 92(5): 1170-82.
 [http://dx.doi.org/10.1086/228631]

[9]
Hahn MW, Kern AD. Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks. Mol Biol Evol  2005; 22(4): 803-6.
 [http://dx.doi.org/10.1093/molbev/msi072] [PMID:  15616139]

[10]
Joy MP, Brock A, Ingber DE, Huang S. High-betweenness proteins in the yeast protein interaction network. J Biomed Biotechnol  2005; 2005(2): 96-103.
 [http://dx.doi.org/10.1155/JBB.2005.96] [PMID:  16046814]

[11]
Wuchty S, Stadler PF. Centers of complex networks. J Theor Biol  2003; 223(1): 45-53.
 [http://dx.doi.org/10.1016/S0022-5193(03)00071-7] [PMID:  12782116]

[12]
Jianxin Wang , Min Li  , Huan Wang , Yi Pan . Identification of essential proteins based on edge clustering coefficient. IEEE/ACM Trans Comput Biol Bioinformatics  2012; 9(4): 1070-80.
 [http://dx.doi.org/10.1109/TCBB.2011.147] [PMID:  22084147]

[13]
Zelen SM. Rethinking centrality: Methods and examples. Soc Networks  1989; 11(1): 1-37.

[14]
Yi Q, Luo J.  Prediction of essential proteins based on local interaction density. IEEE/ACM Trans Comput Biol Bioinform  2016; 13(6): 1170-82.
 [http://dx.doi.org/10.1109/TCBB.2015.2509989] [PMID: 26701891]

[15]
Li M, Lu Y, Wang J, Wu FX, Pan Y. A topology potential-based method for identifying essential proteins from PPI networks. IEEE/ACM Trans Comput Biol Bioinformatics  2015; 12(2): 372-83.
 [http://dx.doi.org/10.1109/TCBB.2014.2361350] [PMID:  26357224]

[16]
Lin CY, Chin CH, Wu HH, Chen SH, Ho CW, Ko MT. Hubba: hub objects analyzer—a framework of interactome hubs identification for network biology. Nucleic Acids Res  2008; 36: W438-43.
 [http://dx.doi.org/10.1093/nar/gkn257] [PMID:  18503085]

[17]
Jeong H, Mason SP, Barabási AL, Oltvai ZN. Lethality and centrality in protein networks. Nature  2001; 411(6833): 41-2.
 [http://dx.doi.org/10.1038/35075138] [PMID:  11333967]

[18]
Sprinzak E, Sattath S, Margalit H. How reliable are experimental protein-protein interaction data? J Mol Biol  2003; 327(5): 919-23.
 [http://dx.doi.org/10.1016/S0022-2836(03)00239-0] [PMID:  12662919]

[19]
Kuchaiev O, Rašajski M, Higham DJ, Pržulj N. Geometric de-noising of protein-protein interaction networks. PLOS Comput Biol  2009; 5(8): e1000454.
 [http://dx.doi.org/10.1371/journal.pcbi.1000454] [PMID:  19662157]

[20]
Zhang F, Peng W, Yang Y, Dai W, Song J. A novel method for identifying essential genes by fusing dynamic protein–protein interactive networks. Genes  2019; 10(1): 31.
 [http://dx.doi.org/10.3390/genes10010031] [PMID:  30626157]

[21]
Lei X, Yang X, Wu F-X. Artificial fish swarm optimization-based method to identify essential proteins. IEEE/ACM Trans Comput  Biol Bioinform   2018; 17(2): 495-505.
 [http://dx.doi.org/10.1109/TCBB.2018.2865567] [PMID:  30113899]

[22]
Zhao B, Wang J, Li M, Wu F, Pan Y. Prediction of essential proteins based on overlapping essential modules. IEEE Trans Nanobiosci  2014; 13(4): 415-24.
 [http://dx.doi.org/10.1109/TNB.2014.2337912] [PMID:  25122840]

[23]
Zhang X, Xu J, Xiao W. A new method for the discovery of essential proteins. PLoS One  2013; 8(3): e58763.
 [http://dx.doi.org/10.1371/journal.pone.0058763] [PMID:  23555595]

[24]
Ren R, Wang J, Li M, et al. Prediction of essential proteins by integration of PPI network topology and protein complexes Bioinformatics research and applications. proceedings of the 7th International Symposium on Bioinformatics Research and Applications (ISBRA).  Berlin, Heidelberg: Springer 2011; pp. 12-24.

[25]
Li M, Zhang H, Wang J, Pan Y. A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data. BMC Syst Biol  2012; 6(1): 15.
 [http://dx.doi.org/10.1186/1752-0509-6-15] [PMID:  22405054]

[26]
Tang J, Liu G, Pan Q. A review on representative swarm intelligence algorithms for solving optimization problems: Applications and trends IEEE/CAA J Automatica Sinica  2021; 8(10): 17.
 [http://dx.doi.org/10.1109/JAS.2021.1004129]

[27]
Gao S, Cheng J, Todo Y, et al. Incorporation of solvent effect into multi-objective evolutionary algorithm for improved protein structure prediction.  IEEE/ACM Trans Comput Biol Bioinformat  2017; 15(4): 1365-78.
 [http://dx.doi.org/10.1109/TCBB.2017.2705094] [PMID: 2853478]

[28]
You ZH, Zhou MC, Luo X, et al. Highly efficient framework for predicting interactions between proteins. IEEE Trans Cybern  2016; 47(3): 731-43.
 [PMID:  28113829]

[29]
Kehyayan C, Mansour N, Kanj F, Khachfe H. Evolutionary Algorithm for Protein Structure Prediction. proceedings of the International Conference on Advanced Computer Theory and Engineering (ICACTE).  Phuket, Thailand: IEEE 2008; pp. 925-9.

[30]
Li G, Li M, Peng W, Li Y, Pan Y, Wang J. A novel extended pareto optimality consensus model for predicting essential proteins. J Theor Biol  2019; 480: 141-9.
 [http://dx.doi.org/10.1016/j.jtbi.2019.08.005] [PMID:  31398315]

[31]
Zhong J, Tang C, Peng W, et al. A novel essential protein identification method based on PPI networks and gene expression data. BMC Bioinformat  2021; 22(1): 248.
 [http://dx.doi.org/10.1186/s12859-021-04175-8] [PMID:  33985429]

[32]
Dai W, Chang Q, Peng W, Zhong J, Li Y. Network embedding the protein–protein interaction network for human essential genes identifica-tion. Genes  2020; 11(2): 153.
 [http://dx.doi.org/10.3390/genes11020153] [PMID:  32023848]

[33]
Sun W, Wang L, Peng J, et al. A cross-entropy-based method for essential protein identification in yeast protein–protein interaction net-work. Curr Bioinform  2020; 15(4): 1-11.
 [http://dx.doi.org/10.2174/1574893615999201116210840]

[34]
Li S, Chen Z, He X, et al. An iteration method for identifying yeast essential proteins from weighted PPI network based on topological and functional features of proteins.  IEEE Access  2020; 8(99): 90792-804.
 [http://dx.doi.org/10.1109/ACCESS.2020.2993860,]

[35]
Peng W, Wang J, Wang W, Liu Q, Wu FX, Pan Y. Iteration method for predicting essential proteins based on orthology and protein-protein interaction networks. BMC Syst Biol  2012; 6(1): 87.
 [http://dx.doi.org/10.1186/1752-0509-6-87] [PMID:  22808943]

[36]
Zhang W, Xu J, Li Y, Zou X. Detecting essential proteins based on network topology, gene expression data, and gene ontology infor-mation. IEEE/ACM Trans Comput Biol Bioinform  2016; 15(1): 109-6.
 [http://dx.doi.org/10.1109/TCBB.2016.2615931] [PMID: 28650821]

[37]
Zhang W, Xu J, Zou X. Predicting essential proteins by integrating network topology, subcellular localization information, gene expression profile and go annotation data.  IEEE/ACM Trans Comput  Biol Bioinform  2019; 17(6): 2053-61.
 [http://dx.doi.org/10.1109/TCBB.2019.2916038] [PMID: 31095490]

[38]
Lei X, Zhao J, Fujita H, Zhang A. Predicting essential proteins based on RNA-Seq, subcellular localization and GO annotation datasets. Knowl Base Syst  2018; 151: 136-48.
 [http://dx.doi.org/10.1016/j.knosys.2018.03.027]

[39]
Qin C, Sun Y, Dong Y. A new computational strategy for identifying essential proteins based on network topological properties and bio-logical information. PLoS One  2017; 12(7): e0182031.
 [http://dx.doi.org/10.1371/journal.pone.0182031] [PMID:  28753682]

[40]
Lei X, Yang X, Fujita H. Random walk-based method to identify essential proteins by integrating network topology and biological charac-teristics. Knowl Base Syst  2019; 167(3): 53-67.
 [http://dx.doi.org/10.1016/j.knosys.2019.01.012]

[41]
Zhao B, Zhao Y, Zhang X, Zhang Z, Zhang F, Wang L. An iteration method for identifying yeast essential proteins from heterogeneous network. BMC Bioinformat  2019; 20(1): 355.
 [http://dx.doi.org/10.1186/s12859-019-2930-2] [PMID:  31234779]

[42]
Hu L, Yang S, Luo X, et al. A distributed framework for large-scale protein-protein interaction data analysis and prediction using mapReduce.  IEEE/CAA J Automatica Sinica  2021; 9(1): 160-72.
 [http://dx.doi.org/10.1109/JAS.2021.1004198]

[43]
Das S, Chakrabarti S. Classification and prediction of protein–protein interaction interface using machine learning algorithm. Sci Rep  2021; 11(1): 1761.
 [http://dx.doi.org/10.1038/s41598-020-80900-2] [PMID:  33469042]

[44]
Menor-Flores M, Vega-Rodriguez M A. Decomposition-based multi-objective optimization approach for PPI network alignment. Knowledge-based system  2022; 243  108527.
 [http://dx.doi.org/10.1016/j.knosys.2022.108527]

[45]
Debnath S, Mollah AF. A supervised machine learning approach for sequence based Protein-Protein Interaction (PPI)Prediction arXiv.   2022; pp. 1-10.
 [http://dx.doi.org/10.48550/arXiv.2203.12659]

[46]
Gavin AC, Aloy P, Grandi P, et al. Proteome survey reveals modularity of the yeast cell machinery. Nature  2006; 440(7084): 631-36.
 [http://dx.doi.org/10.1038/nature04532] [PMID:  16429126]

[47]
Krogan NJ, Cagney G, Yu H, et al. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature  2006; 440(7084): 637-43.
 [http://dx.doi.org/10.1038/nature04670] [PMID:  16554755]

[48]
Xenarios I, Salwínski L, Duan XJ, Higney P, Kim S-M, Eisenberg D. DIP, the database of interacting proteins: A research tool for studying cellular networks of protein interactions. Nucleic Acids Res  2002; 30(1): 303-5.
 [http://dx.doi.org/10.1093/nar/30.1.303] [PMID:  11752321]

[49]
Dai C, He J, Hu K, Ding Y. Identifying essential proteins in dynamic protein networks based on an improved h-index algorithm. BMC Med Inform Decis Mak  2020; 20(1): 110.
 [http://dx.doi.org/10.1186/s12911-020-01141-x] [PMID:  32552708]

[50]
Horyu D, Hayashi T. Comparison between pearson correlation coefficient and mutual information as a similarity measure of gene expres-sion profiles. Japanese J Biomet  2013; 33(2): 125-43.
 [http://dx.doi.org/10.5691/jjb.33.125]

[51]
J. Michael Cherry. The Saccharomyces genome database: Exploring genome features and their annotations. Cold Spring Harbor Protocols  2015; 12: pdb.prot088922..
 [http://dx.doi.org/10.1101/pdb.prot088922]

[52]
Cherry J, Adler C, Ball C, et al. Sgd: Saccharomyces genome database. Nucleic Acids Res  1998; 26(1): 73-9.
 [http://dx.doi.org/10.1093/nar/26.1.73] [PMID:  9399804]

[53]
Zhang R, Lin Y. DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes. Nucleic Acids Res  2009; 37: D455-8.
 [http://dx.doi.org/10.1093/nar/gkn858] [PMID:  18974178]

[54]
Mewes HW, Frishman D, Mayer KF, et al. MIPS: analysis and annotation of proteins from whole genomes in 2005. Nucleic Acids Res  2006; 34(90001) (Suppl. 1): D169-72.
 [http://dx.doi.org/10.1093/nar/gkj148] [PMID:  16381839]

[55]
Östlund G, Schmitt T, Forslund K, et al. InParanoid 7: New algorithms and tools for eukaryotic orthology analysis. Nucleic Acids Res  2010; 38: D196-203.
 [http://dx.doi.org/10.1093/nar/gkp931] [PMID:  19892828]

[56]
Tu BP, Kudlicki A, Rowicka M, McKnight SL. Logic of the yeast metabolic cycle: Temporal compartmentalization of cellular processes. Science  2005; 310(5751): 1152-8.
 [http://dx.doi.org/10.1126/science.1120499] [PMID:  16254148]

[57]
Holman AG, Davis PJ, Foster JM, Carlow CKS, Kumar S. Computational prediction of essential genes in an unculturable endosymbiotic bacterium, Wolbachia of Brugia malayi. BMC Microbiol  2009; 9(1): 243.
 [http://dx.doi.org/10.1186/1471-2180-9-243] [PMID:  19943957]

[58]
Davis J. The Relationship Between Precision-Recall and ROC Curves.  Proceedings of the 23th International Conference on Machine Learning.  New York, NY, USA ACM 2006; pp. 233-40.

Rights & Permissions Print Cite

Journal Information

For Authors

For Editors

For Reviewers

Explore Articles

Open Access

Open Access Articles

For Visitors

DOI https://dx.doi.org/10.2174/1574893618666230315154807	Print ISSN 1574-8936
Publisher Name Bentham Science Publisher	Online ISSN 2212-392X

Current Bioinformatics

An Iterative Model for Identifying Essential Proteins Based on the Whole Process Network of Protein Evolution

Abstract Play Pause

Graphical Abstract

Related Journals

Related Books

Abstract