Generic placeholder image

Current Bioinformatics

Editor-in-Chief

ISSN (Print): 1574-8936
ISSN (Online): 2212-392X

Review Article

A Review of Protein Inter-residue Distance Prediction

Author(s): He Huang and Xinqi Gong*

Volume 15, Issue 8, 2020

Page: [821 - 830] Pages: 10

DOI: 10.2174/1574893615999200425230056

Price: $65

Abstract

Proteins are large molecules consisting of a linear sequence of amino acids. Protein performs biological functions with specific 3D structures. The main factors that drive proteins to form these structures are constraint between residues. These constraints usually lead to important inter-residue relationships, including short-range inter-residue contacts and long-range interresidue distances. Thus, a highly accurate prediction of inter-residue contact and distance information is of great significance for protein tertiary structure computations. Some methods have been proposed for inter-residue contact prediction, most of which focus on contact map prediction and some reviews have summarized the progresses. However, inter-residue distance prediction is found to provide better guidance for protein structure prediction than contact map prediction in recent years. The methods for inter-residue distance prediction can be roughly divided into two types according to the consideration of distance value: one is based on multi-classification with discrete value and the other is based on regression with continuous value. Here, we summarize these algorithms and show that they have obtained good results. Compared to contact map prediction, distance map prediction is in its infancy. There is a lot to do in the future including improving distance map prediction precision and incorporating them into residue-residue distanceguided ab initio protein folding.

Keywords: Machine learning, deep learning, protein structure prediction, contact map, inter-residue distance, distance map.

Graphical Abstract

[1]
Saitoh S, Nakai T, Nishikawa K. A geometrical constraint approach for reproducing the native backbone conformation of a protein. Proteins-structure Funct. Bioinforma 2010; 15: 191-204.
[2]
Bohr J, Bohr H, Brunak S, et al. Protein structures from distance inequalities. J Mol Biol 1993; 231(3): 861-9.
[http://dx.doi.org/10.1006/jmbi.1993.1332] [PMID: 7685827]
[3]
Ma J, Wang S, Wang Z, Xu J. Protein contact prediction by integrating joint evolutionary coupling analysis and supervised learning. Bioinformatics 2015; 31(21): 3506-13.
[http://dx.doi.org/10.1093/bioinformatics/btv472] [PMID: 26275894]
[4]
Jones DT, Singh T, Kosciolek T, Tetchner S. MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins. Bioinformatics 2015; 31(7): 999-1006.
[http://dx.doi.org/10.1093/bioinformatics/btu791] [PMID: 25431331]
[5]
Adhikari B, Hou J, Cheng J. DNCON2: improved protein contact prediction using two-level deep convolutional neural networks. Bioinformatics 2018; 34(9): 1466-72.
[http://dx.doi.org/10.1093/bioinformatics/btx781] [PMID: 29228185]
[6]
Hanson J, Paliwal K, Litfin T, Yang Y, Zhou Y. Accurate prediction of protein contact maps by coupling residual two-dimensional bidirectional long short-term memory with convolutional neural networks. Bioinformatics 2018; 34(23): 4039-45.
[http://dx.doi.org/10.1093/bioinformatics/bty481] [PMID: 29931279]
[7]
Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised machine learning methods applied to predict ligand- binding affinity. Curr Med Chem 2017; 24(23): 2459-70.
[http://dx.doi.org/10.2174/0929867324666170623092503] [PMID: 28641555]
[8]
Xavier MM, Heck GS, Avila MB, et al. SAnDReS a computational tool for statistical analysis of docking results and development of scoring functions. Comb Chem High Throughput Screen 2016; 19(10): 801-12.
[http://dx.doi.org/10.2174/1386207319666160927111347] [PMID: 27686428]
[9]
da Silva AD, Bitencourt-Ferreira G, de Azevedo WF Jr. Taba: a tool to analyze the binding affinity. J Comput Chem 2020; 41(1): 69-73.
[http://dx.doi.org/10.1002/jcc.26048] [PMID: 31410856]
[10]
Li H, Peng J, Sidorov P, et al. Classical scoring functions for docking are unable to exploit large volumes of structural and interaction data. Bioinformatics 2019; 35(20): 3989-95.
[http://dx.doi.org/10.1093/bioinformatics/btz183] [PMID: 30873528]
[11]
Moré JJ, Wu Z. Distance geometry optimization for protein structures. J Glob Optim 1999; 15: 219-34.
[http://dx.doi.org/10.1023/A:1008380219900]
[12]
Adams PD, Berkeley L, Clore GM, Jiang J, Nilges M. Crystallography & NMR System : A New Software Suite for Macromolecular Crystallography & NMR System : A New Software Suite for Macromolecular Structure Determination 1998.
[13]
Fadel V, Bettendorff P, Herrmann T, et al. Automated NMR structure determination and disulfide bond identification of the myotoxin crotamine from Crotalus durissus terrificus. Toxicon 2005; 46(7): 759-67.
[http://dx.doi.org/10.1016/j.toxicon.2005.07.018] [PMID: 16185738]
[14]
Liberti L, Lavor C, Maculan N, Mucherino A. Euclidean distance geometry and applications. SIAM Rev 2014; 56: 3-69.
[http://dx.doi.org/10.1137/120875909]
[15]
Wu, Di, Wu, and Zhijun (2007). An updated geometric build-up algorithm for solving the molecular distance geometry problems with sparse distance data. J Glob Optim 2007; 37: 661-73.
[16]
Souza M, Lavor C, Muritiba A, Maculan N. Solving the molecular distance geometry problem with inaccurate distance data. BMC Bioinformatics 2013; 14(Suppl. 9): S7.
[http://dx.doi.org/10.1186/1471-2105-14-S9-S7] [PMID: 23901894]
[17]
Sit A, Wu Z, Yuan Y. A geometric buildup algorithm for the solution of the distance geometry problem using least-squares approximation. Bull Math Biol 2009; 71(8): 1914-33.
[http://dx.doi.org/10.1007/s11538-009-9431-9] [PMID: 19533250]
[18]
Havel TF. Distance Geometry: Theory, Algorithms, and Chemical Applications. Encycl Comput Chem 2003.
[19]
Mohammed Z, Chris B, Bartoli L, et al. The pros and cons of predicting protein contact maps Protein Structure Prediction 2007; 199-217.
[20]
Kuhlman B, Bradley P. Advances in protein structure prediction and design. Nat Rev Mol Cell Biol 2019; 20: 681-97.
[http://dx.doi.org/10.1038/s41580-019-0163-x]
[21]
Jing X, Dong Q, Lu R, Dong Q. Protein inter-residue contacts prediction: Methods, Performances and applications. Curr Bioinform 2018; 14: 178-89.
[http://dx.doi.org/10.2174/1574893613666181109130430]
[22]
Morcos F, Pagnani A, Lunt B, Bertolino A, Marks DS, Sander C. Morcos F, Pagnani A, Lunt B, Bertolino A, Marks DS, Sander C. Direct-coupling analysis of residue coevolution captures native contacts across many protein families. PNAS 2011; 108(49): E1293-E1301..
[http://dx.doi.org/10.1073/pnas.1111471108]
[23]
Ekeberg M, Hartonen T, Aurell E. Fast pseudolikelihood maximization for direct-coupling analysis of protein structure from many homologous amino-acid sequences. J Comput Phys 2014; 276: 341-56.
[http://dx.doi.org/10.1016/j.jcp.2014.07.024]
[24]
Baldassi C, Zamparo M, Feinauer C, et al. Fast and accurate multivariate Gaussian modeling of protein families: predicting residue contacts and protein-interaction partners. PLoS One 2014; 9(3)e92721
[http://dx.doi.org/10.1371/journal.pone.0092721] [PMID: 24663061]
[25]
Jones DT, Kandathil SM. High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features. Bioinformatics 2018; 34(19): 3308-15.
[http://dx.doi.org/10.1093/bioinformatics/bty341] [PMID: 29718112]
[26]
Adhikari B. DEEPCON: Protein Contact Prediction using Dilated Convolutional Neural Networks with Dropout. bioRxiv. 2019.https://www.biorxiv.org/content/10.1101/590455v1?rss=1
[27]
Liu Y, Palmedo P, Ye Q, Berger B, Peng J. Enhancing Evolutionary Couplings with Deep Convolutional Neural Networks. Cell Syst 2018; 6(1): 65-74.e3.
[http://dx.doi.org/10.1016/j.cels.2017.11.014] [PMID: 29275173]
[28]
Michel M, Hurtado DM, Elofsson A. PconsC4: fast, accurate, and hassle-free contact predictions. Bioinformatics 2018; 4: 1-2.
[http://dx.doi.org/10.1093/bioinformatics/bty1036] [PMID: 30590407]
[29]
Zhang H, Gao Y, Deng M, et al. Improving residue-residue contact prediction via low-rank and sparse decomposition of residue correlation matrix. Biochem Biophys Res Commun 2016; 472(1): 217-22.
[http://dx.doi.org/10.1016/j.bbrc.2016.01.188] [PMID: 26920058]
[30]
Martin LC, Gloor GB, Dunn SD, Wahl LM. Using information theory to search for co-evolving residues in proteins. Bioinformatics 2005; 21(22): 4116-24.
[http://dx.doi.org/10.1093/bioinformatics/bti671] [PMID: 16159918]
[31]
Zhang H, Zhang Q, Ju F, et al. Predicting protein inter-residue contacts using composite likelihood maximization and deep learning 2018.
[http://dx.doi.org/10.1186/s12859-019-3051-7]
[32]
Seemayer S, Gruber M, Söding J. CCMpred--fast and precise prediction of protein residue-residue contacts from correlated mutations. Bioinformatics 2014; 30(21): 3128-30.
[http://dx.doi.org/10.1093/bioinformatics/btu500] [PMID: 25064567]
[33]
Marks DS, Colwell LJ, Sheridan R, et al. Protein 3D structure computed from evolutionary sequence variation. PLoS One 2011; 6(12)e28766
[http://dx.doi.org/10.1371/journal.pone.0028766] [PMID: 22163331]
[34]
Wang Z, Xu J. Predicting protein contact map using evolutionary and physical constraints by integer programming. Bioinformatics 2013; 29(13): i266-73.
[http://dx.doi.org/10.1093/bioinformatics/btt211] [PMID: 23812992]
[35]
Jones DT, Buchan DWA, Cozzetto D, Pontil M. PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics 2012; 28(2): 184-90.
[http://dx.doi.org/10.1093/bioinformatics/btr638] [PMID: 22101153]
[36]
Kaján L, Hopf TA, Kalaš M, Marks DS, Rost B. FreeContact: fast and free software for protein contact prediction from residue co-evolution. BMC Bioinformatics 2014; 15: 85.
[http://dx.doi.org/10.1186/1471-2105-15-85] [PMID: 24669753]
[37]
Ding W, Mao W, Shao D, Zhang W, Gong H. DeepConPred2: An improved method for the prediction of protein residue contacts. Comput Struct Biotechnol J 2018; 16: 503-10.
[http://dx.doi.org/10.1016/j.csbj.2018.10.009] [PMID: 30505403]
[38]
Kamisetty H, Ovchinnikov S, Baker D. Assessing the utility of coevolution-based residue-residue contact predictions in a sequence- and structure-rich era. Proc Natl Acad Sci USA 2013; 110(39): 15674-9.
[http://dx.doi.org/10.1073/pnas.1314045110] [PMID: 24009338]
[39]
Kukic P, Mirabello C, Tradigo G, Walsh I, Veltri P, Pollastri G. Toward an accurate prediction of inter-residue distances in proteins using 2D recursive neural networks. BMC Bioinformatics 2014; 15: 6.
[http://dx.doi.org/10.1186/1471-2105-15-6] [PMID: 24410833]
[40]
Ronneberger O, Fischer P, Brox T. U-Net: Convolutional Networks for Biomedical Image Segmentation 2015.
[41]
Michel M, Menéndez Hurtado D, Elofsson A. PconsC4: fast, accurate and hassle-free contact predictions. Bioinformatics 2019; 35(15): 2677-9.
[http://dx.doi.org/10.1093/bioinformatics/bty1036] [PMID: 30590407]
[42]
Zhou Z, Siddiquee MR. UNet++: A Nested U-Net Architecture for Medical Image Segmentation
[43]
Shenoy A, Institutet K. Feature optimization of contact map predictions based on inter-residue distances and U-Net ++ architecture Author : Aditi Adesh Shenoy. 2019.
[44]
Reese MG, Lund O, Bohr H. Distance distributions in proteins : A six-parameter representation 1996.
[45]
Lund O, Frimand K, Gorodkin J, Bohr H, Bohr J. Protein distance constraints predicted by neural networks and probability density functions Protein distance constraints predicted by neural networks and probability density functions 1997.
[46]
Gorodkin J, Lund O, Andersen CA, Brunak S. Using Sequence Motifs for Enhanced Neural Network Prediction of Protein Distance Constraints 1999.
[47]
Walsh I, Baù D, Martin AJ, Mooney C, Vullo A, Pollastri G. Ab initio and template-based prediction of multi-class distance maps by two-dimensional recursive neural networks. BMC Struct Biol 2009; 9: 5.
[http://dx.doi.org/10.1186/1472-6807-9-5] [PMID: 19183478]
[48]
Ji S, Oruç T, Mead L, et al. DeepCDpred: Inter-residue distance and contact prediction for improved prediction of protein structure. PLoS One 2019; 14(1)e0205214
[http://dx.doi.org/10.1371/journal.pone.0205214] [PMID: 30620738]
[49]
Nair V, Hinton GEBT-IC, On ML. Rectified Linear Units Improve Restricted Boltzmann Machines 2010.
[50]
Wang S, Sun S, Li Z, Zhang R, Xu J. Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model. PLOS Comput Biol 2017; 13(1)e1005324
[http://dx.doi.org/10.1371/journal.pcbi.1005324] [PMID: 28056090]
[51]
Xu J, Wang S. Analysis of distance-based protein structure prediction by deep learning in CASP13. Proteins 2019; 87(12): 1069-81.
[http://dx.doi.org/10.1002/prot.25810] [PMID: 31471916]
[52]
Zhu J, Wang S, Bu D, Xu J. Protein threading using residue co-variation and deep learning. Bioinformatics 2018; 34(13): i263-73.
[http://dx.doi.org/10.1093/bioinformatics/bty278] [PMID: 29949980]
[53]
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition 2016.
[http://dx.doi.org/10.1109/CVPR.2016.90]
[54]
Skwark MJ, Raimondi D, Michel M, Elofsson A. Improved contact predictions using the recognition of protein like contact patterns. PLOS Comput Biol 2014; 10(11)e1003889
[http://dx.doi.org/10.1371/journal.pcbi.1003889] [PMID: 25375897]
[55]
Xu J. Distance-based protein folding powered by deep learning 2019.
[http://dx.doi.org/10.1073/pnas.1821309116]
[56]
Vassura M, Margara L, Di Lena P, Medri F, Fariselli P, Casadio R. Reconstruction of 3D structures from protein contact maps. IEEE/ACM Trans Comput Biol Bioinformatics 2008; 5(3): 357-67.
[http://dx.doi.org/10.1109/TCBB.2008.27] [PMID: 18670040]
[57]
Lund O, Hansen J, Brunak S, Bohr J. Relationship between protein structure and geometrical constraints. Protein Sci 1996; 5(11): 2217-25.
[http://dx.doi.org/10.1002/pro.5560051108] [PMID: 8931140]
[58]
Vendruscolo M, Kussell E, Domany E. Recovery of protein structure from contact maps. Fold Des 1997; 2(5): 295-306.
[http://dx.doi.org/10.1016/S1359-0278(97)00041-2] [PMID: 9377713]
[59]
Galaktionov SG, Marshall GRBT-THIC. Properties of intraglobular contacts in proteins: an approach to prediction of tertiary structure 1994.
[60]
Vendruscolo M, Kussell E, Domany E. Recovery of Protein Structure from Contact Maps. 1-27..
[61]
Gianluca P, Alessandro V, Paolo F, Pierre B. Modular DAG-RNN architectures for assembling coarse protein structures. J Comput Biol A J. Comput Mol Cell Biol 2006; 13: 631-50.
[62]
Michel M, Hayat S, Skwark MJ, Sander C, Marks DS, Elofsson A. PconsFold: improved contact predictions improve protein models. Bioinformatics 2014; 30(17): i482-8.
[http://dx.doi.org/10.1093/bioinformatics/btu458] [PMID: 25161237]
[63]
Adhikari B, Bhattacharya D, Cao R, Cheng J. CONFOLD: Residue-residue contact-guided ab initio protein folding. Proteins 2015; 83(8): 1436-49.
[http://dx.doi.org/10.1002/prot.24829] [PMID: 25974172]
[64]
Adhikari B, Cheng J. CONFOLD2: improved contact-driven ab initio protein structure modeling. BMC Bioinformatics 2018; 19(1): 22.
[http://dx.doi.org/10.1186/s12859-018-2032-6] [PMID: 29370750]
[65]
Adhikari B, Cheng J. Improved protein structure reconstruction using secondary structures, contacts at higher distance thresholds, and non-contacts. BMC Bioinformatics 2017; 18(1): 380.
[http://dx.doi.org/10.1186/s12859-017-1807-5] [PMID: 28851269]
[66]
Alquraishi M. Structural bioinformatics AlphaFold at CASP13 2019.
[67]
Hou J, Cao R, Cheng J. Protein tertiary structure modeling driven by deep learning and contact distance prediction in CASP13 Short running title : Protein structure prediction by deep learning in CASP13. 2019.
[68]
Schaarschmidt J, Monastyrskyy B, Kryshtafovych A, Bonvin AMJJ. Assessment of contact predictions in CASP12: Co-evolution and deep learning coming of age. Proteins 2018; 86(Suppl. 1): 51-66.
[http://dx.doi.org/10.1002/prot.25407] [PMID: 29071738]
[69]
Crippen GM. Linearized embedding: a new metric matrix algorithm for calculating molecular conformations subject to geometric constraints. J Comput Chem 1989; 10: 896-902.
[http://dx.doi.org/10.1002/jcc.540100706]
[70]
Den Kelder D, Gabriëlle M. Distance geometry and molecular conformation. Trends Pharmacol Sci 1988; 10: 164.
[http://dx.doi.org/10.1016/0165-6147(89)90173-9]
[71]
Luo XL, Wu ZJ. Least-Squares Approximations in Geometric Buildup for Solving Distance Geometry Problems. J Optim Theory Appl 2011; 149: 580-98.
[http://dx.doi.org/10.1007/s10957-011-9806-6]
[72]
Liberti L, Lavor C, Mucherino A, Maculan N. Molecular distance geometry methods: from continuous to discrete. Int Trans Oper Res 2011; 18: 33-51.
[http://dx.doi.org/10.1111/j.1475-3995.2009.00757.x]
[73]
Fang HR, O’Leary DP. Euclidean distance matrix completion problems. Optim Methods Softw 2012; 27: 695-717.
[http://dx.doi.org/10.1080/10556788.2011.643888]
[74]
Gao Y, Wang S, Deng M, Xu J. RaptorX-Angle: real-value prediction of protein backbone dihedral angles through a hybrid method of clustering and deep learning. BMC Bioinformatics 2018; 19(Suppl. 4): 100.
[http://dx.doi.org/10.1186/s12859-018-2065-x] [PMID: 29745828]

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy