Abstract
Peptides and proteins are involved in several biological processes at a molecular level. In this context, three-dimensional structure characterization and determination of peptides and proteins have helped researchers unravel the chemical and biological role of these macromolecules. Over 50 years, peptide and protein structures have been determined by experimental methods, including nuclear magnetic resonance (NMR), X-ray crystallography, and cryo-electron microscopy (cryo-EM). Therefore, an increasing number of atomic coordinates for peptides and proteins have been deposited in public databases, thus assisting the development of computational tools for predicting unknown 3D structures. In the last decade, a race for innovative methods has arisen in computational sciences, including more complex biological activity and structure prediction algorithms. As a result, peptide/protein theoretical models have achieved a new level of structure prediction accuracy compared with experimentally determined structures. Machine learning and deep learning approaches, for instance, incorporate fundamental aspects of peptide/protein geometry and include physical/biological knowledge about these macromolecules' experimental structures to build more precise computational models. Additionally, computational strategies have helped structural biology, including comparative, threading, and ab initio modeling and, more recently, prediction tools based on machine learning and deep learning. Bearing this in mind, here we provide a retrospective of protein and peptide structure prediction tools, highlighting their advances and obstacles and how they have assisted researchers in answering crucial biological questions.
Graphical Abstract
[http://dx.doi.org/10.1016/j.mib.2019.03.004] [PMID: 31082661]
[http://dx.doi.org/10.1080/17460441.2018.1465922] [PMID: 29695210]
[http://dx.doi.org/10.1016/bs.apcsb.2018.01.008] [PMID: 29680241]
[http://dx.doi.org/10.1016/j.sbi.2020.05.009] [PMID: 32603877]
[http://dx.doi.org/10.1016/j.jmb.2021.167127] [PMID: 34224746]
[http://dx.doi.org/10.1021/acs.jcim.1c01114] [PMID: 34586808]
[http://dx.doi.org/10.1002/prot.26237] [PMID: 34533838]
[http://dx.doi.org/10.1038/s41586-021-03828-1] [PMID: 34293799]
[http://dx.doi.org/10.3389/fmicb.2019.03097] [PMID: 32038544]
[http://dx.doi.org/10.1038/nrd1799] [PMID: 16056391]
[http://dx.doi.org/10.1038/s41467-021-27396-0] [PMID: 34862392]
[http://dx.doi.org/10.1093/bioinformatics/bth935] [PMID: 15262824]
[http://dx.doi.org/10.1038/090410b0]
[http://dx.doi.org/10.1146/annurev-biochem-013118-111947] [PMID: 30986087]
[http://dx.doi.org/10.1002/andp.19053220607]
[http://dx.doi.org/10.1074/jbc.M006098200] [PMID: 10906336]
[http://dx.doi.org/10.1038/npg.els.0002722]
[http://dx.doi.org/10.1103/PhysRev.46.372]
[http://dx.doi.org/10.1021/acscentsci.9b00394] [PMID: 31482114]
[http://dx.doi.org/10.1016/j.cbpa.2018.03.012] [PMID: 29626784]
[http://dx.doi.org/10.1002/chem.201600101] [PMID: 27075969]
[http://dx.doi.org/10.1002/anie.201400679] [PMID: 24692304]
[http://dx.doi.org/10.1179/0308018815Z.000000000116]
[http://dx.doi.org/10.1063/1.432450]
[http://dx.doi.org/10.1016/j.tibs.2014.10.005] [PMID: 25544475]
[http://dx.doi.org/10.1038/nature.2017.22738] [PMID: 29022937]
[PMID: 21521153]
[http://dx.doi.org/10.1016/j.crmeth.2021.100014] [PMID: 34355210]
[http://dx.doi.org/10.1038/s41586-021-03819-2] [PMID: 34265844]
[http://dx.doi.org/10.3390/molecules23082020] [PMID: 30104534]
[http://dx.doi.org/10.1021/jf500246m] [PMID: 24712545]
[http://dx.doi.org/10.1002/pro.4001] [PMID: 33210433]
[http://dx.doi.org/10.1038/d41586-020-01658-1] [PMID: 32518336]
[http://dx.doi.org/10.1016/j.drudis.2009.07.013] [PMID: 19716431]
[http://dx.doi.org/10.1074/jbc.M211147200] [PMID: 12482868]
[http://dx.doi.org/10.1016/j.peptides.2008.06.022] [PMID: 18656510]
[http://dx.doi.org/10.1021/bi0620297] [PMID: 17253775]
[http://dx.doi.org/10.1146/annurev-biophys-052118-115647] [PMID: 30901260]
[http://dx.doi.org/10.1016/j.str.2019.03.005]
[http://dx.doi.org/10.1021/acs.jcim.0c00841] [PMID: 32946226]
[http://dx.doi.org/10.1371/journal.pcbi.1008060] [PMID: 33524015]
[http://dx.doi.org/10.1021/acs.jctc.1c00077] [PMID: 33780620]
[http://dx.doi.org/10.1186/s12859-020-3522-x] [PMID: 33272215]
[http://dx.doi.org/10.1002/prot.25788] [PMID: 31350773]
[http://dx.doi.org/10.1038/s41467-020-20177-1] [PMID: 33339822]
[http://dx.doi.org/10.1016/j.cell.2015.03.050] [PMID: 25910204]
[http://dx.doi.org/10.1126/science.1251652] [PMID: 24675944]
[http://dx.doi.org/10.1021/acs.langmuir.6b00338] [PMID: 27033359]
[http://dx.doi.org/10.1016/j.sbi.2018.10.006] [PMID: 30502729]
[http://dx.doi.org/10.1038/nmeth.3324] [PMID: 25825836]
[http://dx.doi.org/10.1126/science.1228565] [PMID: 23430643]
[http://dx.doi.org/10.1073/pnas.2017525118] [PMID: 33361332]
[http://dx.doi.org/10.1016/j.cpc.2016.09.014]
[http://dx.doi.org/10.1093/jmicro/dfy033] [PMID: 30032235]
[http://dx.doi.org/10.1016/j.jmb.2020.07.027] [PMID: 32771523]
[http://dx.doi.org/10.1093/bioinformatics/18.suppl_1.S54] [PMID: 12169531]
[http://dx.doi.org/10.1016/S0006-3495(03)74551-2] [PMID: 12885659]
[http://dx.doi.org/10.1073/pnas.0305695101] [PMID: 15126668]
[http://dx.doi.org/10.1002/jcc.20011] [PMID: 15011258]
[http://dx.doi.org/10.1016/j.csbj.2019.12.011] [PMID: 32612753]
[http://dx.doi.org/10.1016/S0959-440X(00)00067-1] [PMID: 10753815]
[http://dx.doi.org/10.1098/rsfs.2016.0153] [PMID: 29147555]
[http://dx.doi.org/10.1002/prot.10141] [PMID: 12112688]
[http://dx.doi.org/10.1002/elps.200900140] [PMID: 19517507]
[http://dx.doi.org/10.1002/elps.1150181505] [PMID: 9504803]
[http://dx.doi.org/10.1016/S0022-2836(05)80360-2] [PMID: 2231712]
[http://dx.doi.org/10.1093/bioinformatics/14.9.755] [PMID: 9918945]
[http://dx.doi.org/10.1093/bioinformatics/bti125] [PMID: 15531603]
[http://dx.doi.org/10.1002/prot.21774] [PMID: 17894353]
[http://dx.doi.org/10.6026/97320630008175] [PMID: 22419836]
[http://dx.doi.org/10.1186/s43141-020-00049-3] [PMID: 32857261]
[http://dx.doi.org/10.1016/j.jtbi.2015.03.035] [PMID: 25861869]
[http://dx.doi.org/10.1080/07391102.2016.1206837] [PMID: 27366981]
[http://dx.doi.org/10.1007/978-1-60327-058-8_8]
[http://dx.doi.org/10.1006/jmbi.1993.1626] [PMID: 8254673]
[http://dx.doi.org/10.1002/jcc.21287] [PMID: 19444816]
[http://dx.doi.org/10.1110/ps.062416606] [PMID: 17075131]
[http://dx.doi.org/10.1016/j.str.2013.08.005] [PMID: 24035711]
[http://dx.doi.org/10.1109/CEC.2014.6900443]
[http://dx.doi.org/10.1006/jmbi.1997.0959] [PMID: 9149153]
[http://dx.doi.org/10.1093/bioinformatics/14.10.846] [PMID: 9927713]
[http://dx.doi.org/10.1002/prot.25832] [PMID: 31603244]
[http://dx.doi.org/10.1107/S2059798319011471] [PMID: 31588918]
[http://dx.doi.org/10.1371/journal.pcbi.1005324] [PMID: 28056090]
[http://dx.doi.org/10.1093/bioinformatics/btx781] [PMID: 29228185]
[http://dx.doi.org/10.1016/j.cels.2017.11.014]
[http://dx.doi.org/10.1038/s41586-019-1923-7] [PMID: 31942072]
[http://dx.doi.org/10.1093/bioinformatics/bty1036] [PMID: 30590407]
[http://dx.doi.org/10.1002/prot.25798] [PMID: 31407406]
[http://dx.doi.org/10.1002/prot.25823] [PMID: 31589781]
[http://dx.doi.org/10.1101/2021.09.26.461876]
[http://dx.doi.org/10.3389/fmolb.2022.906437] [PMID: 35655760]
[http://dx.doi.org/10.1038/s41467-019-12920-0] [PMID: 31666519]
[http://dx.doi.org/10.1093/jmcb/mjaa030] [PMID: 32573721]
[http://dx.doi.org/10.1093/bib/bbab540] [PMID: 34929730]
[http://dx.doi.org/10.1109/CSCI49370.2019.00272]
[http://dx.doi.org/10.1016/j.procs.2018.01.096]
[http://dx.doi.org/10.1007/s10916-018-1003-9] [PMID: 29956014]
[http://dx.doi.org/10.1093/bib/bbz156] [PMID: 31867611]
[http://dx.doi.org/10.1093/bioinformatics/btaa763] [PMID: 34009304]
[http://dx.doi.org/10.1038/s41598-019-43708-3] [PMID: 31089211]
[http://dx.doi.org/10.1093/bioinformatics/bty275] [PMID: 29949966]
[http://dx.doi.org/10.1021/jp505902m] [PMID: 25231121]
[http://dx.doi.org/10.1111/rssc.12003]
[http://dx.doi.org/10.1016/j.sbi.2009.02.005] [PMID: 19327982]
[http://dx.doi.org/10.1021/acs.jctc.8b00285] [PMID: 29906395]
[http://dx.doi.org/10.5772/38023]
[http://dx.doi.org/10.1073/pnas.91.10.4436] [PMID: 8183927]
[http://dx.doi.org/10.1063/1.1731425]
[http://dx.doi.org/10.1016/j.str.2018.11.007] [PMID: 30595456]
[http://dx.doi.org/10.1126/science.1208351] [PMID: 22034434]
[http://dx.doi.org/10.1016/j.csbj.2019.07.010] [PMID: 31462972]
[http://dx.doi.org/10.1002/prot.21345] [PMID: 17373704]
[http://dx.doi.org/10.1016/S0009-2614(99)01123-9]
[http://dx.doi.org/10.1186/1741-7007-5-17] [PMID: 17488521]
[http://dx.doi.org/10.1021/acs.jctc.6b00193] [PMID: 27031286]
[http://dx.doi.org/10.1021/ja003150i] [PMID: 11456657]
[http://dx.doi.org/10.1002/prot.24336] [PMID: 23737254]
[http://dx.doi.org/10.1110/ps.0217002] [PMID: 12381853]
[http://dx.doi.org/10.1002/prot.24098] [PMID: 22513870]
[http://dx.doi.org/10.1021/acs.jcim.0c01175] [PMID: 33591749]
[http://dx.doi.org/10.1016/S0076-6879(04)83004-0] [PMID: 15063647]
[http://dx.doi.org/10.1002/prot.22499] [PMID: 19626712]
[http://dx.doi.org/10.1093/nar/gkw306] [PMID: 27112573]
[http://dx.doi.org/10.1093/bioinformatics/bty341] [PMID: 29718112]