Abstract
Background: The machine learning computation paradigm touched new horizons with the development of deep learning architectures. It is widely used in complex problems and achieved significant results in many traditional applications like protein structure prediction, speech recognition, traffic management, health diagnostic systems and many more. Especially, Convolution neural network (CNN) has revolutionized visual data processing tasks.
Objective: Protein structure is an important research area in various domains, from medical science and health sectors to drug designing. Fourier Transform Infrared Spectroscopy (FTIR) is the leading tool for protein structure determination. This review aims to study the existing deep learning approaches proposed in the literature to predict proteins' secondary structure and to develop a conceptual relation between FTIR spectra images and deep learning models to predict the structure of proteins.
Methods: Various pre-trained CNN models are identified and interpreted to correlate the FTIR images of proteins containing Amide-I and Amide-II absorbance values and their secondary structure.
Results: The concept of transfer learning is efficiently incorporated using the models like Visual Geometry Group (VGG), Inception, Resnet, and Efficientnet. The dataset of protein spectra images is applied as input, and these models significantly predict the secondary structure of proteins.
Conclusion: As deep learning is recently being explored in this field of research, it worked remarkably in this application and needs continuous improvement with the development of new models.
Graphical Abstract
[http://dx.doi.org/10.1016/j.trf.2020.12.015]
[http://dx.doi.org/10.3390/jpm10020021] [PMID: 32244292]
[http://dx.doi.org/10.1016/j.drudis.2020.12.003] [PMID: 33290820]
[http://dx.doi.org/10.1007/s00521-020-04912-9]
[http://dx.doi.org/10.1007/s10846-021-01327-z]
[http://dx.doi.org/10.2174/157489308784340676]
[http://dx.doi.org/10.1016/j.neunet.2014.09.003] [PMID: 25462637]
[http://dx.doi.org/10.1186/s40537-021-00444-8] [PMID: 33425651]
[http://dx.doi.org/10.1109/CVPR.2016.308]
[http://dx.doi.org/10.1073/pnas.37.4.205] [PMID: 14816373]
[http://dx.doi.org/10.1002/bip.360221211] [PMID: 6667333]
[http://dx.doi.org/10.1109/TCBB.2014.2343960] [PMID: 25750595]
[http://dx.doi.org/10.1038/srep11476] [PMID: 26098304]
[http://dx.doi.org/10.1038/srep18962] [PMID: 26752681]
[http://dx.doi.org/10.1016/j.knosys.2016.11.015]
[http://dx.doi.org/10.1142/S021972001850021X] [PMID: 30419785]
[http://dx.doi.org/10.1186/s12859-018-2280-5] [PMID: 30075707]
[http://dx.doi.org/10.1186/s12859-018-2067-8] [PMID: 29745837]
[http://dx.doi.org/10.1002/prot.25487] [PMID: 29492997]
[http://dx.doi.org/10.1002/jcc.25534] [PMID: 30368831]
[http://dx.doi.org/10.1093/bioinformatics/bty1006] [PMID: 30535134]
[http://dx.doi.org/10.1186/s12859-019-2940-0] [PMID: 31208331]
[http://dx.doi.org/10.1101/705426]
[http://dx.doi.org/10.1016/j.asoc.2019.105926]
[http://dx.doi.org/10.1016/j.bpj.2019.11.417]
[http://dx.doi.org/10.1109/ACCESS.2020.2992084]
[http://dx.doi.org/10.3389/fbioe.2021.687426] [PMID: 34211967]
[http://dx.doi.org/10.1002/prot.26007] [PMID: 32893403]
[http://dx.doi.org/10.1007/s11227-020-03467-9]
[http://dx.doi.org/10.1002/jcc.26432] [PMID: 33058261]
[http://dx.doi.org/10.1021/acs.analchem.0c03677] [PMID: 33577285]
[http://dx.doi.org/10.1021/acs.analchem.1c01416] [PMID: 34592098]
[http://dx.doi.org/10.1007/s00249-021-01507-7] [PMID: 33558954]
[http://dx.doi.org/10.1021/acs.analchem.0c03943] [PMID: 33332103]
[http://dx.doi.org/10.1038/protex.2018.075]
[http://dx.doi.org/10.1021/acs.biochem.6b00403] [PMID: 27322779]
[http://dx.doi.org/10.1038/nprot.2015.024] [PMID: 25654756]
[http://dx.doi.org/10.1016/j.bbamem.2012.11.020] [PMID: 23196348]
[http://dx.doi.org/10.1529/biophysj.105.072017] [PMID: 16428280]
[http://dx.doi.org/10.1111/j.1432-1033.2004.04220.x] [PMID: 15233789]
[http://dx.doi.org/10.1021/bi030149y] [PMID: 14992591]
[http://dx.doi.org/10.1039/D1AN00290B] [PMID: 33899058]
[http://dx.doi.org/10.1016/j.compag.2020.105553]
[http://dx.doi.org/10.1016/j.geoderma.2020.114616]
[http://dx.doi.org/10.1016/j.foodchem.2020.126536] [PMID: 32146292]
[http://dx.doi.org/10.1016/j.foodchem.2020.126503] [PMID: 32240914]
[http://dx.doi.org/10.1016/j.chemolab.2020.104063]
[http://dx.doi.org/10.1016/j.snb.2019.126630]
[http://dx.doi.org/10.1016/j.geoderma.2019.06.016]
[http://dx.doi.org/10.1016/j.chemolab.2018.07.008]
[http://dx.doi.org/10.1016/j.chemolab.2017.12.010]
[http://dx.doi.org/10.1049/iet-cvi.2018.5237]
[http://dx.doi.org/10.1016/j.infrared.2017.07.015]
[http://dx.doi.org/10.1140/epjti/s40485-015-0018-6] [PMID: 26146600]
[http://dx.doi.org/10.1039/b922045c] [PMID: 20419267]