Abstract
Background: Intrinsically disordered proteins lack a well-defined three dimensional structure under physiological conditions while possessing the essential biological functions. They take part in various physiological processes such as signal transduction, transcription and posttranslational modifications and etc. The disordered regions are the main functional sites for intrinsically disordered proteins. Therefore, the research of the disordered regions has become a hot issue.
Objective: In this paper, our motivation is to analysis of the features of disordered regions with different molecular functions and predict of different disordered regions using valid features.
Methods: In this article, according to the different molecular function, we firstly divided intrinsically disordered proteins into six classes in DisProt database. Then, we extracted four features using bioinformatics methods, namely, Amino Acid Index (AAIndex), codon frequency (Codon), three kinds of protein secondary structure compositions (3PSS) and Chemical Shifts (CSs), and used these features to predict the disordered regions of the different functions by Support Vector Machine (SVM).
Results: The best overall accuracy was 99.29% using the chemical shift (CSs) as feature. In feature fusion, the overall accuracy can reach 88.70% by using CSs+AAIndex as features. The overall accuracy was up to 86.09% by using CSs+AAIndex+Codon+3PSS as features.
Conclusion: We predicted and analyzed the disordered regions based on the molecular functions. The results showed that the prediction performance can be improved by adding chemical shifts and AAIndex as features, especially chemical shifts. Moreover, the chemical shift was the most effective feature in the prediction. We hoped that our results will be constructive for the study of intrinsically disordered proteins.
Keywords: Intrinsically disordered proteins, disordered regions, amino acid index, codon, protein secondary structure, chemical shifts, support vector machine.
Graphical Abstract
[http://dx.doi.org/10.1021/cr500288y] [PMID: 25004990]
[http://dx.doi.org/10.3390/ijms161023446] [PMID: 26426014]
[http://dx.doi.org/10.1016/S0968-0004(02)02169-2] [PMID: 12368089]
[http://dx.doi.org/10.1186/1471-2164-9-S2-S1] [PMID: 18831774]
[http://dx.doi.org/10.1016/j.bbapap.2010.01.017] [PMID: 20117254]
[http://dx.doi.org/10.1038/cr.2009.87] [PMID: 19597536]
[http://dx.doi.org/10.1016/j.sbi.2013.02.001] [PMID: 23466039]
[http://dx.doi.org/10.2478/ped-2014-0001]
[http://dx.doi.org/10.1002/prot.21671] [PMID: 17680688]
[http://dx.doi.org/10.1002/prot.23160] [PMID: 21928322]
[http://dx.doi.org/10.1002/prot.10528] [PMID: 14579348]
[http://dx.doi.org/10.1093/nar/gkl166] [PMID: 16844983]
[http://dx.doi.org/10.1093/bioinformatics/btn195] [PMID: 18426805]
[http://dx.doi.org/10.1093/bioinformatics/btq373] [PMID: 20823312]
[http://dx.doi.org/10.1080/073911012010525022] [PMID: 22208280]
[http://dx.doi.org/10.1093/nar/gkw1056] [PMID: 27899601]
[http://dx.doi.org/10.1093/bioinformatics/16.4.404] [PMID: 10869041]
[http://dx.doi.org/10.1093/nar/24.1.1] [PMID: 8594554]
[http://dx.doi.org/10.1016/j.mimet.2010.10.013] [PMID: 21044646]
[http://dx.doi.org/10.1016/j.jtbi.2010.09.007] [PMID: 20831876]
[http://dx.doi.org/10.1155/2014/236717] [PMID: 25028675]
[http://dx.doi.org/10.1039/c3mb70486f] [PMID: 24469380]
[http://dx.doi.org/10.1002/jcc.21232] [PMID: 19263424]
[http://dx.doi.org/10.1371/journal.pone.0007072] [PMID: 19759917]
[http://dx.doi.org/10.1186/1471-2105-8-466] [PMID: 18047679]
[http://dx.doi.org/10.3389/fmicb.2018.00955] [PMID: 29867860]
[http://dx.doi.org/10.1093/bioinformatics/btx479] [PMID: 28961687]
[http://dx.doi.org/10.2174/157016461302160514000940]
[http://dx.doi.org/10.1093/nar/28.1.374] [PMID: 10592278]
[http://dx.doi.org/10.1101/gr.115097.110] [PMID: 21482623]
[http://dx.doi.org/10.1016/0022-2836(91)90214-Q] [PMID: 1960729]
[http://dx.doi.org/10.1007/s00726-011-1143-4] [PMID: 22102053]
[PMID: 30247625]
[http://dx.doi.org/10.1016/j.knosys.2018.10.007]
[http://dx.doi.org/10.1089/cmb.2018.0004] [PMID: 30113871]
[http://dx.doi.org/10.7150/ijbs.24174] [PMID: 29989085]
[http://dx.doi.org/10.1016/j.jtbi.2009.11.016] [PMID: 19961864]
[http://dx.doi.org/10.1007/s00726-010-0825-7] [PMID: 21191803]
[http://dx.doi.org/10.1016/j.jtbi.2010.12.024] [PMID: 21168420]
[http://dx.doi.org/10.1016/j.jtbi.2015.06.006] [PMID: 26087283]
[http://dx.doi.org/10.2174/092986612799789387] [PMID: 22185508]
[PMID: 27437396]
[http://dx.doi.org/10.1155/2016/5413903] [PMID: 27597968]
[http://dx.doi.org/10.1016/j.ab.2013.05.024] [PMID: 23756733]
[PMID: 27543076]
[PMID: 28171531]
[http://dx.doi.org/10.1093/bib/bbs088] [PMID: 23396756]
[http://dx.doi.org/10.2174/138920291804170726143423] [PMID: 29081685]