Abstract
Background: The amino acid residues, in protein, undergo post-translation modification (PTM) during protein synthesis, a process of chemical and physical change in an amino acid that in turn alters behavioral properties of proteins. Tyrosine sulfation is a ubiquitous posttranslational modification which is known to be associated with regulation of various biological functions and pathological processes. Thus its identification is necessary to understand its mechanism. Experimental determination through site-directed mutagenesis and high throughput mass spectrometry is a costly and time taking process, thus, the reliable computational model is required for identification of sulfotyrosine sites.
Methodology: In this paper, we present a computational model for the prediction of the sulfotyrosine sites named iSulfoTyr-PseAAC in which feature vectors are constructed using statistical moments of protein amino acid sequences and various position/composition relative features. These features are incorporated into PseAAC. The model is validated by jackknife, cross-validation, self-consistency and independent testing.
Results: Accuracy determined through validation was 93.93% for jackknife test, 95.16% for crossvalidation, 94.3% for self-consistency and 94.3% for independent testing.
Conclusion: The proposed model has better performance as compared to the existing predictors, however, the accuracy can be improved further, in future, due to increasing number of sulfotyrosine sites in proteins.
Keywords: Sulfation, sulfotyrosine, statistical moments, PseAAC, 5-step rule, pseudo components.
Graphical Abstract
[http://dx.doi.org/10.1139/o83-066] [PMID: 6354396]
[http://dx.doi.org/10.2174/1568026615666150819110421] [PMID: 26286211]
[http://dx.doi.org/10.1074/jbc.M203361200] [PMID: 12034737]
[http://dx.doi.org/10.1016/0968-0004(87)90166-6]
[http://dx.doi.org/10.1074/jbc.R300008200] [PMID: 12730193]
[http://dx.doi.org/10.1038/nmeth1056] [PMID: 17558413]
[http://dx.doi.org/10.1016/j.jasms.2006.05.013] [PMID: 16820302]
[http://dx.doi.org/10.1016/S1074-5521(00)00093-4] [PMID: 10712936]
[http://dx.doi.org/10.1074/jbc.M308689200] [PMID: 14551184]
[http://dx.doi.org/10.1016/j.jtbi.2018.07.018] [PMID: 30031793]
[http://dx.doi.org/10.1016/j.ab.2018.09.002] [PMID: 30201554]
[http://dx.doi.org/10.1016/j.ab.2015.08.021] [PMID: 26314792]
[http://dx.doi.org/10.1016/j.omtn.2018.03.012] [PMID: 29858081]
[PMID: 28427142]
[http://dx.doi.org/10.1016/j.omtn.2017.03.006] [PMID: 28624191]
[PMID: 29360500]
[http://dx.doi.org/10.2174/1381612825666181127101039] [PMID: 30479209]
[http://dx.doi.org/10.3390/ijms150610410] [PMID: 24918295]
[http://dx.doi.org/10.1016/j.ab.2015.12.009] [PMID: 26723495]
[http://dx.doi.org/10.1016/j.jtbi.2016.01.020] [PMID: 26807806]
[http://dx.doi.org/10.18632/oncotarget.9148] [PMID: 27153555]
[http://dx.doi.org/10.1093/bioinformatics/btw387] [PMID: 27354696]
[http://dx.doi.org/10.1016/j.jtbi.2016.02.020] [PMID: 26908349]
[http://dx.doi.org/10.1016/j.jmgm.2017.08.020] [PMID: 28886434]
[http://dx.doi.org/10.1016/j.gene.2018.04.055] [PMID: 29694908]
[http://dx.doi.org/10.1016/j.ab.2018.04.021] [PMID: 29704476]
[http://dx.doi.org/10.1007/s11033-018-4417-z] [PMID: 30311130]
[http://dx.doi.org/10.2174/1573406413666170515120507] [PMID: 28521678]
[http://dx.doi.org/10.1016/j.ab.2015.12.017] [PMID: 26748145]
[http://dx.doi.org/10.1002/minf.201600010] [PMID: 28488814]
[http://dx.doi.org/10.2174/1573406413666170623082245] [PMID: 28641529]
[http://dx.doi.org/10.18632/oncotarget.17104] [PMID: 28476023]
[http://dx.doi.org/10.18632/oncotarget.10027] [PMID: 27322424]
[http://dx.doi.org/10.1093/bioinformatics/btw380] [PMID: 27334473]
[http://dx.doi.org/10.1080/07391102.2014.968875] [PMID: 25248923]
[http://dx.doi.org/10.18632/oncotarget.9987] [PMID: 27323404]
[http://dx.doi.org/10.1016/j.jtbi.2018.04.037] [PMID: 29727634]
[http://dx.doi.org/10.1093/protein/gzt042] [PMID: 24048266]
[http://dx.doi.org/10.1371/journal.pone.0055844] [PMID: 23409062]
[http://dx.doi.org/10.7717/peerj.171] [PMID: 24109555]
[http://dx.doi.org/10.2174/1573406413666170419150052] [PMID: 28425870]
[http://dx.doi.org/10.3390/ijms15057594] [PMID: 24857907]
[http://dx.doi.org/10.1371/journal.pone.0105018] [PMID: 25121969]
[http://dx.doi.org/10.3390/ijms150711204] [PMID: 24968264]
[http://dx.doi.org/10.1038/s41598-018-19491-y] [PMID: 29348418]
[PMID: 30593778]
[PMID: 30550863]
[http://dx.doi.org/10.1155/2016/8370132]
[http://dx.doi.org/10.1007/s00232-016-9937-7] [PMID: 27866233]
[http://dx.doi.org/10.1007/s11033-018-4391-5] [PMID: 30238411]
[http://dx.doi.org/10.1038/s41598-018-36203-8] [PMID: 30560923]
[http://dx.doi.org/10.1093/bib/bby089]
[http://dx.doi.org/10.2174/1573406411666141229162834] [PMID: 25548930]
[http://dx.doi.org/10.1016/j.ab.2018.12.019] [PMID: 30593778]
[http://dx.doi.org/10.1186/s12859-019-2700-1]
[PMID: 29107015]
[http://dx.doi.org/10.1093/bib/bby079] [PMID: 30351377]
[http://dx.doi.org/10.1385/ENDO:19:3:333] [PMID: 12624435]
[http://dx.doi.org/10.1093/bioinformatics/18.5.769] [PMID: 12050077]
[http://dx.doi.org/10.1002/jcc.21258] [PMID: 19373826]
[http://dx.doi.org/10.1021/pr1007152] [PMID: 20973568]
[http://dx.doi.org/10.1016/j.ab.2012.06.003] [PMID: 22691961]
[http://dx.doi.org/10.1016/j.jtbi.2010.12.024] [PMID: 21168420]
[http://dx.doi.org/10.1093/protein/14.2.75] [PMID: 11297664]
[PMID: 30010789]
[http://dx.doi.org/10.1016/j.jtbi.2018.09.005] [PMID: 30201434]
[PMID: 29842950]
[http://dx.doi.org/10.1016/j.ygeno.2018.08.007] [PMID: 30179658]
[http://dx.doi.org/10.1016/j.jtbi.2018.07.032] [PMID: 30056084]
[http://dx.doi.org/10.1016/j.jtbi.2018.05.033] [PMID: 29870696]
[http://dx.doi.org/10.1016/j.ygeno.2018.09.004] [PMID: 30196077]
[http://dx.doi.org/10.18632/oncotarget.13758] [PMID: 27926534]
[http://dx.doi.org/10.2174/1381612824666181119145030] [PMID: 30451108]
[http://dx.doi.org/10.1016/j.ab.2013.05.024] [PMID: 23756733]
[http://dx.doi.org/10.1016/j.jtbi.2019.02.007] [PMID: 30768975]
[http://dx.doi.org/10.1016/j.jtbi.2018.10.021] [PMID: 30312687]
[http://dx.doi.org/10.1093/nar/gku1019] [PMID: 25361964]
[http://dx.doi.org/10.1016/j.jtbi.2015.08.025]]
[http://dx.doi.org/10.2174/0929867326666190507082559] [PMID: 31060481]
[http://dx.doi.org/10.1002/pro.5560010312]
[http://dx.doi.org/10.1021/pr025527k]
[http://dx.doi.org/10.1371/journal.pone.0014556]]
[http://dx.doi.org/10.1016/j.jtbi.2005.05.034]
[http://dx.doi.org/10.1016/j.jtbi.2016.09.001]
[http://dx.doi.org/10.1016/j.jtbi.2018.05.033]
[http://dx.doi.org/10.1038/srep42362]
[http://dx.doi.org/10.1016/j.jtbi.2018.08.042]
[http://dx.doi.org/10.2174/1568026617666170414145508]
[http://dx.doi.org/10.1016/j.ab.2007.10.012]
[http://dx.doi.org/10.1016/j.ab.2012.03.015]
[http://dx.doi.org/10.3390/ijms15033495]
[http://dx.doi.org/10.2174/157016409789973707]
[http://dx.doi.org/10.1016/j.ab.2014.04.001] [PMID: 24732113]
[http://dx.doi.org/10.1039/C5MB00155B]
[http://dx.doi.org/10.1093/bioinformatics/btx579] [PMID: 28968797]
[http://dx.doi.org/10.1371/journal.pone.0181966] [PMID: 28797096]
[http://dx.doi.org/10.1007/s00521-013-1372-4]
[http://dx.doi.org/10.1155/2014/875879]
[http://dx.doi.org/10.1155/2014/723595]
[http://dx.doi.org/10.1016/S0196-9781(01)00540-X]
[http://dx.doi.org/10.1155/2013/530696]
[http://dx.doi.org/10.7717/peerj.171] [PMID: 24109555]
[http://dx.doi.org/10.1016/j.ygeno.2015.12.005] [PMID: 26724497]
[http://dx.doi.org/10.1002/minf.201600010] [PMID: 28488814]
[http://dx.doi.org/10.18632/oncotarget.9057] [PMID: 27147572]
[http://dx.doi.org/10.1093/nar/gku1019] [PMID: 25361964]
[http://dx.doi.org/10.1371/journal.pone.0105018] [PMID: 25121969]
[http://dx.doi.org/10.1016/j.jtbi.2016.01.020] [PMID: 26807806]
[http://dx.doi.org/10.18632/oncotarget.11975] [PMID: 27626500]
[http://dx.doi.org/10.18632/oncotarget.7815] [PMID: 26942877]
[http://dx.doi.org/10.1016/j.omtn.2017.04.008] [PMID: 28624202]
[http://dx.doi.org/10.1093/bioinformatics/btw539] [PMID: 27531102]
[http://dx.doi.org/10.18632/oncotarget.13758] [PMID: 27926534]
[http://dx.doi.org/10.1016/j.omtn.2017.03.006] [PMID: 28624191]
[http://dx.doi.org/10.1093/bioinformatics/btx579] [PMID: 28968797]
[http://dx.doi.org/10.1038/s41598-018-19491-y] [PMID: 29348418]
[http://dx.doi.org/10.1016/j.ygeno.2018.01.005] [PMID: 29360500]
[http://dx.doi.org/10.1093/bioinformatics/btw539] [PMID: 27531102]
[PMID: 29897410]
[http://dx.doi.org/10.1016/j.omtn.2017.04.008] [PMID: 28624202]
[http://dx.doi.org/10.1039/C1MB05420A] [PMID: 22134333]
[http://dx.doi.org/10.1039/c3mb25466f] [PMID: 23370050]
[http://dx.doi.org/10.1016/j.jtbi.2011.06.005] [PMID: 21684290]
[http://dx.doi.org/10.1016/j.ab.2013.01.019] [PMID: 23395824]
[http://dx.doi.org/10.1039/c3mb25555g] [PMID: 23536215]
[PMID: 28818512]
[http://dx.doi.org/10.1039/C7MB00267J] [PMID: 28702580]
[http://dx.doi.org/10.1016/j.gene.2017.07.036] [PMID: 28728979]
[http://dx.doi.org/10.1093/bioinformatics/btx711] [PMID: 29106451]
[http://dx.doi.org/10.1016/j.ygeno.2017.10.002] [PMID: 28989035]
[http://dx.doi.org/10.1093/bioinformatics/btx476] [PMID: 29036535]
[http://dx.doi.org/10.4236/ns.2017.99032]
[http://dx.doi.org/10.1093/bioinformatics/btx387] [PMID: 28172617]
[http://dx.doi.org/10.1039/c3mb25555g]
[http://dx.doi.org/10.3109/10409239509083488] [PMID: 7587280]
[http://dx.doi.org/10.1016/j.jtbi.2014.09.029] [PMID: 25264267]
[http://dx.doi.org/10.1007/s00726-014-1711-5] [PMID: 24623121]
[http://dx.doi.org/10.1016/j.bbrc.2005.06.075] [PMID: 15993842]
[http://dx.doi.org/10.1016/j.jtbi.2014.10.008] [PMID: 25454009]
[http://dx.doi.org/10.1016/j.jtbi.2014.04.006] [PMID: 24732262]
[http://dx.doi.org/10.1016/j.jtbi.2014.07.003] [PMID: 25026218]
[http://dx.doi.org/10.3390/ijms15021746] [PMID: 24469313]
[http://dx.doi.org/10.1007/s00726-006-0478-8] [PMID: 17235453]
[http://dx.doi.org/10.1039/c1mb05232b] [PMID: 21984117]
[http://dx.doi.org/10.1002/prot.10251] [PMID: 12471598]
[http://dx.doi.org/10.1016/0006-2952(94)90077-9]
[http://dx.doi.org/10.1042/bj1870829]
[http://dx.doi.org/10.4236/ns.2011.310111]
[http://dx.doi.org/10.2174/138920010791514261]
[http://dx.doi.org/10.1139/v81-107]
[http://dx.doi.org/10.1042/bj2220169]
[http://dx.doi.org/10.1016/j.jtbi.2011.06.006]
[http://dx.doi.org/10.1016/0301-4622(80)80002-0]
[http://dx.doi.org/10.1016/0301-4622(80)80003-2]
[http://dx.doi.org/10.4236/ns.2009.12011]
[http://dx.doi.org/10.1016/0301-4622(88)85002-6]
[http://dx.doi.org/10.2174/1568026617666170414145508] [PMID: 28413951]