Abstract
Background: We research the binding function proteins in Elymus nutans. Recognition for proteins is essential for study of biology. Machine learning methods have been widely used for the prediction of proteins.
Methods: We used BLAST software for the function annotations of Elymus nutans. Besides, we used machine learning methods to recognize proteins which are not annotated by the software. In the process, we focused on identifying the proteins with binding functions. In our research, features are extracted by four algorithms, and then selected by mutual information estimator. Here three classifiers are constructed based on K-nearest neighbour algorithm and gradient boosting algorithm.
Results and Conclusion: Experimental results show that there are 848 proteins with ATP binding function, 113 proteins with heme binding function, 315 proteins with zinc-ion binding function, 135 proteins with GTP binding function and 21 proteins with ADP binding function. Furthermore, we have successfully predicted the functions of 10 special protein sequences whose function annotations cannot be obtained by making sequence alignment with seven famous protein databases. Among them, seven sequences have ATP binding functions, one sequence has heme binding function, one sequence has zinc-ion binding function and the other one has GTP binding function.
Keywords: Protein, binding function, machine learning, feature, ATP, GTP.
[http://dx.doi.org/10.1007/978-1-59745-535-0_4] [PMID: 18287689]
[http://dx.doi.org/10.1093/bioinformatics/bty653] [PMID: 30032213]
[http://dx.doi.org/10.3390/molecules22101732] [PMID: 29039790]
[http://dx.doi.org/10.1093/nar/gkx870] [PMID: 29036709]
[http://dx.doi.org/10.1093/bioinformatics/btx624] [PMID: 29028931]
[http://dx.doi.org/10.1093/bioinformatics/btr657] [PMID: 22130595]
[http://dx.doi.org/10.1186/1471-2105-15-298] [PMID: 25196432]
[http://dx.doi.org/10.1109/TCBB.2016.2615010] [PMID: 28029626]
[http://dx.doi.org/10.1186/s12859-016-1369-y] [PMID: 28155651]
[http://dx.doi.org/10.1093/nar/25.17.3389] [PMID: 9254694]
[http://dx.doi.org/10.1371/journal.pone.0087357] [PMID: 24586270]
[http://dx.doi.org/10.1093/bib/bby062] [PMID: 30020404]
[http://dx.doi.org/10.1007/978-1-4939-7717-8_19] [PMID: 29536452]
[http://dx.doi.org/10.1038/nbt.1883] [PMID: 21572440]
[http://dx.doi.org/10.1093/nar/gkh131] [PMID: 14681372]
[http://dx.doi.org/10.1093/nar/28.1.33] [PMID: 10592175]
[http://dx.doi.org/10.1186/gb-2004-5-2-r7] [PMID: 14759257]
[http://dx.doi.org/10.1093/nar/gkv1248] [PMID: 26582926]
[http://dx.doi.org/10.1093/nar/gkh063] [PMID: 14681412]
[http://dx.doi.org/10.1093/bioinformatics/btl158] [PMID: 16731699]
[http://dx.doi.org/10.1186/1471-2105-6-33] [PMID: 15720719]
[http://dx.doi.org/10.1186/1752-0509-9-S1-S10] [PMID: 25708928]
[http://dx.doi.org/10.1016/j.jtbi.2017.10.030] [PMID: 29100918]
[http://dx.doi.org/10.2174/157016461302160514000940]
[http://dx.doi.org/10.1016/j.jtbi.2007.09.014] [PMID: 17959199]
[http://dx.doi.org/10.1006/bbrc.2000.3815] [PMID: 11097861]
[http://dx.doi.org/10.1093/bioinformatics/btl170] [PMID: 16672258]
[http://dx.doi.org/10.1109/TIT.1967.1053964]
[http://dx.doi.org/10.1007/BF00994018]
[http://dx.doi.org/10.1186/s13059-018-1512-3] [PMID: 30217220]