Abstract
Background: RNA methylation is a reversible post-transcriptional modification involving numerous biological processes. Ribose 2'-O-methylation is part of RNA methylation. It has shown that ribose 2'-O-methylation plays an important role in immune recognition and other pathogenesis.
Objective: We aim to design a computational method to identify 2'-O-methylation.
Methods: Different from the experimental method, we propose a computational workflow to identify the methylation site based on the multi-feature extracting algorithm.
Results: With a voting procedure based on 7 best feature-classifier combinations, we achieved Accuracy of 76.5% in 10-fold cross-validation. Furthermore, we optimized features and input the optimized features into SVM. As a result, the AUC reached to 0.813.
Conclusion: The RNA sample, especially the negative samples, used in this study are more objective and strict, so we obtained more representative results than state-of-arts studies.
Keywords: 2'-O-methylation, feature extraction, classification algorithm, vote strategy, cross-validation, feature selection.
[http://dx.doi.org/10.1016/j.bbagrm.2018.11.009] [PMID: 30572123]
[http://dx.doi.org/10.1101/gad.226654.113] [PMID: 24395246]
[http://dx.doi.org/10.1017/S1355838201010366] [PMID: 11497435]
[http://dx.doi.org/10.1093/nar/gky1051] [PMID: 30380072]
[http://dx.doi.org/10.2174/1566523218666181010101114] [PMID: 30306867]
[http://dx.doi.org/10.1109/TBME.2019.2927157]
[http://dx.doi.org/10.1093/emboj/21.7.1811] [PMID: 11927565]
[http://dx.doi.org/10.1261/rna.035287.112] [PMID: 22912484]
[http://dx.doi.org/10.3390/genes9120642] [PMID: 30567409]
[http://dx.doi.org/10.1002/ange.201408362] [PMID: 25417815]
[http://dx.doi.org/10.1093/nar/gkw810] [PMID: 28180324]
[http://dx.doi.org/10.1016/j.jtbi.2018.12.034] [PMID: 30590059]
[http://dx.doi.org/10.1016/j.ygeno.2016.05.003] [PMID: 27191866]
[http://dx.doi.org/10.1089/cmb.2018.0004] [PMID: 30113871]
[http://dx.doi.org/10.1093/nar/gkv1036] [PMID: 26464443]
[http://dx.doi.org/10.1093/nar/gkj112] [PMID: 16381832]
[http://dx.doi.org/10.1093/nar/gkz740] [PMID: 31504851]
[http://dx.doi.org/10.3389/fgene.2019.00119] [PMID: 30858864]
[http://dx.doi.org/10.1093/bioinformatics/bty140] [PMID: 29528364]
[PMID: 30184176]
[http://dx.doi.org/10.1093/bib/bbz041] [PMID: 31067315]
[http://dx.doi.org/10.1093/bib/bby028] [PMID: 29897410]
[http://dx.doi.org/10.1093/nar/gkv458] [PMID: 25958395]
[http://dx.doi.org/10.1093/bioinformatics/bty827] [PMID: 30247625]
[http://dx.doi.org/10.1093/bioinformatics/bty943] [PMID: 30428009]
[http://dx.doi.org/10.1093/nar/gkn597] [PMID: 18805906]
[http://dx.doi.org/10.1093/nar/gkn159] [PMID: 18390576]
[http://dx.doi.org/10.1093/bioinformatics/btp500] [PMID: 19706744]
[http://dx.doi.org/10.1002/prot.1035] [PMID: 11288174]
[http://dx.doi.org/10.1007/s00438-015-1078-7] [PMID: 26085220]
[http://dx.doi.org/10.1038/s41598-018-32511-1] [PMID: 30250210]
[http://dx.doi.org/10.1016/j.jtbi.2018.11.012] [PMID: 30452958]
[http://dx.doi.org/10.1093/bib/bbz098] [PMID: 31665221]
[http://dx.doi.org/10.1016/j.knosys.2018.10.007]
[http://dx.doi.org/10.1007/s10723-015-9353-8]
[http://dx.doi.org/10.1016/j.jprot.2012.09.006] [PMID: 23000219]
[http://dx.doi.org/10.1155/2016/5413903] [PMID: 27597968]
[http://dx.doi.org/10.1155/2014/623149] [PMID: 24967386]
[http://dx.doi.org/10.1093/bioinformatics/bty312] [PMID: 29684124]
[http://dx.doi.org/10.1109/TCBB.2017.2666141]
[http://dx.doi.org/10.1016/j.omtn.2019.05.028] [PMID: 31299595]
[http://dx.doi.org/10.1093/bioinformatics/bty508] [PMID: 29931187]
[http://dx.doi.org/10.1093/bioinformatics/btz358] [PMID: 31077296]
[PMID: 32217482]
[http://dx.doi.org/10.1109/ACCESS.2019.2929363]
[http://dx.doi.org/10.1093/bioinformatics/bty458] [PMID: 29878118]
[http://dx.doi.org/10.3389/fgene.2019.00033] [PMID: 30809242]
[http://dx.doi.org/10.1109/TCYB.2019.2938895] [PMID: 31545758]
[http://dx.doi.org/10.2174/1574893609666140820224436]
[http://dx.doi.org/10.1155/2013/567529] [PMID: 24062796]
[http://dx.doi.org/10.1155/2013/530696] [PMID: 23762187]
[http://dx.doi.org/10.1093/bioinformatics/bty522] [PMID: 29947803]
[http://dx.doi.org/10.1093/bioinformatics/btx670] [PMID: 29069280]
[http://dx.doi.org/10.2174/1574893611666160609081155]
[http://dx.doi.org/10.1021/acs.jproteome.9b00250] [PMID: 31136183]
[http://dx.doi.org/10.3389/fbioe.2019.00215] [PMID: 31552241]
[http://dx.doi.org/10.1093/bioinformatics/btu852] [PMID: 25568279]
[http://dx.doi.org/10.1016/j.jtbi.2018.01.023] [PMID: 29408627]
[http://dx.doi.org/10.1093/bioinformatics/bty039] [PMID: 29420699]
[http://dx.doi.org/10.1155/2016/7604641] [PMID: 27478823]
[http://dx.doi.org/10.1186/s12859-018-2527-1] [PMID: 30598073]
[http://dx.doi.org/10.3390/genes9030158] [PMID: 29534013]
[http://dx.doi.org/10.1186/s12864-017-4338-6] [PMID: 29363423]
[http://dx.doi.org/10.1093/bioinformatics/bty002] [PMID: 29365045]
[http://dx.doi.org/10.1038/s42256-019-0052-1]
[http://dx.doi.org/10.1016/j.artmed.2017.03.001] [PMID: 28320624]
[http://dx.doi.org/10.1016/j.artmed.2017.02.005] [PMID: 28245947]
[http://dx.doi.org/10.1109/TCBB.2017.2704587] [PMID: 28534780]
[http://dx.doi.org/10.3390/genes10030242]
[http://dx.doi.org/10.2174/1389200219666180829121038] [PMID: 30156155]
[http://dx.doi.org/10.1016/j.neucom.2018.10.028]
[http://dx.doi.org/10.1021/acs.jcim.7b00307] [PMID: 29125297]
[http://dx.doi.org/10.1016/j.ins.2017.08.045]
[http://dx.doi.org/10.1186/s12859-016-1253-9] [PMID: 27677692]
[http://dx.doi.org/10.1016/j.omtn.2019.08.008] [PMID: 31536883]
[http://dx.doi.org/10.3389/fgene.2019.00842] [PMID: 31620165]
[http://dx.doi.org/10.1093/bioinformatics/btz418]
[http://dx.doi.org/10.1109/TCBB.2017.2776280] [PMID: 29990255]
[http://dx.doi.org/10.1007/s41965-019-00020-3]
[http://dx.doi.org/10.1093/bib/bbz112] [PMID: 31714956]
[http://dx.doi.org/10.1109/TCBB.2019.2957758] [PMID: 31804942]
[http://dx.doi.org/10.1186/s12859-019-2700-1] [PMID: 30841845]
[http://dx.doi.org/10.1093/bioinformatics/btz016] [PMID: 30649179]
[http://dx.doi.org/10.1093/bioinformatics/btz721] [PMID: 31566664]
[http://dx.doi.org/10.1093/bib/bbz051] [PMID: 31204427]
[http://dx.doi.org/10.1093/nar/gkz1020] [PMID: 31701126]
[http://dx.doi.org/10.2174/1389450119666181002143355] [PMID: 30277150]
[http://dx.doi.org/10.1016/j.omtn.2019.07.019] [PMID: 31479921]
[http://dx.doi.org/10.1016/j.neucom.2017.11.061]
[http://dx.doi.org/10.1016/j.canlet.2018.01.015] [PMID: 29337107]
[http://dx.doi.org/10.1016/j.patcog.2017.01.016]
[http://dx.doi.org/10.1007/s13105-012-0166-y] [PMID: 22535282]
[http://dx.doi.org/10.1093/bib/bbz022] [PMID: 30868164]
[http://dx.doi.org/10.3390/ijms19061773] [PMID: 29914044]
[http://dx.doi.org/10.1109/TNB.2018.2873221] [PMID: 30281471]
[http://dx.doi.org/10.1016/j.neucom.2014.12.123]
[http://dx.doi.org/10.1162/NECO_a_00605] [PMID: 24708366]
[http://dx.doi.org/10.1109/TCYB.2017.2779450] [PMID: 29990272]
[http://dx.doi.org/10.1109/TCYB.2018.2856208] [PMID: 30059330]
[http://dx.doi.org/10.1038/srep15145] [PMID: 26477495]
[http://dx.doi.org/10.2174/1386207319666151110122621] [PMID: 26552440]
[http://dx.doi.org/10.3389/fgene.2019.00256] [PMID: 30972106]
[http://dx.doi.org/10.2174/1573406413666170623082245] [PMID: 28641529]
[http://dx.doi.org/10.1007/s13361-019-02300-9] [PMID: 31435890]
[http://dx.doi.org/10.1261/rna.069112.118] [PMID: 30425123]
[http://dx.doi.org/10.1093/bfgp/ely030] [PMID: 30265280]
[http://dx.doi.org/10.2174/1574893612666170707095707]
[http://dx.doi.org/10.1002/pmic.201900119] [PMID: 31187588]