Prediction and Motif Analysis of 2’-O-methylation Using a Hybrid Deep
Learning Model from RNA Primary Sequence and Nanopore Signals

Shiyang      Pan; Yuxin      Zhang; Zhen      Wei; Jia      Meng; Daiyun      Huang

doi:10.2174/1574893617666220815153653

Abstract

Background: 2’-O-Methylation (2’-O-Me) is a post-transcriptional RNA modification that occurs in the ribose sugar moiety of all four nucleotides and is abundant in both coding and non-coding RNAs. Accurate prediction of each subtype of 2’-O-Me (Am, Cm, Gm, Um) helps understand their role in RNA metabolism and function.

Objective: This study aims to build models that can predict each subtype of 2’-O-Me from RNA sequence and nanopore signals and exploit the model interpretability for sequence motif mining.

Methods: We first propose a novel deep learning model DeepNm to better capture the sequence features of each subtype with a multi-scale framework. Based on DeepNm, we continue to propose HybridNm, which combines sequences and nanopore signals through a dual-path framework. The nanopore signalderived features are first passed through a convolutional layer and then merged with sequence features extracted from different scales for final classification.

Results: A 5-fold cross-validation process on Nm-seq data shows that DeepNm outperforms two stateof- the-art 2’-O-Me predictors. After incorporating nanopore signal-derived features, HybridNm further achieved significant improvements. Through model interpretation, we identified not only subtypespecific motifs but also revealed shared motifs between subtypes. In addition, Cm, Gm, and Um shared motifs with the well-studied m6A RNA methylation, suggesting a potential interplay among different RNA modifications and the complex nature of epitranscriptome regulation.

Conclusion: The proposed frameworks can be useful tools to predict 2’-O-Me subtypes accurately and reveal specific sequence patterns.

Keywords: 2’-O-Methylation, site prediction, nanopore RNA sequencing, RNA methylation, epitranscriptome, Deep Learning.

« Previous

Graphical Abstract

[1]
Zhao BS, Roundtree IA, He C. Post-transcriptional gene regulation by mRNA modifications. Nat Rev Mol Cell Biol  2017; 18(1): 31-42.
 [http://dx.doi.org/10.1038/nrm.2016.132] [PMID: 27808276]

[2]
Boccaletto P, Machnicka MA, Purta E, et al. MODOMICS: A database of RNA modification pathways. 2017 update. Nucleic Acids Res  2018; 46(D1): D303-7.
 [http://dx.doi.org/10.1093/nar/gkx1030] [PMID: 29106616]

[3]
Taoka M, Nobe Y, Yamaki Y, et al. Landscape of the complete RNA chemical modifications in the human 80S ribosome. Nucleic Acids Res  2018; 46(18): 9289-98.
 [http://dx.doi.org/10.1093/nar/gky811] [PMID: 30202881]

[4]
Marchand V, Pichot F, Thüring K, et al. Next-generation sequencing-based ribomethseq protocol for analysis of tRNA 2′-O-methylation. Biomolecules  2017; 7(1): 7.
 [http://dx.doi.org/10.3390/biom7010013] [PMID: 28208788]

[5]
Elliott BA, Ho HT, Ranganathan SV, et al. Modification of messenger RNA by 2′-O-methylation regulates gene expression in vivo. Nat Commun  2019; 10(1): 3401.
 [http://dx.doi.org/10.1038/s41467-019-11375-7] [PMID: 31363086]

[6]
Dai Q, Moshitch-Moshkovitz S, Han D, et al. Nm-seq maps 2′-O-methylation sites in human mRNA with base precision. Nat Methods  2017; 14(7): 695-8.
 [http://dx.doi.org/10.1038/nmeth.4294] [PMID: 28504680]

[7]
Somme J, Van Laer B, Roovers M, Steyaert J, Versées W, Droogmans L. Characterization of two homologous 2′-O-methyltransferases showing different specificities for their tRNA substrates. RNA  2014; 20(8): 1257-71.
 [http://dx.doi.org/10.1261/rna.044503.114] [PMID: 24951554]

[8]
Shubina MY, Musinova YR, Sheval EV. Nucleolar methyltransferase fibrillarin: Evolution of structure and functions. Biochemistry (Mosc)  2016; 81(9): 941-50.
 [http://dx.doi.org/10.1134/S0006297916090030] [PMID: 27682166]

[9]
Erales J, Marchand V, Panthu B, et al. Evidence for rRNA 2′-O-methylation plasticity: Control of intrinsic translational capabilities of human ribosomes. Proc Natl Acad Sci USA  2017; 114(49): 12934-9.
 [http://dx.doi.org/10.1073/pnas.1707674114] [PMID: 29158377]

[10]
Picard-Jean F, Brand C, Tremblay-Létourneau M, et al. 2′-O-methylation of the mRNA cap protects RNAs from decapping and degradation by DXO. PLoS One  2018; 13(3): e0193804.
 [http://dx.doi.org/10.1371/journal.pone.0193804] [PMID: 29601584]

[11]
Abou Assi H, Rangadurai AK, Shi H, et al. 2′-O-Methylation can increase the abundance and lifetime of alternative RNA conformational states. Nucleic acids research  2020; 48: 12365-79.

[12]
Huang C, Karijolich J, Yu YT. Detection and quantification of RNA 2′-O-methylation and pseudouridylation. Methods  2016; 103: 68-76.
 [http://dx.doi.org/10.1016/j.ymeth.2016.02.003] [PMID: 26853326]

[13]
Hasan MM, Tsukiyama S, Cho JY, et al. Deepm5C: A deep-learning-based hybrid framework for identifying human RNA N5-methylcytosine sites using a stacking strategy. Mol Ther  2022; 30(8): 2856-67.
 [http://dx.doi.org/10.1016/j.ymthe.2022.05.001] [PMID: 35526094]

[14]
Zhou Y, Zeng P, Li YH, Zhang Z, Cui Q. SRAMP: Prediction of mammalian N6-methyladenosine (m6A) sites based on sequence-derived features. Nucleic Acids Res  2016; 44(10): e91.
 [http://dx.doi.org/10.1093/nar/gkw104] [PMID: 26896799]

[15]
Zou Q, Xing P, Wei L, Liu B. Gene2vec: Gene subsequence embedding for prediction of mammalian N6-methyladenosine sites from mRNA. RNA  2019; 25(2): 205-18.
 [http://dx.doi.org/10.1261/rna.069112.118] [PMID: 30425123]

[16]
Chen Z, Zhao P, Li F, et al. Comprehensive review and assessment of computational methods for predicting RNA post-transcriptional modification sites from RNA sequences. Brief Bioinform  2020; 21(5): 1676-96.
 [http://dx.doi.org/10.1093/bib/bbz112] [PMID: 31714956]

[17]
Huang D, Song B, Wei J, Su J, Coenen F, Meng J. Weakly supervised learning of RNA modifications from low-resolution epitranscriptome data. Bioinformatics  2021; 37 (Suppl. 1): i222-30.
 [http://dx.doi.org/10.1093/bioinformatics/btab278] [PMID: 34252943]

[18]
Chen K, Wei Z, Zhang Q, et al. WHISTLE: A high-accuracy map of the human N6-methyladenosine (m6A) epitranscriptome predicted using a machine learning approach. Nucleic Acids Res  2019; 47(7): e41.
 [http://dx.doi.org/10.1093/nar/gkz074] [PMID: 30993345]

[19]
Qiu WR, Jiang SY, Sun BQ, Xiao X, Cheng X, Chou KC. iRNA-2methyl: Identify RNA 2′-O-methylation sites by incorporating sequence-coupled effects into general PseKNC and ensemble classifier. Med Chem  2017; 13: 734-43.

[20]
Mostavi M, Salekin S, Huang Y. Deep-2′-O-Me: Predicting 2′-O-methylation sites by convolutional neural networks.Annual International Conference of the IEEE Engineering in Medicine and Biology Society.   2018; 2018: pp. 2394-7.

[21]
Yang H, Lv H, Ding H, Chen W, Lin H. iRNA-2OM: A sequence-based predictor for identifying 2′-O-methylation sites in homo sapiens. J Comput Boil  2018; 25: 1266-77.

[22]
Zhou Y, Cui Q, Zhou Y. NmSEER V2.0: A prediction tool for 2′-O-methylation sites based on random forest and multi-encoding combination. BMC Bioinform  2019; 20(S25) (Suppl. 25): 690.
 [http://dx.doi.org/10.1186/s12859-019-3265-8] [PMID: 31874624]

[23]
Tahir M, Tayara H, Chong KT. iRNA-PseKNC(2methyl): Identify RNA 2′-O-methylation sites by convolution neural network and Chou’s pseudo components. J Theor Biol  2019; 465: 1-6.
 [http://dx.doi.org/10.1016/j.jtbi.2018.12.034] [PMID: 30590059]

[24]
Li H, Chen L, Huang Z, et al. DeepOMe: A web server for the prediction of 2′-O-Me sites based on the hybrid CNN and BLSTM architecture. Front Cell Dev Biol  2021; 9: 686894.
 [http://dx.doi.org/10.3389/fcell.2021.686894] [PMID: 34055810]

[25]
Xu L, Seki M. Recent advances in the detection of base modifications using the nanopore sequencer. J Hum Genet  2020; 65(1): 25-33.
 [http://dx.doi.org/10.1038/s10038-019-0679-0] [PMID: 31602005]

[26]
Stephenson W, Razaghi R, Busan S, Weeks KM, Timp W, Smibert P. Direct detection of RNA modifications and structure using single-molecule nanopore sequencing. Cell Genom  2022; 2(2): 100097.
 [http://dx.doi.org/10.1016/j.xgen.2022.100097]

[27]
Ramasamy S, Mishra S, Sharma S, et al. An informatics approach to distinguish RNA modifications in nanopore direct RNA sequencing. Genomics  2022; 114(3): 110372.
 [http://dx.doi.org/10.1016/j.ygeno.2022.110372] [PMID: 35460817]

[28]
Stephenson W, Razaghi R, Busan S, Weeks KM, Timp W, Smibert P. Direct detection of RNA modifications and structure using single molecule nanopore sequencing. bioRxiv 2020.
 [http://dx.doi.org/10.1101/2020.05.31.126763]

[29]
Ramasamy S, Sahayasheela VJ, Yu Z, et al. 2021; Chemical probe-based nanopore sequencing to selectively assess the RNA modifications. bioRxiv  2020.2005.2019.105338.

[30]
Jenjaroenpun P, Wongsurawat T, Wadley TD, et al. Decoding the epitranscriptional landscape from native RNA sequences. Nucleic Acids Res  2021; 49(2): e7.
 [http://dx.doi.org/10.1093/nar/gkaa620] [PMID: 32710622]

[31]
Furlan M, Tanaka I, Leonardi T, de Pretis S, Pelizzola M. Direct RNA sequencing for the study of synthesis, processing, and degradation of modified transcripts. Front Genet  2020; 11: 394.
 [http://dx.doi.org/10.3389/fgene.2020.00394] [PMID: 32425981]

[32]
Furlan M, Delgado-Tejedor A, Mulroney L, Pelizzola M, Novoa EM, Leonardi T. Computational methods for RNA modification detection from nanopore direct RNA sequencing data. RNA Biol  2021; 18 (sup1): 31-40.
 [http://dx.doi.org/10.1080/15476286.2021.1978215] [PMID: 34559589]

[33]
Pratanwanich PN, Yao F, Chen Y, et al. Identification of differential RNA modifications from nanopore direct RNA sequencing with xPore. Nat Biotechnol  2021; 39(11): 1394-402.
 [http://dx.doi.org/10.1038/s41587-021-00949-w] [PMID: 34282325]

[34]
Wang Y, Wang H, Xi F, et al. Profiling of circular RNA N6 -methyladenosine in moso bamboo (Phyllostachys edulis) using nanopore-based direct RNA sequencing. J Integr Plant Biol  2020; 62(12): 1823-38.
 [http://dx.doi.org/10.1111/jipb.13002] [PMID: 32735361]

[35]
Lorenz DA, Sathe S, Einstein JM, Yeo GW. Direct RNA sequencing enables m6A detection in endogenous transcript isoforms at base-specific resolution. RNA  2020; 26(1): 19-28.
 [http://dx.doi.org/10.1261/rna.072785.119] [PMID: 31624092]

[36]
Liu H, Begik O, Lucas MC, et al. Accurate detection of m6A RNA modifications in native RNA sequences. Nat Commun  2019; 10(1): 4079.
 [http://dx.doi.org/10.1038/s41467-019-11713-9] [PMID: 31501426]

[37]
Gao Y, Liu X, Wu B, et al. Quantitative profiling of N6-methyladenosine at single-base resolution in stem-differentiating xylem of Populus trichocarpa using Nanopore direct RNA sequencing. Genome Biol  2021; 22(1): 22.
 [http://dx.doi.org/10.1186/s13059-020-02241-7] [PMID: 33413586]

[38]
Zhang Y, Huang D, Wei Z, Chen K. Primary sequence-assisted prediction of m(6)A RNA methylation sites from Oxford nanopore direct RNA sequencing data. Methods (San Diego, Calif) 2022.

[39]
Hassan D, Acevedo D, Daulatabad SV, Mir Q, Janga SC. 2021; Penguin: A tool for predicting pseudouridine sites in direct RNA nanopore sequencing data. bioRxiv  2021.2003.2031.437901.
 [http://dx.doi.org/10.1101/2021.03.31.437901]

[40]
Begik O, Lucas MC, Pryszcz LP, et al. Quantitative profiling of pseudouridylation dynamics in native RNAs with nanopore sequencing. Nat Biotechnol  2021; 39(10): 1278-91.
 [http://dx.doi.org/10.1038/s41587-021-00915-6] [PMID: 33986546]

[41]
Ueda H. 2021; nanoDoc: RNA modification detection using nanopore raw reads with deep one-class classification. bioRxiv  2020.2009.2013.295089.

[42]
Smith MA, Ersavas T, Ferguson JM, et al. Molecular barcoding of native RNAs using nanopore sequencing and deep learning. Genome Res  2020; 30(9): 1345-53.
 [http://dx.doi.org/10.1101/gr.260836.120] [PMID: 32907883]

[43]
Ding H, Bailey AD IV, Jain M, Olsen H, Paten B. Gaussian mixture model-based unsupervised nucleotide modification number detection using nanopore-sequencing readouts. Bioinformatics  2020; 36(19): 4928-34.
 [http://dx.doi.org/10.1093/bioinformatics/btaa601] [PMID: 32597959]

[44]
Viehweger A, Krautwurst S, Lamkiewicz K, et al. Direct RNA nanopore sequencing of full-length coronavirus genomes provides novel insights into structural variants and enables modification analysis. Genome Res  2019; 29(9): 1545-54.
 [http://dx.doi.org/10.1101/gr.247064.118] [PMID: 31439691]

[45]
Parker MT, Knop K, Sherwood AV, et al. Nanopore direct RNA sequencing maps the complexity of arabidopsis mRNA processing and m6A modification. eLife  2020; 9: 9.
 [http://dx.doi.org/10.7554/eLife.49658] [PMID: 31931956]

[46]
McIntyre ABR, Alexander N, Grigorev K, et al. Single-molecule sequencing detection of N6-methyladenine in microbial reference materials. Nat Commun  2019; 10(1): 579.
 [http://dx.doi.org/10.1038/s41467-019-08289-9] [PMID: 30718479]

[47]
Hendra C, Pratanwanich PN, Wan YK, Goh WSS, Thiery A, Göke J. 2021.Detection of m6A from direct RNA sequencing using a multiple instance learning framework. bioRxiv 2021.
 [http://dx.doi.org/10.1101/2021.09.20.461055]

[48]
Fu L, Niu B, Zhu Z, Wu S, Li W. CD-HIT: Accelerated for clustering the next-generation sequencing data. Bioinformatics  2012; 28(23): 3150-2.
 [http://dx.doi.org/10.1093/bioinformatics/bts565] [PMID: 23060610]

[49]
Stoiber M, Quick J, Egan R, et al. 2017; De novo identification of DNA modifications enabled by genome-guided nanopore signal processing. bioRxiv  094672.

[50]
Loman NJ, Quick J, Simpson JT. A complete bacterial genome assembled de novo using only nanopore sequencing data. Nat Methods  2015; 12(8): 733-5.
 [http://dx.doi.org/10.1038/nmeth.3444] [PMID: 26076426]

[51]
Li H. Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics  2018; 34(18): 3094-100.
 [http://dx.doi.org/10.1093/bioinformatics/bty191] [PMID: 29750242]

[52]
Sundararajan M, Taly A, Yan Q. 2017; Volume 70Axiomatic attribution for deep networks. In: Doina P, Yee Whye T, Eds. Proceedings of the 34th International Conference on Machine Learning.  : pp. 3319-.

[53]
Sotoudeh M, Thakur AV. Computing linear restrictions of neural networks. NeurIPS 2019.

[54]
Jha A, K Aicher J, R Gazzara M, Singh D, Barash Y. Enhanced integrated gradients: Improving interpretability of deep learning models using splicing codes as a case study. Genome Biol  2020; 21(1): 149.
 [http://dx.doi.org/10.1186/s13059-020-02055-7] [PMID: 32560708]

[55]
Shrikumar A, Tian K, Avsec Ž, et al. Technical note on transcription factor motif discovery from importance scores (TF-MoDISco) version 0.5.6.5. 2018. Cornell University. Available from: https://arxiv.org/abs/1811.00416

[56]
Le NQK, Do DT, Nguyen TT, Le QA. A sequence-based prediction of Kruppel-like factors proteins using XGBoost and optimized features. Gene  2021; 787: 145643.
 [http://dx.doi.org/10.1016/j.gene.2021.145643] [PMID: 33848577]

[57]
Le NQK, Ho QT. Deep transformers and convolutional neural network in identifying DNA N6-methyladenine sites in cross-species genomes. DNA  2021; 204: 199-206.
 [http://dx.doi.org/10.1016/j.ymeth.2021.12.004] [PMID: 34915158]

[58]
Do DT, Le NQK. Using extreme gradient boosting to identify origin of replication in Saccharomyces cerevisiae via hybrid features. Genomics  2020; 112(3): 2445-51.
 [http://dx.doi.org/10.1016/j.ygeno.2020.01.017] [PMID: 31987913]

[59]
Dietterich TG. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput  1998; 10(7): 1895-923.
 [http://dx.doi.org/10.1162/089976698300017197] [PMID: 9744903]

[60]
Li X, Xiong X, Wang K, et al. Transcriptome-wide mapping reveals reversible and dynamic N(1)-methyladenosine methylome. Nat Chem Biol  2016; 12(5): 311-6.
 [http://dx.doi.org/10.1038/nchembio.2040] [PMID: 26863410]

[61]
Song Z, Huang D, Song B, et al. Attention-based multi-label neural networks for integrated prediction and interpretation of twelve widely occurring RNA modifications. Nat Commun  2021; 12(1): 4011.
 [http://dx.doi.org/10.1038/s41467-021-24313-3] [PMID: 34188054]

Rights & Permissions Print Cite

Article Metrics

5

Journal Information

For Authors

For Editors

For Reviewers

Explore Articles

Open Access

Open Access Articles

For Visitors

DOI https://dx.doi.org/10.2174/1574893617666220815153653	Print ISSN 1574-8936
Publisher Name Bentham Science Publisher	Online ISSN 2212-392X

Current Bioinformatics

Prediction and Motif Analysis of 2’-O-methylation Using a Hybrid Deep Learning Model from RNA Primary Sequence and Nanopore Signals

Abstract Play Pause

Graphical Abstract

Related Journals

Related Books

Abstract