Generic placeholder image

Current Bioinformatics

Editor-in-Chief

ISSN (Print): 1574-8936
ISSN (Online): 2212-392X

Research Article

Prediction and Motif Analysis of 2’-O-methylation Using a Hybrid Deep Learning Model from RNA Primary Sequence and Nanopore Signals

Author(s): Shiyang Pan, Yuxin Zhang, Zhen Wei, Jia Meng and Daiyun Huang*

Volume 17, Issue 9, 2022

Published on: 09 September, 2022

Page: [873 - 882] Pages: 10

DOI: 10.2174/1574893617666220815153653

Price: $65

Abstract

Background: 2’-O-Methylation (2’-O-Me) is a post-transcriptional RNA modification that occurs in the ribose sugar moiety of all four nucleotides and is abundant in both coding and non-coding RNAs. Accurate prediction of each subtype of 2’-O-Me (Am, Cm, Gm, Um) helps understand their role in RNA metabolism and function.

Objective: This study aims to build models that can predict each subtype of 2’-O-Me from RNA sequence and nanopore signals and exploit the model interpretability for sequence motif mining.

Methods: We first propose a novel deep learning model DeepNm to better capture the sequence features of each subtype with a multi-scale framework. Based on DeepNm, we continue to propose HybridNm, which combines sequences and nanopore signals through a dual-path framework. The nanopore signalderived features are first passed through a convolutional layer and then merged with sequence features extracted from different scales for final classification.

Results: A 5-fold cross-validation process on Nm-seq data shows that DeepNm outperforms two stateof- the-art 2’-O-Me predictors. After incorporating nanopore signal-derived features, HybridNm further achieved significant improvements. Through model interpretation, we identified not only subtypespecific motifs but also revealed shared motifs between subtypes. In addition, Cm, Gm, and Um shared motifs with the well-studied m6A RNA methylation, suggesting a potential interplay among different RNA modifications and the complex nature of epitranscriptome regulation.

Conclusion: The proposed frameworks can be useful tools to predict 2’-O-Me subtypes accurately and reveal specific sequence patterns.

Keywords: 2’-O-Methylation, site prediction, nanopore RNA sequencing, RNA methylation, epitranscriptome, Deep Learning.

« Previous
Graphical Abstract

[1]
Zhao BS, Roundtree IA, He C. Post-transcriptional gene regulation by mRNA modifications. Nat Rev Mol Cell Biol 2017; 18(1): 31-42.
[http://dx.doi.org/10.1038/nrm.2016.132] [PMID: 27808276]
[2]
Boccaletto P, Machnicka MA, Purta E, et al. MODOMICS: A database of RNA modification pathways. 2017 update. Nucleic Acids Res 2018; 46(D1): D303-7.
[http://dx.doi.org/10.1093/nar/gkx1030] [PMID: 29106616]
[3]
Taoka M, Nobe Y, Yamaki Y, et al. Landscape of the complete RNA chemical modifications in the human 80S ribosome. Nucleic Acids Res 2018; 46(18): 9289-98.
[http://dx.doi.org/10.1093/nar/gky811] [PMID: 30202881]
[4]
Marchand V, Pichot F, Thüring K, et al. Next-generation sequencing-based ribomethseq protocol for analysis of tRNA 2′-O-methylation. Biomolecules 2017; 7(1): 7.
[http://dx.doi.org/10.3390/biom7010013] [PMID: 28208788]
[5]
Elliott BA, Ho HT, Ranganathan SV, et al. Modification of messenger RNA by 2′-O-methylation regulates gene expression in vivo. Nat Commun 2019; 10(1): 3401.
[http://dx.doi.org/10.1038/s41467-019-11375-7] [PMID: 31363086]
[6]
Dai Q, Moshitch-Moshkovitz S, Han D, et al. Nm-seq maps 2′-O-methylation sites in human mRNA with base precision. Nat Methods 2017; 14(7): 695-8.
[http://dx.doi.org/10.1038/nmeth.4294] [PMID: 28504680]
[7]
Somme J, Van Laer B, Roovers M, Steyaert J, Versées W, Droogmans L. Characterization of two homologous 2′-O-methyltransferases showing different specificities for their tRNA substrates. RNA 2014; 20(8): 1257-71.
[http://dx.doi.org/10.1261/rna.044503.114] [PMID: 24951554]
[8]
Shubina MY, Musinova YR, Sheval EV. Nucleolar methyltransferase fibrillarin: Evolution of structure and functions. Biochemistry (Mosc) 2016; 81(9): 941-50.
[http://dx.doi.org/10.1134/S0006297916090030] [PMID: 27682166]
[9]
Erales J, Marchand V, Panthu B, et al. Evidence for rRNA 2′-O-methylation plasticity: Control of intrinsic translational capabilities of human ribosomes. Proc Natl Acad Sci USA 2017; 114(49): 12934-9.
[http://dx.doi.org/10.1073/pnas.1707674114] [PMID: 29158377]
[10]
Picard-Jean F, Brand C, Tremblay-Létourneau M, et al. 2′-O-methylation of the mRNA cap protects RNAs from decapping and degradation by DXO. PLoS One 2018; 13(3): e0193804.
[http://dx.doi.org/10.1371/journal.pone.0193804] [PMID: 29601584]
[11]
Abou Assi H, Rangadurai AK, Shi H, et al. 2′-O-Methylation can increase the abundance and lifetime of alternative RNA conformational states. Nucleic acids research 2020; 48: 12365-79.
[12]
Huang C, Karijolich J, Yu YT. Detection and quantification of RNA 2′-O-methylation and pseudouridylation. Methods 2016; 103: 68-76.
[http://dx.doi.org/10.1016/j.ymeth.2016.02.003] [PMID: 26853326]
[13]
Hasan MM, Tsukiyama S, Cho JY, et al. Deepm5C: A deep-learning-based hybrid framework for identifying human RNA N5-methylcytosine sites using a stacking strategy. Mol Ther 2022; 30(8): 2856-67.
[http://dx.doi.org/10.1016/j.ymthe.2022.05.001] [PMID: 35526094]
[14]
Zhou Y, Zeng P, Li YH, Zhang Z, Cui Q. SRAMP: Prediction of mammalian N6-methyladenosine (m6A) sites based on sequence-derived features. Nucleic Acids Res 2016; 44(10): e91.
[http://dx.doi.org/10.1093/nar/gkw104] [PMID: 26896799]
[15]
Zou Q, Xing P, Wei L, Liu B. Gene2vec: Gene subsequence embedding for prediction of mammalian N6-methyladenosine sites from mRNA. RNA 2019; 25(2): 205-18.
[http://dx.doi.org/10.1261/rna.069112.118] [PMID: 30425123]
[16]
Chen Z, Zhao P, Li F, et al. Comprehensive review and assessment of computational methods for predicting RNA post-transcriptional modification sites from RNA sequences. Brief Bioinform 2020; 21(5): 1676-96.
[http://dx.doi.org/10.1093/bib/bbz112] [PMID: 31714956]
[17]
Huang D, Song B, Wei J, Su J, Coenen F, Meng J. Weakly supervised learning of RNA modifications from low-resolution epitranscriptome data. Bioinformatics 2021; 37 (Suppl. 1): i222-30.
[http://dx.doi.org/10.1093/bioinformatics/btab278] [PMID: 34252943]
[18]
Chen K, Wei Z, Zhang Q, et al. WHISTLE: A high-accuracy map of the human N6-methyladenosine (m6A) epitranscriptome predicted using a machine learning approach. Nucleic Acids Res 2019; 47(7): e41.
[http://dx.doi.org/10.1093/nar/gkz074] [PMID: 30993345]
[19]
Qiu WR, Jiang SY, Sun BQ, Xiao X, Cheng X, Chou KC. iRNA-2methyl: Identify RNA 2′-O-methylation sites by incorporating sequence-coupled effects into general PseKNC and ensemble classifier. Med Chem 2017; 13: 734-43.
[20]
Mostavi M, Salekin S, Huang Y. Deep-2′-O-Me: Predicting 2′-O-methylation sites by convolutional neural networks.Annual International Conference of the IEEE Engineering in Medicine and Biology Society. 2018; 2018: pp. 2394-7.
[21]
Yang H, Lv H, Ding H, Chen W, Lin H. iRNA-2OM: A sequence-based predictor for identifying 2′-O-methylation sites in homo sapiens. J Comput Boil 2018; 25: 1266-77.
[22]
Zhou Y, Cui Q, Zhou Y. NmSEER V2.0: A prediction tool for 2′-O-methylation sites based on random forest and multi-encoding combination. BMC Bioinform 2019; 20(S25) (Suppl. 25): 690.
[http://dx.doi.org/10.1186/s12859-019-3265-8] [PMID: 31874624]
[23]
Tahir M, Tayara H, Chong KT. iRNA-PseKNC(2methyl): Identify RNA 2′-O-methylation sites by convolution neural network and Chou’s pseudo components. J Theor Biol 2019; 465: 1-6.
[http://dx.doi.org/10.1016/j.jtbi.2018.12.034] [PMID: 30590059]
[24]
Li H, Chen L, Huang Z, et al. DeepOMe: A web server for the prediction of 2′-O-Me sites based on the hybrid CNN and BLSTM architecture. Front Cell Dev Biol 2021; 9: 686894.
[http://dx.doi.org/10.3389/fcell.2021.686894] [PMID: 34055810]
[25]
Xu L, Seki M. Recent advances in the detection of base modifications using the nanopore sequencer. J Hum Genet 2020; 65(1): 25-33.
[http://dx.doi.org/10.1038/s10038-019-0679-0] [PMID: 31602005]
[26]
Stephenson W, Razaghi R, Busan S, Weeks KM, Timp W, Smibert P. Direct detection of RNA modifications and structure using single-molecule nanopore sequencing. Cell Genom 2022; 2(2): 100097.
[http://dx.doi.org/10.1016/j.xgen.2022.100097]
[27]
Ramasamy S, Mishra S, Sharma S, et al. An informatics approach to distinguish RNA modifications in nanopore direct RNA sequencing. Genomics 2022; 114(3): 110372.
[http://dx.doi.org/10.1016/j.ygeno.2022.110372] [PMID: 35460817]
[28]
Stephenson W, Razaghi R, Busan S, Weeks KM, Timp W, Smibert P. Direct detection of RNA modifications and structure using single molecule nanopore sequencing. bioRxiv 2020.
[http://dx.doi.org/10.1101/2020.05.31.126763]
[29]
Ramasamy S, Sahayasheela VJ, Yu Z, et al. 2021; Chemical probe-based nanopore sequencing to selectively assess the RNA modifications. bioRxiv 2020.2005.2019.105338.
[30]
Jenjaroenpun P, Wongsurawat T, Wadley TD, et al. Decoding the epitranscriptional landscape from native RNA sequences. Nucleic Acids Res 2021; 49(2): e7.
[http://dx.doi.org/10.1093/nar/gkaa620] [PMID: 32710622]
[31]
Furlan M, Tanaka I, Leonardi T, de Pretis S, Pelizzola M. Direct RNA sequencing for the study of synthesis, processing, and degradation of modified transcripts. Front Genet 2020; 11: 394.
[http://dx.doi.org/10.3389/fgene.2020.00394] [PMID: 32425981]
[32]
Furlan M, Delgado-Tejedor A, Mulroney L, Pelizzola M, Novoa EM, Leonardi T. Computational methods for RNA modification detection from nanopore direct RNA sequencing data. RNA Biol 2021; 18 (sup1): 31-40.
[http://dx.doi.org/10.1080/15476286.2021.1978215] [PMID: 34559589]
[33]
Pratanwanich PN, Yao F, Chen Y, et al. Identification of differential RNA modifications from nanopore direct RNA sequencing with xPore. Nat Biotechnol 2021; 39(11): 1394-402.
[http://dx.doi.org/10.1038/s41587-021-00949-w] [PMID: 34282325]
[34]
Wang Y, Wang H, Xi F, et al. Profiling of circular RNA N6 -methyladenosine in moso bamboo (Phyllostachys edulis) using nanopore-based direct RNA sequencing. J Integr Plant Biol 2020; 62(12): 1823-38.
[http://dx.doi.org/10.1111/jipb.13002] [PMID: 32735361]
[35]
Lorenz DA, Sathe S, Einstein JM, Yeo GW. Direct RNA sequencing enables m6A detection in endogenous transcript isoforms at base-specific resolution. RNA 2020; 26(1): 19-28.
[http://dx.doi.org/10.1261/rna.072785.119] [PMID: 31624092]
[36]
Liu H, Begik O, Lucas MC, et al. Accurate detection of m6A RNA modifications in native RNA sequences. Nat Commun 2019; 10(1): 4079.
[http://dx.doi.org/10.1038/s41467-019-11713-9] [PMID: 31501426]
[37]
Gao Y, Liu X, Wu B, et al. Quantitative profiling of N6-methyladenosine at single-base resolution in stem-differentiating xylem of Populus trichocarpa using Nanopore direct RNA sequencing. Genome Biol 2021; 22(1): 22.
[http://dx.doi.org/10.1186/s13059-020-02241-7] [PMID: 33413586]
[38]
Zhang Y, Huang D, Wei Z, Chen K. Primary sequence-assisted prediction of m(6)A RNA methylation sites from Oxford nanopore direct RNA sequencing data. Methods (San Diego, Calif) 2022.
[39]
Hassan D, Acevedo D, Daulatabad SV, Mir Q, Janga SC. 2021; Penguin: A tool for predicting pseudouridine sites in direct RNA nanopore sequencing data. bioRxiv 2021.2003.2031.437901.
[http://dx.doi.org/10.1101/2021.03.31.437901]
[40]
Begik O, Lucas MC, Pryszcz LP, et al. Quantitative profiling of pseudouridylation dynamics in native RNAs with nanopore sequencing. Nat Biotechnol 2021; 39(10): 1278-91.
[http://dx.doi.org/10.1038/s41587-021-00915-6] [PMID: 33986546]
[41]
Ueda H. 2021; nanoDoc: RNA modification detection using nanopore raw reads with deep one-class classification. bioRxiv 2020.2009.2013.295089.
[42]
Smith MA, Ersavas T, Ferguson JM, et al. Molecular barcoding of native RNAs using nanopore sequencing and deep learning. Genome Res 2020; 30(9): 1345-53.
[http://dx.doi.org/10.1101/gr.260836.120] [PMID: 32907883]
[43]
Ding H, Bailey AD IV, Jain M, Olsen H, Paten B. Gaussian mixture model-based unsupervised nucleotide modification number detection using nanopore-sequencing readouts. Bioinformatics 2020; 36(19): 4928-34.
[http://dx.doi.org/10.1093/bioinformatics/btaa601] [PMID: 32597959]
[44]
Viehweger A, Krautwurst S, Lamkiewicz K, et al. Direct RNA nanopore sequencing of full-length coronavirus genomes provides novel insights into structural variants and enables modification analysis. Genome Res 2019; 29(9): 1545-54.
[http://dx.doi.org/10.1101/gr.247064.118] [PMID: 31439691]
[45]
Parker MT, Knop K, Sherwood AV, et al. Nanopore direct RNA sequencing maps the complexity of arabidopsis mRNA processing and m6A modification. eLife 2020; 9: 9.
[http://dx.doi.org/10.7554/eLife.49658] [PMID: 31931956]
[46]
McIntyre ABR, Alexander N, Grigorev K, et al. Single-molecule sequencing detection of N6-methyladenine in microbial reference materials. Nat Commun 2019; 10(1): 579.
[http://dx.doi.org/10.1038/s41467-019-08289-9] [PMID: 30718479]
[47]
Hendra C, Pratanwanich PN, Wan YK, Goh WSS, Thiery A, Göke J. 2021.Detection of m6A from direct RNA sequencing using a multiple instance learning framework. bioRxiv 2021.
[http://dx.doi.org/10.1101/2021.09.20.461055]
[48]
Fu L, Niu B, Zhu Z, Wu S, Li W. CD-HIT: Accelerated for clustering the next-generation sequencing data. Bioinformatics 2012; 28(23): 3150-2.
[http://dx.doi.org/10.1093/bioinformatics/bts565] [PMID: 23060610]
[49]
Stoiber M, Quick J, Egan R, et al. 2017; De novo identification of DNA modifications enabled by genome-guided nanopore signal processing. bioRxiv 094672.
[50]
Loman NJ, Quick J, Simpson JT. A complete bacterial genome assembled de novo using only nanopore sequencing data. Nat Methods 2015; 12(8): 733-5.
[http://dx.doi.org/10.1038/nmeth.3444] [PMID: 26076426]
[51]
Li H. Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics 2018; 34(18): 3094-100.
[http://dx.doi.org/10.1093/bioinformatics/bty191] [PMID: 29750242]
[52]
Sundararajan M, Taly A, Yan Q. 2017; Volume 70Axiomatic attribution for deep networks. In: Doina P, Yee Whye T, Eds. Proceedings of the 34th International Conference on Machine Learning. : pp. 3319-.
[53]
Sotoudeh M, Thakur AV. Computing linear restrictions of neural networks. NeurIPS 2019.
[54]
Jha A, K Aicher J, R Gazzara M, Singh D, Barash Y. Enhanced integrated gradients: Improving interpretability of deep learning models using splicing codes as a case study. Genome Biol 2020; 21(1): 149.
[http://dx.doi.org/10.1186/s13059-020-02055-7] [PMID: 32560708]
[55]
Shrikumar A, Tian K, Avsec Ž, et al. Technical note on transcription factor motif discovery from importance scores (TF-MoDISco) version 0.5.6.5. 2018. Cornell University. Available from: https://arxiv.org/abs/1811.00416
[56]
Le NQK, Do DT, Nguyen TT, Le QA. A sequence-based prediction of Kruppel-like factors proteins using XGBoost and optimized features. Gene 2021; 787: 145643.
[http://dx.doi.org/10.1016/j.gene.2021.145643] [PMID: 33848577]
[57]
Le NQK, Ho QT. Deep transformers and convolutional neural network in identifying DNA N6-methyladenine sites in cross-species genomes. DNA 2021; 204: 199-206.
[http://dx.doi.org/10.1016/j.ymeth.2021.12.004] [PMID: 34915158]
[58]
Do DT, Le NQK. Using extreme gradient boosting to identify origin of replication in Saccharomyces cerevisiae via hybrid features. Genomics 2020; 112(3): 2445-51.
[http://dx.doi.org/10.1016/j.ygeno.2020.01.017] [PMID: 31987913]
[59]
Dietterich TG. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 1998; 10(7): 1895-923.
[http://dx.doi.org/10.1162/089976698300017197] [PMID: 9744903]
[60]
Li X, Xiong X, Wang K, et al. Transcriptome-wide mapping reveals reversible and dynamic N(1)-methyladenosine methylome. Nat Chem Biol 2016; 12(5): 311-6.
[http://dx.doi.org/10.1038/nchembio.2040] [PMID: 26863410]
[61]
Song Z, Huang D, Song B, et al. Attention-based multi-label neural networks for integrated prediction and interpretation of twelve widely occurring RNA modifications. Nat Commun 2021; 12(1): 4011.
[http://dx.doi.org/10.1038/s41467-021-24313-3] [PMID: 34188054]

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy