Generic placeholder image

Combinatorial Chemistry & High Throughput Screening

Editor-in-Chief

ISSN (Print): 1386-2073
ISSN (Online): 1875-5402

General Research Article

m1A-pred: Prediction of Modified 1-methyladenosine Sites in RNA Sequences through Artificial Intelligence

Author(s): Muhammad Taseer Suleman* and Yaser Daanial Khan

Volume 25, Issue 14, 2022

Published on: 10 August, 2022

Page: [2473 - 2484] Pages: 12

DOI: 10.2174/1386207325666220617152743

Price: $65

Abstract

Background: The process of nucleotides modification or methyl groups addition to nucleotides is known as post-transcriptional modification (PTM). 1-methyladenosine (m1A) is a type of PTM formed by adding a methyl group to the nitrogen at the 1st position of the adenosine base. Many human disorders are associated with m1A, which is widely found in ribosomal RNA and transfer RNA.

Objective: The conventional methods such as mass spectrometry and site-directed mutagenesis proved to be laborious and burdensome. Systematic identification of modified sites from RNA sequences is gaining much attention nowadays. Consequently, an extreme gradient boost predictor, m1A-Pred, is developed in this study for the prediction of modified m1A sites.

Methods: The current study involves the extraction of position and composition-based properties within nucleotide sequences. The extraction of features helps in the development of the features vector. Statistical moments were endorsed for dimensionality reduction in the obtained features.

Results: Through a series of experiments using different computational models and evaluation methods, it was revealed that the proposed predictor, m1A-pred, proved to be the most robust and accurate model for the identification of modified sites.

Availability and Implementation: To enhance the research on m1A sites, a friendly server was also developed, which was the final phase of this research.

Keywords: 1-methyladenosine, PTMs, statistical moments, RMBase, XGB, tRNA.

[1]
Chen, W.; Feng, P.; Yang, H.; Ding, H.; Lin, H.; Chou, K.C. iRNA-3typeA: Identifying three types of modification at RNA’s adenosine sites. Mol. Ther. Nucleic Acids, 2018, 11, 468-474.
[http://dx.doi.org/10.1016/j.omtn.2018.03.012] [PMID: 29858081]
[2]
Jonkhout, N.; Tran, J.; Smith, M.A.; Schonrock, N.; Mattick, J.S.; Novoa, E.M. The RNA modification landscape in human disease. RNA, 2017, 23(12), 1754-1769.
[http://dx.doi.org/10.1261/rna.063503.117] [PMID: 28855326]
[3]
Ianniello, Z.; Fatica, A. N6-Methyladenosine Role in acute myeloid Leukaemia. Int. J. Mol. Sci., 2018, 19(8), 2345.
[http://dx.doi.org/10.3390/ijms19082345] [PMID: 30096915]
[4]
Du, T.; Rao, S.; Wu, L.; Ye, N.; Liu, Z.; Hu, H.; Xiu, J.; Shen, Y.; Xu, Q. An association study of the m6A genes with major depressive disorder in Chinese Han population. J. Affect. Disord., 2015, 183, 279-286.
[http://dx.doi.org/10.1016/j.jad.2015.05.025] [PMID: 26047305]
[5]
Dunn, D.B. The occurence of 1-methyladenine in ribonucleic acid. Biochim. Biophys. Acta, 1961, 46(1), 198-200.
[http://dx.doi.org/10.1016/0006-3002(61)90668-0] [PMID: 13725042]
[6]
Sprinzl, M.; Hartmann, T.; Meissner, F.; Moll, J.; Vorderwülbecke, T. Compilation of tRNA sequences and sequences of tRNA genes. Nucleic Acids Res., 1987, 15(Suppl.), r53-r188.
[http://dx.doi.org/10.1093/nar/15.suppl.r53] [PMID: 3554146]
[7]
Agris, P.F. The importance of being modified: Roles of modified nucleosides and Mg2+ in RNA structure and function. Prog. Nucleic Acid Res. Mol. Biol., 1996, 53, 79-129.
[http://dx.doi.org/10.1016/S0079-6603(08)60143-9] [PMID: 8650309]
[8]
Li, J.; Zhang, C.; Yuan, X.; Cao, Y. Molecular characteristics of N1-Methyladenosine regulators and their correlation with overall cancer survival. DNA Cell Biol., 2021, 40(3), 513-522.
[http://dx.doi.org/10.1089/dna.2020.6214] [PMID: 33416433]
[9]
Li, X.; Xiong, X.; Wang, K.; Wang, L.; Shu, X.; Ma, S.; Yi, C. Transcriptome-wide mapping reveals reversible and dynamic N1-methyladenosine methylome. Nat. Chem. Biol., 2016, 12(5), 311-316.
[http://dx.doi.org/10.1038/nchembio.2040] [PMID: 26863410]
[10]
Safra, M.; Sas-Chen, A.; Nir, R.; Winkler, R.; Nachshon, A.; Bar-Yaacov, D.; Erlacher, M.; Rossmanith, W.; Stern-Ginossar, N.; Schwartz, S. The m1A landscape on cytosolic and mitochondrial mRNA at single-base resolution. Nature, 2017, 551(7679), 251-255.
[http://dx.doi.org/10.1038/nature24456] [PMID: 29072297]
[11]
Chen, W.; Feng, P.; Tang, H.; Ding, H.; Lin, H. RAMPred: Identifying the N1-methyladenosine sites in eukaryotic transcriptomes. Sci. Rep., 2016, 6(1), 31080.
[http://dx.doi.org/10.1038/srep31080] [PMID: 27511610]
[12]
Sun, P.; Chen, Y.; Liu, B.; Gao, Y.; Han, Y.; He, F.; Ji, J. Deep-MRMP: A new predictor for multiple types of RNA modification sites using deep learning. Math. Biosci. Eng., 2019, 16(6), 6231-6241.
[http://dx.doi.org/10.3934/mbe.2019310] [PMID: 31698559]
[13]
Chen, w; Xing, P; Zou, Q Detecting N6-methyladenosine sites from RNA transcriptomes using ensemble support vector machines. Nature, 2017, 7, 70242.
[14]
Xu, Z.C.; Feng, P.M.; Yang, H.; Qiu, W.R.; Chen, W.; Lin, H. iRNAD: A computational tool for identifying D modification sites in RNA sequence. Bioinformatics, 2019, 35(23), 4922-4929.
[http://dx.doi.org/10.1093/bioinformatics/btz358] [PMID: 31077296]
[15]
[16]
Singh, A.V.; Ansari, M.H.D.; Rosenkranz, D.; Maharjan, R.S.; Kriegel, F.L.; Gandhi, K.; Kanase, A.; Singh, R.; Laux, P.; Luch, A. Artificial intelligence and machine learning in computational nanotoxicology: Unlocking and empowering nanomedicine. Adv. Healthc. Mater., 2020, 9(17), 1901862.
[http://dx.doi.org/10.1002/adhm.201901862] [PMID: 32627972]
[17]
Chou, K.C. Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins, 2001, 43(3), 246-255.
[http://dx.doi.org/10.1002/prot.1035] [PMID: 11288174]
[18]
Malebary, S.J.; Khan, Y.D. Identification of antimicrobial peptides using chou’s 5 step rule. Comput. Mater. Contin, 2021, 67(3), 2863-2881.
[http://dx.doi.org/10.32604/cmc.2021.015041]
[19]
Khan, Y.D.; Amin, N.; Hussain, W.; Rasool, N.; Khan, S.A.; Chou, K.C. iProtease-PseAAC(2L): A two-layer predictor for identifying proteases and their types using Chou’s 5-step-rule and general PseAAC. Anal. Biochem., 2020, 588, 113477.
[http://dx.doi.org/10.1016/j.ab.2019.113477] [PMID: 31654612]
[20]
Naseer, S.; Hussain, W.; Khan, Y.D.; Rasool, N. IPhosS(Deep)-PseAAC: Identify phosphoserine sites in proteins using deep learning on general pseudo amino acid compositions via modified 5-steps rule. IEEE/ACM Trans; Comput. Biol. Bioinforma, 2020, pp. 1-1.
[http://dx.doi.org/10.1109/TCBB.2020.3040747]
[21]
Naseer, S.; Hussain, W.; Khan, Y.D.; Rasool, N. Sequence-based identification of arginine amidation sites in proteins using deep representations of proteins and PseAAC. Curr. Bioinform., 2021, 15(8), 937-948.
[http://dx.doi.org/10.2174/1574893615666200129110450]
[22]
Naseer, S.; Hussain, W.; Khan, Y.D.; Rasool, N. NPalmitoylDeep-PseAAC: A predictor of N-Palmitoylation sites in proteins using deep representations of proteins and PseAAC via modified 5-steps rule. Curr. Bioinform., 2021, 16(2), 294-305.
[http://dx.doi.org/10.2174/1574893615999200605142828]
[23]
Hussain, W.; Rasool, N.; Khan, Y.D. A Sequence-Based predictor of zika virus proteins developed by integration of PseAAC and statistical moments. Comb. Chem. High Throughput Screen., 2020, 23(8), 797-804.
[http://dx.doi.org/10.2174/1386207323666200428115449] [PMID: 32342804]
[24]
Feng, C.Q.; Zhang, Z.Y.; Zhu, X.J.; Lin, Y.; Chen, W.; Tang, H.; Lin, H. iTerm-PseKNC: A sequence-based tool for predicting bacterial transcriptional terminators. Bioinformatics, 2019, 35(9), 1469-1477.
[http://dx.doi.org/10.1093/bioinformatics/bty827] [PMID: 30247625]
[25]
Tahir, M.; Tayara, H.; Chong, K.T. iRNA-PseKNC(2methyl): Identify RNA 2′-O-methylation sites by convolution neural network and Chou’s pseudo components. J. Theor. Biol., 2019, 465, 1-6.
[http://dx.doi.org/10.1016/j.jtbi.2018.12.034] [PMID: 30590059]
[26]
Lai, H.Y.; Zhang, Z.Y.; Su, Z.D.; Su, W.; Ding, H.; Chen, W.; Lin, H. iProEP: A computational predictor for predicting promoter. Mol. Ther. Nucleic Acids, 2019, 17, 337-346.
[http://dx.doi.org/10.1016/j.omtn.2019.05.028] [PMID: 31299595]
[27]
Alzahrani, E.; Alghamdi, W.; Ullah, M.Z.; Khan, Y.D. Identification of stress response proteins through fusion of machine learning mod-els and statistical paradigms. Sci. Rep., 2021, 11(1), 21767.
[http://dx.doi.org/10.1038/s41598-021-99083-5] [PMID: 34741132]
[28]
Khan, Y.D.; Ahmed, F.; Khan, S.A. Situation recognition using image moments and recurrent neural networks. Neural Comput. Appl., 2014, 24(7-8), 1519-1529.
[http://dx.doi.org/10.1007/s00521-013-1372-4]
[29]
Khan, Y.D.; Khan, S.A.; Ahmad, F.; Islam, S. Iris recognition using image moments and k-means algorithm. Scientific-WorldJournal, 2014, 2014, 1-9.
[http://dx.doi.org/10.1155/2014/723595] [PMID: 24977221]
[30]
Akmal, M.A.; Hussain, W.; Rasool, N.; Khan, Y.D.; Khan, S.A.; Chou, K-C. Using Chou’s 5-Steps rule to predict O-linked serine glycosylation sites by blending position relative features and statistical moment. IEEE/ACM Trans; Comput. Biol. Bioinforma, 2020, pp. 1-1.
[http://dx.doi.org/10.1109/TCBB.2020.2968441]
[31]
Akmal, M.A.; Rasool, N.; Khan, Y.D. Prediction of N-linked glycosylation sites using position relative features and statistical moments. PLoS One, 2017, 12(8), e0181966.
[http://dx.doi.org/10.1371/journal.pone.0181966] [PMID: 28797096]
[32]
Mahmood, M.K.; Ehsan, A.; Khan, Y.D. IHyd-ProSite: A novel computational approach for identifying Hydroxylation sites in proline via mathematical modeling. bioRxiV, 2020.
[http://dx.doi.org/10.1101/2020.03.03.974717]
[33]
Allehaibi, K.; Daanial Khan, Y.; Khan, S.A. iTAGPred: A two-level prediction model for identification of angiogenesis and tumor angio-genesis biomarkers. Appl. Bionics Biomech., 2021, 2021, 1-15.
[http://dx.doi.org/10.1155/2021/2803147] [PMID: 34616486]
[34]
Qiang, X.; Chen, H.; Ye, X.; Su, R.; Wei, L. M6AMRFS: Robust prediction of N6-Methyladenosine sites with sequence-based features in multiple species. Front. Genet., 2018, 9, 495.
[http://dx.doi.org/10.3389/fgene.2018.00495] [PMID: 30410501]
[35]
Zhao, Z.; Peng, H.; Lan, C.; Zheng, Y.; Fang, L.; Li, J. Imbalance learning for the prediction of N6-Methylation sites in mRNAs. BMC Genomics, 2018, 19(1), 574.
[http://dx.doi.org/10.1186/s12864-018-4928-y] [PMID: 30068294]
[36]
Ryu, S.E.; Shin, D.H.; Chung, K. Prediction model of dementia risk based on XGBoost using derived variable extraction and hyper param-eter optimization. IEEE Access, 2020, 8, 177708-177720.
[http://dx.doi.org/10.1109/ACCESS.2020.3025553]
[37]
Shi, R.; Xu, X.; Li, J.; Li, Y. Prediction and analysis of train arrival delay based on XGBoost and Bayesian optimization. Appl. Soft Comput., 2021, 109, 107538.
[http://dx.doi.org/10.1016/j.asoc.2021.107538]
[38]
Budholiya, K.; Shrivastava, S.K.; Sharma, V. An optimized XGBoost based diagnostic system for effective prediction of heart disease. J. King Saud Univ. Comput. Inf. Sci, 2022, 34(7), 4514-4523.
[http://dx.doi.org/10.1016/j.jksuci.2020.10.013]
[39]
Singh, A.V.; Maharjan, R.S.; Kanase, A.; Siewert, K.; Rosenkranz, D.; Singh, R.; Laux, P.; Luch, A. Machine-Learning-Based approach to decode the influence of nanomaterial properties on their interaction with cells. ACS Appl. Mater. Interfaces, 2021, 13(1), 1943-1955.
[http://dx.doi.org/10.1021/acsami.0c18470] [PMID: 33373205]
[40]
Malebary, S.J.; Khan, Y.D. Evaluating machine learning methodologies for identification of cancer driver genes. Sci. Rep., 2021, 11(1), 12281.
[http://dx.doi.org/10.1038/s41598-021-91656-8] [PMID: 34112883]
[41]
Hussain, W.; Rasool, N.; Khan, Y.D. Insights into machine learning-based approaches for virtual screening in drug discovery: Existing strategies and streamlining through FP-CADD. Curr. Drug Discov. Technol., 2021, 18(4), 463-472.
[http://dx.doi.org/10.2174/1570163817666200806165934] [PMID: 32767944]
[42]
Naseer, S.; Hussain, W.; Khan, Y.D.; Rasool, N. Optimization of serine phosphorylation prediction in proteins by comparing human engineered features and deep representations. Anal. Biochem., 2021, 615, 114069.
[http://dx.doi.org/10.1016/j.ab.2020.114069] [PMID: 33340540]
[43]
Naseer, S.; Ali, R.F.; Khan, Y.D.; Dominic, P.D.D. iGluK-Deep: Computational identification of lysine glutarylation sites using deep neural networks with general pseudo amino acid compositions. J. Biomol. Struct. Dyn., 2021, 1-14.
[http://dx.doi.org/10.1080/07391102.2021.1962738] [PMID: 34396935]
[44]
Cao, C.; Liu, F.; Tan, H.; Song, D.; Shu, W.; Li, W.; Zhou, Y.; Bo, X.; Xie, Z. Deep learning and its applications in biomedicine. Genomics Proteomics Bioinform., 2018, 16(1), 17-32.
[http://dx.doi.org/10.1016/j.gpb.2017.07.003] [PMID: 29522900]
[45]
Qiu, W.R.; Sun, B.Q.; Xiao, X.; Xu, Z.C.; Chou, K.C. iPTM-mLys: Identifying multiple lysine PTM sites and their different types. Bioinformatics, 2016, 32(20), 3116-3123.
[http://dx.doi.org/10.1093/bioinformatics/btw380] [PMID: 27334473]
[46]
Cheng, X.; Zhao, S.G.; Xiao, X.; Chou, K.C. iATC-mISF: A multi-label classifier for predicting the classes of anatomical therapeutic chemicals. Bioinformatics, 2017, 33(16), 2610-2610.
[http://dx.doi.org/10.1093/bioinformatics/btx387] [PMID: 28810696]
[47]
Jain, A.; Kihara, D. Phylo-PFP: Improved automated protein function prediction using phylogenetic distance of distantly related sequences. Bioinformatics, 2019, 35(5), 753-759.
[http://dx.doi.org/10.1093/bioinformatics/bty704] [PMID: 30165572]
[48]
Chou, K.C. Some remarks on predicting multi-label attributes in molecular biosystems. Mol. Biosyst., 2013, 9(6), 1092-1100.
[http://dx.doi.org/10.1039/c3mb25555g] [PMID: 23536215]
[49]
Chan, C.T.Y.; Dyavaiah, M.; DeMott, M.S.; Taghizadeh, K.; Dedon, P.C.; Begley, T.J. A quantitative systems approach reveals dynamic control of tRNA modifications during cellular stress. PLoS Genet., 2010, 6(12), e1001247.
[http://dx.doi.org/10.1371/journal.pgen.1001247] [PMID: 21187895]
[50]
Helm, M.; Alfonzo, J.D. Posttranscriptional RNA Modifications: Playing metabolic games in a cell’s chemical Legoland. Chem. Biol., 2014, 21(2), 174-185.
[http://dx.doi.org/10.1016/j.chembiol.2013.10.015] [PMID: 24315934]
[51]
Peifer, C.; Sharma, S.; Watzinger, P.; Lamberth, S.; Kötter, P.; Entian, K.D. Yeast Rrp8p, a novel methyltransferase responsible for m1A 645 base modification of 25S rRNA. Nucleic Acids Res., 2013, 41(2), 1151-1163.
[http://dx.doi.org/10.1093/nar/gks1102] [PMID: 23180764]
[52]
Ballesta, J.P.; Cundliffe, E. Site-specific methylation of 16S rRNA caused by pct, a pactamycin resistance determinant from the producing organism, Streptomyces pactum. J. Bacteriol., 1991, 173(22), 7213-7218.
[http://dx.doi.org/10.1128/jb.173.22.7213-7218.1991] [PMID: 1657884]
[54]
Deep Promise Web, Avialable from: https://deeppromise.erc.monash.edu/

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy