Generic placeholder image

Current Bioinformatics

Editor-in-Chief

ISSN (Print): 1574-8936
ISSN (Online): 2212-392X

Perspective

A Machine Learning Perspective on DNA and RNA G-quadruplexes

Author(s): Fabiana Rossi and Alessandro Paiardini*

Volume 17, Issue 4, 2022

Published on: 06 April, 2022

Page: [305 - 309] Pages: 5

DOI: 10.2174/1574893617666220224105702

Abstract

G-quadruplexes (G4s) are particular structures found in guanine-rich DNA and RNA sequences that exhibit a wide diversity of three-dimensional conformations and exert key functions in the control of gene expression. G4s are able to interact with numerous small molecules and endogenous proteins, and their dysregulation can lead to a variety of disorders and diseases. Characterization and prediction of G4-forming sequences could elucidate their mechanism of action and could thus represent an important step in the discovery of potential therapeutic drugs. In this perspective, we propose an overview of G4s, discussing the state of the art of methodologies and tools developed to characterize and predict the presence of these structures in genomic sequences. In particular, we report on machine learning (ML) approaches and artificial neural networks (ANNs) that could open new avenues for the accurate analysis of quadruplexes, given their potential to derive informative features by learning from large, high-density datasets.

Next »
Graphical Abstract

[1]
Sen D, Gilbert W. Formation of parallel four-stranded complexes by guanine-rich motifs in DNA and its implications for meiosis. Nature 1988; 334(6180): 364-6.
[http://dx.doi.org/10.1038/334364a0] [PMID: 3393228]
[2]
Burge S, Parkinson GN, Hazel P, Todd AK, Neidle S. Quadruplex DNA: Sequence, topology and structure. Nucleic Acids Res 2006; 34(19): 5402-15.
[http://dx.doi.org/10.1093/nar/gkl655] [PMID: 17012276]
[3]
Huppert JL, Balasubramanian S. Prevalence of quadruplexes in the human genome. Nucleic Acids Res 2005; 33(9): 2908-16.
[http://dx.doi.org/10.1093/nar/gki609] [PMID: 15914667]
[4]
Qin Y, Hurley LH. Structures, folding patterns, and functions of intramolecular DNA G-quadruplexes found in eukaryotic promoter re-gions. Biochimie 2008; 90(8): 1149-71.
[http://dx.doi.org/10.1016/j.biochi.2008.02.020] [PMID: 18355457]
[5]
Lombardi EP, Londoño-Vallejo A. A guide to computational methods for G-quadruplex prediction. Nucleic Acids Res 2020; 48(3): 1603-3.
[http://dx.doi.org/10.1093/nar/gkaa033] [PMID: 31943112]
[6]
Bugaut A, Balasubramanian S. A sequence-independent study of the influence of short loop lengths on the stability and topology of intra-molecular DNA G-quadruplexes. Biochemistry 2008; 47(2): 689-97.
[http://dx.doi.org/10.1021/bi701873c] [PMID: 18092816]
[7]
Sahakyan AB, Chambers VS, Marsico G, Santner T, Di Antonio M, Balasubramanian S. Machine learning model for sequence-driven DNA G-quadruplex formation. Sci Rep 2017; 7(1): 14535.
[http://dx.doi.org/10.1038/s41598-017-14017-4] [PMID: 29109402]
[8]
Mukundan VT, Phan AT. Bulges in G-quadruplexes: Broadening the definition of G-quadruplex-forming sequences. J Am Chem Soc 2013; 135(13): 5017-28.
[http://dx.doi.org/10.1021/ja310251r] [PMID: 23521617]
[9]
Bhattacharyya D, Mirihana Arachchilage G, Basu S. Metal Cations in G-Quadruplex folding and stability. Front Chem 2016; 4: 38.
[http://dx.doi.org/10.3389/fchem.2016.00038] [PMID: 27668212]
[10]
Balasubramanian S, Hurley LH, Neidle S. Targeting G-quadruplexes in gene promoters: A novel anticancer strategy? Nat Rev Drug Discov 2011; 10(4): 261-75.
[http://dx.doi.org/10.1038/nrd3428] [PMID: 21455236]
[11]
Huppert JL, Balasubramanian S. G-quadruplexes in promoters throughout the human genome. Nucleic Acids Res 2007; 35(2): 406-13.
[http://dx.doi.org/10.1093/nar/gkl1057] [PMID: 17169996]
[12]
Siddiqui-Jain A, Grand CL, Bearss DJ, Hurley LH. Direct evidence for a G-quadruplex in a promoter region and its targeting with a small molecule to repress c-MYC transcription. Proc Natl Acad Sci USA 2002; 99(18): 11593-8.
[http://dx.doi.org/10.1073/pnas.182256799] [PMID: 12195017]
[13]
Lipps HJ, Rhodes D. G-quadruplex structures: In vivo evidence and function. Trends Cell Biol 2009; 19(8): 414-22.
[http://dx.doi.org/10.1016/j.tcb.2009.05.002] [PMID: 19589679]
[14]
Raiber EA, Kranaster R, Lam E, Nikan M, Balasubramanian S. A non-canonical DNA structure is a binding motif for the transcription factor SP1 in vitro. Nucleic Acids Res 2012; 40(4): 1499-508.
[http://dx.doi.org/10.1093/nar/gkr882] [PMID: 22021377]
[15]
Rodriguez R, Miller KM, Forment JV, et al. Small-molecule-induced DNA damage identifies alternative DNA structures in human genes. Nat Chem Biol 2012; 8(3): 301-10.
[http://dx.doi.org/10.1038/nchembio.780] [PMID: 22306580]
[16]
Sarkies P, Murat P, Phillips LG, Patel KJ, Balasubramanian S, Sale JE. FANCJ coordinates two pathways that maintain epigenetic stability at G-quadruplex DNA. Nucleic Acids Res 2012; 40(4): 1485-98.
[http://dx.doi.org/10.1093/nar/gkr868] [PMID: 22021381]
[17]
Yang D. G-Quadruplex DNA and RNA. G-Quadruplex Nucleic Acids. In: Yang D, Lin C, Eds. Methods in Molecular Biology. New York, NY: Springer 2019. 2035: pp. 1-24.
[18]
Besnard E, Babled A, Lapasset L, et al. Unraveling cell type-specific and reprogrammable human replication origin signatures associated with G-quadruplex consensus motifs. Nat Struct Mol Biol 2012; 19(8): 837-44.
[http://dx.doi.org/10.1038/nsmb.2339] [PMID: 22751019]
[19]
Huang WC, Tseng TY, Chen YT, et al. Direct evidence of mitochondrial G-quadruplex DNA by using fluorescent anti-cancer agents. Nucleic Acids Res 2015; 43(21): 10102-13.
[http://dx.doi.org/10.1093/nar/gkv1061] [PMID: 26487635]
[20]
Kwok CK, Marsico G, Sahakyan AB, Chambers VS, Balasubramanian S. rG4-seq reveals widespread formation of G-quadruplex struc-tures in the human transcriptome. Nat Methods 2016; 13(10): 841-4.
[http://dx.doi.org/10.1038/nmeth.3965] [PMID: 27571552]
[21]
Lyu K, Chow EYC, Mou X, Chan TF, Kwok CK. RNA G-quadruplexes (rG4s): Genomics and biological functions. Nucleic Acids Res 2021; 49(10): 5426-50.
[http://dx.doi.org/10.1093/nar/gkab187] [PMID: 33772593]
[22]
Huppert JL, Bugaut A, Kumari S, Balasubramanian S. G-quadruplexes: The beginning and end of UTRs. Nucleic Acids Res 2008; 36(19): 6260-8.
[http://dx.doi.org/10.1093/nar/gkn511] [PMID: 18832370]
[23]
Beaudoin JD, Perreault JP. Exploring mRNA 3′-UTR G-quadruplexes: Evidence of roles in both alternative polyadenylation and mRNA shortening. Nucleic Acids Res 2013; 41(11): 5898-911.
[http://dx.doi.org/10.1093/nar/gkt265] [PMID: 23609544]
[24]
Crenshaw E, Leung BP, Kwok CK, et al. Amyloid precursor protein translation is regulated by a 3’UTR Guanine Quadruplex. PLoS One 2015; 10(11): e0143160.
[25]
Song J, Perreault JP, Topisirovic I, Richard S. RNA G-quadruplexes and their potential regulatory roles in translation. Translation 2016; 4(2): e1244031.
[http://dx.doi.org/10.1080/21690731.2016.1244031] [PMID: 28090421]
[26]
Lee SC, Zhang J, Strom J, et al. G-quadruplex in the nrf2 mrna 5′ untranslated region regulates de novo nrf2 protein translation under oxidative stress. Mol Cell Biol 2016; 37(1): 37.
[PMID: 27736771]
[27]
Didiot MC, Tian Z, Schaeffer C, Subramanian M, Mandel JL, Moine H. The G-quartet containing FMRP binding site in FMR1 mRNA is a potent exonic splicing enhancer. Nucleic Acids Res 2008; 36(15): 4902-12.
[http://dx.doi.org/10.1093/nar/gkn472] [PMID: 18653529]
[28]
Collie GW, Parkinson GN. The application of DNA and RNA G-quadruplexes to therapeutic medicines. Chem Soc Rev 2011; 40(12): 5867-92.
[http://dx.doi.org/10.1039/c1cs15067g] [PMID: 21789296]
[29]
Bugaut A, Rodriguez R, Kumari S, Hsu STD, Balasubramanian S. Small molecule-mediated inhibition of translation by targeting a native RNA G-quadruplex. Org Biomol Chem 2010; 8(12): 2771-6.
[http://dx.doi.org/10.1039/c002418j] [PMID: 20436976]
[30]
Malina J, Scott P, Brabec V. Stabilization of human telomeric RNA G-quadruplex by the water-compatible optically pure and biologically-active metallohelices. Sci Rep 2020; 10(1): 14543.
[http://dx.doi.org/10.1038/s41598-020-71429-5] [PMID: 32884069]
[31]
Spiegel J, Adhikari S, Balasubramanian S. The structure and function of DNA G-quadruplexes. Trends Chem 2020; 2(2): 123-36.
[http://dx.doi.org/10.1016/j.trechm.2019.07.002] [PMID: 32923997]
[32]
Martone J, Mariani D, Santini T, et al. SMaRT lncRNA controls translation of a G-quadruplex-containing mRNA antagonizing the DHX36 helicase. EMBO Rep 2020; 21(6): e49942.
[http://dx.doi.org/10.15252/embr.201949942] [PMID: 32337838]
[33]
Takahama K, Takada A, Tada S, et al. Regulation of telomere length by G-quadruplex telomere DNA- and TERRA-binding protein TLS/FUS. Chem Biol 2013; 20(3): 341-50.
[http://dx.doi.org/10.1016/j.chembiol.2013.02.013] [PMID: 23521792]
[34]
Patel A, Lee HO, Jawerth L, et al. A liquid-to-solid phase transition of the als protein fus accelerated by disease mutation. Cell 2015; 162(5): 1066-77.
[http://dx.doi.org/10.1016/j.cell.2015.07.047] [PMID: 26317470]
[35]
Byrd AK, Zybailov BL, Maddukuri L, et al. Evidence that G-quadruplex DNA accumulates in the cytoplasm and participates in stress granule assembly in response to oxidative stress. J Biol Chem 2016; 291(34): 18041-57.
[http://dx.doi.org/10.1074/jbc.M116.718478] [PMID: 27369081]
[36]
Maizels N. G4-associated human diseases. EMBO Rep 2015; 16(8): 910-22.
[http://dx.doi.org/10.15252/embr.201540607] [PMID: 26150098]
[37]
Modelska A, Turro E, Russell R, et al. The malignant phenotype in breast cancer is driven by eIF4A1-mediated changes in the translation-al landscape. Cell Death Dis 2015; 6: e1603-3.
[http://dx.doi.org/10.1038/cddis.2014.542] [PMID: 25611378]
[38]
Thandapani P, Song J, Gandin V, et al. Aven recognition of RNA G-quadruplexes regulates translation of the mixed lineage leukemia pro-tooncogenes. eLife 2015; 4: e06234.
[http://dx.doi.org/10.7554/eLife.06234] [PMID: 26267306]
[39]
Nahalka J. The role of the protein-RNA recognition code in neurodegeneration. Cell Mol Life Sci 2019; 76(11): 2043-58.
[http://dx.doi.org/10.1007/s00018-019-03096-3] [PMID: 30980111]
[40]
Lucá R, Averna M, Zalfa F, et al. The fragile X protein binds mRNAs involved in cancer progression and modulates metastasis formation. EMBO Mol Med 2013; 5(10): 1523-36.
[http://dx.doi.org/10.1002/emmm.201302847] [PMID: 24092663]
[41]
Cammas A, Millevoi S. RNA G-quadruplexes: Emerging mechanisms in disease. Nucleic Acids Res 2016; 2: gkw1280.
[http://dx.doi.org/10.1093/nar/gkw1280] [PMID: 28013268]
[42]
Paramasivan S, Rujan I, Bolton PH. Circular dichroism of quadruplex DNAs: Applications to structure, cation effects and ligand binding. Methods 2007; 43(4): 324-31.
[http://dx.doi.org/10.1016/j.ymeth.2007.02.009] [PMID: 17967702]
[43]
Mergny JL, Phan AT, Lacroix L. Following G-quartet formation by UV-spectroscopy. FEBS Lett 1998; 435(1): 74-8.
[http://dx.doi.org/10.1016/S0014-5793(98)01043-6] [PMID: 9755862]
[44]
Kwok CK, Sahakyan AB, Balasubramanian S. Structural analysis using SHALiPE to Reveal RNA G-Quadruplex formation in human pre-cursor MicroRNA. Angew Chem Int Ed Engl 2016; 55(31): 8958-61.
[http://dx.doi.org/10.1002/anie.201603562] [PMID: 27355429]
[45]
Yang SY, Lejault P, Chevrier S, et al. Transcriptome-wide identification of transient RNA G-quadruplexes in human cells. Nat Commun 2018; 9(1): 4730.
[http://dx.doi.org/10.1038/s41467-018-07224-8] [PMID: 30413703]
[46]
Bedrat A, Lacroix L, Mergny JL. Re-evaluation of G-quadruplex propensity with G4Hunter. Nucleic Acids Res 2016; 44(4): 1746-59.
[http://dx.doi.org/10.1093/nar/gkw006] [PMID: 26792894]
[47]
Hon J, Martínek T, Zendulka J, Lexa M. pqsfinder: an exhaustive and imperfection-tolerant search tool for potential quadruplex-forming sequences in R.Bioinformatics. 2017; pp. 3373-9.
[48]
Garant JM, Luce MJ, Scott MS, Perreault JP. G4RNA: An RNA G-quadruplex database.Database. 2015; 2015: p. bav059.
[49]
Angermueller C, Pärnamaa T, Parts L, Stegle O. Deep learning for computational biology. Mol Syst Biol 2016; 12(7): 878.
[http://dx.doi.org/10.15252/msb.20156651] [PMID: 27474269]
[50]
Glorot X, Bordes A, Bengio Y. Deep sparse rectifier neural networks. Proc Mach Learn Res 2011; 2011: 315-23.
[51]
Garant JM, Perreault JP, Scott MS. Motif independent identification of potential RNA G-quadruplexes by G4RNA screener. Bioinformatics 2017; 3532-7.
[52]
Garant JM, Perreault JP, Scott MS. G4RNA screener web server: User focused interface for RNA G-quadruplex prediction. Biochimie 2018; 151: 115.
[53]
Klimentova E, Polacek J, Simecek P, Alexiou P. PENGUINN: Precise exploration of nuclear G-Quadruplexes using interpretable neural networks. Front Genet 2020; 11: 568546.
[54]
Barshai M, Orenstein Y. Predicting G-Quadruplexes from DNA sequences using multi-kernel convolutional neural networks. In: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. 357-65.
[http://dx.doi.org/10.1145/3307339.3342133]
[55]
Rocher V, Genais M, Nassereddine E, Mourad R. DeepG4: A deep learning approach to predict cell-type specific active G-quadruplex regions. PLOS Comput Biol 2021; 17(8): e1009308.
[http://dx.doi.org/10.1371/journal.pcbi.1009308] [PMID: 34383754]
[56]
Karsisiotis AI, O’Kane C, Webba da Silva M. DNA quadruplex folding formalism--a tutorial on quadruplex topologies. Methods 2013; 64(1): 28-35.
[http://dx.doi.org/10.1016/j.ymeth.2013.06.004] [PMID: 23791747]
[57]
Bugaut A, Murat P, Balasubramanian S. An RNA hairpin to G-quadruplex conformational transition. J Am Chem Soc 2012; 134(49): 19953-6.
[http://dx.doi.org/10.1021/ja308665g] [PMID: 23190255]
[58]
Marsico G, Chambers VS, Sahakyan AB, et al. Whole genome experimental maps of DNA G-quadruplexes in multiple species. Nucleic Acids Res 2019; 47(8): 3862-74.
[http://dx.doi.org/10.1093/nar/gkz179] [PMID: 30892612]

© 2025 Bentham Science Publishers | Privacy Policy