Abstract
G-quadruplexes (G4s) are particular structures found in guanine-rich DNA and RNA sequences that exhibit a wide diversity of three-dimensional conformations and exert key functions in the control of gene expression. G4s are able to interact with numerous small molecules and endogenous proteins, and their dysregulation can lead to a variety of disorders and diseases. Characterization and prediction of G4-forming sequences could elucidate their mechanism of action and could thus represent an important step in the discovery of potential therapeutic drugs. In this perspective, we propose an overview of G4s, discussing the state of the art of methodologies and tools developed to characterize and predict the presence of these structures in genomic sequences. In particular, we report on machine learning (ML) approaches and artificial neural networks (ANNs) that could open new avenues for the accurate analysis of quadruplexes, given their potential to derive informative features by learning from large, high-density datasets.
Graphical Abstract
[http://dx.doi.org/10.1038/334364a0] [PMID: 3393228]
[http://dx.doi.org/10.1093/nar/gkl655] [PMID: 17012276]
[http://dx.doi.org/10.1093/nar/gki609] [PMID: 15914667]
[http://dx.doi.org/10.1016/j.biochi.2008.02.020] [PMID: 18355457]
[http://dx.doi.org/10.1093/nar/gkaa033] [PMID: 31943112]
[http://dx.doi.org/10.1021/bi701873c] [PMID: 18092816]
[http://dx.doi.org/10.1038/s41598-017-14017-4] [PMID: 29109402]
[http://dx.doi.org/10.1021/ja310251r] [PMID: 23521617]
[http://dx.doi.org/10.3389/fchem.2016.00038] [PMID: 27668212]
[http://dx.doi.org/10.1038/nrd3428] [PMID: 21455236]
[http://dx.doi.org/10.1093/nar/gkl1057] [PMID: 17169996]
[http://dx.doi.org/10.1073/pnas.182256799] [PMID: 12195017]
[http://dx.doi.org/10.1016/j.tcb.2009.05.002] [PMID: 19589679]
[http://dx.doi.org/10.1093/nar/gkr882] [PMID: 22021377]
[http://dx.doi.org/10.1038/nchembio.780] [PMID: 22306580]
[http://dx.doi.org/10.1093/nar/gkr868] [PMID: 22021381]
[http://dx.doi.org/10.1038/nsmb.2339] [PMID: 22751019]
[http://dx.doi.org/10.1093/nar/gkv1061] [PMID: 26487635]
[http://dx.doi.org/10.1038/nmeth.3965] [PMID: 27571552]
[http://dx.doi.org/10.1093/nar/gkab187] [PMID: 33772593]
[http://dx.doi.org/10.1093/nar/gkn511] [PMID: 18832370]
[http://dx.doi.org/10.1093/nar/gkt265] [PMID: 23609544]
[http://dx.doi.org/10.1080/21690731.2016.1244031] [PMID: 28090421]
[PMID: 27736771]
[http://dx.doi.org/10.1093/nar/gkn472] [PMID: 18653529]
[http://dx.doi.org/10.1039/c1cs15067g] [PMID: 21789296]
[http://dx.doi.org/10.1039/c002418j] [PMID: 20436976]
[http://dx.doi.org/10.1038/s41598-020-71429-5] [PMID: 32884069]
[http://dx.doi.org/10.1016/j.trechm.2019.07.002] [PMID: 32923997]
[http://dx.doi.org/10.15252/embr.201949942] [PMID: 32337838]
[http://dx.doi.org/10.1016/j.chembiol.2013.02.013] [PMID: 23521792]
[http://dx.doi.org/10.1016/j.cell.2015.07.047] [PMID: 26317470]
[http://dx.doi.org/10.1074/jbc.M116.718478] [PMID: 27369081]
[http://dx.doi.org/10.15252/embr.201540607] [PMID: 26150098]
[http://dx.doi.org/10.1038/cddis.2014.542] [PMID: 25611378]
[http://dx.doi.org/10.7554/eLife.06234] [PMID: 26267306]
[http://dx.doi.org/10.1007/s00018-019-03096-3] [PMID: 30980111]
[http://dx.doi.org/10.1002/emmm.201302847] [PMID: 24092663]
[http://dx.doi.org/10.1093/nar/gkw1280] [PMID: 28013268]
[http://dx.doi.org/10.1016/j.ymeth.2007.02.009] [PMID: 17967702]
[http://dx.doi.org/10.1016/S0014-5793(98)01043-6] [PMID: 9755862]
[http://dx.doi.org/10.1002/anie.201603562] [PMID: 27355429]
[http://dx.doi.org/10.1038/s41467-018-07224-8] [PMID: 30413703]
[http://dx.doi.org/10.1093/nar/gkw006] [PMID: 26792894]
[http://dx.doi.org/10.15252/msb.20156651] [PMID: 27474269]
[http://dx.doi.org/10.1145/3307339.3342133]
[http://dx.doi.org/10.1371/journal.pcbi.1009308] [PMID: 34383754]
[http://dx.doi.org/10.1016/j.ymeth.2013.06.004] [PMID: 23791747]
[http://dx.doi.org/10.1021/ja308665g] [PMID: 23190255]
[http://dx.doi.org/10.1093/nar/gkz179] [PMID: 30892612]