Abstract
Enhancer-promoter interactions (EPIs) in the human genome are of great significance to transcriptional regulation, which tightly controls gene expression. Identification of EPIs can help us better decipher gene regulation and understand disease mechanisms. However, experimental methods to identify EPIs are constrained by funds, time, and manpower, while computational methods using DNA sequences and genomic features are viable alternatives. Deep learning methods have shown promising prospects in classification and efforts that have been utilized to identify EPIs. In this survey, we specifically focus on sequence-based deep learning methods and conduct a comprehensive review of the literature. First, we briefly introduce existing sequence- based frameworks on EPIs prediction and their technique details. After that, we elaborate on the dataset, pre-processing means, and evaluation strategies. Finally, we concluded with the challenges these methods are confronted with and suggest several future opportunities. We hope this review will provide a useful reference for further studies on enhancer-promoter interactions.
Keywords: Enhancer-promoter interactions, sequence features, prediction, deep learning, attention mechanism, word embedding, convolutional neural network, recurrent neural network, interpretable model.
[http://dx.doi.org/10.1016/j.devcel.2011.06.008] [PMID: 21763601]
[http://dx.doi.org/10.1038/nrg3163] [PMID: 22392219]
[http://dx.doi.org/10.1016/j.omtn.2019.08.008] [PMID: 31536883]
[http://dx.doi.org/10.1016/j.ygeno.2010.11.002] [PMID: 21112384]
[http://dx.doi.org/10.1038/nrg3458] [PMID: 23503198]
[http://dx.doi.org/10.1016/j.gde.2011.11.001] [PMID: 22169023]
[http://dx.doi.org/10.1038/nature14222] [PMID: 25693564]
[http://dx.doi.org/10.1038/nature12716] [PMID: 24213634]
[http://dx.doi.org/10.1016/j.cell.2011.12.014] [PMID: 22265404]
[http://dx.doi.org/10.1016/j.cell.2014.11.021] [PMID: 25497547]
[http://dx.doi.org/10.1016/j.cell.2016.09.037]
[http://dx.doi.org/10.1093/nar/gkv865] [PMID: 26338778]
[http://dx.doi.org/10.1038/ng.3539] [PMID: 27064255]
[http://dx.doi.org/10.1038/ng.3950] [PMID: 28869592]
[http://dx.doi.org/10.1007/s41048-019-0089-z]
[http://dx.doi.org/10.1145/3354031.3354039]
[http://dx.doi.org/10.1093/bioinformatics/btz641] [PMID: 31410461]
[http://dx.doi.org/10.1093/bioinformatics/btx257] [PMID: 28881991]
[http://dx.doi.org/10.1186/s12864-018-4459-6] [PMID: 29764360]
[http://dx.doi.org/10.1093/bioinformatics/btz694] [PMID: 31588505]
[http://dx.doi.org/10.1093/bioinformatics/bty1050] [PMID: 30649185]
[http://dx.doi.org/10.1109/TCBB.2013.146]
[http://dx.doi.org/10.1016/j.tcs.2015.12.006]
[http://dx.doi.org/10.1109/TNB.2017.2762580] [PMID: 29035221]
[http://dx.doi.org/10.1109/TCDS.2017.2785332]
[http://dx.doi.org/10.1016/j.artmed.2017.03.001] [PMID: 28320624]
[http://dx.doi.org/10.1016/j.artmed.2017.02.005] [PMID: 28245947]
[http://dx.doi.org/10.1016/j.ins.2019.05.070]
[http://dx.doi.org/10.2174/1574893611666160609081155]
[http://dx.doi.org/10.3389/fgene.2018.00515] [PMID: 30459809]
[http://dx.doi.org/10.3389/fmicb.2019.00827] [PMID: 31057526]
[http://dx.doi.org/10.1007/s12250-016-3740-6] [PMID: 27151186]
[http://dx.doi.org/10.2174/1574893613666181113131415]
[http://dx.doi.org/10.1016/j.knosys.2018.10.007]
[http://dx.doi.org/10.1093/bib/bbz177] [PMID: 31994694]
[http://dx.doi.org/10.1016/j.omtn.2019.12.035] [PMID: 32045876]
[http://dx.doi.org/10.1016/j.csbj.2019.09.002] [PMID: 31921389]
[http://dx.doi.org/10.1098/rsob.190054] [PMID: 31164042]
[http://dx.doi.org/10.1016/j.omtn.2020.02.004] [PMID: 32169803]
[http://dx.doi.org/10.1093/bib/bby053] [PMID: 29947743]
[http://dx.doi.org/10.1093/bib/bbaa017] [PMID: 32065211]
[http://dx.doi.org/10.1016/j.omtn.2019.05.028] [PMID: 31299595]
[http://dx.doi.org/10.1093/bioinformatics/bty827] [PMID: 30247625]
[http://dx.doi.org/10.1109/TCBB.2017.2666141] [PMID: 28186907]
[http://dx.doi.org/10.1038/nmeth.3547] [PMID: 26301843]
[http://dx.doi.org/10.1038/nbt.3300] [PMID: 26213851]
[http://dx.doi.org/10.1093/bioinformatics/btx234] [PMID: 28881969]
[http://dx.doi.org/10.2174/1574893614666181212102030]
[http://dx.doi.org/10.1186/s12918-016-0353-5] [PMID: 28155714]
[http://dx.doi.org/10.1109/ACCESS.2020.2970442]
[http://dx.doi.org/10.1093/bib/bbz098] [PMID: 31665221]
[http://dx.doi.org/10.1002/pmic.201900119] [PMID: 31187588]
[http://dx.doi.org/10.2174/1574893612666170707095707]
[PMID: 26531826]
[http://dx.doi.org/10.1093/database/baz131]]
[http://dx.doi.org/10.3934/mbe.2019123] [PMID: 31137222]
[http://dx.doi.org/10.7150/ijbs.24174] [PMID: 29989085]
[http://dx.doi.org/10.1093/bib/bbz123] [PMID: 31633777]
[http://dx.doi.org/10.1093/nar/gkv1249] [PMID: 26586801]
[http://dx.doi.org/10.2174/1386207319666151110122621] [PMID: 26552440]
[http://dx.doi.org/10.1093/bib/bbx165] [PMID: 29272359]
[PMID: 28171531]
[http://dx.doi.org/10.1093/bioinformatics/btz358] [PMID: 31077296]
[PMID: 27543076]
[http://dx.doi.org/10.1038/nature11247] [PMID: 22955616]
[http://dx.doi.org/10.1038/nature14248] [PMID: 25693563]
[http://dx.doi.org/10.1038/nature12787] [PMID: 24670763]
[http://dx.doi.org/10.1038/s41588-019-0434-7] [PMID: 31332378]
[http://dx.doi.org/10.1093/bioinformatics/bty333] [PMID: 29701758]
[http://dx.doi.org/10.1109/TCBB.2019.2952338] [PMID: 31722485]
[http://dx.doi.org/10.1016/j.physa.2010.05.034]
[http://dx.doi.org/10.1093/bioinformatics/bty943] [PMID: 30428009]
[http://dx.doi.org/10.1186/s12920-017-0313-y] [PMID: 29297351]
[http://dx.doi.org/10.1109/TCBB.2016.2550432] [PMID: 27076459]
[http://dx.doi.org/10.1371/journal.pcbi.1006916] [PMID: 31022173]
[http://dx.doi.org/10.1093/bib/bbz057] [PMID: 31197324]
[http://dx.doi.org/10.1109/TCBB.2018.2858756] [PMID: 30040651]
[http://dx.doi.org/10.1109/TCBB.2017.2670558] [PMID: 28222000]
[http://dx.doi.org/10.2174/1389203721666200117153412] [PMID: 31957607]
[http://dx.doi.org/10.1109/ACCESS.2019.2953951]
[http://dx.doi.org/10.2174/1389200219666181031105916] [PMID: 30378494]
[http://dx.doi.org/10.1016/j.csbj.2018.10.007] [PMID: 30425802]
[http://dx.doi.org/10.1002/med.21658] [PMID: 31922268]
[http://dx.doi.org/10.2174/1389200219666180820112457] [PMID: 30124147]
[http://dx.doi.org/10.1142/S1793524517500504]
[http://dx.doi.org/10.2174/1574893614666190730110747]
[http://dx.doi.org/10.2174/1574893614666181217145156]
[http://dx.doi.org/10.2174/1574893613666181112150422]
[http://dx.doi.org/10.1007/s11432-013-4848-z]
[http://dx.doi.org/10.1016/j.neucom.2014.10.044]
[http://dx.doi.org/10.3390/molecules22101732] [PMID: 29039790]
[http://dx.doi.org/10.1186/s12859-016-1405-y] [PMID: 27919220]
[http://dx.doi.org/10.2174/1574893614666190204150109]
[http://dx.doi.org/10.1109/TCYB.2018.2856208] [PMID: 30059330]
[http://dx.doi.org/10.1109/TCYB.2017.2779450] [PMID: 29990272]
[http://dx.doi.org/10.1109/TCYB.2019.2938895] [PMID: 31545758]
[http://dx.doi.org/10.1007/s12539-018-0294-3] [PMID: 29882026]
[http://dx.doi.org/10.1101/gr.176586.114] [PMID: 25228660]
[http://dx.doi.org/10.1126/science.1181369] [PMID: 19815776]
[http://dx.doi.org/10.2174/1574893609666140820224436]
[http://dx.doi.org/10.1109/TCBB.2016.2520947] [PMID: 26890920]
[http://dx.doi.org/10.1038/ncomms10812] [PMID: 26960733]