Abstract
Background: Gene regulation is a complex and dynamic process that not only depends on the DNA sequence of genes but is also influenced by a key factor called epigenetic mechanisms. This factor, along with other factors, contributes to changing the behavior of DNA. While these factors cannot affect the structure of DNA, they can control the behavior of DNA by turning genes "on" or "off," which determines which proteins are transcribed.
Objectives: This paper will focus on the histone modification mechanism; histones are the group of proteins that bundle the DNA into a structural form called nucleosomes (coils); The way these histone proteins wrap DNA determines whether or not a gene can be accessed for expression. When histones are tightly bound to DNA, the gene is unable to be expressed, and vice versa. It is important to know histone modifications’ combinatorial patterns and how these combinatorial patterns can affect and work together to control the process of gene expression.
Methods: In this paper, ConvChrome deep learning methodologies are proposed for predicting the gene expression behavior from histone modifications data as an input to use more than one convolutional network model; this happens in order to recognize patterns of histones signals and interpret their spatial relationship on chromatin structure to give insights into regulatory signatures of histone modifications.
Results and Conclusion: The results show that ConvChrome achieved an Area Under the Curve (AUC) score of 88.741%, which is an outstanding improvement over the baseline for gene expression classification prediction task from combinatorial interactions among five histone modifications on 56 different cell types
Keywords: Epigenetics, gene expression regulation, histone modifications, deep learning, DNA, convolution neural networks.
Graphical Abstract
[http://dx.doi.org/10.1038/nature05918] [PMID: 17522676]
[http://dx.doi.org/10.1093/hmg/ddi114] [PMID: 15809273]
[http://dx.doi.org/10.1038/cr.2011.22] [PMID: 21321607]
[http://dx.doi.org/10.1242/jcs.02689] [PMID: 16317046]
[http://dx.doi.org/10.1016/j.tig.2015.10.007] [PMID: 26704082]
[http://dx.doi.org/10.1038/s41588-020-0696-0] [PMID: 32989324]
[http://dx.doi.org/10.1038/s41593-018-0187-0] [PMID: 30038276]
[PMID: 33994847]
[http://dx.doi.org/10.1016/j.eswa.2017.05.039]
[PMID: 33414495]
[http://dx.doi.org/10.3390/s21041249] [PMID: 33578714]
[http://dx.doi.org/10.1093/jamia/ocaa261] [PMID: 33319904]
[http://dx.doi.org/10.1109/TITS.2020.2976572]
[http://dx.doi.org/10.1109/MCE.2020.2969195]
[http://dx.doi.org/10.1093/bib/bbz042] [PMID: 31155636]
[http://dx.doi.org/10.1109/ACCESS.2021.3058537]
[http://dx.doi.org/10.1093/bioinformatics/btaa003] [PMID: 31913448]
[http://dx.doi.org/10.1038/srep17573] [PMID: 26634993]
[http://dx.doi.org/10.1093/bioinformatics/btw074] [PMID: 26873929]
[http://dx.doi.org/10.1109/5.726791]
[http://dx.doi.org/10.1186/gb-2011-12-2-r15] [PMID: 21324173]
[http://dx.doi.org/10.1186/gb-2012-13-9-r53] [PMID: 22950368]
[http://dx.doi.org/10.1186/s12859-018-2100-y] [PMID: 29671394]
[http://dx.doi.org/10.1093/bioinformatics/btw427] [PMID: 27587684]
[PMID: 30147283]
[http://dx.doi.org/10.1109/SSCI44817.2019.9002669]
[http://dx.doi.org/10.1109/BigComp48618.2020.00-41]
[http://dx.doi.org/10.1038/nature14248] [PMID: 25693563]