Generic placeholder image

Current Bioinformatics

Editor-in-Chief

ISSN (Print): 1574-8936
ISSN (Online): 2212-392X

Research Article

DeepEpi: Deep Learning Model for Predicting Gene Expression Regulation Based on Epigenetic Histone Modifications

Author(s): Rania Hamdy*, Yasser Omar and Fahima Maghraby

Volume 19, Issue 7, 2024

Published on: 17 November, 2023

Page: [624 - 640] Pages: 17

DOI: 10.2174/1574893618666230818121046

Price: $65

Abstract

Background: Histone modification is a vital element in gene expression regulation. The way in which these proteins bind to the DNA impacts whether or not a gene may be expressed. Although those factors cannot influence DNA construction, they can influence how it is transcribed.

Objective: Each spatial location in DNA has its function, so the spatial arrangement of chromatin modifications affects how the gene can express. Also, gene regulation is affected by the type of histone modification combinations that are present on the gene and depends on the spatial distributional pattern of these modifications and how long these modifications read on a gene region. So, this study aims to know how to model Long-range spatial genome data and model complex dependencies among Histone reads.

Methods: The Convolution Neural Network (CNN) is used to model all data features in this paper. It can detect patterns in histones signals and preserve the spatial information of these patterns. It also uses the concept of memory in long short-term memory (LSTM), using vanilla LSTM, Bi-Directional LSTM, or Stacked LSTM to preserve long-range histones signals. Additionally, it tries to combine these methods using ConvLSTM or uses them together with the aid of a self-attention.

Results: Based on the results, the combination of CNN, LSTM with the self-attention mechanism obtained an Area under the Curve (AUC) score of 88.87% over 56 cell types.

Conclusion: The result outperforms the present state-of-the-art model and provides insight into how combinatorial interactions between histone modification marks can control gene expression. The source code is available at https://github.com/RaniaHamdy/DeepEpi.

Graphical Abstract

[1]
Reik W. Stability and flexibility of epigenetic gene regulation in mammalian development. Nature 2007; 447(7143): 425-32.
[http://dx.doi.org/10.1038/nature05918] [PMID: 17522676]
[2]
Morgan HD, Santos F, Green K, Dean W, Reik W. Epigenetic reprogramming in mammals. Hum Mol Genet 2005; 14(1): 47-58.
[http://dx.doi.org/10.1093/hmg/ddi114]
[3]
Deans C, Maggert KA. What do you mean, “epigenetic”? Genetics 2015; 199(4): 887-96.
[http://dx.doi.org/10.1534/genetics.114.173492] [PMID: 25855649]
[4]
Gene expression and regulation. Available from: https://www.nature.com/scitable/topic/gene-expression-and-regulation-15/ [Accessed: 14-Nov-2022]
[5]
How is a Gene Expressed to Produce a Protein. Available from: https://pediaa.com/how-is-a-gene-expressed-to-produce-a-protein/ [Accessed: 14-Nov-2022]
[6]
Bannister AJ, Kouzarides T. Regulation of chromatin by histone modifications. Cell Res 2011; 21(3): 381-95.
[http://dx.doi.org/10.1038/cr.2011.22] [PMID: 21321607]
[7]
Epigenetics: Fundamentals,” What is Epigenetics? Available from: https://www.whatisepigenetics.com/fundamentals/ [Accessed: 14-Nov-2022]
[8]
Hendrich B, Bickmore W. Human diseases with underlying defects in chromatin structure and modification. Hum Mol Genet 2001; 10(20): 2233-42.
[http://dx.doi.org/10.1093/hmg/10.20.2233] [PMID: 11673406]
[9]
Araki Y, Mimura T. The histone modification code in the pathogenesis of autoimmune diseases. Mediators Inflamm 2017; 2017: 1-12.
[http://dx.doi.org/10.1155/2017/2608605] [PMID: 28127155]
[10]
Zhang W, Song M, Qu J, Liu GH. Epigenetic modifications in cardiovascular aging and diseases. Circ Res 2018; 123(7): 773-86.
[http://dx.doi.org/10.1161/CIRCRESAHA.118.312497] [PMID: 30355081]
[11]
Johnson CA. Chromatin modification and disease. J Med Genet 2000; 37(12): 905-15.
[http://dx.doi.org/10.1136/jmg.37.12.905] [PMID: 11106353]
[12]
Sadri-Vakili G, Cha JHJ. Mechanisms of Disease: histone modifications in Huntington’s disease. Nat Clin Pract Neurol 2006; 2(6): 330-8.
[http://dx.doi.org/10.1038/ncpneuro0199] [PMID: 16932577]
[13]
Wen K, Miliç J, El-Khodor B, et al. The role of DNA methylation and histone modifications in neurodegenerative diseases: A systematic review. PLoS One 2016; 11(12): e0167201.
[http://dx.doi.org/10.1371/journal.pone.0167201] [PMID: 27973581]
[14]
Atlante S, Mongelli A, Barbi V, Martelli F, Farsetti A, Gaetano C. The epigenetic implication in coronavirus infection and therapy. Clin Epigenetics 2020; 12(1): 156.
[http://dx.doi.org/10.1186/s13148-020-00946-x] [PMID: 33087172]
[15]
McCray A. Cancer treatment and epigenetics. Available from: https://www.webmd.com/cancer/cancer-treatment-epigenetics [Accessed: 14-Nov-2022]
[16]
Wang X, Liu M, Zhang Y, et al. Deep fusion learning facilitates anatomical therapeutic chemical recognition in drug repurposing and discovery. Brief Bioinform 2021; 22(6): bbab289.
[http://dx.doi.org/10.1093/bib/bbab289] [PMID: 34368838]
[17]
Jiménez-Luna J, Grisoni F, Schneider G. Drug discovery with explainable artificial intelligence. Nat Mach Intell 2020; 2(10): 573-84.
[http://dx.doi.org/10.1038/s42256-020-00236-4]
[18]
Gunasekaran H, Ramalakshmi K, Rex Macedo Arokiaraj A, Deepa Kanmani S, Venkatesan C, Suresh Gnana Dhas C. Analysis of DNA sequence classification using CNN and hybrid models. Comput Math Methods Med 2021; 2021: 1-12.
[http://dx.doi.org/10.1155/2021/1835056] [PMID: 34306171]
[19]
Guo L, Jiang Q, Jin X, et al. A deep convolutional neural network to improve the prediction of protein secondary structure. Curr Bioinform 2020; 15(7): 767-77.
[http://dx.doi.org/10.2174/1574893615666200120103050]
[20]
Hamdy R, Maghraby FA, Omar YMK. ConvChrome: Predicting gene expression based on histone modifications using deep learning techniques. Curr Bioinform 2022; 17(3): 273-83.
[http://dx.doi.org/10.2174/1574893616666211214110625]
[21]
Sequeira AM, Lousa D, Rocha M. ProPythia: A Python package for protein classification based on machine and deep learning. Neurocomputing 2022; 484: 172-82.
[http://dx.doi.org/10.1016/j.neucom.2021.07.102]
[22]
Shi Z. Graph neural networks and attention-based CNN-LSTM for protein classification. arXiv:220409486 2022.
[23]
Setlur Nagesh SV, Podgorsak A, Krebs JM, Bednarek D, Rudin S. Image processing using Convolutional Neural Network (CNN) for Region of Interest (ROI) fluoroscopy. In Medical Imaging 2020: Biomedical Applications in Molecular. Structural, and Functional Imaging. Washington: SPIE Publishers 2020; pp. 317-27.
[24]
Islam MM, Karray F, Alhajj R, Zeng J. A review on deep learning techniques for the diagnosis of novel Coronavirus (COVID-19). IEEE Access 2021; 9: 30551-72.
[http://dx.doi.org/10.1109/ACCESS.2021.3058537] [PMID: 34976571]
[25]
Lecun Y, Bengio Y. Convolutional networks for images, speech, and time-series The handbook of brain theory and neural networks. London, England: MIT Press 1995.
[26]
Cai S, Shu Y, Chen G, Ooi BC, Wang W, Zhang M. Effective and efficient dropout for deep convolutional neural networks arXiv:190403392 2019.
[27]
Yu D, Wang H, Chen P, Wei Z. Mixed pooling for convolutional neural networks Rough sets and knowledge technology. Cham: Springer International Publishing 2014; pp. 364-75.
[http://dx.doi.org/10.1007/978-3-319-11740-9_34]
[28]
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: A simple way to prevent neural networks from overfitting. J Mach Learn Res 2014; 15(56): 1929-58.
[29]
Brownlee J. A gentle introduction to dropout for regularizing deep neural networks Available from: https://machinelearningmastery.com/dropout-for-regularizing-deep-neural-networks/ [Accessed: 14-Nov-2022]
[30]
Vaswani A, et al. Attention is all you need. 31st Conference on Neural Information Processing Systems Long Beach, CA, USA 2017.
[31]
Kana M. 5 secrets about LSTM and GRU everyone else knows. Available from: https://towardsdatascience.com/5-secrets-about-lstm-and-gru-everyone-else-know-97446d89e35b [Accessed: 14-Nov-2022]
[32]
Cui Z, Ke R, Pu Z, Wang Y. Stacked bidirectional and unidirectional LSTM recurrent neural network for forecasting network-wide traffic state with missing values. Transp Res, Part C Emerg Technol 2020; 118(102674): 102674.
[http://dx.doi.org/10.1016/j.trc.2020.102674]
[33]
Srivastava P. Essentials of deep learning : Introduction to long short term memory. Available from: https://www.analyticsvidhya.com/blog/2017/12/fundamentals-of-deep-learning-introduction-to-lstm/ [Accessed: 14-Nov-2022]
[34]
Shi X, Chen Z, Wang H, Yeung DY, Wong WK, Woo WC. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. arXiv:150604214 2015.
[35]
Essien A, Giannetti C. A deep learning model for smart manufacturing using convolutional LSTM neural network autoencoders. IEEE Trans Industr Inform 2020; 16(9): 6069-78.
[http://dx.doi.org/10.1109/TII.2020.2967556]
[36]
Cheng C, Yan KK, Yip KY, et al. A statistical framework for modeling gene expression using chromatin features and application to modENCODE datasets. Genome Biol 2011; 12(2): R15.
[http://dx.doi.org/10.1186/gb-2011-12-2-r15] [PMID: 21324173]
[37]
Dong X, Greven MC, Kundaje A, et al. Modeling gene expression using chromatin features in various cellular contexts. Genome Biol 2012; 13(9): R53.
[http://dx.doi.org/10.1186/gb-2012-13-9-r53] [PMID: 22950368]
[38]
Sun S, Sun X, Zheng Y. Higher-order partial least squares for predicting gene expression levels from chromatin states. BMC Bioinformatics 2018; 19(S5) (Suppl. 5): 113.
[http://dx.doi.org/10.1186/s12859-018-2100-y] [PMID: 29671394]
[39]
Singh R, Lanchantin J, Robins G, Qi Y. DeepChrome: deep-learning for predicting gene expression from histone modifications. Bioinformatics 2016; 32(17): i639-48.
[http://dx.doi.org/10.1093/bioinformatics/btw427] [PMID: 27587684]
[40]
Singh R, Lanchantin J, Sekhon A, Qi Y. Attend and predict: Understanding gene regulation by selective attention on chromatin. Adv Neural Inf Process Syst. 2017; 30: pp. 6785-95.
[PMID: 30147283]
[41]
Zhu L, Kesseli J, Nykter M, Huttunen H. Predicting gene expression levels from histone modification signals with convolutional recurrent neural networks. EMBEC & NBC 2017: Joint Conference of the European Medical and Biological Engineering Conference (EMBEC) and the Nordic-Baltic Conference on Biomedical Engineering and Medical Physics (NBC); Tampere, Finland 555-8.
[http://dx.doi.org/10.1007/978-981-10-5122-7_139]
[42]
Chaubey V, Nair MS, Pillai GN. Gene expression prediction using a deep 1D convolution neural network. 2019 IEEE Symposium Series on Computational Intelligence (SSCI); Xiamen, China 2019; 1383-9.
[http://dx.doi.org/10.1109/SSCI44817.2019.9002669]
[43]
Kamal IM, Wahid NA, Bae H. Gene expression prediction using stacked temporal convolutional network. 2020 IEEE International Conference on Big Data and Smart Computing (BigComp); Busan, Korea (South) 2020; 402-5.
[http://dx.doi.org/10.1109/BigComp48618.2020.00-41]
[44]
Cheng W, Murtaza G, Wang A. SimpleChrome: Encoding of combinatorial effects for predicting gene expression. arXiv:201208671 2020.
[45]
Symeonidi A, Nicolaou A, Johannes F, Christlein V. Recursive Convolutional Neural Networks for Epigenomics. 2020 25 th International Conference on Pattern Recognition (ICPR); Milan, Italy 2021; 2567-74.
[http://dx.doi.org/10.1109/ICPR48806.2021.9412272]
[46]
Kadavath S, Paradis S, Yeung J. DeepChrome 2.0: Investigating and improving architectures, visualizations, & experiments arXiv:220911923 2022.
[47]
Kundaje A, Meuleman W, Ernst J, et al. Integrative analysis of 111 reference human epigenomes. Nature 2015; 518(7539): 317-30.
[http://dx.doi.org/10.1038/nature14248] [PMID: 25693563]
[48]
Roadmap epigenomics. Available from: https: //egg2.wustl.edu/roadmap/web_portal/processed_data.html [Accessed: 28-Dec-2022]
[49]
DeepEpi Available from: https://github.com/RaniaHamdy/DeepEpi [Accessed: 16-Nov-2022]
[50]
CellInfo Pdf at master Available from: https: //github.com/QData/DeepChrome [Accessed: 14-Nov-2022]
[51]
Encode project common cell types. Available from: https: //www.genome.gov/encode-project-common-cell-types [Accessed 14-Nov-2022]
[52]
Zhang L, Xue G, Liu J, Li Q, Wang Y. Revealing transcription factor and histone modification co-localization and dynamics across cell lines by integrating ChIP-seq and RNA-seq data. BMC Genomic 2018; 19(S10): 914.
[http://dx.doi.org/10.1186/s12864-018-5278-5]

Rights & Permissions Print Cite
© 2025 Bentham Science Publishers | Privacy Policy