Abstract
Background: Enhancers are short DNA regions that improve transcription efficiency by recruiting transcription factors. Identifying enhancer regions is important to understand the process of gene expression. As enhancers are independent of their distances and orientations to the target genes, it is difficult to locate enhancers accurately. Recently, with the development of highthroughput ChIP-seq (Chromatin Immunoprecipitation sequencing) technologies, several computational methods were developed to predict enhancers. However, most of these methods rely on p300 binding sites and/or DNase I hypersensitive sites (DHSs) for selecting positive training samples, which is imprecise and subsequently leads to unsatisfactory prediction performance. Besides, in the literature, there is no work that predicts enhancers from tissues across different developmental stages.
Methods: In this paper, we proposed a method based on support vector machines (SVMs) to investigate enhancer prediction on cell lines and tissues from EnhancerAtlas. Specifically, we focused on predicting enhancers on different developmental stages of heart and lung tissues.
Results and Conclusion: Our results show that 1) the proposed method achieves good performance on most cell lines and tissues, especially it outperforms several state of the art methods on heart and lung. 2) It is easier to predict enhancers from tissues of adult stage than from tissues of fetal stage, which is proven on both heart and lung tissues.
Keywords: Enhancer prediction, ChIP-seq, support vector machines, heart, lung, developmental stages.
Graphical Abstract