Abstract
Transcriptional regulation is a key step to control the level of mRNA formed. Recent view of transcriptional regulation has evolved from a one-dimensional mode, i.e. RNA Polymerase II assembles with general transcription factors, and cis-regulatory elements (CREs) interact with transcription factors, to a much complex multiple-dimensional mode, involving combinatorial interactions between transcription factors and regulatory sequences, chromatin structure, histone modifications, DNA methylation. High throughput experimental technologies, such as array-based ChIP-chip and sequencing-based ChIP-seq, have been developed to survey in vivo transcription factor binding sites and histone modifications. Despite many efforts have been made to analyze and interpret the data, challenges remain in many aspects of both experimental protocols and computational analyses. For example, how to determine the optimized number of PCR cycles? How to normalize multiple datasets from multiple experiments? How to utilize the large number of unmapped and multiple mapped tags in ChIP-seq experiment? This review focuses on issues emerged in high throughput data processing and discusses advantages and disadvantages of various strategies.
Keywords: Array-based, ChiP, ChiP-seq, cis-regulatory elements (CREs), omics, sequence-based, Aberrant Genomes, Peak Detection, DNA methylation, MBD-seq data