Genome Wide Approaches to Identify Protein-DNA Interactions

doi:10.2174/0929867325666180530115711

摘要

背景：转录因子是DNA结合蛋白，在许多基本生物学过程中起关键作用。弄清它们与DNA的相互作用对于鉴定其靶基因和了解调节网络至关重要。由于实验和计算方法的最新进展，全基因组识别其结合位点变得可行。 ChIP芯片，ChIP-seq和ChIP-exo是三种用于划定全基因组转录因子结合位点的技术。目的：本文旨在概述这三种技术，包括它们的实验程序，计算方法和流行的分析工具。结论：ChIP芯片，ChIP-seq和ChIP-exo是研究全基因组体内蛋白质-DNA相互作用的主要技术。由于下一代测序技术的飞速发展，不赞成使用基于阵列的ChIP芯片，而ChIP-seq已成为在整个基因组中鉴定转录因子结合位点的最广泛使用的技术。新开发的ChIP-exo进一步提高了单核苷酸的空间分辨率。已经开发了许多工具来分析ChIP芯片，ChIP-seq和ChIP-exo数据。但是，不同的程序可能采用不同的机制或基础算法，因此每个程序都将固有地包括自己的一组统计假设和偏差。因此，为给定实验选择最合适的分析程序需要仔细考虑。而且，大多数程序仅具有命令行界面，因此其安装和使用将需要Unix / Linux的基本计算专业知识。

关键词: 转录因子，转录因子结合位点，染色质免疫沉淀，微阵列，下一代测序，ChIP芯片，ChIP-seq，ChIP-exo，数据分析。

« Previous Next »

[1] 
Consortium, E.P. ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature,  2012, 489(7414), 57-74.
[http://dx.doi.org/10.1038/nature11247] [PMID:  22955616] 
[2] 
Celniker, S.E.; Dillon, L.A.; Gerstein, M.B.; Gunsalus, K.C.; Henikoff, S.; Karpen, G.H.; Kellis, M.; Lai, E.C.; Lieb, J.D.; MacAlpine, D.M.; Micklem, G.; Piano, F.; Snyder, M.; Stein, L.; White, K.P.; Waterston, R.H. modENCODE Consortium. Unlocking the secrets of the genome. Nature,  2009, 459(7249), 927-930.
[http://dx.doi.org/10.1038/459927a] [PMID:  19536255] 
[3] 
Bernstein, B.E.; Stamatoyannopoulos, J.A.; Costello, J.F.; Ren, B.; Milosavljevic, A.; Meissner, A.; Kellis, M.; Marra, M.A.; Beaudet, A.L.; Ecker, J.R.; Farnham, P.J.; Hirst, M.; Lander, E.S.; Mikkelsen, T.S.; Thomson, J.A. The NIH roadmap epigenomics mapping consortium. The NIH roadmap epigenomics mapping consortium. Nat. Biotechnol.,  2010, 28(10), 1045-1048.
[http://dx.doi.org/10.1038/nbt1010-1045] [PMID:  20944595] 
[4] 
Ren, B.; Robert, F.; Wyrick, J.J.; Aparicio, O.; Jennings, E.G.; Simon, I.; Zeitlinger, J.; Schreiber, J.; Hannett, N.; Kanin, E.; Volkert, T.L.; Wilson, C.J.; Bell, S.P.; Young, R.A. Genome-wide location and function of DNA binding proteins. Science,  2000, 290(5500), 2306-2309.
[http://dx.doi.org/10.1126/science.290.5500.2306] [PMID:  11125145] 
[5] 
Johnson, D.S.; Mortazavi, A.; Myers, R.M.; Wold, B. Genome-wide mapping of in vivo protein-DNA interactions. Science,  2007, 316(5830), 1497-1502.
[http://dx.doi.org/10.1126/science.1141319] [PMID:  17540862] 
[6] 
Robertson, G.; Hirst, M.; Bainbridge, M.; Bilenky, M.; Zhao, Y.; Zeng, T.; Euskirchen, G.; Bernier, B.; Varhol, R.; Delaney, A.; Thiessen, N.; Griffith, O.L.; He, A.; Marra, M.; Snyder, M.; Jones, S. Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat. Methods,  2007, 4(8), 651-657.
[http://dx.doi.org/10.1038/nmeth1068] [PMID:  17558387] 
[7] 
Retraction Note to. Retraction Note to: ChIP-seq analysis of androgen receptor in LNCaP cell line. Mol. Biol. Rep.,  2015, 42(10), 1479.
[http://dx.doi.org/10.1007/s11033-015-3903-9] [PMID:  26285940] 
[8] 
Rhee, H.S.; Pugh, B.F. Comprehensive genome-wide protein-DNA interactions detected at single-nucleotide resolution. Cell,  2011, 147(6), 1408-1419.
[http://dx.doi.org/10.1016/j.cell.2011.11.013] [PMID:  22153082] 
[9] 
Adriaens, M.E.; Prickaerts, P.; Chan-Seng-Yue, M.; van den Beucken, T.; Dahlmans, V.E.H.; Eijssen, L.M.; Beck, T.; Wouters, B.G.; Voncken, J.W.; Evelo, C.T.A. Quantitative analysis of ChIP-seq data uncovers dynamic and sustained H3K4me3 and H3K27me3 modulation in cancer cells under hypoxia. Epigenetics Chromatin,  2016, 9, 48.
[http://dx.doi.org/10.1186/s13072-016-0090-4] [PMID:  27822313] 
[10] 
Adli, M.; Bernstein, B.E. Whole-genome chromatin profiling from limited numbers of cells using nano-ChIP-seq. Nat. Protoc.,  2011, 6(10), 1656-1668.
[http://dx.doi.org/10.1038/nprot.2011.402] [PMID:  21959244] 
[11] 
Aghamirzaie, D.; Raja Velmurugan, K.; Wu, S.; Altarawy, D.; Heath, L.S.; Grene, R. Expresso: A database and web server for exploring the interaction of transcription factors and their target genes in Arabidopsis thaliana using ChIP-Seq peak data. F1000 Res.,  2017, 6, 372.
[http://dx.doi.org/10.12688/f1000research.10041.1] [PMID:  28529706] 
[12] 
Nelson, J.D.; Denisenko, O.; Bomsztyk, K. Protocol for the fast chromatin immunoprecipitation (ChIP) method. Nat. Protoc.,  2006, 1(1), 179-185.
[http://dx.doi.org/10.1038/nprot.2006.27] [PMID:  17406230] 
[13] 
Buck, M.J.; Lieb, J.D. ChIP-chip: considerations for the design, analysis, and application of genome-wide chromatin immunoprecipitation experiments. Genomics,  2004, 83(3), 349-360.
[http://dx.doi.org/10.1016/j.ygeno.2003.11.004] [PMID:  14986705] 
[14] 
Liu, X.S. Getting started in tiling microarray analysis. PLOS Comput. Biol.,  2007, 3(10), 1842-1844.
[http://dx.doi.org/10.1371/journal.pcbi.0030183] [PMID:  17967045] 
[15] 
Park, P.J. ChIP-seq: advantages and challenges of a maturing technology. Nat. Rev. Genet.,  2009, 10(10), 669-680.
[http://dx.doi.org/10.1038/nrg2641] [PMID:  19736561] 
[16] 
 HiSeq 3000/HiSeq 4000 System quality and performance. Available at: https://www.illumina.com/systems/sequencing-platforms/hiseq-3000-4000/specifications.html (Accessed Date: 14 Nov, 2017)
[17] 
Ladunga, I. Computational biology of transcription factor binding; Humana Press: New York, NY, 2010, p. xi.
[http://dx.doi.org/10.1007/978-1-60761-854-6] 
[18] 
Teng, M.; Irizarry, R.A. Accounting for GC-content bias reduces systematic errors and batch effects in ChIP-seq data. Genome Res.,  2017, 27(11), 1930-1938.
[http://dx.doi.org/10.1101/gr.220673.117] [PMID:  29025895] 
[19] 
Landt, S.G.; Marinov, G.K.; Kundaje, A.; Kheradpour, P.; Pauli, F.; Batzoglou, S.; Bernstein, B.E.; Bickel, P.; Brown, J.B.; Cayting, P.; Chen, Y.; DeSalvo, G.; Epstein, C.; Fisher-Aylor, K.I.; Euskirchen, G.; Gerstein, M.; Gertz, J.; Hartemink, A.J.; Hoffman, M.M.; Iyer, V.R.; Jung, Y.L.; Karmakar, S.; Kellis, M.; Kharchenko, P.V.; Li, Q.; Liu, T.; Liu, X.S.; Ma, L.; Milosavljevic, A.; Myers, R.M.; Park, P.J.; Pazin, M.J.; Perry, M.D.; Raha, D.; Reddy, T.E.; Rozowsky, J.; Shoresh, N.; Sidow, A.; Slattery, M.; Stamatoyannopoulos, J.A.; Tolstorukov, M.Y.; White, K.P.; Xi, S.; Farnham, P.J.; Lieb, J.D.; Wold, B.J.; Snyder, M. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res.,  2012, 22(9), 1813-1831.
[http://dx.doi.org/10.1101/gr.136184.111] [PMID:  22955991] 
[20] 
Song, J.S.; Maghsoudi, K.; Li, W.; Fox, E.; Quackenbush, J.; Shirley Liu, X. Microarray blob-defect removal improves array analysis. Bioinformatics,  2007, 23(8), 966-971.
[http://dx.doi.org/10.1093/bioinformatics/btm043] [PMID:  17332024] 
[21] 
Ji, H. Computational analysis of ChIP-chip data in: Handbook of Statistical Bioinformatics; ; Lu, H.H-S.; Schölkopf, B.; Zhao, H., Eds.; Springer Berlin Heidelberg: Berlin, Heidelberg, 2011, pp. 257-282.
[http://dx.doi.org/10.1007/978-3-642-16345-6_12] 
[22] 
Cawley, S.; Bekiranov, S.; Ng, H.H.; Kapranov, P.; Sekinger, E.A.; Kampa, D.; Piccolboni, A.; Sementchenko, V.; Cheng, J.; Williams, A.J.; Wheeler, R.; Wong, B.; Drenkow, J.; Yamanaka, M.; Patel, S.; Brubaker, S.; Tammana, H.; Helt, G.; Struhl, K.; Gingeras, T.R. Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell,  2004, 116(4), 499-509.
[http://dx.doi.org/10.1016/S0092-8674(04)00127-8] [PMID:  14980218] 
[23] 
Johnson, W.E.; Li, W.; Meyer, C.A.; Gottardo, R.; Carroll, J.S.; Brown, M.; Liu, X.S. Model-based analysis of tiling-arrays for ChIP-chip. Proc. Natl. Acad. Sci. USA,  2006, 103(33), 12457-12462.
[http://dx.doi.org/10.1073/pnas.0601180103] [PMID:  16895995] 
[24] 
Ji, H.; Jiang, H.; Ma, W.; Johnson, D.S.; Myers, R.M.; Wong, W.H. An integrated software system for analyzing ChIP-chip and ChIP-seq data. Nat. Biotechnol.,  2008, 26(11), 1293-1300.
[http://dx.doi.org/10.1038/nbt.1505] [PMID:  18978777] 
[25] 
Ji, H.; Wong, W.H. TileMap: create chromosomal map of tiling array hybridizations. Bioinformatics,  2005, 21(18), 3629-3636.
[http://dx.doi.org/10.1093/bioinformatics/bti593] [PMID:  16046496] 
[26] 
Bailey, T.; Krajewski, P.; Ladunga, I.; Lefebvre, C.; Li, Q.; Liu, T.; Madrigal, P.; Taslim, C.; Zhang, J. Practical guidelines for the comprehensive analysis of ChIP-seq data. PLOS Comput. Biol.,  2013, 9(11)e1003326
[http://dx.doi.org/10.1371/journal.pcbi.1003326] [PMID:  24244136] 
[27] 
Andrews, S. FastQC: a quality control tool for high throughput sequence data., 2010.
[28] 
Martin, M. Cutadapt removes adapter sequences from highthroughput sequencing reads. EMBnet.journal,  2011, 17(1)
[http://dx.doi.org/10.14806/ej.17.1.200] 
[29] 
Joshi, N.A.F.J. (2011) Sickle: A sliding-window, adaptive, quality-based trimming tool for FastQ files (Version 1.33) [Software]; Available at https://github.com/najoshi/sickle
[30] 
Bolger, A.M.; Lohse, M.; Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics,  2014, 30(15), 2114-2120.
[http://dx.doi.org/10.1093/bioinformatics/btu170] [PMID:  24695404] 
[31] 
Del Fabbro, C.; Scalabrin, S.; Morgante, M.; Giorgi, F.M. An extensive evaluation of read trimming effects on Illumina NGS data analysis. PLoS One,  2013, 8(12) e85024
[http://dx.doi.org/10.1371/journal.pone.0085024] [PMID:  24376861] 
[32] 
Li, H.; Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics,  2009, 25(14), 1754-1760.
[http://dx.doi.org/10.1093/bioinformatics/btp324] [PMID:  19451168] 
[33] 
Langmead, B.; Salzberg, S.L. Fast gapped-read alignment with Bowtie 2. Nat. Methods,  2012, 9(4), 357-359.
[http://dx.doi.org/10.1038/nmeth.1923] [PMID:  22388286] 
[34] 
Li, H.; Homer, N. A survey of sequence alignment algorithms for next-generation sequencing. Brief. Bioinform.,  2010, 11(5), 473-483.
[http://dx.doi.org/10.1093/bib/bbq015] [PMID:  20460430] 
[35] 
Treangen, T.J.; Salzberg, S.L. Repetitive DNA and next-generation sequencing: computational challenges and solutions. Nat. Rev. Genet.,  2011, 13(1), 36-46.
[http://dx.doi.org/10.1038/nrg3117] [PMID:  22124482] 
[36] 
Nakato, R.; Shirahige, K. Recent advances in ChIP-seq analysis: from quality management to whole-genome annotation. Brief. Bioinform.,  2017, 18(2), 279-290.
[PMID:  26979602] 
[37] 
Broadinstitute Picard, Available at:. http://broadinstitute.github.io/picard/ (Accessed on November 23, 2017)
[38] 
Li, H.; Handsaker, B.; Wysoker, A.; Fennell, T.; Ruan, J.; Homer, N.; Marth, G.; Abecasis, G.; Durbin, R. 1000 Genome Project Data Processing Subgroup. The Sequence Alignment/Map format and SAMtools. Bioinformatics,  2009, 25(16), 2078-2079.
[http://dx.doi.org/10.1093/bioinformatics/btp352] [PMID:  19505943] 
[39] 
Kharchenko, P.V.; Tolstorukov, M.Y.; Park, P.J. Design and analysis of ChIP-seq experiments for DNA-binding proteins. Nat. Biotechnol.,  2008, 26(12), 1351-1359.
[http://dx.doi.org/10.1038/nbt.1508] [PMID:  19029915] 
[40] 
Pepke, S.; Wold, B.; Mortazavi, A. Computation for ChIP-seq and RNA-seq studies. Nat. Methods,  2009, 6(Suppl. 11), S22-S32.
[http://dx.doi.org/10.1038/nmeth.1371] [PMID:  19844228] 
[41] 
Zhang, Y.; Liu, T.; Meyer, C.A.; Eeckhoute, J.; Johnson, D.S.; Bernstein, B.E.; Nusbaum, C.; Myers, R.M.; Brown, M.; Li, W.; Liu, X.S. Model-based analysis of ChIP-Seq (MACS). Genome Biol.,  2008, 9(9), R137.
[http://dx.doi.org/10.1186/gb-2008-9-9-r137] [PMID:  18798982] 
[42] 
Rozowsky, J.; Euskirchen, G.; Auerbach, R.K.; Zhang, Z.D.; Gibson, T.; Bjornson, R.; Carriero, N.; Snyder, M.; Gerstein, M.B. PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls. Nat. Biotechnol.,  2009, 27(1), 66-75.
[http://dx.doi.org/10.1038/nbt.1518] [PMID:  19122651] 
[43] 
Valouev, A.; Johnson, D.S.; Sundquist, A.; Medina, C.; Anton, E.; Batzoglou, S.; Myers, R.M.; Sidow, A. Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data. Nat. Methods,  2008, 5(9), 829-834.
[http://dx.doi.org/10.1038/nmeth.1246] [PMID:  19160518] 
[44] 
Li, Q.H.; Brown, J.B.; Huang, H.Y.; Bickel, P.J. Measuring reproducibility of high-throughput experiments. Ann. Appl. Stat.,  2011, 5(3), 1752-1779.
[http://dx.doi.org/10.1214/11-AOAS466] 
[45] 
ENCODE (2012) Irreproducible Discovery Rate (IDR), (Version 0.11.5) [Software]; Available at . https://www.encodeproject.org/software/idr/
[46] 
Li, Q. (2014) IDR: Irreproducible Discovery Rate, (Version 1.2) [Software]; Available at: https://CRAN.R-project.org/package=idr.
[47] 
Wang, L.; Chen, J.; Wang, C.; Uuskula-Reimand, L.; Chen, K.; Medina-Rivera, A.; Young, E.J.; Zimmermann, M.T.; Yan, H.; Sun, Z.; Zhang, Y.; Wu, S.T.; Huang, H.; Wilson, M.D.; Kocher, J.P.; Li, W. MACE: model based analysis of ChIP-exo. Nucleic Acids Res.,  2014, 42(20)e156
[http://dx.doi.org/10.1093/nar/gku846] [PMID:  25249628] 
[48] 
Guo, Y.; Mahony, S.; Gifford, D.K. High resolution genome wide binding event finding and motif discovery reveals transcription factor spatial binding constraints. PLOS Comput. Biol.,  2012, 8(8)e1002638
[http://dx.doi.org/10.1371/journal.pcbi.1002638] [PMID:  22912568] 
[49] 
ENCODE Epitope-tagged transcription factor ChIP-seq. Available at: https://www.encodeproject.org/documents/35a9f776-dd6a-44e3-8795-50ead83f34f7/@@download/attachment/Guidelines_for_Use_of_Epitope_Tags_ in_ChIP-seq_Jan_2017.pdf (accessed Nov 29, 2017).

Rights & Permissions Print Cite

Article Metrics

36

6

Journal Information

For Authors

For Editors

For Reviewers

Explore Articles

Open Access

Open Access Articles

For Visitors

DOI https://dx.doi.org/10.2174/0929867325666180530115711	Print ISSN 0929-8673
Publisher Name Bentham Science Publisher	Online ISSN 1875-533X

当代药物化学

全基因组识别蛋白质-DNA相互作用的方法

摘要

当代药物化学

全基因组识别蛋白质-DNA相互作用的方法

摘要 Play Pause

Related Journals

Related Books

摘要