Chromatin immunoprecipitation followed by sequencing (ChIP–seq) can be used to map DNA-binding proteins and histone modifications in a genome-wide manner at base-pair resolution.
ChIP–seq offers superior data quality to chromatin immunoprecipitation followed by microarray (ChIP–chip), and its advantages include higher resolution, less noise, higher genome coverage and wider dynamic range.
To eliminate bias in fragmentation and sequencing, a control sample (generally input DNA) should also be sequenced. Other issues to consider in experimental design include the quality of the antibodies and the depth of sequencing.
Genome alignment and the identification of enriched regions present challenges for data analysis, and there are several strategies available.
Owing to increased genome coverage, a substantial fraction of the repetitive regions in the genome can now be examined.
Increased sensitivity and specificity in the mapping of transcription factor binding sites has facilitated motif discovery and target identification.
Detailed profiling of histone modifications and nucleosome positions enables greater understanding of epigenetic mechanisms in development and differentiation.
As the cost of sequencing continues to decrease, ChIP–seq will be the method of choice over array-based approaches in nearly all cases.
Chromatin immunoprecipitation followed by sequencing (ChIP–seq) is a technique for genome-wide profiling of DNA-binding proteins, histone modifications or nucleosomes. Owing to the tremendous progress in next-generation sequencing technology, ChIP–seq offers higher resolution, less noise and greater coverage than its array-based predecessor ChIP–chip. With the decreasing cost of sequencing, ChIP–seq has become an indispensable tool for studying gene regulation and epigenetic mechanisms. In this Review, I describe the benefits and challenges in harnessing this technique with an emphasis on issues related to experimental design and data analysis. ChIP–seq experiments generate large quantities of data, and effective computational analysis will be crucial for uncovering biological mechanisms.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
Nature Communications Open Access 05 May 2023
BMC Bioinformatics Open Access 21 April 2023
BMC Genomics Open Access 04 April 2023
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
Farnham, P. J. Insights from genomic profiling of transcription factors. Nature Rev. Genet. 10, 605–616 (2009).
Jiang, C. & Pugh, B. F. Nucleosome positioning and gene regulation: advances through genomics. Nature Rev. Genet. 10, 161–172 (2009).
Henikoff, S. Nucleosome destabilization in the epigenetic regulation of gene expression. Nature Rev. Genet. 9, 15–26 (2008).
Li, B., Carey, M. & Workman, J. L. The role of chromatin during transcription. Cell 128, 707–719 (2007).
Allis, C. D., Jenuwein, T. & Reinberg, D. (eds) Epigenetics (Cold Spring Harb. Lab. Press, New York, 2007).
Berger, S. L. The complex language of chromatin regulation during transcription. Nature 447, 407–412 (2007).
Bernstein, B. E., Meissner, A. & Lander, E. S. The mammalian epigenome. Cell 128, 669–681 (2007).
Solomon, M. J., Larsen, P. L. & Varshavsky, A. Mapping protein–DNA interactions in vivo with formaldehyde: evidence that histone H4 is retained on a highly transcribed gene. Cell 53, 937–947 (1988).
Blat, Y. & Kleckner, N. Cohesins bind to preferential sites along yeast chromosome III, with differential regulation along arms versus the centric region. Cell 98, 249–259 (1999).
Ren, B. et al. Genome-wide location and function of DNA binding proteins. Science 290, 2306–2309 (2000).
Bentley, D. R. Whole-genome re-sequencing. Curr. Opin. Genet. Dev. 16, 545–552 (2006).
Shendure, J. & Ji, H. Next-generation DNA sequencing. Nature Biotech. 26, 1135–1145 (2008).
Mardis, E. R. Next-generation DNA sequencing methods. Annu. Rev. Genomics Hum. Genet. 9, 387–402 (2008).
Hillier, L. W. et al. Whole-genome sequencing and variant discovery in C. elegans. Nature Methods 5, 183–188 (2008).
Ley, T. J. et al. DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome. Nature 456, 66–72 (2008).
Kim, J. B. et al. Polony multiplex analysis of gene expression (PMAGE) in mouse hypertrophic cardiomyopathy. Science 316, 1481–1484 (2007).
Nagalakshmi, U. et al. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 320, 1344–1349 (2008).
Wilhelm, B. T. et al. Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution. Nature 453, 1239–1243 (2008).
Korbel, J. O. et al. Paired-end mapping reveals extensive structural variation in the human genome. Science 318, 420–426 (2007).
Boyle, A. P. et al. High-resolution mapping and characterization of open chromatin across the genome. Cell 132, 311–322 (2008).
Maher, C. A. et al. Transcriptome sequencing to detect gene fusions in cancer. Nature 458, 97–101 (2009).
Lau, N. C. et al. Characterization of the piRNA complex from rat testes. Science 313, 363–367 (2006).
Branton, D. et al. The potential and challenges of nanopore sequencing. Nature Biotech. 26, 1146–1153 (2008).
Johnson, D. S. et al. Genome-wide mapping of in vivo protein–DNA interactions. Science 316, 1497–1502 (2007). This study is an early demonstration of the increased sensitivity and specificity of ChIP–seq for genome-wide mapping of transcription factor binding sites.
Barski, A. et al. High-resolution profiling of histone methylations in the human genome. Cell 129, 823–837 (2007). The first large-scale profiling of chromatin marks using ChIP–seq. Histone H2A.Z, RNA polymerase II, CTCF and 20 histone methylations were profiled for human T cells.
Robertson, G. et al. Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nature Methods 4, 651–657 (2007). Another early demonstration of the increased sensitivity and specificity of ChIP–seq.
Mikkelsen, T. S. et al. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature 448, 553–560 (2007). The first study to examine in a genome-wide manner how chromatin states change as cells move from immature to adult states.
Visel, A. et al. ChIP–seq accurately predicts tissue-specific activity of enhancers. Nature 457, 854–858 (2009).
Robertson, A. G. et al. Genome-wide relationship between histone H3 lysine 4 mono- and tri-methylation and transcription factor binding. Genome Res. 18, 1906–1917 (2008).
Schones, D. E. et al. Dynamic regulation of nucleosome positioning in the human genome. Cell 132, 887–898 (2008).
Tolstorukov, M. Y., Kharchenko, P. V., Goldman, J. A., Kingston, R. E. & Park, P. J. Comparative analysis of H2A.Z nucleosome organization in the human and yeast genomes. Genome Res. 19, 967–977 (2009).
Henikoff, S., Henikoff, J. G., Sakai, A., Loeb, G. B. & Ahmad, K. Genome-wide profiling of salt fractions maps physical properties of chromatin. Genome Res. 19, 460–469 (2009).
Orlando, V. Mapping chromosomal proteins in vivo by formaldehyde-crosslinked-chromatin immunoprecipitation. Trends Biochem. Sci. 25, 99–104 (2000).
O'Neill, L. P. & Turner, B. M. Immunoprecipitation of native chromatin: NChIP. Methods 31, 76–82 (2003).
Schones, D. E. & Zhao, K. Genome-wide approaches to studying chromatin modifications. Nature Rev. Genet. 9, 179–191 (2008).
Kim, T. H. et al. A high-resolution map of active promoters in the human genome. Nature 436, 876–880 (2005).
Alekseyenko, A. A. et al. A sequence motif within chromatin entry sites directs MSL establishment on the Drosophila X chromosome. Cell 134, 599–609 (2008).
Rozowsky, J. et al. PeakSeq enables systematic scoring of ChIP–seq experiments relative to controls. Nature Biotech. 27, 66–75 (2009). This paper proposes a peak-scoring approach that emphasizes the need for input control and sequence alignability.
Quail, M. A. et al. A large genome center's improvements to the Illumina sequencing system. Nature Methods 5, 1005–1010 (2008).
Whiteford, N. et al. An analysis of the feasibility of short read sequencing. Nucleic Acids Res. 33, e171 (2005).
Celniker, S. E. et al. Unlocking the secrets of the genome. Nature 459, 927–930 (2009).
Acevedo, L. G. et al. Genome-scale ChIP–chip analysis using 10,000 human cells. Biotechniques 43, 791–797 (2007).
Dahl, J. A. & Collas, P. MicroChIP — a rapid micro chromatin immunoprecipitation assay for small cell samples and biopsies. Nucleic Acids Res. 36, e15 (2008).
Wu, A. R. et al. Automated microfluidic chromatin immunoprecipitation from 2,000 cells. Lab Chip 9, 1365–1370 (2009).
O'Neill, L. P. et al. Epigenetic characterization of the early embryo with a chromatin immunoprecipitation protocol applicable to small cell populations. Nature Genet. 38, 835–841 (2006).
Harris, T. D. et al. Single-molecule DNA sequencing of a viral genome. Science 320, 106–109 (2008).
Peng, S., Alekseyenko, A. A., Larschan, E., Kuroda, M. I. & Park, P. J. Normalization and experimental design for ChIP–chip data. BMC Bioinformatics 8, 219 (2007).
Kharchenko, P. V., Tolstorukov, M. Y. & Park, P. J. Design and analysis of ChIP–seq experiments for DNA-binding proteins. Nature Biotech. 26, 1351–1359 (2008). This study develops peak callers based on strand-specific patterns and examines the issue of sequencing depth.
Lefrançois, P. et al. Efficient yeast ChIP–seq using multiplex short-read DNA sequencing. BMC Genomics 10, 37 (2009).
Fullwood, M. J., Wei, C. L., Liu, E. T. & Ruan, Y. Next-generation DNA sequencing of paired-end tags (PET) for transcriptome and genome analyses. Genome Res. 19, 521–532 (2009).
Marioni, J. C., Mason, C. E., Mane, S. M., Stephens, M. & Gilad, Y. RNA–seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 18, 1509–1517 (2008).
Barrett, T. et al. NCBI GEO: archive for high-throughput functional genomic data. Nucleic Acids Res. 37, D885–D890 (2009).
Sayers, E. W. et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 37, D5–D15 (2009).
Cochrane, G. et al. Petabyte-scale innovations at the European Nucleotide Archive. Nucleic Acids Res. 37, D19–D25 (2009).
Erlich, Y., Mitra, P. P., delaBastide, M., McCombie, W. R. & Hannon, G. J. Alta-Cyclic: a self-optimizing base caller for next-generation sequencing. Nature Methods 5, 679–682 (2008).
Rougemont, J. et al. Probabilistic base calling of Solexa sequencing data. BMC Bioinformatics 9, 431 (2008).
Trapnell, C. & Salzberg, S. L. How to map billions of short reads onto genomes. Nature Biotech. 27, 455–457 (2009).
Li, H., Ruan, J. & Durbin, R. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 18, 1851–1858 (2008). This study introduces a popular short-read aligner for NGS platforms.
Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).
Ondov, B. D., Varadarajan, A., Passalacqua, K. D. & Bergman, N. H. Efficient mapping of Applied Biosystems SOLiD sequence data to a reference genome for functional genomic applications. Bioinformatics 24, 2776–2777 (2008).
Rumble, S. M. et al. SHRiMP: accurate mapping of short color-space reads. PLoS Comput. Biol. 5, e1000386 (2009).
Bourque, G. et al. Evolution of the mammalian transcription factor binding repertoire via transposable elements. Genome Res. 18, 1752–1762 (2008).
Pauler, F. M. et al. H3K27me3 forms BLOCs over silent genes and intergenic regions and specifies a histone banding pattern on a mouse autosomal chromosome. Genome Res. 19, 221–233 (2009).
Zheng, D. Asymmetric histone modifications between the original and derived loci of human segmental duplications. Genome Biol. 9, R105 (2008).
Valouev, A. et al. Genome-wide analysis of transcription factor binding sites based on ChIP–seq data. Nature Methods 5, 829–834 (2008). This paper proposes a peak-calling method that accounts for the directionality of reads and the size of sequenced fragments.
Fejes, A. P. et al. FindPeaks 3.1: a tool for identifying areas of enrichment from massively parallel short-read sequencing technology. Bioinformatics 24, 1729–1730 (2008).
Zhang, Y. et al. Model-based analysis of ChIP–seq (MACS). Genome Biol. 9, R137 (2008).
Jothi, R., Cuddapah, S., Barski, A., Cui, K. & Zhao, K. Genome-wide identification of in vivo protein–DNA binding sites from ChIP–seq data. Nucleic Acids Res. 36, 5221–5231 (2008).
Ji, H. et al. An integrated software system for analyzing ChIP–chip and ChIP–seq data. Nature Biotech. 26, 1293–1300 (2008). This article introduces a software system that has a graphical user interface for data analysis and includes tools for data visualization andmotif discovery.
Nix, D. A., Courdy, S. J. & Boucher, K. M. Empirical methods for controlling false positives and estimating confidence in ChIP–seq peaks. BMC Bioinformatics 9, 523 (2008).
Schmid, C. D. & Bucher, P. ChIP–seq data reveal nucleosome architecture of human promoters. Cell 131, 831–832 (2007); author reply 131, 832–833 (2007).
Boyle, A. P., Guinney, J., Crawford, G. E. & Furey, T. S. F–seq: a feature density estimator for high-throughput sequence tags. Bioinformatics 24, 2537–2538 (2008).
Lai, W. R., Johnson, M. D., Kucherlapati, R. & Park, P. J. Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data. Bioinformatics 21, 3763–3770 (2005).
Xu, H., Wei, C. L., Lin, F. & Sung, W. K. An HMM approach to genome-wide identification of differential histone modification sites from ChIP–seq data. Bioinformatics 24, 2344–2349 (2008).
Zang, C. et al. A clustering approach for identification of enriched domains from histone modification ChIP–seq data. Bioinformatics 25, 1952–1958 (2009).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B 57, 289–300 (1995).
Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA 100, 9440–9445 (2003).
Tompa, M. et al. Assessing computational tools for the discovery of transcription factor binding sites. Nature Biotech. 23, 137–144 (2005).
Bailey, T. L., Williams, N., Misleh, C. & Li, W. W. MEME: discovering and analyzing DNA and protein sequence motifs. Nucleic Acids Res. 34, W369–W373 (2006).
Liu, X. S., Brutlag, D. L. & Liu, J. S. An algorithm for finding protein–DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments. Nature Biotech. 20, 835–839 (2002).
Pavesi, G., Mereghetti, P., Mauri, G. & Pesole, G. Weeder Web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes. Nucleic Acids Res. 32, W199–W203 (2004).
Romer, K. A. et al. WebMOTIFS: automated discovery, filtering and scoring of DNA sequence motifs using multiple programs and Bayesian approaches. Nucleic Acids Res. 35, W217–W220 (2007).
Heintzman, N. D. et al. Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nature Genet. 39, 311–318 (2007).
Hon, G., Ren, B. & Wang, W. ChromaSig: a probabilistic approach to finding common chromatin signatures in the human genome. PLoS Comput. Biol. 4, e1000201 (2008).
Ashburner, M. et al. Gene Ontology: tool for the unification of biology. Nature Genet. 25, 25–29 (2000).
Orford, K. et al. Differential H3K4 methylation identifies developmentally poised hematopoietic genes. Dev. Cell 14, 798–809 (2008).
Guttman, M. et al. Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature 458, 223–227 (2009).
Gentleman, R. C. et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 5, R80 (2004).
The ENCODE Project Consortium. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816 (2007).
Wang, Z. et al. Combinatorial patterns of histone acetylations and methylations in the human genome. Nature Genet. 40, 897–903 (2008). This paper examines the correlations among 39 histone modification patterns and their relationship to transcriptional activation.
Bernstein, B. E. et al. A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell 125, 315–326 (2006).
Kurdistani, S. K., Tavazoie, S. & Grunstein, M. Mapping global histone acetylation patterns to gene expression. Cell 117, 721–733 (2004).
Liu, C. L. et al. Single-nucleosome mapping of histone modifications in S. cerevisiae. PLoS Biol. 3, e328 (2005).
Pokholok, D. K. et al. Genome-wide map of nucleosome acetylation and methylation in yeast. Cell 122, 517–527 (2005).
Lee, C. K., Shibata, Y., Rao, B., Strahl, B. D. & Lieb, J. D. Evidence for nucleosome depletion at active regulatory regions genome-wide. Nature Genet. 36, 900–905 (2004).
Yuan, G. C. et al. Genome-scale identification of nucleosome positions in S. cerevisiae. Science 309, 626–630 (2005).
Lee, W. et al. A high-resolution atlas of nucleosome occupancy in yeast. Nature Genet. 39, 1235–1244 (2007).
Johnson, S. M., Tan, F. J., McCullough, H. L., Riordan, D. P. & Fire, A. Z. Flexibility and constraint in the nucleosome core landscape of Caenorhabditis elegans chromatin. Genome Res. 16, 1505–1516 (2006).
Albert, I. et al. Translational and rotational settings of H2A.Z nucleosomes across the Saccharomyces cerevisiae genome. Nature 446, 572–576 (2007).
Mavrich, T. N. et al. Nucleosome organization in the Drosophila genome. Nature 453, 358–362 (2008).
I thank P. Kharchenko, M. Tolstorukov, A. Alekseyenko and other members of the Park and the Kuroda laboratories for their insights. I gratefully acknowledge support from the National Institutes of Health grants R01GM082798, U01HG004258 and RL1DE019021.
The basic structural subunit of chromatin. A nucleosome consists of approximately 147 base pairs of DNA and an octamer of histone proteins.
The chromatin states that are found along the genome, defined for a given time point and cell type. Thus, for a given genome there may be hundreds or thousands of epigenomes, depending on the stability of the chromatin states.
- DNase I hypersensitive site
A chromosomal region that is highly accessible to cleavage by DNase I. Such sites are associated with open chromatin conformations and transcriptional activity.
- Bivalent domain
A region of chromatin marked by a histone modification associated with active transcription (histone H3 lysine 4 trimethylation) and a modification associated with repression (histone H3 lysine 27 trimethylation). It is postulated to mark genes that are silent but poised for transcription.
The differential expression of genes depending on whether they were inherited maternally or paternally.
A region of highly compact chromatin. Constitutive heterochromatin is largely composed of repetitive DNA.
A class of repetitive DNA that is made up of repeats that are 2–8 nucleotides in length.
- RNA interference
The process by which the introduction or expression within cells of single- or double-stranded RNA leads to the degradation of mRNA and therefore to gene suppression.
- Poisson model
A probability distribution that is often used to model the number of random events in a fixed interval. Given an average number of events in the interval, the probability of a given number of occurrences can be calculated.
About this article
Cite this article
Park, P. ChIP–seq: advantages and challenges of a maturing technology. Nat Rev Genet 10, 669–680 (2009). https://doi.org/10.1038/nrg2641
This article is cited by
BMC Bioinformatics (2023)
BMC Genomics (2023)
High-throughput time series expression profiling of Plasmopara halstedii infecting Helianthus annuus reveals conserved sequence motifs upstream of co-expressed genes
BMC Genomics (2023)
BMC Bioinformatics (2023)
BIND&MODIFY: a long-range method for single-molecule mapping of chromatin modifications in eukaryotes
Genome Biology (2023)