Formation of long-range chromatin loops is a crucial step in transcriptional activation of target genes by distal enhancers1,2. Mapping such structural features can help define target genes for enhancers and annotate non-coding sequence variants linked to human diseases1,2,3. Study of the higher-order chromatin organization has been facilitated by the development of chromosome conformation capture (3C)-based technologies4,5. Among the commonly used high-throughput 3C approaches are Hi-C6 and chromatin interaction analysis by paired-end tag sequencing (ChIA-PET)7. Global analysis of long-range chromatin interactions using Hi-C has been achieved at kilobase resolution but requires billions of sequencing reads8. High-resolution analysis of long-range chromatin interactions at selected genomic regions can be attained cost-effectively through either ChIA-PET7,9 or targeted capture and sequencing of Hi-C libraries10. ChIA-PET has been used to identify long-range interactions at promoters and enhancers at high resolution in various cell types and species11. However, this procedure requires hundreds of million cells as starting materials, likely because chromatin immunoprecipitation and proximity ligation are performed after chromatin shearing, which potentially leads to great disruption of protein/DNA complexes. To reduce the amount of input materials and improve the sensitivity and robustness of the assay, we developed Proximity Ligation-Assisted ChIP-seq (PLAC-seq), in which proximity ligation is conducted in nuclei prior to chromatin shearing and immunoprecipitation (Figure 1A, Supplementary information, Figure S1A and Data S1). We demonstrated that by switching the order of proximity ligation and chromatin shearing steps, PLAC-seq greatly improves the efficiency and accuracy over ChIA-PET7,9 in detection of long-range chromatin interactions in mammalian cells.
We performed PLAC-seq in mouse embryonic stem (ES) cells using antibodies against RNA Polymerase II (Pol II), H3K4me3 and H3K27ac to determine long-range chromatin interactions at promoters and enhancers in the genome (Supplementary information, Table S1). As shown in Figure 1B, PLAC-seq yielded libraries with higher number of unique read pairs compared with ChIA-PET. As expected, the sequencing reads were strongly enriched at the factor-binding sites detected by ChIP-seq analysis in the mouse ES cells12 (Supplementary information, Figure S1B-S1D and S1F-S1H). Additionally, the PLAC-seq experiments generated long-range chromatin contacts that were highly reproducible between biological replicates (Pearson correlation > 0.90; Supplementary information, Figure S1E). To identify long-range chromatin interactions, we used 'FitHiC'13 to analyze the combined datasets from two biological replicates (Supplementary information, Data S1). A total of 72 074, 273 145, and 155 545 chromatin loops (FDR < 0.01) were identified from the Pol II, H3K4me3, and H3K27ac PLAC-seq experiments, respectively. We found that PLAC-seq could be performed with much fewer cells than ChIA-PET. Even with 0.5 million (M) cells, a majority of strong long-range interactions could be detected (Figure 1C and Supplementary information, Figure S1I).
Several lines of evidence support the superior performance of PLAC-seq over ChIA-PET. First, PLAC-seq was nearly 100 times more cost-effective than ChIA-PET in generating long-range intra-chromosomal read pairs, which are typically used to infer chromatin loops. Using 20-fold fewer cells (5 M vs 100 M), Pol II PLAC-seq produced 10 times more reads (175 M vs 16 M) with lower PCR duplication rate (30% vs 44%) than a previously published Pol II ChIA-PET experiment14. In addition, PLAC-seq generated more long-range intra-chromosomal pairs (67% vs 9%) and fewer inter-chromosomal pairs (11% vs 48%) (Figure 1B). Second, PLAC-seq uncovered chromatin loops in the mouse ES cells with much higher sensitivity and specificity than ChIA-PET. Additionally, PLAC-seq chromatin interactions were typically supported by 24 unique read pairs (medium) compared to 3 PETs supporting ChIA-PET interactions14 (Figure 1D). Pol II PLAC-seq analysis identified 57% of Pol II ChIA-PET interactions (FDR < 0.05 and PET count >= 3, 10 kb to 3Mb) and a lot of additional interactions (Figure 1E). PLAC-seq covered more regulatory elements, such as promoters and distal DNase I hypersensitive sites (DHSs), than ChIA-PET (Supplementary information, Figure S1J). As a reference, we performed in situ Hi-C with the mouse ES cell line and collected nearly 1.2 billion paired-end sequencing reads, from which we identified 68 781 long-range chromatin interactions (FDR < 0.01) using FitHiC13. Compared with chromatin interactions identified by in situ Hi-C, PLAC-seq is 8 times more sensitive than ChIA-PET and also more accurate (Figure 1F). Third, we performed 4C-seq analysis of four randomly selected genomic regions (Supplementary information, Table S1). Although both ChIA-PET and PLAC-seq identified many common chromatin interactions (Figure 1G and Supplementary information, Figure S2B-S2C), PLAC-seq uncovered seven additional strong interactions (marked 2, 4 and 5 in Figure 1G, and 1-4 in Supplementary information, Figure S2A-S2C) detected by 4C-seq. Taken together, the results above support the superior sensitivity and specificity of PLAC-seq over ChIA-PET.
We also developed a new computational algorithm to identify chromatin interactions at high resolution from PLAC-seq data. We used the binomial test (Supplementary information, Data S1) to determine the enrichment of read pairs for an interaction due to chromatin immunoprecipitation using in situ Hi-C analysis result as an estimation of background interaction frequency (Figure 1H). We termed this type of interactions as 'PLACE' (PLAC-Enriched) interactions. A total of 28 822 and 19 429 significant H3K4me3 and H3K27ac PLACE interactions (FDR < 0.05) in the mouse ES cells were identified, respectively. These corresponded to different sets of chromatin interactions, with 26% of H3K27ac PLACE interactions overlapping with 19% of H3K4me3 PLACE interactions (Figure 1I). A majority of H3K27ac PLACE interactions were enhancer-associated (74%) while H3K4me3 PLACE interactions were generally promoter-associated (78%) (Figure 1J). Genes involved in H3K27ac PLACE interactions had significantly higher expression levels than genes associated with H3K4me3 PLACE interactions (P < 2.2e-16, Figure 1K), suggesting that H3K27ac PLAC-seq could be used to discover chromatin interactions at active enhancers and H3K4me3 PLAC-seq at active or poised promoters.
In summary, we developed a fast, sensitive and cost-effective method to map long-range chromatin interactions in mammalian cells. Using PLAC-seq, we obtained high-resolution maps of chromatin interactions at enhancers and promoters in the mouse ES cells. The ease of experimental procedure and small amount of input materials required will allow the mapping of long-range chromatin interactions in a broad set of species, cell types, and experimental settings. A similar method called HiChIP was recently reported by Mumbach et al.15 when our manuscript was under review.
Raw and processed data have been deposited to NCBI Gene Expression Omnibus with the accession number GSE86150.
Gene Expression Omnibus
The work is supported by funding from the Ludwig Institute for Cancer Research and NIH (1U54DK107977-01, 2P50 GM085764 and U54 HG006997).
Summary of PLAC-seq libraries and 4C primer sequence
About this article
(Supplementary information is linked to the online version of the paper on the Cell Research website.)
Seminars in Cell & Developmental Biology (2019)