Brief Communication | Published:

Single-cell chromatin immunocleavage sequencing (scChIC-seq) to profile histone modification

Nature Methodsvolume 16pages323325 (2019) | Download Citation

Abstract

Our method for analyzing histone modifications, scChIC-seq (single-cell chromatin immunocleavage sequencing), involves targeting of the micrococcal nuclease (MNase) to a histone mark of choice by tethering to a specific antibody. Cleaved target sites are then selectively PCR amplified. We show that scChIC-seq reliably detects H3K4me3 and H3K27me3 target sites in single human white blood cells. The resulting data are used for clustering of blood cell types.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Data availability

The scChIC-seq data were deposited in the Gene Expression Omnibus database with accession number GSE105012.

Code availability

Codes for analysis of the figures are available at https://github.com/wailimku/scChIC-seq.git.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  1. 1.

    Corces, M. R. et al. Nat. Genet. 48, 1193–1203 (2016).

  2. 2.

    Jin, W. et al. Nature 528, 142–146 (2015).

  3. 3.

    Lai, B. et al. Nature 562, 281–285 (2018).

  4. 4.

    Smallwood, S. A. et al. Nat. Methods 11, 817–820 (2014).

  5. 5.

    Cusanovich, D. A. et al. Science 348, 910–914 (2015).

  6. 6.

    Buenrostro, J. D. et al. Nature 523, 486–490 (2015).

  7. 7.

    Lay, F. D., Kelly, T. K. & Jones, P. A. Methods Mol. Biol. 1708, 267–284 (2018).

  8. 8.

    Pott, S. eLife 6, e23203 (2017).

  9. 9.

    Barski, A. et al. Cell 129, 823–837 (2007).

  10. 10.

    Cao, Z., Chen, C., He, B., Tan, K. & Lu, C. Nat. Methods 12, 959–962 (2015).

  11. 11.

    Lara-Astiaso, D. et al. Science 345, 943–949 (2014).

  12. 12.

    Brind’Amour, J. et al. Nat. Commun. 6, 6033 (2015).

  13. 13.

    Blecher-Gonen, R. et al. Nat. Protoc. 8, 539–554 (2013).

  14. 14.

    Schmidl, C., Rendeiro, A. F., Sheffield, N. C. & Bock, C. Nat. Methods 12, 963–965 (2015).

  15. 15.

    Adli, M. & Bernstein, B. E. Nat. Protoc. 6, 1656–1668 (2011).

  16. 16.

    Rotem, A. et al. Nat. Biotechnol. 33, 1165–1172 (2015).

  17. 17.

    Schmid, M., Durussel, T. & Laemmli, U. K. Mol. Cell 16, 147–157 (2004).

  18. 18.

    Skene, P. J. & Henikoff, S. eLife 6, e21856 (2017).

  19. 19.

    Kiselev, V. Y. et al. Nat. Methods 14, 483–486 (2017).

  20. 20.

    Langmead, B. & Salzberg, S. L. Nat. Methods 9, 357–359 (2012).

  21. 21.

    Heinz, S. et al. Mol. Cell 38, 576–589 (2010).

  22. 22.

    Satpathy, A. T. et al. Nat. Med. 24, 580–590 (2018).

  23. 23.

    Lawrence, M. et al. PLoS Comput. Biol. 9, e1003118 (2013).

  24. 24.

    Xu, S., Grullon, S., Ge, K. & Peng, W. Methods Mol. Biol. 1150, 97–111 (2014).

  25. 25.

    Jaitin, D. A. et al. Science 343, 776–779 (2014).

  26. 26.

    Zhou, X. et al. Nat. Methods 8, 989–990 (2011).

  27. 27.

    Schep, A. N., Wu, B., Buenrostro, J. D. & Greenleaf, W. J. Nat. Methods 14, 975–978 (2017).

  28. 28.

    Zhao, C., Hu, S., Huo, X. & Zhang, Y. PLoS ONE 12, e0180583 (2017).

  29. 29.

    Li, W. V. & Li, J. J. Nat. Commun. 9, 997 (2018).

  30. 30.

    Renard, E., Branders, S. & Absil, P. A. in Proc. Algorithms and Bioinformatics—16th International Workshop, WABI 2016, Aarhus, Denmark, August 22–24, 2016: Lecture Notes (eds. Frith, M. C. & Pedersen, C. N. S.) 281–292 (Springer, 2016).

  31. 31.

    Nakamura, K. et al. Protocol Exchange https://doi.org/10.1038/protex.2019.011 (2019).

Download references

Acknowledgements

We thank the NHLBI DNA Sequencing Core facility, the NHLBI Flow Cytometry Core facility and the NIH Biowulf High Performance Computing Systems for assistance with this work. This work was supported by the Division of Intramural Research of NHLBI, NIH. B.N. was funded by the National Key Research and Development Project (no. 2016YFA0502203).

Author information

Author notes

  1. These authors contributed equally: Wai Lim Ku, Kosuke Nakamura, Weiwu Gao.

Affiliations

  1. Laboratory of Epigenome Biology, Systems Biology Center, NHLBI, NIH, Bethesda, MD, USA

    • Wai Lim Ku
    • , Kosuke Nakamura
    • , Weiwu Gao
    • , Kairong Cui
    • , Gangqing Hu
    • , Qingsong Tang
    •  & Keji Zhao
  2. National Institute of Health Sciences, Kawasaki, Japan

    • Kosuke Nakamura
  3. Department of Pathophysiology and High Altitude Pathology, Third Military Medical University, Chongqing, China

    • Weiwu Gao
    •  & Bing Ni

Authors

  1. Search for Wai Lim Ku in:

  2. Search for Kosuke Nakamura in:

  3. Search for Weiwu Gao in:

  4. Search for Kairong Cui in:

  5. Search for Gangqing Hu in:

  6. Search for Qingsong Tang in:

  7. Search for Bing Ni in:

  8. Search for Keji Zhao in:

Contributions

K.Z conceived the project. K.Z. and B.N. directed the study. K.N., W.G., K.C., K.Z. and W.L.K. performed the experiments. W.L.K. performed data analysis. G.H., Q.T. and B.N. contributed to experimental design and data. W.L.K., K.N. and K.Z. wrote the paper.

Competing interests

The authors declare no competing interests.

Corresponding authors

Correspondence to Bing Ni or Keji Zhao.

Integrated supplementary information

  1. Supplementary Figure 1 Measurement of H3K4me3 profiles by the scChIC-seq using 100, 300, 1,000 and 3,000 NIH3T3 cells.

    a, H3K4me3 Ab-MNase conjugates are added to the pre-treated cells, such that the conjugates could be bound to the locations with the H3K4me3 histone modification mark. b, A genome browser snapshot showing the H3K4me3 profiles around the locus of Dcaf8 for 100, 300, 1,000 and 3,000 cells. c, A genome browser snapshot showing the H3K4me3 profiles around the locus of Mbtd1 for 100, 300, 1,000 and 3,000 cells. (The experiment for 3,000 cells was repeated three times independently with similar results, while there were no repeated experiments for 1,000, 300, or 100 cells.)

  2. Supplementary Figure 2 scChIC-seq detects H3K4me3 profiles in a small number of cells.

    a, A TSS profile plot for H3K4me3 measured by ChIP-seq using bulk cells (black) and by scChIC-seq using 100 (green), 300 (magenta), 1,000 (blue) and 3,000 (red) cells. b, A scatter plot of the H3K4me3 read density of ChIP-seq (bulk cell) versus that of scChIC-seq (3,000 cells rep1) at the enriched regions identified using the H3K4me3 ChIP-seq (bulk cells). c, Scatter plots for the H3K4me3 scChIC-seq data (3,000 cells rep2) versus H3K4me3 ChIP-seq in NIH-3T3 cells. d, Scatter plots for the H3K4me3 scChIC-seq data (1,000 cells) versus H3K4me3 ChIP-seq in NIH-3T3 cells. e, Scatter plots for the H3K4me3 scChIC-seq data (300 cells) versus H3K4me3 ChIP-seq in NIH-3T3 cells. f, Scatter plots for the H3K4me3 scChIC-seq data (100 cells) versus H3K4me3 ChIP-seq in NIH-3T3 cells. (The correlation shown in bg is calculated using the Pearson correlation coefficient.) g, A Venn diagram showing the overlap of the enriched regions of H3K4me3 profiles measured by ChIP-seq using bulk cells (red) and by scChIC-seq using 3,000 cells rep1 (blue). h, A histogram showing the fractions of enriched regions identified by scChIC-seq that are overlapped with those identified by bulk cell H3K4me3 ChIP-seq. In each scChIC-seq library (100, 300, 1,000 and 3,000 cells), we computed the precision, which is the fraction of these enriched regions that are overlapped with that identified by bulk cell H3K4me3 ChIP-seq. i, A histogram showing the number of enriched regions identified by scChIC-seq libraries of using 100, 300, 1,000 and 3,000 cells NIH-3T3 cells.

  3. Supplementary Figure 3 TSS profile plots for the H3K4me3 profiles detected by scChIC-seq using H3K4me3 Ab-MNase conjugate (red) and IgG Ab-MNase (blue) with different washing conditions.

    ad, (a) 200 mM NaCl, (b) 300 mM NaCl, (c) 400 mM NaCl, and (d) RIPA buffer + 150 mM NaCl. (Three independent experiments were performed for the condition of RIPA + 150 mM salt water, and no repeated experiments for other conditions. It is because the other other conditions are not always workable. This reproducibility of the results is also taken account into searching the optimal conditions.)

  4. Supplementary Figure 4 TSS profile plots for the H3K4me3 profiles detected by scChIC-seq using H3K4me3 Ab + PA-MNase conjugate (red) and IgG Ab + PA-MNase (blue) with different washing conditions.

    ad, (a) 200 mM NaCl, (b) 300 mM NaCl, (c) 400 mM NaCl, and (d) RIPA buffer + 150 mM NaCl. (Two independent experiments were performed for the condition of 400 mM salt water, and no repeated experiments for other conditions. It is because the other conditions are not always workable. This reproducibility of the results is also taken account into searching the optimal conditions.)

  5. Supplementary Figure 5 Comparison between the H3K4me3 profiles obtained by scChIC-seq assays using H3K4me3-MNase or H3K4me3 Ab + PA-MNase.

    a, A genome browser snapshot showing the H3K4me3 profiles identified by ChIP-seq using bulk cell data (black), scChIC-seq using H3K4me3 Ab-MNase (blue) and scChIC-seq using H3K4me3 Ab + PA-MNase (red). (The case of H3K4me3 Ab-MNase was repeated by three independent experiments. The case of H3K4me3 Ab + PA-MNase was repeated by two idependent experiments.) b, A scatter plot of the H3K4me3 read density of ChIP-seq versus that of scChIC-seq using H3K4me3 Ab + PA-MNase. The correlation is computed using the Pearson’s correlation coeffiecient. c, A scatter plot of the H3K4me3 read density of scChIC-seq using H3K4me3 Ab + PA-MNase versus that of scChIC-seq using H3K4me3 Ab-MNase. The correlation is computed using the Pearson’s correlation coeffiecient. d, A Venn diagram showing the overlap of the enriched regions of H3K4me3 profiles measured by ChIP-seq using bulk cells (black) and by scChIC-seq using H3K4me3 Ab + PA-MNase and 3,000 cells (blue). e, A Venn diagram showing the overlap of the enriched regions of H3K4me3 profiles measured by scChIC-seq using H3K4me3 Ab + PA-MNase and 3,000 cells (blue) and by scChIC-seq using H3K4me3 Ab-MNase and 3,000 cells (red).

  6. Supplementary Figure 6 TSS profile plots of the H3K4me3 profiles around TSS.

    ac, (a) 3T3 cells, (b) mouse ESC cells and (c) mouse naïve CD4 T cells. In each cell type, the H3K4me3 TSS profiles (blue) are compared to the control IgG (red). d, Two heat maps showing the clusters of the H3K4me3 enriched regions measured by ChIP-seq using bulk cells (right panel) and scChIC using 3,000 cells (left panel).

  7. Supplementary Figure 7 Application of scChIC-seq to detect the profiles of H3K27me3.

    a, A Venn diagram showing the overlap of the enriched regions of H3K27me3 profiles measured by ChIP-Seq using bulk cells (red) and by scChIC-seq using 3,000 cells (blue). b, A genome browser snapshot showing profiles of H3K27me3 and Brd4 detected by scChIC and ChIP-seq. Genome browser snapshots showing the H3K27me3 profiles detected by ChIP-seq using bulk cells (top panel in red), by scChIC-seq using 3,000 cells (second panel in blue). Three independent experiments (H3K27me3 Ab-PA-MNase 3000 cells) were performed which obtained the similar results.

  8. Supplementary Figure 8 Application of scChIC-seq to profiling of H3K4me3 in single human white blood cells.

    a, A TSS profile plot showing the H3K4me3 profile around TSS for a single cell (red) and for the aggregation of 281 pooled single cells. b, A Venn diagram for comparison between the identified enriched regions from the data by bulk cell ChIP-seq and the pooled 281 single cells by scChIC-seq. c, A scatter plot showing the correlation between the ChIP-seq and pooled top 40 single-cell ChIC-seq data on the 52,798 combined H3K4me3 peaks for human white blood cells. The top 40 single cells are selected based on precision. The correlation between the ChIP-seq and pooled 281 single cell ChIC-seq data is 0.66. The correlation is computed using the Person’s correlation coefficient. d, A boxplot showing the precision from the top 10% single cells (about 48%) and all 242 single cells (48%). They are also compared to the random case of 242 simulated single cells, in which reads are randomly positioned in each cell. Precision is defined by the fraction of reads in single cells that are within the enriched regions identified using bulk cell ChIP-seq data. On each box, the central mark indicated the median. The bottom and top edges of the box indicated the 25th and 75th percentiles, respectively. A more detailed explanation of the boxplot could be found in the Methods. e, A boxplot showing the sensitivity from the top 10% single cells (about 18%) and all 242 single cells (10%). They are also compared to the random case of 242 simulated single cells, in which reads are randomly positioned in each cell. Sensitivity is defined by the fraction of enriched regions identified using bulk cell ChIP-seq data that have single cell reads. On each box, the central mark indicated the median. The bottom and top edges of the box indicated the 25th and 75th percentiles, respectively. A more detailed explanation of the boxplot could be found in the Methods.

  9. Supplementary Figure 9 Application of scChIC-seq to profiling of H3K27me3 in single cells.

    a, Genome browser snapshots showing the H3K27me3 profiles from bulk cell H3K27me3 ChIP-Seq data, from the pooled 106 single-cell ChIC-seq data and from 50 individual cells (106 cells from three independent experiments). b, A Venn diagram showing the overlap between the identified enriched regions from the bulk cell H3K27me3 ChIP-seq data and the pooled 106 single cell scChIC-seq data. The correlation is computed using the Person’s correlation coefficient. c, A scatter plot showing the correlation between the bulk cell H3K27me3 ChIP-seq and pooled 84 single cell ChIC-seq data. d, A boxplot showing the precision from all 84 single cells (47%). They are also compared to the random case of 84 simulated single cells, in which reads are randomly positioned in each cell. Precision is defined by the fraction of reads in single cells that are within the enriched regions identified using bulk cell ChIP-seq data. On each box, the central mark indicated the median. The bottom and top edges of the box indicated the 25th and 75th percentiles, respectively. A more detailed explanation of the boxplot could be found in the Methods. e, A boxplot showing the sensitivity for all 84 single cells (9.5%). They are also compared to the random case of 84 simulated single cells, in which reads are randomly positioned in each cell. Sensitivity is defined by the fraction of enriched regions identified using bulk cell ChIP-seq data that have single cell reads. On each box, the central mark indicated the median. The bottom and top edges of the box indicated the 25th and 75th percentiles, respectively. A more detailed explanation of the boxplot could be found in the Methods.

  10. Supplementary Figure 10 Correlation between H3K4me3 scChIC-seq data and scRNA-seq data.

    a, A violin plot showing the relationship between variability in H3K4me3 and heterogeneity in gene expression for T cell. The P value is computed using two-sided Wilcoxon rank-sum test. The central mark indicated the median. The sample size in each group is n = 2,342. b, A violin plot showing the relationship between co-methylation and co-expression for T cell. The P value is computed using two-sided Wilcoxon rank-sum test. The central mark indicated the median. The sample size in each group is n = 2,459,408. c, A violin plot showing the relationship between variability in H3K4me3 and heterogeneity in gene expression for monocytes. The P value is computed using two-sided Wilcoxon rank-sum test. The central mark indicated the median. The sample size in each group is n = 2,267. d, A violin plot showing the relationship between co-methylation and co-expression for monocytes. The P value is computed using two-sided Wilcoxon rank-sum test. The central mark indicated the median. The sample size in each group is n = 2,280,322. e, Two groups of annotated genes are selected from the highly co-methylated peaks (blue) and the highly variable peaks (red) using the T cells identified from the scChIC-seq data. Four cdf plots are plotted for the two groups of genes using the gene expression in B cells (top left), monocytes (top right), T cells (bottom left), and NK cells (bottom right). The P values for the difference between the gene expression of the two groups are computed using two-sided Wilcoxon rank-sum test. In each subplot, the number of highly co-methylated peaks in T cells = 40 while the number of highly variable peals in T cells = 62. f, Two groups of annotated genes are selected from the highly co-methylated peaks (blue) and the highly variable peaks (red) using the monocytes identified from the scChIC-se data. Four cdf plots are plotted for the two groups of genes using the gene expression in B cells (top left), monocytes (top right), T cells (bottom left), and NK cells (bottom right). The P values for the difference between the gene expression of the two groups are computed using two-sided Wilcoxon rank-sum test. In each subplot, the number of highly co-methylated peaks in Mono = 61 while the number of highly variable peaks in Mono = 72.

  11. Supplementary Figure 11 TFs enriched in the highly co-methylated and highly variable peaks are associated with cell-specific expression.

    a, A volcano plot of the comparison between the enriched TFs that are specific to the highly co-methylated peaks and highly variable peaks in the T cells identified from the H3K4me3 scChIC-seq data. The y-axis is the negative log of P value in the differential TFs analysis. The P value is computed using one-sided Student t test. X-axis is the difference between the mean value of two TF-bias corrected deviation vectors obtained from chromVAR (Nat. Methods 14, 975–978; 2017). The sample size is n = 1,520. b, The enriched TFs, which are specific to highly co-methylated peaks in T cells (a), are preferentially expressed in Th1 cells. The bar plot shows the gene expression levels (RPKM) of the enriched TFs in Th1 cells and naïve T cells. c, The enriched TFs, which are specific to highly variable peaks in T cells (a), are preferentially expressed in naïve T cells. The bar plot shows the gene expression levels (RPKM) of the enriched TFs in Th1 cells and naïve T cells. d, A volcano plot of the comparison between the enriched TFs that are specific to the highly co-methylated peaks and highly variable peaks in monocytes identified from the H3K4me3 scChIC-seq data. The y-axis is the negative log of P value in the differential TFs analysis. The P value is computed using one-sided Student t test. X-axis is the difference between the mean value of two TF-bias corrected deviations vectors obtained from chromVAR (Nat. Methods 14, 975–978; 2017). The sample size is n = 1,520. e, The enriched TFs, which are specific to highly co-methylated peaks in monocytes (c), are preferentially expressed in monocytes. The bar plot shows the gene expression levels (RPKM) of the enriched TFs in monocytes and macrophages. f, The enriched TFs, which are specific to highly variable peaks in monocytes (c), are preferentially expressed in macrophages. The bar plot shows the gene expression levels (RPKM) of the enriched TFs in monocytes and macrophages.

Supplementary information

  1. Supplementary Information

    Supplementary Figures 1–11 and Supplementary Results

  2. Reporting Summary

  3. Supplementary Protocol

    scChIC-Seq Supplementary Protocol (version from January 2019).

  4. Supplementary Data 1

    The mapping statistics of the libraries of scChIC-seq libraries.

  5. Supplementary Data 2

    Information on the datasets used in this study.

  6. Supplementary Data 3

    Summary of materials and reagents.

  7. Supplementary Data 4

    Information on the primer sequences.

About this article

Publication history

Received

Accepted

Published

Issue Date

DOI

https://doi.org/10.1038/s41592-019-0361-7