Unsupervised pattern discovery in human chromatin structure through genomic segmentation

Hoffman, Michael M; Buske, Orion J; Wang, Jie; Weng, Zhiping; Bilmes, Jeff A; Noble, William Stafford

doi:10.1038/nmeth.1937

Brief Communication
Published: 18 March 2012

Unsupervised pattern discovery in human chromatin structure through genomic segmentation

Michael M Hoffman¹,
Orion J Buske¹^nAff5,
Jie Wang²,
Zhiping Weng²,
Jeff A Bilmes³ &
…
William Stafford Noble^1,4

Nature Methods volume 9, pages 473–476 (2012)Cite this article

13k Accesses
384 Citations
39 Altmetric
Metrics details

Subjects

Abstract

We trained Segway, a dynamic Bayesian network method, simultaneously on chromatin data from multiple experiments, including positions of histone modifications, transcription-factor binding and open chromatin, all derived from a human chronic myeloid leukemia cell line. In an unsupervised fashion, we identified patterns associated with transcription start sites, gene ends, enhancers, transcriptional regulator CTCF-binding regions and repressed regions. Software and genome browser tracks are at http://noble.gs.washington.edu/proj/segway/.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Heat map of discovered Gaussian parameters in an unsupervised 25-label segmentation trained on 31 tracks of histone modification, transcription-factor binding and open chromatin signal data in 1% of the human genome.

**Figure 2: Gene structure in Segway labels.**

Chromatin alternates between A and B compartments at kilobase scale for subgenic organization

Article Open access 06 June 2023

Hannah L. Harris, Huiya Gu, … M. Jordan Rowley

Deciphering multi-way interactions in the human genome

Article Open access 20 September 2022

Gabrielle A. Dotson, Can Chen, … Indika Rajapakse

Single-cell chromatin state analysis with Signac

Article 01 November 2021

Tim Stuart, Avi Srivastava, … Rahul Satija

References

ENCODE Project Consortium. PLoS Biol. 9, e1001046 (2011).
Day, N., Hemmaplardh, A., Thurman, R.E., Stamatoyannopoulos, J.A. & Noble, W.S. Bioinformatics 23, 1424–1426 (2007).
Article CAS Google Scholar
Erdman, C. & Emerson, J.W. Bioinformatics 24, 2143–2148 (2008).
Article CAS Google Scholar
Jaschek, R. & Tanay, A. in Research in Computational Molecular Biology, Lecture Notes in Computer Science Vol. 5541 (ed. Batzoglou, S.) 170–183 (Springer, Berlin, 2009).
Ernst, J. & Kellis, M. Nat. Biotechnol. 28, 817–825 (2010).
Article CAS Google Scholar
Filion, G.J. et al. Cell 143, 212–224 (2010).
Article CAS Google Scholar
Kharchenko, P.V. et al. Nature 471, 480–485 (2011).
Article CAS Google Scholar
Bilmes, J. & Bartels, C. IEEE Signal Process. Mag. 22, 89–100 (2005).
Article Google Scholar
Reynolds, S.M., Käll, L., Riffle, M.E., Bilmes, J.A. & Noble, W.S. PLOS Comput. Biol. 4, e1000213 (2008).
Article Google Scholar
Wang, Z., Schones, D.E. & Zhao, K. Curr. Opin. Genet. Dev. 19, 127–134 (2009).
Article CAS Google Scholar
Hon, G., Ren, B. & Wang, W. PLOS Comput. Biol. 4, e1000201 (2008).
Article Google Scholar
Raney, B.J. et al. Nucleic Acids Res. 39, D871–D875 (2011).
Article CAS Google Scholar
Hoffman, M.M., Buske, O.J. & Noble, W.S. Bioinformatics 26, 1458–1459 (2010).
Article CAS Google Scholar
Johnson, N.L. Biometrika 36, 149–176 (1949).
Article CAS Google Scholar
Bilmes, J. in UAI '00: Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence (eds. Boutilier, C. & Goldszmidt, M.) 38–45 (Morgan Kaufmann, San Francisco, 2000).
Grundy, W.N., Bailey, T.L., Elkan, C.P. & Baker, M.E. Comput. Appl. Biosci. 13, 397–406 (1997).
CAS PubMed Google Scholar
Bilmes, J. & Bartels, C. in UAI '03, Proceedings of the 19th Conference in Uncertainty in Artificial Intelligence (eds. Meek, C. & Kjærulff, U.) 47–56 (Morgan Kaufmann Publishers, San Francisco, 2003).
Dempster, A.P., Laird, N.M. & Rubin, D.B. J. Royal Stat. Soc. B 39, 1–22 (1977).
Google Scholar
Viterbi, A.J. IEEE Trans. Inf. Theory 13, 260–269 (1967).
Article Google Scholar
Fujita, P.A. et al. Nucleic Acids Res. 39, D876–D882 (2011).
Article CAS Google Scholar
Harrow, J. et al. Genome Biol. 7, S4.1–S4.9 (2006).
Article Google Scholar
Takahashi, H., Kato, S., Murata, M. & Carninci, P. Methods Mol. Biol. 786, 181–200 (2012).
Article CAS Google Scholar
Siepel, A. et al. Genome Res. 15, 1034–1050 (2005).
Article CAS Google Scholar
Buske, O.J., Hoffman, M.M., Ponts, N., Roch, K.G.L. & Noble, W.S. BMC Bioinformatics 12, 415 (2011).
Article CAS Google Scholar
Davis, J. & Goadrich, M. in Proceedings of the 23rd International Conference on Machine Learning 233–240 (ACM, New York, 2006).
Flicek, P. et al. Nucleic Acids Res. 39, D800–D806 (2011).
Article CAS Google Scholar
UniProt Consortium. Nucleic Acids Res. 39, D214–D219 (2011).
Berriz, G.F., Beaver, J.E., Cenik, C., Tasan, M. & Roth, F.P. Bioinformatics 25, 3043–3044 (2009).
Article CAS Google Scholar
Wingender, E. et al. Nucleic Acids Res. 28, 316–319 (2000).
Article CAS Google Scholar
Sandelin, A., Alkema, W., Engstrom, P., Wasserman, W. & Lenhard, B. Nucleic Acids Res. 32, D91–D94 (2004).
Article CAS Google Scholar
Grant, C.E., Bailey, T.L. & Noble, W.S. Bioinformatics 27, 1017–1018 (2011).
Article CAS Google Scholar
Bickel, P.J., Boley, N., Brown, J.B., Huang, H. & Zhang, N.R. Ann. Appl. Stat. 4, 1660–1697 (2010).
Article Google Scholar

Download references

Acknowledgements

We thank P.J. Collins for assistance with transient transfection assays, S. Djebali for processing data, C.E. Grant for motif analysis, A. Kundaje for helpful suggestions, and members of the ENCODE Project Consortium, the ENCODE Data Coordination Center and the US National Human Genome Research Institute for providing early public access to the unpublished data used in this work. This work used data produced in the laboratories of B.E. Bernstein (Broad Institute of the Massachusetts Institute of Technology and Harvard University), M.P. Snyder (Stanford University), R.M. Myers (HudsonAlpha Institute for Biotechnology), P.J. Farnham (University of Southern California), V.R. Iyer (University of Texas at Austin), G.E. Crawford (Duke University), J.D. Lieb and T.S. Furey (University of North Carolina at Chapel Hill), J.A. Stamatoyannopoulos (University of Washington), P. Carninci (RIKEN), T.R. Gingeras (Cold Spring Harbor Laboratory), and A. Sidow (Stanford University). This publication was made possible by grants 004695, 004561 and 006259 from National Human Genome Research Institute.

Author information

Orion J Buske
Present address: Present address: Department of Computer Science, University of Toronto, Toronto, Ontario, Canada.,

Authors and Affiliations

Department of Genome Sciences, University of Washington, Seattle, Washington, USA
Michael M Hoffman, Orion J Buske & William Stafford Noble
Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts, USA
Jie Wang & Zhiping Weng
Department of Electrical Engineering, University of Washington, Seattle, Washington, USA
Jeff A Bilmes
Department of Computer Science and Engineering, University of Washington, Seattle, Washington, USA
William Stafford Noble

Authors

Michael M Hoffman
View author publications
You can also search for this author in PubMed Google Scholar
Orion J Buske
View author publications
You can also search for this author in PubMed Google Scholar
Jie Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhiping Weng
View author publications
You can also search for this author in PubMed Google Scholar
Jeff A Bilmes
View author publications
You can also search for this author in PubMed Google Scholar
William Stafford Noble
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.M.H., W.S.N. and J.A.B. conceived of the project; M.M.H., W.S.N. and Z.W. designed computational and biological experiments. M.M.H., J.A.B., O.J.B. and J.W. developed software used in this work; M.M.H., O.J.B. and J.W. conducted computational experiments and analyzed data; and M.M.H., W.S.N., Z.W., J.A.B., O.J.B. and J.W. wrote the manuscript.

Corresponding author

Correspondence to William Stafford Noble.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–11, Supplementary Tables 1–4, Supplementary Results, Supplementary Discussion (PDF 1780 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hoffman, M., Buske, O., Wang, J. et al. Unsupervised pattern discovery in human chromatin structure through genomic segmentation. Nat Methods 9, 473–476 (2012). https://doi.org/10.1038/nmeth.1937

Download citation

Received: 01 July 2011
Accepted: 14 February 2012
Published: 18 March 2012
Issue Date: May 2012
DOI: https://doi.org/10.1038/nmeth.1937

This article is cited by

Predicting active enhancers with DNA methylation and histone modification
- Ximei Luo
- Qun Li
- Lei Xu
BMC Bioinformatics (2023)
ChromGene: gene-based modeling of epigenomic data
- Artur Jaroszewicz
- Jason Ernst
Genome Biology (2023)
Accurate prediction of functional states of cis-regulatory modules reveals common epigenetic rules in humans and mice
- Pengyu Ni
- Joshua Moe
- Zhengchang Su
BMC Biology (2022)
A map of cis-regulatory modules and constituent transcription factor binding sites in 80% of the mouse genome
- Pengyu Ni
- David Wilson
- Zhengchang Su
BMC Genomics (2022)
Robust normalization and transformation techniques for constructing gene coexpression networks from RNA-seq data
- Kayla A. Johnson
- Arjun Krishnan
Genome Biology (2022)