The genome-wide architecture of chromatin-associated proteins that maintains chromosome integrity and gene regulation is not well defined. Here we use chromatin immunoprecipitation, exonuclease digestion and DNA sequencing (ChIP–exo/seq)1,2 to define this architecture in Saccharomyces cerevisiae. We identify 21 meta-assemblages consisting of roughly 400 different proteins that are related to DNA replication, centromeres, subtelomeres, transposons and transcription by RNA polymerase (Pol) I, II and III. Replication proteins engulf a nucleosome, centromeres lack a nucleosome, and repressive proteins encompass three nucleosomes at subtelomeric X-elements. We find that most promoters associated with Pol II evolved to lack a regulatory region, having only a core promoter. These constitutive promoters comprise a short nucleosome-free region (NFR) adjacent to a +1 nucleosome, which together bind the transcription-initiation factor TFIID to form a preinitiation complex. Positioned insulators protect core promoters from upstream events. A small fraction of promoters evolved an architecture for inducibility, whereby sequence-specific transcription factors (ssTFs) create a nucleosome-depleted region (NDR) that is distinct from an NFR. We describe structural interactions among ssTFs, their cognate cofactors and the genome. These interactions include the nucleosomal and transcriptional regulators RPD3-L, SAGA, NuA4, Tup1, Mediator and SWI–SNF. Surprisingly, we do not detect interactions between ssTFs and TFIID, suggesting that such interactions do not stably occur. Our model for gene induction involves ssTFs, cofactors and general factors such as TBP and TFIIB, but not TFIID. By contrast, constitutive transcription involves TFIID but not ssTFs engaged with their cofactors. From this, we define a highly integrated network of gene regulation by ssTFs.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
Structural basis of RNA polymerase II transcription on the chromatosome containing linker histone H1
Nature Communications Open Access 26 November 2022
Nature Structural & Molecular Biology Open Access 21 November 2022
Nature Communications Open Access 23 August 2022
Subscribe to Nature+
Get immediate online access to Nature and 55 other Nature journal
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
See Supplementary Data 4 for a list of where to find available data and code online. In essence, all raw sequencing data and peak files from this study are available at the NCBI GEO (https://www.ncbi.nlm.nih.gov/geo/) under accession number GSE147927. Processed data are available at https://doi.org/10.26208/rykf-6050. Additional analyses and data are at yeastepigenome.org. We warn that single-replicate data files are not likely to have meaningful data and should not be used without further replication. All underlying data used to generate composite plots, coordinate files and script parameters for Figs. 2–5, Extended Data Figs. 4, 5, 7, 8b and Supplementary Fig. 1 can be downloaded from https://github.com/CEGRcode/2021-Rossi_Nature. Final composite plot values can be found in Supplementary Data 5.
Code is available at https://github.com/CEGRcode/scriptmanager.
Rossi, M. J., Lai, W. K. M. & Pugh, B. F. Simplified ChIP-exo assays. Nat. Commun. 9, 2842 (2018).
Rhee, H. S. & Pugh, B. F. Comprehensive genome-wide protein-DNA interactions detected at single-nucleotide resolution. Cell 147, 1408–1419 (2011).
Hahn, S. & Young, E. T. Transcriptional regulation in Saccharomyces cerevisiae: transcription factor regulation and function, mechanisms of initiation, and roles of activators and coactivators. Genetics 189, 705–736 (2011).
Levine, M., Cattoglio, C. & Tjian, R. Looping back to leap forward: transcription enters a new era. Cell 157, 13–25 (2014).
Cramer, P. Organization and regulation of gene transcription. Nature 573, 45–54 (2019).
Eaton, M. L., Galani, K., Kang, S., Bell, S. P. & MacAlpine, D. M. Conserved nucleosome positioning defines replication origins. Genes Dev. 24, 748–753 (2010).
Li, N. et al. Structure of the origin recognition complex bound to DNA replication origin. Nature 559, 217–222 (2018).
Wellinger, R. J. & Zakian, V. A. Everything you ever wanted to know about Saccharomyces cerevisiae telomeres: beginning to end. Genetics 191, 1073–1105 (2012).
Biggins, S. The composition, functions, and regulation of the budding yeast kinetochore. Genetics 194, 817–846 (2013).
Camahort, R. et al. Cse4 is part of an octameric nucleosome in budding yeast. Mol. Cell 35, 794–805 (2009).
Henikoff, S. et al. The budding yeast centromere DNA element II wraps a stable Cse4 hemisome in either orientation in vivo. eLife 3, e01861 (2014).
Rhee, H. S., Bataille, A. R., Zhang, L. & Pugh, B. F. Subnucleosomal structures and nucleosome asymmetry across a genome. Cell 159, 1377–1388 (2014).
Furuyama, S. & Biggins, S. Centromere identity is specified by a single centromeric nucleosome in budding yeast. Proc. Natl Acad. Sci. USA 104, 14706–14711 (2007).
Yan, K. et al. Structure of the inner kinetochore CCAN complex assembled onto a centromeric nucleosome. Nature 574, 278–282 (2019).
Han, Y., Yan, C., Fishbain, S., Ivanov, I. & He, Y. Structural visualization of RNA polymerase III transcription machineries. Cell Discov. 4, 40 (2018).
Mayer, A. et al. Uniform transitions of the general RNA polymerase II transcription complex. Nat. Struct. Mol. Biol. 17, 1272–1278 (2010).
Petrenko, N., Jin, Y., Wong, K. H. & Struhl, K. Evidence that Mediator is essential for Pol II transcription, but is not a required component of the preinitiation complex in vivo. eLife 6, e28447 (2017).
Jeronimo, C. et al. Tail and kinase modules differently regulate core Mediator recruitment and function in vivo. Mol. Cell 64, 455–466 (2016).
Andrau, J. C. et al. Genome-wide location of the coactivator mediator: binding without activation and transient Cdk8 interaction on DNA. Mol. Cell 22, 179–192 (2006).
Paul, E., Zhu, Z. I., Landsman, D. & Morse, R. H. Genome-wide association of mediator and RNA polymerase II in wild-type and mediator mutant yeast. Mol. Cell. Biol. 35, 331–342 (2015).
Zhu, X. et al. Genome-wide occupancy profile of mediator and the Srb8-11 module reveals interactions with coding regions. Mol. Cell 22, 169–178 (2006).
Krastanova, O., Hadzhitodorov, M. & Pesheva, M. Ty elements of the yeast Saccharomyces cerevisiae. Biotechnol. Biotechnol. Equip. 19, 19–26 (2005).
Reja, R., Vinayachandran, V., Ghosh, S. & Pugh, B. F. Molecular mechanisms of ribosomal protein gene coregulation. Genes Dev. 29, 1942–1954 (2015).
Krietenstein, N. et al. Genomic nucleosome organization reconstituted with pure proteins. Cell 167, 709–721 (2016).
Chereji, R. V., Ocampo, J. & Clark, D. J. MNase-sensitive complexes in yeast: nucleosomes and non-histone barriers. Mol. Cell 65, 565–577 (2017).
Candelli, T. et al. High-resolution transcription maps reveal the widespread impact of roadblock termination in yeast. EMBO J. 37, e97490 (2018).
Brzovic, P. S. et al. The acidic transcription activator Gcn4 binds the mediator subunit Gal11/Med15 using a simple protein interface forming a fuzzy complex. Mol. Cell 44, 942–953 (2011).
Huisinga, K. L. & Pugh, B. F. A genome-wide housekeeping role for TFIID and a highly regulated stress-related role for SAGA in Saccharomyces cerevisiae. Mol. Cell 13, 573–585 (2004).
Dudley, A. M., Rougeulle, C. & Winston, F. The Spt components of SAGA facilitate TBP binding to a promoter at a post-activator-binding step in vivo. Genes Dev. 13, 2940–2945 (1999).
Moqtaderi, Z., Bai, Y., Poon, D., Weil, P. A. & Struhl, K. TBP-associated factors are not generally required for transcriptional activation in yeast. Nature 383, 188–191 (1996).
Baptista, T. et al. SAGA is a general cofactor for RNA polymerase II transcription. Mol. Cell 68, 130–143 (2017).
Mittal, C., Rossi, M. J. & Pugh, B. F. High similarity among ChEC-seq datasets. Preprint at https://www.biorxiv.org/content/10.1101/2021.02.04.429774v1 (2021).
Harbison, C. T. et al. Transcriptional regulatory code of a eukaryotic genome. Nature 431, 99–104 (2004).
Boija, A. et al. Transcription factors activate genes through the phase-separation capacity of their activation domains. Cell 175, 1842–1855 (2018).
Badjatia, N. et al. Acute stress drives global repression through two independent RNA polymerase II stalling events in Saccharomyces. Cell Rep. 34, 108640 (2021).
Vinayachandran, V. et al. Widespread and precise reprogramming of yeast protein-genome interactions in response to heat shock. Genome Res. 28, 357–366 (2018).
Wal, M. & Pugh, B. F. Genome-wide mapping of nucleosome positions in yeast using high-resolution MNase ChIP-Seq. Methods Enzymol. 513, 233–250 (2012).
Shao, D., Kellogg, G. D., Lai, W. K. M., Mahony, S. & Pugh, B. F. in Practice and Experience in Advanced Research Computing 285–292 (Association for Computing Machinery, Portland, OR, 2020).
Picard Toolkit. http://broadinstitute.github.io/picard/ (2019).
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Yamada, N., Lai, W. K. M., Farrell, N., Pugh, B. F. & Mahony, S. Characterizing protein–DNA binding event subtypes in ChIP–exo data. Bioinformatics 35, 903–913 (2019).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. 57, 289–300 (1995).
de Hoon, M. J., Imoto, S., Nolan, J. & Miyano, S. Open source clustering software. Bioinformatics 20, 1453–1454 (2004).
Becht, E. et al. Dimensionality reduction for visualizing single-cell data using UMAP. Nat. Biotechnol. 37, 38–44 (2019).
Xu, Z. et al. Bidirectional promoters generate pervasive transcription in yeast. Nature 457, 1033–1037 (2009).
Rhee, H. S. & Pugh, B. F. Genome-wide structure and organization of eukaryotic pre-initiation complexes. Nature 483, 295–301 (2012).
van Dijk, E. L. et al. XUTs are a class of Xrn1-sensitive antisense regulatory non-coding RNA in yeast. Nature 475, 114–117 (2011).
Albert, I., Wachi, S., Jiang, C. & Pugh, B. F. GeneTrack—a genomic data processing and visualization framework. Bioinformatics 24, 1305–1306 (2008).
Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).
Jiang, C. & Pugh, B. F. A compiled and systematic reference map of nucleosome positions across the Saccharomyces cerevisiae genome. Genome Biol. 10, R109 (2009).
Yen, K., Vinayachandran, V., Batta, K., Koerber, R. T. & Pugh, B. F. Genome-wide nucleosome specificity and directionality of chromatin remodelers. Cell 149, 1461–1473 (2012).
Badis, G. et al. A library of yeast transcription factor motifs reveals a widespread function for Rsc3 in targeting nucleosome exclusion at promoters. Mol. Cell 32, 878–887 (2008).
MacIsaac, K. D. et al. An improved map of conserved regulatory sites for Saccharomyces cerevisiae. BMC Bioinformatics 7, 113 (2006).
Liang, K. & Keleş, S. Normalization of ChIP-seq data with control. BMC Bioinformatics 13, 199 (2012).
Bailey, T. L. & Elkan, C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc. Int. Conf. Intell. Syst. Mol. Biol. 2, 28–36 (1994).
This work was supported by National Institutes of Health (NIH) grants ES013768, GM059055 and HG004160 to B.F.P.; National Science Foundation (NSF) ABI INNOVATION grant 1564466 to S.M.; grants from the Pennsylvania State University Institute for Computational and Data Sciences to B.F.P. and W.K.M.L.; and computation from Advanced CyberInfrastructure (ROAR) at the Pennsylvania State University. We thank D. Shao for her role as lead software engineer for the PEGR platform and for support through the Penn State Institute and Computational Data Sciences (ICDS) Research Innovations with Scientists and Engineers (RISE) team. We thank O. Lang for operating EpitopeID.
B.F.P. has a financial interest in Peconic, LLC, which offers the ChIP–exo technology (US Patent 20100323361A1) implemented herein as a commercial service and could potentially benefit from the outcomes of this research. The remaining authors declare no competing interests.
Peer review information Nature thanks Vishwanath Iyer and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
a, Simplified view of transcriptional regulation. A ssTF (TF; for example, Gal4) binds to its cognate motif (a UAS) within promoters in competition with chromatin/nucleosomes (red bar). The ssTF recruits (pink/green arrow) cofactors (for example, SAGA and Mediator) that assist in the assembly of a PIC (comprising TBP, TFIIB, and so on) and of Pol II at the transcript start site (TSS) of genes. Pol II then traverses the gene to the transcription end site (TES). b, Diagram showing the ChIP–exo assay. Protein targets are crosslinked to DNA, which is then fragmented. Specific proteins are captured through an engineered TAP tag that binds the common Fc region of any immobilized IgG. Near-base-pair resolution is achieved using a strand-specific λ exonuclease that digests each strand of DNA in the 5′–3′ direction up to the point of crosslinking. c, Pie chart showing assayed targets separated by broad GO-based classifications (inner ring), or by UMAP-based clustering of genome-wide binding locations (outer ring). Listed are the common names of ChIP–exo targets that generated significantly enriched locations (with ‘significance’ defined in the Methods section ‘ChExMix locations’), grouped by their UMAP/K-means-derived meta-assemblage abbreviations (along with membership count), which are further grouped by the simplified GO-related categories. See also Supplementary Data 2 (2H).
Shown is an example web browser view at yeastepigenome.org of ChIP–exo occupancy patterns for all targets (for example, Reb1) around predefined genomic features. Rows are sorted by gene or promoter (NFR/NDR) length, or by distance from the indicated reference feature (where x = 0). Promoter classes include (from top to bottom) RP, STM, TFO, UNB and others. See Supplementary Data 1 (1G, 1J, 1C) for the identification numbers and coordinates of respective row features, and for the sort order of features that are constant in all target display windows. The lower right box (when present) provides strand-separated tag 5′ ends distributed around the target’s cognate DNA motif, with the motif’s opposite strand (red) inverted in the composite plot. Corresponding colour-coded nucleotide sequences are shown. All images, underlying data values and datasets can be downloaded through embedded ‘META DATA’ target-specific links at yeastepigenome.org. Each dataset download includes a ReadMe file describing the contents of the download. We warn that targets with only a single replicate did not pass our significance threshold. See Supplementary Data 1 (1C) for sort orders that are not provided in the download.
Left, plots showing the ChIP–exo patterns for Orc6 and Mcm5. Bold lines represent means; dotted lines represent the 5–95% confidence interval (CI). The CI was calculated for each base pair in the 1-kb window across all ACSs (n = 253). Right, heatmaps showing ACS occupancy by Orc6 and Mcm5. Blue represents ChIP–exo data on the ACS motif strand; red represents data on the opposite strand.
Shown is gene-body occupancy (strands combined) of selected Pol II elongation-associated targets (a and b have different datasets, as annotated). In each panel, data were aligned and orientated by TSS (left) and TES (right). Shown are the top 200 coding genes (middle) and the top 200 noncoding genes (bottom) (based on Sua7 occupancy). See also Fig. 3c and note that the RP panels are identical to Fig. 3c. The y-axis values represent arbitrary linear units (a.u.), and are not comparable across different datasets, but are scaled equivalently across each of the six subpanels for the same dataset (shown with the same fill colour). Individual plotted values can be found in Supplementary Data 5.
Heatmaps show the occupancy of the TFIIIB (Bdp1 and TBP/Spt15) components of the Pol III PIC and the TFIIB (Sua7) components of the Pol II PIC at the five Ty LTR classes, along with the nucleotide composition (±100 bp from the LTR start; from yeastepignome.org; Gs, As, Ts and Cs are in yellow, red, green and blue, respectively). All rows are linked and sorted by LTR class, then length.
Extended Data Fig. 7 Properties of inducible (STM), insulated (TFO) and constitutive (UNB) Pol II promoters.
a, NDRs are nucleosomal in vitro, while NFRs are mostly nucleosome-free. Heatmap of in vitro reconstituted MNase H3–ChIP nucleosomes (right), aligned by in vivo +1 nucleosome dyads and sorted by distance between the in vivo +1 nucleosome dyad and the first upstream stable nucleosome (USN) dyad (in vivo, left). b, Insulator ssTFs uncouple divergent PIC assembly. Shown are correlation coefficients between PIC (TFIIB/Sua7) occupancy (100-bp window centred on the TSS) at divergent TSSs that share the same promoter region. Data are presented as means ± s.d., from N = 6 biologically independent experiments, using two-tailed t-tests with no multiple comparisons. RP and STM promoters are merged here. c, Insulation at tandem genes. Shown are composite plots of PIC occupancy (green, TFIIB/Sua7) for promoter regions that share an upstream termination region (that is, tandem genes). Pcf11 (a representative termination factor) is in light brown, along with all 78 ssTFs (cyan), either collectively (‘TF’, top two panels) or individually, as indicated. STM, TFO and UNB composites are shown. The top ten insulator-associated ssTFs are based on the number of TFO promoters bound.
a, Shown are strand-separated composite plots of 78 ssTFs bound at their cognate sites, and grouped by their meta-assemblage label (coloured borders, as defined in Extended Data Fig. 1c). Plots are oriented and centred by motif, and extend from −100 bp to +100 bp. Patterns were highly penetrant across individual sites for each ssTF (for example, see the lower right panel in Extended Data Fig. 2 for Reb1 and yeastepigenome.org for other ssTFs). b, ChIP–exo composite profiles of individual subunits of the Mediator complex at motifs that are bound by the ssTF Yrr1 (from −500 bp to +500 bp), showing consistency of patterning across Mediator subunits.
a, Venn diagram showing promoters that have overlapping locations of STM cofactors (greater than 0 ChExMix calls in the dataset ‘SampleIDs’ in Supplementary Data 2 (10A)), as well as Z scores for pair-wise overlaps. b, Representative architecture of STM cofactors or PIC components at a consolidated set of ssTF motifs at 984 STM promoters (not strand-separated; see Methods and Supplementary Data 1 (1AI)), oriented by TSS. c, Frequency distribution of promoters having the indicated PIC/TFIID (GTF/Taf2) occupancy ratios. The GTF that was used to measure occupancy is indicated at the top of each panel. These six GTFs were averaged in Fig. 5c. ‘MA’, three-bin moving average.
Genes encoding ssTFs are capitalized and are connected to their encoded proteins (ovals). Arrows connect ssTFs to other ssTF-encoding genes, to which they are bound via their cognate motif. ssTFs that bind to their own genes or create a loop are indicated with blue arrows (light blue where a motif was not detected for ssTFs binding their own gene). ssTFs are colour-coded on the basis of their meta-assemblage membership (see key). Promoter-bound ssTFs that are also particularly enriched with bound cofactors have coloured halos. Short diagonal black arrows point to the total number of all 5,378 coding genes that are bound by that ssTF to its cognate motif (first number given) or where no motif was detected (second number). The average relative PIC (TFIIB/Sua7) occupancy levels for those sets of genes is indicated by the number of green dots.
This file contains Supplementary Notes (Evidence that most UNB genes are unlikely to be bound by TFs or STM cofactors), Supplementary Fig. 1, and additional references.
List of features and genome coordinates used in this study. A description of the information found in each column is described in the tab labeled “Key.”
Target list, including those that passed and failed QC, and progressed into UMAP analysis. UMAP/Kmeans cluster ID, labels, related biochemical complexes, and derived meta-assemblages for each target are listed. Binary (bound vs not bound) calls are made for target datasets against every genomic feature, along with averages for feature classes. Penetrance of each target into each promoter classes is reported. An overlap between peaks from this study with peaks from MacIsaac et al53 is presented in the tab labeled “MacIsaac Comparison.” -log10 p-values are de-rived from ChiTest function.
Values report the number of overlapping ChExMix peaks between any two of the 371 targets examined in this study, that is illustrated in Fig. 1b.
General description of data resources and code availability.
Values for individual composite traces, as plotted in main and extended data figures.
About this article
Cite this article
Rossi, M.J., Kuntala, P.K., Lai, W.K.M. et al. A high-resolution protein architecture of the budding yeast genome. Nature 592, 309–314 (2021). https://doi.org/10.1038/s41586-021-03314-8
This article is cited by
Genome Biology (2022)
Nature Reviews Molecular Cell Biology (2022)
Structural basis of RNA polymerase II transcription on the chromatosome containing linker histone H1
Nature Communications (2022)
Nature Structural & Molecular Biology (2022)
Nature Reviews Molecular Cell Biology (2022)