Introduction

Eukaryotic gene expression is extensively regulated, mostly at the transcription initiation step. Nucleosomes bind to promoters and form a physical barrier that blocks transcription, indicating that a destabilizing interaction between DNA and histones is important for active transcription. As acetylation of lysine residues in the histone tail region neutralizes the positive charge of histones and diminishes the electrostatic interaction between histones and the negatively charged phosphodiester backbone of DNA, histone acetylation has long been considered to be generally required for active transcription1 and other biological processes that demand DNA access2.

SAGA (Spt-Ada-Gcn5 acetyltransferase) is a eukaryotic transcription coactivator complex that controls transcription by modifying histones. The story of SAGA started with the discovery of that the transcription coactivator Gcn5 was a nuclear histone acetyltransferase (HAT)3. Although Gcn5p showed HAT activity with free histones, it did not acetylate nucleosomal histones; this finding led to the discovery of functional HAT complexes containing Gcn5p4. Two HAT complexes were found in yeast: the 1.8-MDa SAGA complex and the 0.8-MDa ADA complex.

HAT complexes containing the Gcn5p homolog were also found in humans and were independently named the SPT3-TAF9-GCN5 acetyltransferase (STAGA) complex and TATA-binding protein-free TAF-containing complex (TFTC)5,6. STAGA and TFTC were initially regarded as distinct complexes, but more recently, they both have been increasingly recognized as corresponding to the human SAGA (hSAGA) complex. SAGA has retained its transcriptional coactivator function throughout evolution, although its specific role in transcription has been modified in some species. For example, yeast SAGA participates in the transcription of stress-inducible genes, such as heat-shock genes7,8,9,10, whereas hSAGA activity has been observed at ER stress-induced promoters but does not appear to be involved in the p38 MAPK pathway-mediated acetylation of histones at sodium arsenite-induced promoters11. In addition to the classical coactivator function of its HAT activity, SAGA regulates transcription via Ubp8p, which is a ubiquitin-specific protease (UBP) that catalyzes H2Bub1 deubiquitylation12,13,14,15. Multiple lines of evidence developed over the past decade have shown that the SAGA subcomplex critical for this DUB activity also regulates other aspects of gene expression, including the nucleocytoplasmic export of mRNAs16,17. Collectively, these studies suggest that the SAGA complex may comprehensively coordinate the entire gene expression process and that, conversely, malfunction of SAGA may deregulate gene expression in a manner that may be linked to various diseases.

Before presenting details, we declare that, in this review, the proteins discussed are designated by the standard nomenclature for the corresponding organism. For example, the Gcn5 protein in Saccharomyces cerevisiae is designated “Gcn5p”; in Schizosaccharomyces pombe, it is “Gcn5”; and in Homo sapiens, it is “GCN5.” The nomenclature of other SAGA subunits in four representative model organisms is summarized in Table 118,19.

Table 1 Subunits of SAGA in four representative model organisms.

Functional modules of SAGA

Yeast SAGA is composed of 19 subunits that are organized into 4 functionally distinct modules: the HAT module (Gcn5p, Ada2p, Ada3p, and Sgf29p), the core structural module (Taf5p, Taf6p, Taf9p, Taf10p, Taf12p, Ada1p, Spt7p, Spt20p, Spt3p, and Spt8p), the DUB module (Ubp8p, Sgf73p, Sus1p, and Sgf11p), and the transcription factor-binding (TF-binding) module (Tra1p)18,19 (Fig. 1 and Table 1). The modular organization of SAGA was first revealed by genetic studies in yeast20 and was subsequently supported by biochemical experiments and electron microscopy21. In higher eukaryotes, the splicing module (SF3B3 and SF3B5) was also found as a subcomplex of the SAGA complex22,23,24. The splicing module facilitates the activation and proper splicing of some SAGA-regulated transcripts, but the specific role of the SF3B subcomplex within SAGA warrants further investigation24. The compartmentalization of these functional modules allows the SAGA complex to intricately and dynamically regulate gene expression, as discussed in detail below.

Fig. 1: Schematic diagram showing the modular organization of the SAGA complex.
figure 1

For the sake of simplicity, each subunit is labeled with the name used in S. cerevisiae. The schematic diagram shows the major functions of each module and presents recent structural data obtained from yeast49,53,64. Subunits belonging to each module are colored similarly: red, HAT module; blue, core module; green, TF-binding module; and yellow, DUB module. Subunits having a histone octamer-like fold in the core module are depicted as half circles that form circles with their corresponding partners: Taf6-Taf9; Ada1-Taf12; Spt7-Taf10; and Spt3, which has two octamer-like folds. The dotted circle near Spt3 and Spt8 indicates the TBP-binding site, where TBP is recruited at the transcription initiation step.

HAT module

Gcn5p, which is one of the best-studied proteins of the GCN5-related N-acetyltransferase (GNAT) superfamily, serves as the catalytic subunit of the HAT module (Fig. 2a). Recombinant Gcn5p acetylates free histones but is not sufficient to acetylate nucleosomal histones alone25. Biochemical and structural studies have revealed that Ada2p potentiates Gcn5p HAT activity by cooperatively binding to Gcn5p and changing its conformation to a catalytically active form. Specifically, Ada2p promotes Gcn5p binding to acetyl-CoA26. Remarkably, yeast with ADA2 deletion showed decreased telomeric silencing, indicating that Ada2p plays a specific role in maintaining genomic stability27. Deletion of GCN5 did not yield a comparable phenotype27, implying that Ada2p participates in telomeric silencing through its own targeting activity, independent of the HAT activity of Gcn5p.

Fig. 2: SAGA regulates transcription to fine-tune gene expression.
figure 2

Subunits that are major in each step are colored; otherwise, they are uncolored. a SAGA promotes an open chromatin structure through its HAT activity, which can be allosterically regulated by the proteasome to favor transcription initiation67,110. When this occurs, Bre1/Rad6-dependent H2Bub1 triggers the methylation of histone H3 by Set1, yielding TSS-associated histone modifications that act as markers to recruit downstream effectors that facilitate transcription initiation. b Deubiquitylation of ubiquitylated H2B is mediated by DUBm and is necessary for the recruitment of Ctk1, which phosphorylates Ser2 of RNAPII CTD and allows the release of paused RNAPII104. For productive elongation, the nucleosome barrier must be overcome. Histone chaperones (FACT and Spt6) and chromatin remodelers may be essential in this process and may cooperatively regulate the transition from initiation to transcription elongation. c The Rpt2p-Sgf73p interaction leads to the separation of DUBm, and the separated Sgf73 contributes to cotranscriptional mRNP surveillance and the mRNA export pathway93,111. Sus1, which is a subunit of both DUBm and TREX-2, may mediate the targeting of genes to nuclear pore complexes (NPCs).

Humans have two paralogs of the yeast Gcn5p protein, GCN5 and p300/CBP-associated factor (PCAF)28. PCAF is found in both the SAGA complex and p300/CBP, where it contributes to the activation of transcription. The most significant difference between GCN5 and PCAF is the E3 ubiquitin ligase domain of the latter, which may enable PCAF to regulate the stability of transcription factors or signaling proteins and thereby participate in diverse regulatory mechanisms29,30.

The HAT module is shared between the SAGA and ADA complexes in S. cerevisiae4,31. The ADA complex consists of the SAGA HAT module and two additional subunits, Ahc1p and Ahc2p32. Although the ADA complex lacks an activator-targeting subunit, it is thought to be recruited to chromatin with relatively low specificity through the bromodomain of Gcn5p or the activator-domain-binding sites of Ada2p/Ada3p, whereby it helps maintain the global histone acetylation level. A recent study suggested that there may be an ADA-equivalent complex in Drosophila33; however, no homolog subunit of Ahc1p or Ahc2p has yet been identified in this model. Thus, it is unclear whether this proposed complex can be considered homologous to the yeast ADA complex or whether it is simply a different form of the SAGA HAT module.

In metazoans, the HAT module is also found in the Ada2a-containing (ATAC) complex34,35,36, although the composition of the ATAC HAT module differs slightly from that of the SAGA HAT module. Metazoans have two paralogous ADA2 proteins, ADA2a and ADA2b, which are specific to the ATAC HAT module and SAGA HAT module, respectively37,38. They also differ in that ATAC possesses a second HAT subcomplex, ATAC2, which is conserved in flies and mammals39,40. Differences in catalytic subunits and adapter protein species direct distinct HAT activities: histone H3-specific acetyltransferase activity is observed with SAGA, whereas H3/H4-specific activity is observed with ATAC39,41,42. Other subunits of the ATAC complex (YEATS2, ATAC1/ZZZ3, MBIP, WDR5, and DR1/NC2β) have been suggested to be involved in structural or regulatory functions; a recent study partially supported this idea by showing that ATAC1 modulates the histone H3 acetyltransferase activity of the ATAC complex43.

ChIP-seq data for ATAC and SAGA showed that these two HAT complexes possess overlapping and nonoverlapping binding sites44. While SAGA prefers promoters to enhancers45, ATAC is associated with both enhancers and promoters. The following are among the yet-unanswered questions: which proteins are responsible for targeting ATAC specifically on enhancers, and which ATAC subunits interact with them?

Core module

The core structural module, which is the largest module in the SAGA complex, consists of ten subunits. It critically contributes to the assembly of the preinitiation complex (PIC) by recruiting TBP and transmitting signals from the TF-binding module to the HAT and DUB modules. Taf6p-Taf9p, Taf10p-Spt7p, Taf12p-Ada1p, and Spt3p have histone-like fold domains (HFDs) and together form an asymmetric octamer-like fold46 (Fig. 1). In yeast, five subunits containing HFDs are shared between the core module of SAGA and the general TF complex of TFIID: Taf5p, Taf6p, Taf9p, Taf10p, and Taf12p. In Drosophila and humans, however, Taf5 and Taf6 are not in TFIID; instead, TAF5L and TAF6L are preferentially found in SAGA47, while TAF5 and TAF6 are in TFIID. The relative positions of the corresponding TAFs are conserved between SAGA and TFIID, including the TBP-binding site in these complexes and the histone octamer-like folds. These TAFs, shared between SAGA and TFIID, are thought to be important for promoting PIC formation by SAGA.

In yeast, Ada1p, Spt7p, and Spt20p are integral to the proper assembly of the SAGA complex4,48. A structural analysis of the SAGA core module organization showed that Taf5p is central for core module assembly and provides docking sites for histone fold pairs on its C-terminal WD40 domain49. Spt20p was found to critically stabilize the complex by providing a wedge-like structure that can be intercalated between the WD40 domain and N-terminal domain of Taf5p. Notably, a recent study on S. pombe showed that SPT20 deletion did not affect the assembly of SAGA complex subunits, except for DUBm and Tra150. These results suggest that, at least in S. pombe, Spt20 is not necessary for SAGA assembly.

As characterized by biochemical assays and supported by structural data, Spt3p and Spt8p are known to bind TBP, load it onto a promoter, and promote the assembly of the PIC. For genes that are inhibited by SAGA, however (e.g., HIS3 and TRP3), Spt3p and Spt8p may prevent the binding of TBP to the TATA box51. Although Spt3p is conserved in Drosophila and humans, no homolog of Spt8p has been found in metazoans. The null mutation of Spt3p results in mating-defective phenotypes, sporulation in diploids, and invasive growth in haploids; in contrast, the SPT8-null mutant shows no severe defect, suggesting that the loss of Spt8p during evolution would not have had a significant effect. In this regard, metazoan SAGA can be compared with the yeast SAGA-like (SLIK) complex, which lacks Spt8p due to truncation of the Spt8p-interacting helices in Spt7p52. Uncovering the detailed roles of the SLIK complex might improve our understanding of how SAGA has changed throughout evolution.

Recent cryogenic electron microscopy (cryo-EM) images showed that the C-terminal stirrup region of TBP is bound by the Spt3p pocket, whereas the N-terminal region of TBP is in contact with Spt8p53. Consistent with early biochemical evidence, the SPT8-null mutant does not induce any alteration in the physical interaction between Spt3p and TBP54. Together, these findings suggest that Spt3p is a major player in the binding of TBP, whereas Spt8p tends to play auxiliary and/or regulatory roles.

TF-binding module

The TF-binding module comprises a single protein, Tra1p, and forms the largest subunit (~433 kDa) of SAGA. Tra1p belongs to the phosphoinositide 3 kinase-related kinase (PIKK) family but lacks catalytic activity, making it the only pseudokinase of the PIKK family55,56. Similar to other PIKK family proteins, Tra1p requires chaperone and cochaperone proteins to ensure that it is properly folded and assembled on the HAT complex50,57,58. Tra1p is composed of three domains: the HEAT (Huntingtin, elongation factor 3, PR65/A, and TOR) domain, which harbors various binding sites for acidic activators, including Gcn4p, Gal4p, and Rap1p; the FAT (FRAP, ATM, and TRRAP) domain; and the PIKK domain59,60. Recent structural data suggested that Tra1p binds to the SAGA core module using a cup-shaped motif in the FAT domain50. A narrow hinge formed between Tra1 and the core module is thought to provide structural flexibility for regulatory functions60.

Tra1p is also found in the NuA4 HAT complex61 as part of the activator-targeting module of NuA462. The only exception to this pattern is found in S. pombe, where two paralogous genes correspond to TRA1; designated TRA1 and TRA2, these proteins are exclusively found in SAGA and NuA4, respectively63. The nonoverlapping distribution of Tra1 and Tra2 provides a way to characterize the SAGA- or NuA4-specific roles of Tra1p. In contrast to SAGA, which has other subunits that can bind TFs, none of the NuA4 subunits is known to interact with TFs. Considering that Tra1-depleted cells are viable but Tra2-depleted cells are not, the necessity of Tra1 for the viability of other organisms likely reflects the importance of the appropriate targeting of NuA4.

Since SAGA and NuA4 from S. cerevisiae occupy distinct facets of Tra1p, the access of activators to the Tra1p-activator-binding domains would be restricted depending on the HAT complex to which Tra1p is bound64. For instance, Tra1p and Eaf1p form the structural core of NuA465, whereas Tra1p is at periphery of the SAGA complex64, where its integrity is maintained by the core subunit. Thus, because distinct facets of Tra1p interact with HAT complexes, Tra1p might recognize different populations of TFs. A recent study on SAGA and NuA4 showed that depletion of the HAT catalytic subunits of SAGA and NuA4 led to different effects on Pol II occupancy66, implying that their targeting might be distinct. Nucleosome binding induces a structural change in the SAGA complex49 in a manner that displaces the HAT module and the DUB module from the core module, enabling them to bind to the nucleosome in the proper orientation. Considering that an activator is generally thought to be important in determining whether the coactivator binds a genomic region (and the associated nucleosomes), one possible supposition is that Tra1 may induce a conformational change in SAGA upon binding to activator. Studies have shown that nucleosomal HAT activity is more efficient when the HAT module is in a separate ADA complex than when ADA is incorporated into SAGA31, suggesting that allowing the HAT module to become more flexible upon TF binding may enable the precise regulation of the nucleosomal HAT activity of SAGA.

DUB module

Histone H2B ubiquitylation has been shown to be essential for many chromatin-based actions, but its role in transcription regulation is undoubtedly the most thoroughly studied. Works from many laboratories have revealed the existence of cotranscriptional H2B ubiquitylation–deubiquitylation cycles and their roles in directly stimulating transcription elongation67. The SAGA DUB module (DUBm) in S. cerevisiae consists of the catalytic Ubp8p (ubiquitin protease 8) subunit along with Sgf11p (SAGA-associated factor 11), Sus1p (SI gene upstream of ySa1), and Sgf73p (SAGA-associated factor 73). The deubiquitylation of H2Bub1 by SAGA was first discovered by researchers studying S. cerevisiae Ubp8p; they demonstrated that both H2B ubiquitylation and its successive deubiquitylation are essential for gene activation14,15. A second component of DUBm, Sgf11p, was subsequently identified and found to be required for the deubiquitylation process68,69,70. Next, Sus1p was identified as a component of both SAGA and the Sac3p-Thp1p mRNA export complex71, which was later named TREX-2. Sus1p, whose recruitment to SAGA depends on Ubp8p and Sgf11p, has also been shown to be required for the H2B-deubiquitylation activity of DUBm72. Sgf73p, which was initially identified as a novel component of SAGA through a proteomic approach73, is required for the proper assembly of DUBm onto the SAGA complex and the recruitment of TREX-2 to SAGA. A physical interaction between SAGA and TREX-2 is essential for targeting the transcriptional machinery to the periphery of the nuclear pore complex (NPC) in a phenomenon known as “gene gating”74,75.

Studies have shown that Ubp8p alone cannot carry out the deubiquitylation reaction69, and its zinc-finger (ZnF)-UBP domain cannot bind to free ubiquitin68, suggesting that Ubp8p activity may be subject to allosteric regulation promoted by a nonsubstrate partner (likely another DUBm subunit). The minimal DUBm structure that confers full deubiquitylation of the nucleosome consists of Ubp8p, Sgf11p, Sus1p, and an N-terminal fragment of Sgf73p (amino acids 1–104)74. Several studies have shed light on the structural basis for the deubiquitylation activity of DUBm76,77,78. Sgf11p was found to activate Ubp8p through its C-terminal ZnF domain, which directly interacts with the C-terminal catalytic domain of Ubp8p. An interaction between Sus1p and the helix of Sgf11p and between Sus1p and the ZnF domain of Ubp8p yields an entirely closed conformation of DUBm that seems to stabilize its activity. Sgf73p facilitates the coupling of the “catalytic lobe” and the “assembly lobe” of DUBm, which act together as a unified module through its C-terminal fragment, which is required to connect DUBm with the remainder of SAGA53,79. Overall, the four subunits seem to have a highly intertwined arrangement. Recent structural studies elucidated the chromatin-binding interface of DUBm80, and these findings suggested that there are interactions between the ZnF domain of Sgf11p and the conserved acidic patch in histone H2A/H2B81 and between the catalytic domain of Ubp8p and histone H2B. However, the crystal structure assessed in the abovementioned reports did not include the SCA7 domain within Sgf73p, which contains the second ZnF domain and was previously reported as a nucleosome-binding domain82. Notably, a recent in vitro study suggested that a fine-tuned regulatory mechanism for generating sophisticated ubiquitylation patterns might involve the competition of DUBm with Bre1p for binding the acidic patch in the nuclear core particle83.

DUBm is clearly conserved in Drosophila melanogaster84,85,86 and H. sapiens87,88,89. In contrast, the ATXN7 (the human ortholog of yeast Sgf73p) gene has undergone diversification during evolution and is found as two paralogs, ATXN7L1 and ATXN7L2, in Mus musculus and H. sapiens86,87,90. Both paralogs have been described as being part of the SAGA complex, as assessed through proteomic analysis91. Another subunit of DUBm, ATXN7L3 (the human ortholog of yeast Sgf11p), also has a paralog called ATXN7L3B. However, ATXN7L3B localizes to the cytoplasm and does not associate with either SAGA or TREX-2, although it interacts with ENY2 (the human ortholog of yeast Sus1p)92.

DUBm has been shown to dissociate from SAGA under certain conditions. In S. cerevisiae, the 19S regulatory particle (19S RP) of the proteasome facilitates the separation of functional DUBm from SAGA through a physical interaction93. Consistent with this finding, studies from different laboratories have shown that DUBm can function as an independent module. For example, it reportedly remained stable following the loss of dAtxn7 (the Drosophila ortholog of yeast Sgf73p, which has been reported to structurally connect DUBm with SAGA) in D. melanogaster86 and upon knockdown of SUPT20H (human ortholog of yeast Spt20p, which is essential for maintaining the integrity of SAGA) in H. sapiens11. In striking contrast to the effect seen in yeast, a few studies showed that H2B ubiquitylation decreased following the loss of dAtxn7 and ATXN7L3 in metazoans86,87,90. Thus, DUBm may maintain its integrity without SAGA or even lose its initial role as a ubiquitin protease and thus, likely has an alternative, its SAGA-independent function.

The SAGA DUBm and transcription activation and elongation

Histone H2B ubiquitylation facilitates transcription elongation by cooperating with FACT to support efficient nucleosome reassembly94,95,96. H2B ubiquitylation may also directly regulate transcription elongation by changing the chromatin structure. Although the role of H2B ubiquitylation in nucleosome stabilization seems to suggest the opposite effect97,98, several biochemical studies have shown that H2B ubiquitylation disrupts chromatin compaction and promotes an open chromatin structure99,100,101. H2B deubiquitylation has also been shown to have a profound effect on subsequent H3 methylation and productive transcription14,15,88,89,102,103. Notably, Ubp8p and Rad6p/Bre1p may interact with elongating RNA polymerase II, travel along with it into the coding regions14,104, and act predominantly on gene bodies45. These findings collectively suggest that repeated and transient cycles of H2B ubiquitylation and deubiquitylation occur in the wake of RNA polymerase II during transcription elongation and function as a key checkpoint for optimal gene activation.

During the early stage of transcription, the PAF (polymerase-associated factor) complex is recruited to RNA polymerase II, with which it interacts through pSer5 in the RNA polymerase II CTD (C-terminal domain). This PAF recruitment facilitates Rad6p/Bre1p-mediated H2B ubiquitylation in S. cerevisiae. Deubiquitylation of this ubiquitylated H2B is mediated by DUBm; this modification is necessary for the subsequent recruitment of Ctk1p104, which phosphorylates Ser2 in the RNA polymerase II CTD (Fig. 2b). pSer2 crucially supports the transition to productive elongation through its regulatory role in promoter-proximal pausing, suggesting that DUBm regulates the release of paused RNA polymerase II. A more recent study showed that SAGA binding is highly correlated with the pausing site located at the 5′ end of the genes in D. melanogaster105. Another study found that the level of H2Bub1 was decreased by depletion of dDsk2, that this effect coincided with a disruption in RNA polymerase II pausing at several developmental target genes and that the dDsk2-depletion-induced downregulation of H2Bub1 can be suppressed by the codepletion of Nonstop106.

The SAGA DUBm in mRNA export and surveillance

In addition to regulating transcription activation and elongation through histone modifications, SAGA has been shown to be important for the proper export of mRNAs. SAGA shares its DUBm subunit, Sus1p, with the TREX-2 complex;71 this functional and physical interaction between the two complexes enables SAGA to function in maintaining genome stability and coordinating mRNA export93,107 (Fig. 2c). Interestingly, several studies have suggested that the proteasome may also contribute to mRNA export through SAGA and TREX-2. The evolutionarily conserved proteasomal 19S RP subunit, Sem1p, which is another TREX-2 complex subunit108, is involved in the recruitment and deubiquitinating activity of SAGA109. Notably, the 19S RP can nonproteolytically alter the biochemical properties of SAGA to promote gene activation110. Rpt2p, which is an ATPase subunit of 19S RP, has been shown to induce the separation of a functional DUBm from SAGA by physically interacting with Sgf73p of DUBm. This separation is essential for the localization of Mex67-Mtr2 and the TREX-2 complex to the transcriptional machinery and thus successful mRNA export93. Furthermore, a recent study showed that Sgf73p contributes to mRNP surveillance and the Mex67-mediated noncanonical mRNA export pathway under stress conditions111. This study also suggested the possibility of an independent role for Sgf73p separate from the deubiquitylation process. More specifically, the study showed that Sgf73p is the sole factor required for growth restoration of the mRNA export defective mutant yra1-1, although the DUBm separated from SAGA upon Rpt2p-Sgf73p interaction has functional deubiquitylation activity in vivo93. Nevertheless, H2B ubiquitylation itself seems to be essential, as H2Bub1 and H2Bub1-dependent Swd2p ubiquitylation are required for Mex67p recruitment and export-competent mRNP biogenesis112. Last, Sgf73p has been suggested to promote the translocation of the transcription site to the nuclear periphery, which is a phenomenon known as “gene gating”74,113,114. This action is believed to allow efficient export of newly transcribed mRNAs. In the future, researchers should study the molecular details underlying the ability of DUBm-mediated deubiquitylation to regulate mRNA biogenesis and assess whether DUBm or Sgf73p plays a role irrespective of deubiquitylation activity.

SAGA and human disease

Owing to the importance of SAGA in proper gene expression, malfunction of SAGA subunits is likely to cause developmental defects or diseases115,116. During development, gene expression must be tightly regulated at each stage to ensure that restricted sets of required genes are activated at specific times while other sets remain silent. A defect in SAGA can disable sequential gene activation, impairing normal development. For example, Drosophila requires Gcn5p for normal metamorphosis, oogenesis, and cell proliferation in imaginal tissues103,117. In mouse, GCN5 depletion causes embryonic lethality118, GCN5 hypomorphic mutants show a defect in neural crest closure119, and conditional depletion of Gcn5 results in a decrease in brain size120. Although PCAF-null mice exhibit normal embryonic development, the double deletion of GCN5 and PCAF causes a more severe phenotype than that observed following depletion of GCN5 alone121. This outcome suggests that GCN5 and PCAF have both redundant and distinct roles. A recent study in zebrafish revealed that knocking down GCN5 or PCAF disturbs cardiac and limb development122. Not confined to the HAT catalytic subunit, SAGA itself seems to be important for proper development, as mice harboring a hypomorphic mutant allele of SUPT20 (the mouse homolog of yeast Spt20p) displayed defects in axial skeletal development123. SAGA is also important for maintaining the stemness of stem cells. The well-known target of GCN5 acetyltransferase is c-Myc, which is a pluripotency factor known to be important in the early reprogramming phase of pluripotency induction124. Thus, the activation of c-Myc by GCN5 overexpression and enhanced SAGA recruitment facilitates reprogramming. Similarly, a GCN5 overexpression gain-of-function mutation has been identified as a source of c-Myc-driven cancers in various tissues125,126,127,128,129,130. SAGA activates the transcription of c-Myc and is recruited to c-Myc-targeted genes through interactions between c-Myc and SAGA subunits (e.g., TRRAP, STAF65γ, and KAT2A)131,132,133. Since many c-Myc-dependent genes are related to cell proliferation and cell growth134, this interaction may further accelerate malignancies.

Another remarkable subunit of SAGA that appears to be highly engaged in cancer is USP22, which is the catalytic subunit of DUBm. USP22 was initially identified in microarray screens as a member of an 11-gene “death-from cancer” signature that can be used to predict tumor recurrence, metastasis, highly aggressive tumors, and poor prognosis for people with one of several types of cancers135,136. More recent studies have shown that USP22 is not only a marker of aggressive tumors but also a tumor inducing factor. USP22 plays an essential role in regulating gene activation and cell growth and in maintaining genome integrity. Usp22 itself regulates transcription through its deubiquitylation activity and is also involved in regulating several TFs, including the androgen receptor137, the oncogene c-MYC88, and the tumor suppressor p53138. USP22 was also shown to act as a key factor in cell cycle progression through the G1 phase by controlling CCND1 ubiquitylation in non-small-cell lung cancer139. Further, USP22 has been linked to the maintenance of genome integrity by modulating the stability of TRF1, which regulates the length of a telomere and whose misregulation causes chromosomal abnormalities and cell death140. Interestingly, the depletion of GCN5 does not alter TRF1 mRNA expression, and the expression of catalytically inactive GCN5 does not affect TRF1 protein expression, but the loss of GCN5 alters the TRF1 protein level because it impairs the associations of USP22 and ATXN7L3 with SAGA. Overall, the evidence indicates that overexpression of USP22 in cancer leads to the abnormal activation of several pathways involved in cell survival and tumorigenesis. Thus, the development of new cancer therapies targeting USP22 may yield promising results.

In addition to its contribution to cancer development, the DUBm subunit ATXN7 is closely related to neurodegenerative disease. An expansion of a highly conserved polyQ motif in the amino-terminal region of ATXN7 causes spinocerebellar ataxia type 7 (SCA7)141. Several studies have indicated that polyQ-expanded ATXN7 (polyQ-ATXN7) is incorporated into SAGA142,143 and affects its function. Nuclear inclusions formed by misfolded polyQ-ATXN7 have been found in the cell nuclei of SCA7 patients; these nuclear inclusions also contain other proteins, including the SAGA components GCN5144, USP22145, and ATXN7L3146. PolyQ-ATXN7 has been suggested to further compromise the integrity of SAGA, as it was found to alter the stability of subunits such as GCN5 and STAF36142. Although another study failed to observe a change in the level of GCN5 in the polyQ-ATXN7-containing SAGA, the levels of other key SAGA subunits (i.e., Ada2, Ada3, Taf12, and Spt3) were reduced147. PolyQ-ATXN7 has also been shown to decrease the HAT activity of SAGA in some studies142,147, while others have found that polyQ-ATXN7 increases H2B ubiquitylation without decreasing H3 acetylation144,146. Further research is needed to resolve this discrepancy. Although GCN5 depletion may also contribute to the severity of SCA7 phenotypes, it is not sufficient to drive SCA7 in a mouse model148. As the loss of GCN5 affects the stability of DUBm140, this observation may suggest that the HAT-independent activity of SAGA plays a regulatory role in SCA7.

Conclusion

SAGA was initially found to be important for both transcriptional activation and elongation, and more recent studies have shown that the complex is also important for mRNA export, especially through the DUBm. SAGA has also been reported to contribute to the biogenesis, quality control, and export of mRNPs. Collectively, these findings show that the SAGA complex intricately coordinates gene expression from the activation step through elongation and export. It seems likely that the structural organization of its functional modules allows SAGA to execute comprehensive and dynamic regulation of gene expression.

It was long believed that SAGA and TFIID selectively regulate the expression of distinct gene groups149, but recent reports have suggested that the SAGA complex plays a more general role in gene expression150. Nonetheless, the highly regulated genes, whose expression has been suggested to be more dependent on SAGA than TFIID, tend to have TSS-upstream nucleosomes with higher occupancy and less well-defined positioning compared to those of housekeeping genes. For the proper expression of these genes, chromatin-level regulation is required to resolve the competition between the nucleosome and the TF. Indeed, several studies have demonstrated interactions between SAGA and chromatin remodelers. For example, the SAGA-acetylated nucleosome is displaced by SWI/SNF, and Chd1p151 is thought to physically associate with SAGA152. These studies seem to suggest that SAGA and these chromatin remodelers cooperate to regulate transcription. Additional studies are needed to examine the possible functional links between SAGA and other chromatin remodelers critical for regulating the chromatin structure of the promoter. Understanding the selectivity of specific remodelers and the detailed mechanisms underlying their coordinated action will help researchers elucidate how SAGA functions to fine-tune gene expression.

Based on the current evidence, it is tempting to speculate that mammalian SAGA, ADA, and ATAC function to acetylate histone H3 (and possibly H4) at specific genomic locations through the distinct recruitment mechanisms and/or subunit contexts of each complex. Genome-wide studies to investigate the global localization of each complex in a mammalian system will provide the fundamental information needed to verify this hypothesis. Furthermore, comparative genomics of the localization of yeast SAGA and mammalian HAT complexes may provide insight into how the roles of SAGA have been modified throughout evolution.

To improve our understanding of the functions of SAGA, researchers should seek to resolve the complete structure of SAGA. The structures of some modules have been published, but the literature lacks their functional configurations when bound to chromatin as part of the whole complex and in different cellular contexts. In addition, the process and mechanisms through which SAGA is assembled from individual subunits to an intact 19-protein complex require much more research. Information on whether each subunit undergoes a conformational change upon binding to one another, upon binding of a TF, and/or during the assembly of the PIC will provide invaluable insights into the overall function of SAGA.