Divergent architecture of the heterotrimeric NatC complex explains N-terminal acetylation of cognate substrates

The heterotrimeric NatC complex, comprising the catalytic Naa30 and the two auxiliary subunits Naa35 and Naa38, co-translationally acetylates the N-termini of numerous eukaryotic target proteins. Despite its unique subunit composition, its essential role for many aspects of cellular function and its suggested involvement in disease, structure and mechanism of NatC have remained unknown. Here, we present the crystal structure of the Saccharomyces cerevisiae NatC complex, which exhibits a strikingly different architecture compared to previously described N-terminal acetyltransferase (NAT) complexes. Cofactor and ligand-bound structures reveal how the first four amino acids of cognate substrates are recognized at the Naa30–Naa35 interface. A sequence-specific, ligand-induced conformational change in Naa30 enables efficient acetylation. Based on detailed structure–function studies, we suggest a catalytic mechanism and identify a ribosome-binding patch in an elongated tip region of NatC. Our study reveals how NAT machineries have divergently evolved to N-terminally acetylate specific subsets of target proteins.

kingdom, downstream targets of NatC Nt-acetylation include photosystem II core proteins D1 and CP47 in Arabidopsis thaliana, with implications for photosynthesis and plant growth 42 . In addition, knockdown of NatC subunits in human cells results in reduced cell growth and p53-dependent apoptosis 31 .
A strong upregulation of NAA30 has been observed in glioblastoma, and a NAA30-knockdown in glioblastoma-initiating cells (GICs) reduced their viability, sphere-forming ability, and hypoxia tolerance 43 . Moreover, mice transplanted with GICs in which NAA30 was knocked down show prolonged survival compared to control animals transplanted with unmodified GICs, indicating that NatC may serve as a therapeutic target in cancer. Interestingly, a nuclear localization of Naa30 was observed in GICs and, more sporadically, in neuronal stem cell cultures 43 . The nuclear localization is specific to a splice variant of human NAA30, which encodes a truncated protein missing parts of the GNAT-fold. The truncated version is also abundantly expressed in thyroid cancer tissues and other human cancer cell lines 44 . In a recent study, a potentially pathogenic de novo mutation in NAA35 was identified in patients with cerebral palsy, a heterogeneous group of disorders affecting movement and posture 45 .
Recently, crystal and cryo-EM structures of several NATs have been reported 21,24,[46][47][48][49][50][51][52] , but the structure of the heterotrimeric NatC complex has remained elusive. Here, to obtain insights into the tertiary and quaternary assembly of NatC, we determined its structure by X-ray crystallography and elucidated the mechanism of substrate recognition. A structure-function approach yielded insights into substrate specificity and catalysis, leading us to propose a refined reaction scheme for NatC catalysis. We also identified a ribosome-binding patch on the NatC surface and suggest a model for the NatC-ribosome complex.

Results
Overall structure of NatC. To prepare the NatC complex for structural studies, the three subunits of S. cerevisiae NatC were co-expressed in Escherichia coli (Supplementary Figs. 1 and 2). The catalytic subunit Naa30 was designed as a truncation construct lacking 17 non-conserved residues at the C-terminus, analogously to a previously crystallized NatA construct 24 . During an initial purification, a partial proteolytic degradation of eleven non-conserved residues at the C-terminus of subunit Naa38 was identified by MALDI-MS, and, consequently, these 11 residues were also deleted ( Supplementary Fig. 3a, b). The final homogeneous NatC preparation used for kinetic and structural studies ( Supplementary Fig. 3c, d) therefore contained subunits Naa30ΔC17 (residues 1-159), full-length Naa35 (residues 1-733), and Naa38ΔC11 (residues . Structures of the selenomethionine-derivatized (NatC, apo) and native, CoA-bound (NatC•CoA) complex were determined to 2.40 and 2.45 Å resolution, respectively. Both crystallized in space group P2 1 2 1 2 1 . The derivatized structures were solved by single-wavelength anomalous diffraction (SAD) and refined to R work and R free values of 19.5% and 22.3%, respectively ( Table 1). The structure of CoA-bound NatC was solved by molecular replacement using the ligand-free NatC structure and refined to R work and R free values of 20.3% and 23.6%, respectively.
Naa35 is mostly α-helical and, in addition, contains three short β-strands in the N-terminal region and two β-strands at the Cterminus ( Supplementary Fig. 2). Helix α21, together with the Cterminal end of α20, protrude by~30 Å from the central body, thereby forming an extension, which we refer to as the "Naa35 tip". A DALI search did not yield any closely related relatives of Naa35, indicating a unique fold.
In agreement with previous predictions 54 , Naa38 adopts an Sm fold, with an N-terminal α-helix, followed by a strongly bent fivestranded β-sheet 55 . An additional short α-helix is present at the C-terminus. A DALI search revealed that Naa38 is most similar to the spliceosomal Lsm4 (RMSD of 1.8 Å over 72 Cα atoms; Supplementary Fig. 4c-f), which is part of the donut-shaped, hetero-heptameric Lsm2-8 ring that binds snRNA U6 in its center 56 . However, Naa38 differs from other Lsm proteins, as it lacks the conserved IRG motif that mediates association with small nuclear RNAs 54 .
Naa35 is the central assembly hub of the NatC complex. The large auxiliary subunit Naa35 acts as the central assembly hub of the NatC complex, forming extensive interactions with Naa30 and Naa38 (Fig. 1b). The elongated Naa35 N-terminus wraps around almost the entire circumference of Naa38, burying a surface area of 1970 Å 2 . As Naa35 extends around Naa38, it connects it to Naa30, thereby limiting the direct contact between the catalytic and the small auxiliary subunit to 260 Å 2 . The remainder of Naa35 wraps around Naa30 in a more condensed, ring-like structure covering three quarters of its circumference and burying another 1890 Å 2 .
The quaternary assembly of the three NatC subunits results in the formation of a tunnel in the center of the NatC complex (Fig. 1c). This tunnel is surrounded by loop regions of Naa30 and Naa35, which comprise the peptide-binding site (see below). The tunnel extends to a deep groove that contains the active site of the complex and accommodates the acetyl-CoA cofactor-binding site.
Interactions between the three NatC subunits are mediated by several evolutionary conserved contacts ( Supplementary Fig. 5). While the interface between Naa30 and Naa35 is dominated by hydrogen bonding interactions, the β1-α2-loop-β2 segment of Naa35 forms extensive hydrophobic interactions with Naa38. Moreover, the β-sheet of Naa38 is extended by three short βstrands from the Naa35 N-terminus, forming a bifurcated, antiparallel β-sheet. Thus, within the NatC complex, the Naa38 β-sheet is not available for homo-or hetero-oligomeric interactions, unlike in the Lsm2-8 assembly ( Supplementary Fig. 4e, f).
In agreement with the structural data, deletion of the first 44 residues of Naa35 disrupted the interaction with Naa38. In contrast, deletion of the non-conserved N-terminal α1 helix (Naa35ΔN17) was not sufficient to disrupt NatC integrity (Fig. 1d).
Comparison of NatC with heterodimeric NatA and NatB complexes. A comparison of the heterotrimeric NatC complex with the heterodimeric NatA 24 and NatB 50 complexes revealed remarkable differences in their tertiary and quaternary structures (Fig. 2). All NAT complexes contain a catalytic subunit, with a conserved GNAT architecture. While NatA and NatB have related, single auxiliary subunits, NatC possesses two unrelated auxiliary subunits. Strikingly, the relative position of the NatC catalytic and auxiliary subunits is opposite to their arrangement in NatA and NatB. Whereas the auxiliary subunits Naa15 (NatA) and Naa25 (NatB) primarily engulf the N-terminal part of their catalytic subunits, Naa35 (NatC) additionally encloses the Cterminal half. In NatC, the β6-β7 loop, which is necessary for substrate binding, is in direct contact with the auxiliary subunit Naa35. In NatA and NatB, the β6-β7 loop makes no contact with the corresponding auxiliary subunit. Furthermore, NatA and NatB auxiliary subunits are necessary for the proper positioning of the catalytic α1-loop-α2 region, and hence for full catalytic activity 24,50 . In NatC, helix α2 of Naa30 is in contact with both auxiliary subunits Naa35 and Naa38. However, in contrast to NatA and NatB, α1 in Naa30 forms no contacts to either of the two auxiliary subunits.
Substrate recognition in NatC. Colorimetric activity assays were performed to kinetically characterize the activity of NatC. Short decameric peptides were used as substrates, in which the five initial N-terminal residues corresponded to the sequence of cognate NatC substrates (Fig. 3a, Table 2, and Supplementary Figs. 6 and 7). A peptide containing the N-terminus of S. cerevisiae Arl3 (yArl3) was Nt-acetlyated by NatC with a k cat of 4.2 ± 0.3 s −1 and a K m of 140 ± 20 µM. A comparable activity was observed for a peptide with the N-terminal sequence of hARFRP1, the human ortholog of yArl3, suggesting that species-specific differences in the N-terminal sequences between yArl3 and hARFRP1 are not due to adaption to their respective NatC ortholog. A peptide comprising the five initial N-terminal residues of the major capsid protein Gag, exhibited a k cat of 12 ± 1 s −1 and a K m < 20 µM. Therefore, the specificity
constant of NatC toward the Gag peptide is at least 20-fold higher compared to yArl3.
To obtain insights into the substrate recognition mode and the marked difference of NatC catalytic efficiencies toward the two peptides, crystal structures of NatC in complex with CoA, and either yArl3 or Gag peptide were determined to a maximal resolution of 2.99 and 2.75 Å, respectively ( Table 1). The electron density of the peptide ligands was sufficient to model the first four and six amino acids of the yArl3 and Gag peptide, respectively ( Fig. 3b and Supplementary Fig. 8). Moreover, clear electron density was visible for a tightly coordinated water molecule in the active site of the NatC complex.
The yArl3 and the Gag peptide formed several side chain and backbone interactions, mostly with Naa30, but also with Naa35. In both structures, the methionine sulfur atom at peptide position 1 is within hydrogen bond distance to Ser28 (Fig. 3c, d). No NatC activity was measured for a known NatA substrate (yeast threonyl-tRNA synthetase, yThrRs), which lacks an N-terminal methionine 24 .
The Phe2 and Leu2 side chains of the yArl3 and Gag peptide, respectively, were positioned in a hydrophobic pocket (the "main peptide pocket"), which is formed by conserved residues from α2 and β4 of Naa30, as well as the short 3 10 -helix η3 from Naa35 ( Fig. 3c and Supplementary Fig. 9). While F2W and F2A substitutions led to comparable k cat and K m values as seen for the native yArl3 peptide, the F2R, F2Y, F2L, and F2K substitutions resulted in 3-8-fold and the F2E substitution in a 40-fold increased K m value ( Table 2 and Fig. 3a). Also, the L2E substitution in the Gag peptide led to an at least 13-fold decrease of K m . Substitutions at position 2 to positively charged amino acids (F2R and F2K) in yArl3 led to strong decreases in turnover numbers to 3-8%.
At peptide position 3, Gag-Arg3 forms a hydrogen bond with the protein backbone and a cation-π interaction with Naa30-Tyr144. In addition, Naa30-Glu29 interacts with the Gag peptide backbone. In the yArl3-bound structure, in contrast, Naa30-Glu29 forms an alternative hydrogen bond with yArl3-His3 and does not contact the peptide backbone. Glutamate substitutions at position 3 massively increased K m and reduced k cat for both peptides. Interestingly, the H3A substitution in yArl3 led to k cat and K m values that were similar to those of the Gag peptide, suggesting that an imidazole side chain at peptide position 3 interferes with full catalytic activity.
Only in the Gag peptide, the fourth peptide residue, Phe4, deeply inserts into a second hydrophobic surface generated by Naa30 and Naa35, the "extended pocket" (Fig. 3c and Supplementary Fig. 9). The yArl3-Leu4 side chain is at the periphery of this pocket. Glutamate substitutions at position 4 led to strong increases in K m for both peptides. For the Gag peptide, k cat was strongly reduced in addition. Similar to the yArl3-H3A substitution, also the yArl3-L4A variant exhibited a strong increase in k cat and a strongly reduced K m value. Furthermore, a peptide derived from the N-terminus of yeast actin, a known NatB substrate 50 with acidic residues at positions two and four, was not acetylated by NatC.
Val5 in the Gag peptide occupies the same position as Leu4 of yArl3, whereas residue 5 is not resolved for yArl3. Glutamate substitutions at position 5 only moderately affected the specificity constant for both peptides, indicating a minor effect of this position for binding. To further explore the differences between the yArl3 and Gag peptides, yArl3/Gag hybrids with single amino acid substitutions at positions 2-4 were introduced in the yArl3 peptide. While the F2L hybrid exhibited a strongly reduced specificity constant compared to yArl3, the H3R and L4F hybrids approached K m and k cat values of the Gag peptide ( Fig. 3a and Table 2). These results highlight the importance of positions 3 and 4 for efficient substrate acetylation.
Peptide ligand-induced conformational changes. The higher turnover number of the Gag peptide compared to yArl3 suggests favorable structural rearrangements of active site residues. With an RMSD of 0.218 Å, binding of CoA induced only minor structural changes in the catalytic subunit compared to the apo structure. However, binding of yArl3, and, especially, the Gag substrate, led to large conformational changes ( Fig. 3d and Supplementary Fig. 10a). In particular, binding of the Gag peptide induced a marked constriction of the central tunnel in NatC ( Supplementary Fig. 10b). The α1-α2 loop of Naa30 moved~2.8 or 3.5 Å upon binding of the yArl3 or Gag peptide, respectively. In the peptide-free NatC structures, Leu27 in the α1-α2 loop is positioned in a pocket in between helices α1 and α2. Binding of Gag induces a flipping of Leu27, allowing recognition of the Nterminal methionine, which is sandwiched between Leu27 and Tyr145 of Naa30. In contrast, the Leu27 side chain appears to be flexible in the yArl3 structure ( Supplementary Fig. 8b).
In contrast to the yArl3 peptide, binding of the viral Gag peptide induced a large conformational rearrangement of the β6-β7 loop in Naa30. Naa30-Asn147 forms a hydrogen bond with Naa35-Gln106 in the apo and CoA/yArl3-bound NatC structures, but traversed a distance of 8.8 Å in the Gag-bound structure to form an intramolecular hydrogen bond with Naa30-Glu120. Moreover, Naa30-Leu146 moved~5 Å toward Gag-Phe4, which complements the extended peptide pocket. Thus, NatC exhibits a sequence-specific, peptide ligand-induced conformational change of the β6-β7 loop.
Interestingly, the Naa30 β6-β7 loop in the Gag-bound structure adopts a conformation that strongly resembles that of Naa10 in the NatA complex ( Supplementary Fig. 10c). This L146 NatC, apo  Table 2. Four peptides exhibited K m values below the detection limit of the colorimetric assay and were assumed to be <20 µM ( Supplementary Fig. 6). Thus, the bar graphs for these peptides (marked with ">") represent the lower bound of the specificity constant and no error bar is provided. b Substrate peptides in the NatC•CoA•MFHLV (orange) and NatC•CoA•MLRFV (magenta) structures. c Magnified views into the active sites of the two peptide-bound NatC complexes, shown in the same orientation. Hydrogen bonds are indicated as black dashed lines. d Superposition of all four NatC structures focusing on structural changes in the α1-α2 and β6-β7 loop in Naa30.
Naa10 conformation is similar in the presence and absence of peptide substrate. As a result, the β6-β7 loop of NatA appears already primed for efficient catalysis in the peptide-free state.
Catalytic mechanism of the NatC complex. A structure-based mutagenesis approach was performed to characterize the catalytic mechanism of NatC. All mutations of residues close to the active site of the Naa30 subunit led to a reduction of k cat of at least 40% (Table 2 and Fig. 4a, b). Reductions to ≤10% of wild-type (WT) activity were observed for residues Glu29, Glu118, and Tyr130, indicating their crucial role for catalysis. The L27A mutant showed the most severe reduction in k cat with a minor impact on K m , suggesting that affinity toward the substrate was not negatively affected. Mutations in the large auxiliary subunit Naa35, K59A ( Supplementary Fig. 10a), and F47A (Fig. 3c) had only moderate effects on catalytic parameters (Table 2). A NatC construct expressed and purified without its small auxiliary subunit Naa38 (ΔNaa38) showed a strong decrease in k cat to 6% of WT activity, while the K m remained almost identical to WT, indicating that the small subunit Naa38 is crucial for NatC activity. On the basis of this mutagenesis study and previous functional data of other NAT complexes, we propose a catalytic mechanism for the NatC complex (Fig. 4c), which is detailed in the discussion. WT wild type. a N-terminal five residues of the decameric substrate peptides ending on -GSRRR. Peptides with N-terminal sequences of cognate yeast NatC substrates (yArl3 and Gag) are printed in bold letters. Single amino acid substitutions in related yArl3-and Gag-peptide variants are underlined. b WT NatC construct (Naa30ΔC17, full-length Naa35 and Naa38ΔC11). c K m values are for the substrates in the substrate column. The acetyl-CoA K m was calculated for NatC WT using the yArl3 substrate (K m = 30 ± 6 µM for n = 4 independent experiments). Four K m values (*) were below the detection limit of the colorimetric assay and were assumed to be <20 µM (see Supplementary Fig. 6). d All normalizations are relative to NatC WT with the yArl3 peptide, except for the Gag-L2E, -R3E, -F4E, and -V5E peptides, which were compared against Gag. e The Naa35-Tip1 mutant contains point mutations: K500A, K501A, K503A, and K504A. f The Naa38 deletion construct consist of Naa30ΔC17 and full-length Naa35 only. Where k cat is ND (not determined), activity could not be detected. Values for k cat and K m represent means ± SD of n independent experiments (see last table column).
We also identified several distinct EPRs in NatC (Fig. 5a). The four most prominent ones, EPR1-EPR4, were selected for mutational studies. EPR2, located on the Naa35 tip, contains 11 positively charged residues, which were divided into two sets (Tip1 and Tip2). For each NatC mutant, between three and four positively charged residues were substituted with alanine. Our cosedimentation assay showed that the electropositive Naa35 tip region is necessary for association of NatC with the ribosome, whereas ribosome binding was not affected for the other EPR mutants (Fig. 5b, c and Supplementary Fig. 11). The Tip1 mutant showed no significant difference in k cat and K m compared to NatC, arguing against major structural disturbances in NatC induced by the mutations. Interestingly, Naa38 does not seem to be required for the NatC-ribosome association, as the Naa35-Naa30 heterodimer associated with ribosomes similarly to NatC. S. cerevisiae NAA35-deletion strains (naa35Δ) were reported to exhibit reduced growth on non-fermentable carbon sources 30,58,59 , and we confirmed these results (Fig. 5d and Supplementary Fig. 12) Table 2. c Proposed catalytic mechanism for the NatC complex (see discussion). d Sequence logos of regions near the active site, generated from multiple sequence alignments of 18 different species (see "Methods"). Black circles indicate proposed catalytic residues and + signs, substrate-binding residues. e Superposition of NatC•CoA•MLRFV, Naa50•CoA-Ac-MGLP (3TFY) and Naa60•CoA-Ac-MKAV (5ICV). For simplicity, only NatC ligands are shown. The proposed catalytic water (sphere) is indicated in all NATs.
NAA35-WT, and the Tip1 or Tip2 variants rescued the growth phenotype of the naa35Δ strain on glycerol. Thus, the EPR2 region in the Naa35 tip region appears not essential for yeast growth on glycerol.

Discussion
In this work, we report the crystal structure of the conserved heterotrimeric NatC complex. We show that the large auxiliary subunit Naa35 is the central scaffold of NatC, which forms extensive interactions with Naa30 and Naa38. The architecture of the complex is strikingly different from that of other NAT complexes, which is mainly due to the unique architecture and interaction network of the Naa35 subunit. By characterizing the peptide-binding site located in between the catalytic Naa30 and the Naa35 subunits, our works explains the substrate specificity of NatC. Furthermore, we reveal how substrate binding leads to correct positioning of the catalytic machinery, and finally identify residues important for catalysis and ribosome interaction.
Naa35 mediates the unique assembly of NatC into a highly intertwined complex. Its N-terminus mediates the interaction with Naa38, while the C-terminal residues engage in an intricate interaction network with the catalytic subunit. Besides this scaffolding function, Naa35 also mediates the interaction with the ribosome via an elongated tip region. We did not find close relatives of Naa35 in the PDB database, suggesting that Naa35 has evolved a specialized function in the NatC complex. Deletion of any of the three NatC subunits in yeast leads to a complete loss of Nt-acetylation of a NatC model peptide 30 . On the other hand, Ntacetylation of yArl3 is abolished in a naa30Δ strain, but still present in naa38Δ yeast cells 38 . Accordingly, we show that a Naa38-deletion construct of NatC still displayed residual activity Residues within electropositive regions (EPRs) that were substituted with alanine in NatC mutants constructs are specified in brackets. b Representative western blot of a NatC/ribosome co-sedimentation assay. NatC constructs in the supernatant (S) and pellet (P) fractions were immunodetected via a FLAG-tag at the Naa35 N-terminus, yeast ribosomes via the ribosomal protein uL30. Co-sedimentations assays were performed in triplicate (all replicates are shown in Supplementary Fig. 11). c Quantification (chemoluminescence band intensities) of the NatC fraction in the pellet (P/(S + P)). Data represent mean ± SD (n = 3 independent experiments). NatC mutants were compared against NatC WT using a one-way ANOVA analysis with Dunnetts correction. d Serial tenfold dilutions of S. cerevisiae WT (BY4741) and naa35Δ (Y00294) strains, transformed with a pRS416 yeast centromere vector, carrying no insert (vector), NAA35-WT or the NAA35-Tip1 or NAA35-Tip2 mutant genes. Cells were grown at 37°C for 5 days on SD-ura agar plates, supplemented with 3% glycerol or 2% dextrose, respectively.
for a yArl3 peptide. The influence of Naa38 on catalytic activity is likely indirect, as Naa38 is far away from the catalytic subunit and the two subunits share only a small interaction interface. Instead, Naa38 seems to stabilize the Naa35 N-terminus, which runs in between Naa30 and Naa38, and forms the distal end of the extended peptide-binding pocket. When expressed on its own, yeast Naa30 aggregated in our hands, highlighting its dependence on Naa35. However, the Arabidopsis homolog of Naa30 can functionally replace all three NatC subunits in yeast, and AtNaa35 knockout plants show no obvious phenotype 42 . Thus, in contrast to the yeast counterpart, plant Naa30 can either act alone or as part of a different multiprotein complex. NatC acetylates substrates with an initial N-terminal methionine followed by a hydrophobic/amphipathic amino acid 9,25,[29][30][31][32][33][34] . This substrate preference can well be explained by our two substrate-bound NatC structures. In the peptide-free NatC structures, Leu27 is positioned in a hydrophobic pocket in between helices α1 and α2. In the Gag-bound structure, we observed a repositioning of Leu27 from a hydrophobic pocket within Naa30 toward the N-terminal methionine of the substrate peptide, whereas Leu27 appeared flexible in the yArl3-bound structure. We envision that the repositioning of Leu27 upon substrate-binding primes the catalytic machinery, as the L27A mutant exhibited the strongest catalytic reduction in our assay. The substrate's second amino acid is placed in the hydrophobic main peptide pocket and shows a preference for phenylalanine, tryptophane, or alanine. The latter is unlikely to exist in vivo, as MetAP usually cleaves the amino-terminal methionine when it precedes a small amino acid 60 , unless subsequent inhibitory residues (e.g., Pro at position 3) restrain MetAP activity 61 . Terecero et al. have already qualitatively shown in yeast that also substrate positions 3 and 4 are important for acetylation by NatC, as glutamate substitutions at these positions prevented Ntacetylation 34 . Our peptide-bound structures reveal that Gag-Arg3 interacts with Tyr144 of the β6-β7. Furthermore, a contact of Naa30-Glu29 with the Gag peptide backbone stabilizes active site loops α1-α2 and β6-β7 (Fig. 3c). For the yArl3 peptide, this interaction is sterically prevented by a hydrogen bond of yArl3-His3 with Naa30-Glu29, which may partially account for the reduced specificity constant of yArl3 compared to Gag. When the steric restraints are released by the H3A substitution and, possibly, the L4A substitution in yArl3, efficient catalysis is restored. Furthermore, the Gag peptides explores an extended peptide pocket, which can accommodate another large hydrophobic amino acid at peptide position four. Correspondingly, the yArl3-L4F substitution leads to reduction in K m . Our data thus suggest that amino acids 3 and 4 of the Gag peptide induce sequencespecific structural rearrangements in the active site that are favorable for catalysis. This can explain the high specificity constant of the Gag peptide, which likely reflects an evolutionary adaption of the L-A virus to ensure that all Gag protomers are Ntacetylated to allow an error-free, seamless assembly of the viral capsid. Interestingly, three mitochondrial proteins acetylated by NatC in yeast share the same four N-terminal amino acids as Gag 34 , and may therefore be modified in a similarly efficient manner.
Even though NatC, NatE, and NatF share substantially overlapping substrate specificity in vitro 9 , they may have a smaller overlap in vivo. In yeast, the Naa50 subunit of the NatE complex lacks the optimal acetyl-CoA-binding motif and is enzymatically inactive 21,22 , and NatF is completely absent 13,17 . In humans, NatF is tethered to the Golgi membranes, where it specifically acetylates transmembrane proteins 17 and is thus likely to exhibit only a minor substrate overlap with NatC. Human Naa50 is catalytically active in its uncomplexed 62 and NatA-complexed form 21 . A comparison of the NatC substrate pocket with those of human Naa50 and Naa60 reveals considerable differences in the shape, hydrophobicity, and electrostatic surface potential of the respective peptide-binding pockets ( Supplementary Fig. 9). The NatC pocket, which is formed at the interface of subunits Naa30 and Naa35, is much deeper and more confined than the peptide-binding sites in NatE and NatF. While the NatC peptide-binding pocket is lined by several conserved hydrophobic residues, it still exhibits a moderate negative electrostatic surface potential as opposed to the positive charge in NatE. This may contribute to subtle differences in their substrate specificity. The unique substratebinding pocket of NatC, including a substrate-binding interface in between the catalytic and auxiliary subunits, offers opportunities to design small molecules that specifically interfere with NatC function. Such compounds could be used to explore the cellular function of NatC in more detail. In light of the observation that NatC is upregulated in cancer cells 43 , a therapeutic application of NatC inhibitors may also be envisioned.
Based on our mutagenesis data and previously proposed reaction schemes for GNATs 24,50,63 and NAT complexes, we present a refined catalytic mechanism for NatC (Fig. 4c). It was proposed for human Naa50 that a tyrosine and histidine cooperate in the deprotonation of the amino group via a coordinated water molecule 46 , and a similar situation is found in Naa60 (refs. 47,49 ). A glutamate was also suggested to function as the general base in the GNAT histone acetyltransferase Gcn5 from S. cerevisiae 64 and the Salmonella typhimurium GNAT RimI, which also employs a catalytic water 65 . An active site alignment of NAT orthologs reveals that the NatC catalytic subunit Naa30 has two potential general bases: Tyr80 and Glu118 (Fig. 4d, e). The proposed catalytic water is positioned 3.5 Å away from the peptide's amino group and is coordinated by the hydroxyl group of the Tyr80 side chain, the backbone carbonyl oxygen of Ile81 and the carboxyl group of the Glu118 side chain. E118A or E118Q substitutions resulted in a~90% reduction, and Y80A or Y80F substitutions led to a~50% reduction of the k cat compared to Naa30 WT (Table 2). This suggests that Glu118 serves as the key general base, while Tyr80 may support its function by coordinating the catalytic water. The deprotonated peptide amino group then performs a nucleophilic attack on the carbonyl carbon of the enzyme-bound acetyl-CoA, resulting in a transient zwitterionic tetrahedral intermediate. Simultaneously, the resultant negative charge on the carbonyl oxygen (oxyanion) may be stabilized by the backbone amide of Leu84. Tyr130 likely serves as a general acid that donates a proton to break the thioester bond of the tetrahedral intermediate. Y130A and Y130F mutants exhibited a strongly reduced k cat , corresponding to~1-2% of NatC WT turnover numbers, while showing no significant changes in K m , emphasizing the crucial role of Tyr130 in catalysis. Tyr130 is universally conserved among NAT orthologs (Fig. 4d), and its role as a catalytic acid was initially proposed for S. typhimurium RimI 65 . Mutations in a corresponding tyrosine in NatB lead to similar reductions in k cat (ref. 50 ).
After the nucleophilic attack, the nitrogen atom of the former peptide carries a positive charge and an excess hydrogen has to be removed in a second deprotonation step. Point mutations in the highly conserved Glu29 exhibited reductions of k cat to 4-6% of NatC WT turnover numbers, but the amino group of the substrate peptide and the carboxyl group of Glu29 are~9 Å apart. Glu29 may play an indirect role in NatC catalysis, through direct contact with cognate NatC substrate peptides, resulting in a stabilization of active site loop α1-α2 and β6-β7. In NatA, mutations of the corresponding glutamate showed similarly strong reductions in k cat with negligible effects on K m (ref. 24 ). However, a mutation of the corresponding glutamate in NatB increased its enzymatic activity by~60% (ref. 50 ), pointing to alternative functions in different NATs.
A recent cryo-EM structure of the S. cerevisiae NatA/Naa50ribosome complex revealed interactions with the ribosomal RNA near the peptide exit tunnel 66 (Supplementary Fig. 13a). To explore possible binding modes of NatC with the ribosome, the catalytic subunit of NatC was aligned with Naa10 in the NatA complex. In the resulting model, the auxiliary subunit Naa35 is in close contact with the surface of the ribosome. Moreover, the peptide-binding site is in close proximity to the ribosomal exit tunnel, and the Naa35 tip region contacts the ribosomal RNA, in agreement with its essential role in NatC-ribosome binding ( Supplementary Fig. 13b, c). Mutations in the Naa35 tip region did not affect yeast growth on glycerol. However, the molecular basis for reduced growth of NatC deletion strains under nonfermentative conditions is not clear. It may be envisaged that the introduced mutations in the Naa35 tip only partially disturb the NatC-ribosome interaction in vivo. Alternatively, the NatC-ribosome interaction may not be required for the acetylation of all NatC substrates, including those implicated in yeast growth. Clearly, more work is required to fully understand the exact role of the co-translational activity of NatC in yeast and human.
Taken together, the structural and biochemical studies presented in this work provide insight into the unique architecture, substrate preference, catalysis, and ribosome interaction of NatC. They also show how sophisticated NAT machineries have divergently evolved to provide the cell with a broad repertoire of N-terminally acetylated proteins.

Methods
NatC expression and purification. Genes encoding S. cerevisiae NatC subunits Naa30 (Uniprot ID: Q03503), Naa35 (Q02197), and Naa38 (P23059) were obtained from the Dharmacon yeast ORF collection and cloned into the pRSFDuet-1 (Novagen) expression vector (Supplementary Table 1). A NatC complex construct (designated NatC WT), expressing the truncated NatC subunits Naa30ΔC17 (residues 1-159), full-length Naa35 (residues 1-733), and Naa38ΔC11 (residues 1-77) was used for all kinetic and structural studies. The genes encoding Naa38ΔC11 and Naa30ΔC17 were cloned into the second multiple cloning site of the pRSFDuet-1 vector. An additional ribosome-binding site (AAGGAGATA-TACC) was added in front of the Naa30ΔC17 start codon. DNA encoding fulllength Naa35, preceded by a human rhinovirus 3C (HRV 3C) cleavage site, was cloned in frame with the sequence for the His 6 -tag in the first MCS. For NatC-ribosome co-sedimentation assays, an additional FLAG-tag (DYKDDDDK) was inserted in between the HRV 3C cleavage site and Naa35. NatC point mutant constructs were created by site-directed mutagenesis.
Mass spectrometry. Purified NatC complex was diluted in 0.1% trifluoroacetic acid (TFA) to a final concentration of 10 µM and mixed with an equal volume of saturated α-cyano-4-hydroxycinnamic acid solution in 50% acetonitrile/0.05% TFA. A total of 0.5 µL of the sample/matrix mixture was spotted on a MALDI target plate and measured with a mircoflex™ LRF MALDI-TOF mass spectrometer (Bruker) using linear, positive ion mode.
NatC crystallization and structure determination. NatC was diluted to 5-6 mg mL −1 in size-exclusion buffer containing 2.5 mM 2-mercaptoethanol. The selenomethionine-substituted NatC complex was crystallized in its apo form. CoA was added at a 1:3 molar ratio to all native NatC proteins. In addition to CoA, the substrate peptides MFHLVGSRRR or MLRFVGSRRR were added to two further crystallization setups at a molar ratio of 1:5 or 1:3, respectively. Crystallization plates were set up as sitting drops in a 96-well format with a Crystal Gryphon dispensing robot (Art Robbins Instruments) by mixing 0.2 µL protein solution with 0.2 µL reservoir solution above an 80-µL reservoir. Reservoir solution contained 14.5-16.5% PEG 4000, 150 mM ammonium iodide, and 100 mM sodium citrate, pH 6.1-6.3. Diffraction quality crystals appeared after 12-48 h and required another 5 days to reach maximum dimensions. Crystals were soaked for~10 s in well solution containing 20% (v/v) ethylene glycol and quickly frozen in liquid nitrogen. All datasets were collected at beamline BL14.1 at BESSY II in Berlin at a temperature of 100 K.
The dataset of the selenomethionine-substituted NatC complex (NatC, apo) was taken from a single crystal at a wavelength of 0.9797 Å (3600 images, 0.1°per frame) and processed to 2.40-Å resolution using the XDS suite 68 . Experimental phasing was achieved by SAD phasing with SHELXC/D/E 69 , using HKL2Map 70 from the CCP4 interface 71 . An automated model building and refinement was performed with Buccaneer 72 , and the model was completed manually in Coot 73 and refined in Phenix 74 . Translation-libration-screw-rotation (TLS) refinement employed two TLS groups for subunit Naa30ΔC17, two groups for Naa38ΔC11 and five groups for subunit Naa35.
Datasets of native, ligand-bound NatC complexes were taken from a single crystal each, at a wavelength of 0.9184 Å and consisted of 1100 images (NatC•CoA•MFHLV) or 1800 images (NatC•CoA and NatC•CoA•MLRFV) with an oscillation of 0.1°per frame. All native datasets were processed with XDS to 2.45-Å (NatC•CoA), 2.99-Å (NatC•CoA•MFHLV), or 2.75-Å (NatC•CoA•MFHLV) resolution, respectively. Datasets were phased by molecular replacement with Phaser 75 using the refined structure of the SeMet-labeled NatC complex, followed by several rounds of manual model building (Coot) and refinement (Phenix). All ligand occupancies were refined. TLS was used in later stages of the refinement using the same TLS groups as for the selenomethionine dataset plus one additional group for CoA and another for the peptide ligand. Simulated annealing OMIT maps for CoA, substrate peptides and a water molecule in the active site were generated with phenix.maps after performing a simulated annealing refinement run in Phenix. Figures were prepared with the PyMOL Molecular Graphics System, Version 2.3.1 (Schrödinger, LLC). Secondary structure assignment for all NatC structures was done manually. NatC dimension were calculated with the PyMOL script "Draw Protein Dimensions" (https:// pymolwiki.org/index.php/Draw_Protein_Dimensions). Interfaces and buried surface areas between NatC subunits were calculated using the PDBe PISA webserver 76 . NatC subunit structures were compared against the PDB database, using the DALI webserver 77 . Structural superpositions and RMSD calculations were performed with PyMOL. The electrostatic surface potential and surface hydrophobicity were calculated with the PyMOL plugins APBS 78 or VASCo 79 . Multiple sequence alignments for Naa30, Naa35, and Naa38 were generated with Geneious (https:// www.geneious.com/) and visualized using ESPript3.0 (http://espript.ibcp.fr/). Sequence logos of the catalytic subunits of five different NAT paralogs (Naa10, Naa20, Naa30, Naa50, and Naa60) were generated from individual multiple sequence alignments using the corresponding NAT orthologs from 18 different species: S. cerevisiae, Kluyveromyces lactis, Candida albicans, Fusarium graminearum, Physcomitrelle patens, Oryza sativa, Zea mays, A. thaliana, Populus trichocarpa, Ricinius communis, Drosophila melanogaster, Aedes aegypti, Tribolium castaneum, Strongylocentrotus purpuratus, Danio rerio, Xenopus tropicalis, Homo sapiens, and Mus musculus.
Acetyltransferase assays. The acetyltransferase activity was determined using the Ellman method, adapted from Thompson, et al. 80 . Reactions were carried out at 25°C in acetylation buffer (50 mM HEPES, pH 7.5, 150 mM NaCl, 0.2 mM EDTA, and 0.003% (v/v) Tween-20). Tween-20 was added to the buffer to reduce protein surface adsorption. Catalytic parameters for NatC WT and mutants were determined using the yArl3 peptide (MFHLVGSRRR), containing the five N-terminal residues of the S. cerevisiae ADP-ribosylation factor-like protein 3 (Uniprot ID Q02804), followed by a GS linker and a triple arginine to facilitate peptide solubility. Reactions were carried out as duplicates at six different peptide concentrations and an acetyl-CoA concentration of 500 µM (≥10 × K m ). Individual reactions (110 µL total volume) were performed with final NatC concentrations between 50-2500 nM for 150 s for NatC WT and up to 100 min for catalytically impaired NatC mutant complexes. At regular time intervals, 20-µL aliquots were taken from the reaction and quenched in 40 µL quenching buffer (3.2 M guanidinium-HCl, 5 mM EDTA, 100 mM sodium phosphate, pH 6.8), containing freshly added 5,5dithiobis(2-nitrobenzoic acid) (DTNB) at a final concentration of 0.5 mM. A total NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-19321-8 ARTICLE NATURE COMMUNICATIONS | (2020) 11:5506 | https://doi.org/10.1038/s41467-020-19321-8 | www.nature.com/naturecommunications of 50 µL of the quenched reactions were transferred into 384-well plates and absorbances were measured at 412 nm with a M1000 Pro Microplate reader (Tecan). TNB 2− anion product concentration, was determined using the Beer-Lambert law (A = ε × c × l), assuming ε = 14,150 M −1 cm −1 (ref. 81 ). Reaction background absorbances (containing quenched enzyme) were determined for each reaction and subtracted from the absorbances of the individual reactions. Plate background absorbances were subtracted from all reactions to account for plate-specific imperfections. Turnover of limiting substrate did not exceed 20%. Initial velocities were fitted by nonlinear least fit squares to the Michaelis-Menten equation using GraphPad Prism version 6.01 to determine k cat and K m parameters. Each complete acetylation assay was performed at least in triplicate, and the average k cat and K m and SD were calculated with Microsoft Excel 365. The K m of acetyl-CoA for NatC WT was determined with a fixed Arl3 peptide concentration of 1340 µM and 30-500 µM acetyl-CoA. A NatA substrate peptide (SASEAGSRRR) containing the N-terminal five residues (after removal of the initiator methionine) of the S. cerevisiae ThrRS (Uniprot ID: P04801); and a NatB substrate peptide (MDSEVGSRRR) from S. cerevisiae actin (Uniprot ID: P60010) showed no activity, even after incubating for 2 h and using a large excess (2 µM) of NatC WT. The Gag peptide (MLRFVGSRRR), containing the N-terminal five residues of the major capsid protein (Gag) of the S. cerevisiae virus L-A (Uniprot ID: P32503) showed a higher specificity constant, and thus the acetyltransferase assay was adapted: final NatC concentrations: 10-80 nM; total reaction times: as short as 60 s; peptide concentrations: 30-500 µM; total reaction volume: 226 µL, with larger aliquots of 72 µL taken. Aliquots were quenched in 36 µL of 2× quenching buffer (6.4 M guanidinium-HCl, 10 mM EDTA, 200 mM sodium phosphate, pH 6.8). Two times 50 µL of each quenched reaction were transferred into separate microplate wells to obtain duplicate absorbance readings, which were averaged. Further peptides, exhibiting a K m > 50 µM, were measured as described for the yArl3 peptide. Peptides exhibiting a K m < 50 µM were measured as described for the Gag peptide.
All peptides were synthesized by ProteoGenix (Schiltigheim, France) with a purity of ≥95% and with TFA exchanged for HCl. Peptides were solubilized in acetylation buffer and peptide concentrations were determined photometrically from TNB 2− product concentration, formed in an end point acetylation reaction. For each peptide stock, five different dilutions were incubated together with an excess of CoA and 2 µM NatC WT in acetylation buffer. Reaction aliquots were quenched after 1.5 and 2 h with a twofold excess of quenching buffer, containing 0.5 mM DTNB. Measurements taken after 2 h did not show a significant increase of absorbance compared to the 1.5 h time point and both readings were within 5% of one another.
Purification of yeast 80 S ribosomes. Yeast 80 S ribosomes were purified according to a protocol adapted from Magin, et al. 57 . S. cerevisiae strain YA2488 was plated out on YPD-agar plates (1% yeast extract, 2% peptone, and 2% glucose), and used to inoculate a 20 mL preculture of YPD, which was incubated at 30°C for 7 h. A volume of 8 L of YPD was inoculated to an OD 600 of 0.001 and grown at 30°C to an OD 600 of 2 (stationary phase). Cells were pelleted at 8000 × g, 5 min, 4°C (Beckman Rotor JLA9.1000), resuspended in YP medium (1% yeast extract, 2% peptone, and no glucose), and incubated for 10 min, 250 r.p.m., 30°C, to ensure that all ribosomes would be in the "apo" form, without nascent chain or tRNA. The following cell lysis steps were all carried out at 4°C: cells were pelleted by centrifugation at 8000 × g, 5 min (Beckman Rotor JLA9.1000) and resuspended in buffer A (30 mM HEPES, pH 7.5, 50 mM KCl, 10 mM MgCl 2 , 8.5% sorbitol, 2 mM DTT, and 0.5 mM EDTA). Cells were centrifuged again at 8000 × g, 5 min, the pellet was weighed, and resuspended in lysis buffer (buffer A + one cOmplete mini EDTA-free protease inhibitor tablet (Roche 11836170001), 1 unit µL −1 RNAsin (N261B, Promega), and 800 µg mL −1 heparin) to a final concentration of 200% (w/v). Glass beads (G8772 sigma) were added to the resuspension and lysis was performed by vortexing for 30 s, followed by 30 s on ice, repeated four times. To obtain the yeast lysates, the resuspension was centrifuged at 9000 × g, 10 min (rotor Beckman TA-10.250), and the supernatant was collected. The absorbances at 260 and 280 nm were measured. To isolate the 80 S monosomal ribosome fraction, yeast lysate (A 260 = 200) was underlaid with 1 mL 30% sucrose in 80 S buffer (20 mM HEPES, pH 7.6, 100 mM potassium acetate, 5 mM MgCl 2 , and 2 mM DTT) and centrifuged 70,000 × g, 18 h (Beckman ultracentrifuge, rotor mla80). The pellet was collected, and added at a concentration of A 260 = 100, to the top of a 15-30% sucrose gradient, which was prepared by underlying 15% sucrose in 80 S buffer with 30% sucrose in 80 S buffer, and mixing (Gradient Master, BIOCOMP, short settings). The gradients were then spun at 92,703 × g, 6 h, using a SW32 Ti Beckman swing-out rotor, and then loaded to a polysome collector (LKB). Fractions containing 80 S monosomes were pooled and centrifuged at 127,959 × g, 14 h (Beckman MLA-80 rotor). The pellet containing the 80 S yeast ribosomes were resuspended in 20 mM HEPES pH 7.6, 50 mM potassium acetate, 5 mM MgCl 2 , and 2 mM DTT to give a final concentration of 2 µM.
Yeast genetics. S. cerevisiae WT (BY4741) and Naa35-deletion (Y00294) strains were obtained from Euroscarf. The NAA35 gene (YEL053C), including 460 bp upstream and 494 bps downstream of the NAA35 ORF, was amplified by PCR from S. cerevisiae genomic DNA (strain BY4741), and cloned into the BamHI cloning site of the pRS416 yeast centromere vector. Plasmids encoding Naa35-mutants Tip1 (K500A, K501A, K503A, and K504A) and Tip2 (K511, R515, and R519) were created using overlapping fragments, containing the desired mutations. Additional 27 bases (ATGGATTATAAAGATGATGATGATAAA) encoding for an additional FLAG-tag were inserted in front of the start codon of NAA35-WT and mutant genes for immunodetection of the NAA35 gene products. For vector transformations, S. cerevisiae WT (BY4741) and NAA35-deletion (Y00294) strains were grown overnight in 5 mL YPD (10 g L −1 yeast extract, 20 g L −1 peptone, and 2% (w/v) glucose) medium at 30°C. Cells were sedimented at 3000 × g, washed with 1 mL of LiOAc mix (100 mM lithium acetate, 50 mM EDTA, 100 mM Tris-HCl, pH 7.6), and resuspended in 100 µL LiOAc mix. To this, 10 µL of freshly boiled herring sperm DNA (Sigma), 1 µg of plasmid DNA, and 700 µL of PEG mix (40% (w/v) PEG 4000, 100 mM lithium acetate, 50 mM EDTA, 100 mM Tris-HCl, pH 7.6) was added and the suspension was incubated for 30 min at 30°C, followed by addition of 48 µL DMSO and 15 min incubation at 42°C. After addition of 3 mL YPD, cells were grown for 1 h at 30°C and plated on SD-ura (6.7 g L −1 yeast nitrogen base, 1.92 g L −1 SD medium supplement without uracil) agar plates containing 2% (w/v) glucose, which were grown for 2-3 days at 30°C. FLAG-tagged Naa35 could not be detected by western blot, likely due to low Naa35 expression levels from the endogenous promotor, as previously suggested 58 .
Yeast dilution spot assays. Single colonies of transformed S. cerevisiae WT and NAA35-deletion strains were used for overnight cultivation at 30°C in SD-ura medium, containing 2% (w/v) glucose. After 16 h, cells were sedimented (5 min, 3000 × g) and resuspended in SD-ura medium, containing 3% (v/v) glycerol to an OD 600 of 1. Cells were grown at 30°C for 3 h to deplete remaining glucose. For dilution spot assays, yeast cells were initially diluted to an OD 600 of 1. Subsequent tenfold dilutions were made and a 48-pin multi-blot replicator was used to spot cells onto SD-ura agar plates containing 2% (w/v) glucose or 3% (v/v) glycerol, respectively. Plates were incubated at 37°C for 3-5 days.
Statistics and reproducibility. Statistical parameters including the definitions and exact value of n, deviations, p values, and the types of the statistical tests are reported in the figures and corresponding figure legends or hereafter. The SDS-PAGE analysis in Fig. 1d was repeated twice with identical results. The underlying NatC complex purifications were performed three times for the NatC complex containing full-length Naa35 (residues 1-733) and twice for the NatC complex containing Naa35ΔN17 (residues 18-733). Replicate purifications showed identical results. The NatC complex containing Naa35ΔN44 (residues 44-733) was purified once. Statistical analyses were carried out using GraphPad Prism version 6.01.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
The atomic coordinates of NatC have been deposited in the Protein Data Bank with accession numbers 6YGA (NatC, apo), 6YGB (NatC•CoA), 6YGC (NatC•CoA•MFHLV), and 6YGD (NatC•CoA•MLRFV). The authors declare that the main data supporting the findings of this study are available within the article and its Supplementary Information files. All other supporting data, plasmids, and yeast strains developed in this study are available from the corresponding author upon reasonable request. Source data are provided with this paper.