Structure of human RNA polymerase III

In eukaryotes, RNA Polymerase (Pol) III is specialized for the transcription of tRNAs and other short, untranslated RNAs. Pol III is a determinant of cellular growth and lifespan across eukaryotes. Upregulation of Pol III transcription is observed in cancer and causative Pol III mutations have been described in neurodevelopmental disorders and hypersensitivity to viral infection. Here, we report a cryo-EM reconstruction at 4.0 Å of human Pol III, allowing mapping and rationalization of reported genetic mutations. Mutations causing neurodevelopmental defects cluster in hotspots affecting Pol III stability and/or biogenesis, whereas mutations affecting viral sensing are located in proximity to DNA binding regions, suggesting an impairment of Pol III cytosolic viral DNA-sensing. Integrating x-ray crystallography and SAXS, we also describe the structure of the higher eukaryote specific RPC5 C-terminal extension. Surprisingly, experiments in living cells highlight a role for this module in the assembly and stability of human Pol III.

T ranscription of the eukaryotic genome is mediated by three highly specialized nuclear RNA polymerase (Pol) enzymes. Pol III transcribes short untranslated RNAs, which are essential for cellular functions, such as the entire pool of transfer RNAs, the precursor of the 5S ribosomal RNA and the U6 spliceosomal RNA 1 .
Pol III is a multi-subunit complex composed of 17 subunits. A central ten-subunit core, which harbours the catalytic site and a peripheral heterodimeric stalk that are structurally conserved among the three eukaryotic Pols. The TFIIF-like RPC4/5 and the TFIIE-like RPC3/6/7 subcomplexes are Pol III specific and can be regarded as built-in general transcription factors that play a fundamental role in Pol III transcription initiation and termination [2][3][4] .
Across the eukaryotic kingdom, Pol III displays a high degree of conservation both in terms of subunit composition and sequence homology of the individual components. A notable exception is the subunit RPC5, which in metazoans encompasses a long C-terminal extension (RPC5EXT,~450 residues long), whose function is currently unknown.
Pol III activity is highly regulated in a cell cycle and cell-typedependent manner 5 , and is a determinant of lifespan in eukaryotes 6 . In recent years, a large number of disease-causing mutations have been assigned to Pol III subunits, with a particular incidence of allele variants that strongly affect the correct development of the central nervous system (CNS), resulting in severe neurodegenerative diseases [7][8][9][10][11][12][13][14] . Furthermore, causative Pol III mutations have also been described in patients affected by hypersensitivity to viral infection 15,16 .
To date, yeast Pol III has been extensively structurally and functionally characterized, while its human counterpart has been left relatively untouched, due to the inherent technical challenges in obtaining yields amenable for structural biology. However, understanding the specific influence of pathological mutations and the role of regulatory elements unique to the human enzyme relies on such structural information. Here, we report the cryogenic electron microscopy (cryo-EM) reconstruction of human Pol III. We further study the enzymes' complete architecture using a structural biology hybrid approach integrating two crystal structures of the human RPC5 C-terminal extension, as well as SAXS data and molecular modelling. Results of our comparative structural analysis rationalize the effect of pathological mutations and yield unexpected insights into Pol III regulation.

Results
Purification of human RNA Pol III. To obtain a high-resolution structure of human Pol III, we isolated the endogenous complex from HeLa cells. To this end, we employed CRISPR/Cas9 genome editing in human cells to create a homozygous knock-in of a cleavable green fluorescent protein (GFP)-tag at the C terminus of subunit RPAC1 (shared between Pol I and Pol III) (Fig. 1a). Fractionation experiments followed by immunopurification using an anti-GFP nanobody revealed that Pol III is present in both nuclear and cytoplasmic fractions (Fig. 1b), in agreement with previous reports highlighting a Pol III cytosolic DNA-sensing activity 17,18 . An optimized large-scale purification, including an ion-exchange step to separate Pol I and Pol III, enabled the isolation of active human Pol III (Fig. 1c, d and Supplementary  Fig. 1) from total cell extracts with yields and quality amenable for further EM studies. Cryo-EM structure of human Pol III. The non-crosslinked purified human Pol III sample was applied to carbon-coated cryo-EM grids and imaged on a Titan Krios TEM microscope equipped with a Falcon III camera. Two data sets were collected at 0°and 30°tilting angles, to overcome preferred orientation of the sample on the cryo-grids, resulting in a merged dataset of 172,678 particles after two-dimensional (2D) class averaging (Supplementary Fig. 2 and Table 1). The majority of imaged particles represented the intact 17-subunit human Pol III but a sizeable fraction with a similar angular distribution displayed no density for the RPC3/6/7 heterotrimer, which had possibly dissociated during purification, in agreement with earlier reports 19 , or during cryo-EM specimen preparation. Hierarchical three-dimensional (3D) classification led to a reconstruction of the intact human Pol III from 25,369 particles at an overall resolution of 4.0 Å(Supplementary Figs. 2 and 3, and Table 1). The core of the enzyme is characterized by a very detailed EM map where side chains are clearly discernible (Fig. 2). The RPC8/9 stalk and the RPC3/6/7 subcomplex are more flexible than the core; hence, their local resolution is lower compared to the rest of the complex (Supplementary Fig. 3). Interestingly, the coiled-coil region of the clamp subdomain within the largest subunit RPC1, which is in direct contact with the RPC3/6/7 heterotrimer, also displays a high degree of flexibility. This finding suggests that the coiled-coil region of the clamp together with the heterotrimer form a discrete structural and functional unit, which in yeast has been shown to be able to sense melting of the upstream side of the transcription bubble 4 .
As can be expected from the high degree of sequence conservation, the overall structure of human Pol III resembles the yeast counterpart. Structure-based alignments and comparison revealed that most subunits share a high degree of similarity and low root-mean-squared deviation values ( Supplementary  Fig. 4). However, local differences highlight specific features that might be relevant for human-specific regulation and correct assembly of the Pol III enzyme. Three relevant deletions were Fig. 2 The structure of human RNA polymerase III. Shown is the electron density map filtered according to the local resolution with the fitted model shown in ribbon representation (above). Regions of the electron density map are coloured according to the subunit structure as labelled. Shown below are selected regions of several subunits showing the fit with the filtered electron density (mesh). In human RPC4, a small deletion removed a helix (residues 269-285), which in the yeast RPC4 protrudes back towards the Pol core and contacts RPC2 ( Supplementary Fig. 5). Deletion of this region may therefore highlight a weaker association between the human RPC4/5 heterodimer and core when compared to the yeast enzyme. In the RPC5 dimerization module, structural alignment detected the insertion of a small loop in humans in addition to the large C-terminal insertion (RPC5EXT), which together suggest a slightly rearranged heterodimer module in human Pol III (Supplementary Fig. 5). Furthermore, comparison of the yeast and human stalk subunit RPC9 identified two additional deletions in the human structure which remove unstructured loops (not present in the cryo-EM map of the corresponding yeast subunit). This comparative analysis of the Pol peripheral subcomplexes was limited in the human structure due to the flexibility of the RPC5EXT and heterotrimerclamp module. As a result, both the reported RPC3 iron-sulphur cluster 22 and the RPC5EXT, elements absent in the Saccharomyces cerevisiae Pol III structures 3,4,23 , were not visible in our EM map.
Structure of the RPC5 C-terminal extension. To gain insight into the structure and function of RPC5EXT, we determined the structure of its individual domains by X-ray crystallography (Supplementary Fig. 6, Fig. 3, and Tables 2 and 3). The RPC5EXT is formed by two consecutive tandem winged helix domains (tWHD1 residues 259-440; tWHD2 residues 556-708) connected by a 115 residue-long flexible linker. Such an architecture has not been reported for other components of the eukaryotic transcription apparatus and appears to be found exclusively in metazoan RPC5. Of the two tWHDs, tWHD1 is the most conserved while tWHD2 is absent in Caenorhabditis elegans and Drosophila melanogaster ( Supplementary Fig. 7). The tWHD1 is formed by two juxtaposed winged helix domains that form a compact globular domain with one of the two recognition helices, typically involved in DNA binding, buried within the structure (Fig. 3b). The compact conformation of tWHD1 is observed also in solution as highlighted by small-angle X-ray scattering (SAXS) data ( Fig. 3d and Supplementary Fig. 6). The tWHD2 structure revealed a dimer formed by Inset shows detail of the electron density map. c Overall structure of RPC5-tWHD2 crystallographic packing in ribbon cartoon. RPC5-WHD3 and RPC5-WHD4 are shown in dark and light green, respectively, and the inset shows detail of the electron density map. d Fitting of RPC5-tWHD1 (red, inset) into the SAXS experimental data (black). e Fitting of RPC5-tWHD2 in 'closed' (green, inset), 'open' (dashed green) and dimer (orange) conformations into the SAXS experimental data (black). f Docking of RPC5-tWHD1 (red) and RPC5-tWHD2 (green) into the ab initio SAXS envelope generated from RPC5EXT SAXS data collection. ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-20262-5 domain swapping (Fig. 3c). This arrangement is likely caused by the crystallization conditions and, in agreement with this hypothesis, SAXS data showed a monomeric conformation as the most likely in solution ( Fig. 3e and Supplementary Fig. 6). Nevertheless, the two possible conformations of tWHD2, compact or elongated, suggests a degree of flexibility within this domain. Finally, SAXS analysis of a construct encompassing the full-length RPC5EXT support the model of two globular compact tWHD domains connected by a long flexible linker, spanning approximately up to 175 Å in length ( Fig. 3f and Supplementary Figs. 8 and 9).
Comparison with existing protein structures using the Dali server (http://ekhidna.biocenter.helsinki.fi/dali_server/) suggested similarities between tWHD1 and the WHD of S. cerevisiae Pol II general transcription factor TFIIF Rap30 subunit [24][25][26] and with the tWHD of Pol I A49 subunit [27][28][29][30][31] . Both subunits are orthologs of RPC5 and involved in stabilization of the pre-initiation complexes (PICs), suggesting a putative functional link. However, although the position of TFIIF Rap30 WHD in the Pol II PIC clashes with the Bdp1 subunit of transcription factor TFIIIB in the Pol III PIC 3,4 (Fig. 4), the equivalent position of A49 tWHD in the Pol I PIC 27 is accessible in the Pol III PIC (Fig. 4). Thus, one possibility is that, analogously to A49 tWHD, the RPC5-tWHD1 participates in an interaction with the upstream DNA and bound transcription factors, thus stabilizing the human Pol III PIC. In  addition, the Dali server analysis retrieved similarities between the individual WHDs of RPC5EXT tWHD2 and the WHDs of cullin and cullin-like proteins, which are involved in ubiquitindependent proteolysis 32 .
The RPC5 extension is required for RNA Pol III stability. To gain insight into the functional role of RPC5EXT, we used small interfering RNA (siRNA) to knock down RPC5 in HEK293T cells and rescued it with ectopic expression of haemagglutinin (HA)tagged RPC5 constructs encompassing the full-length protein (RPC5FL) or a version of RPC5 devoid of either tWHD2 (RPC5ΔtWHD2) or the entire RPC5EXT (RPC5ΔC) (Supplementary Fig. 10a, b). Immunoprecipitation using anti-HA magnetic beads revealed that both RPC5FL, RPC5ΔtWHD2 and RPC5ΔC are able to integrate into and pull down a bona fide intact Pol III complex, as probed by RPC1, RPC2 and RPC4 antibodies ( Supplementary Fig. 10c). However, the corresponding immunoblots of whole-cell extracts, prior to the immunoprecipitation, indicate lower steady-state levels of RPC5ΔC compared to RPC5FL, pointing towards a direct role of RPC5EXT in enhancing RPC5 stability. To further explore the role of RPC5EXT in regulating RPC5 stability in the context of an intact Pol III complex, we employed a cycloheximide chase assay (Fig. 5). Levels of RPC5FL remained stable for the course of the experiment (8 h), as well as subunits RPC1 and RPC2, suggesting a relatively long half-life of the Pol III complex (Fig. 5a). On the contrary, RPC5ΔtWHD2 and RPC5ΔC were rapidly degraded with RPC5ΔC almost completely depleted after only 2 h following cycloheximide treatment (Fig. 5b, c). Surprisingly, subunits RPC2 and, to a minor extent, RPC1 and RPC4 were also rapidly depleted, suggesting that RPC5EXT is essential for the stability of the whole Pol III complex.
Pathological genetic mutations map to Pol III subunit interfaces. Many studies have reported mutations of the Pol III enzyme that are related to human diseases, in particular heritable diseases, which affect the correct development of the CNS. Specifically, allele variants encoding mutated versions of the Pol III subunits RPC1, RPC2, RPAC1 and RPAC2 subunits have been established as causative mutations of hypomyelinating leukodystrophy (HL) 7-10,33-37 , Treacher-Collins syndrome (TCS) 11,12 and Wiedemann-Rautenstrauch syndrome (WRS) 13,14 .
To rationalize these findings, we mapped known Pol III mutations on our high-resolution structure ( Fig. 6 and Supplementary Data 1). Reported mutations affecting CNS development tend to cluster in specific hotspots, very often at the interface of several Pol III subunits. For example, TCS mutations L51R and T50I in RPAC2 result in disruption of hydrophobic and salt-bridge interactions, respectively, at the interface with the RPAC1 subunit, suggesting a strong destabilizing effect that might impair correct assembly of the enzyme (Fig. 6). Analogously, most reported HL mutations lay at the interface of several subunits and have disruptive effects on these interfaces ( Fig. 6 and Supplementary Data 1). Interestingly, WRS mutation R1069Q in subunit RPC1 disrupts a charged interaction with residue N1249 in the same subunit. This residue is itself mutated in HL, possibly altering the interface between subunits RPC1 and RPABC1 in both pathologies. Overall, these finding indicate a general molecular mechanism from mutations resulting in CNS disorders, which is the partial loss of function of Pol III activity through destablization of the enzyme core.
Recently, Pol III mutations have also been described in patients affected by acute severe response to Varicella zoster virus (VZV) infection 15,16,38 . Most of these mutations map at the periphery of the Pol III enzyme and result in neutralizing basic charged residues exposed to the solvent in proximity of DNA-binding regions (Fig. 6). As human Pol III has been shown to display cytosolic DNA-sensing activity, it is conceivable that VZV mutations indeed impair proper DNA binding and transcription factor-independent RNA synthesis in the cytoplasm.
Overall, these findings are in agreement with previous work using the homologous yeast Pol III, and/or Pol I and Pol II enzymes to map disease-causing mutations [8][9][10]15,39 . However, the availability of a high-resolution structure of human Pol III enable the comprehensive mapping and rationalization of these allele variants with high confidence.

Discussion
Here we describe the 4.0 Å resolution cryo-EM structure of apo human Pol III. The structure confirms the overall high-degree of structural homology with its S. cerevisiae counterpart but also highlights specific differences, such as a rearranged foot domain ( Supplementary Fig. 5). Integrating cryo-EM data with X-ray crystallographic and SAXS data for the metazoan-specific RPC5EXT, detailed structural information has been obtained for the whole Pol III complex ( Supplementary Fig. 11). Surprisingly, experiments in living cells highlighted a prominent role of RPC5EXT for the integrity of the Pol III complex. Absent in lower eukaryotes, RPC5EXT thus represents an additional metazoan-specific module, impacting on the correct assembly of Pol III. Several abundant phosphorylation sites have been identified in RPC5EXT, which, together with the evidence of structural similarities between RPC5EXT tWHD2 and factors involved in targeted degradation, suggests the intriguing hypothesis of an RPC5EXT-mediated layer of regulation impacting overall Pol III abundance in response to environmental cues that may have evolved in higher organisms.
Furthermore, the high-resolution structure of human Pol III enabled the mapping of more than 85% of reported Pol III genetic mutations with high precision, rationalizing their effects at a molecular level (Supplementary Data 1). Mutations affecting the CNS development tend to spatially cluster together and it seems likely that the severity of the phenotypes observed in HL, TCS and WRS correlate with the disruptive effect of such variants. For example, TCS mutations appear to be particularly disruptive at the interface of RPAC1 and RPAC2, two subunits shared between Pol I and Pol III. Interestingly, mutations N32I and N74S in RPAC1 mutations associated with HL lead to reduced Pol III assembly and nuclear import, without affecting Pol I 7 . This is consistent with our model, as these residues mediate interactions with Pol III-specific RPC1 and RPC2, respectively (Supplementary Data 1).
Finally, as Pol III represents a central nexus involved in the regulation of organismal growth, development and lifespan in eukaryotes and is often deregulated in cancer, the structure of the human enzyme will represent an invaluable tool to aid the design of small molecules capable of specifically targeting Pol III transcription for therapeutic purposes.  was a gift from Feng Zhang (Addgene plasmid #62987: https://n2t.net/addgene:62987; RRID: Addgene_62987). A donor plasmid carried a short GS-linker sequence with an embedded Human Rhinovirus (HRV) 3C protease cleavage site and the sfGFP ORF surrounded by two large sequence segments homologous to the insertion locus in the genome.
HeLa cells were transfected with a 1 : 1 : 1 mix of gRNA1 and gRNA2 vectors together with the donor plasmid using PolyJet transfection reagent (SL100688, SignaGen Laboratories) according to the manufacturer's instructions. Several days later, the GFP-expressing cells were enriched by flow cytometry using a Bio-Rad S3e cell sorter. GFP-positive cells were seeded on large culture dishes such that they could grow as single cell colonies. After 2-3 weeks, colonies were transferred manually into multi-well slides for live-cell imaging and were screened under identical microscope settings. The brightest clones were selected for expansion. These monoclonal populations were validated by PCR on extracted genomic DNA (using the Blood & Tissue Kit, Qiagen).
The selected cell line was cultivated adherently and adapted to suspension growth as follows: Cells from 8 flasks (about 70 × 10 6 cells total; 83.3912.302, Sarstedt) were detached by incubation with trypsin (25,300, Gibco) at 37°C for 5 min, transferred to a spinner flask (250 mL total volume; 4500, Corning) and cultured in suspension with high-glucose DMEM (11965, Gibco) supplemented with 1% FBS (10,270, Gibco) and 1% Penicillin/Streptomycin (P0781, Sigma Aldrich) under moderate stirring at 37°C, 5% CO 2 atmosphere. To expand the culture, 1× the current volume of fresh media including all supplements was added when cells reached a density of~3.5 × 10 5 cells/mL and transferred to spinner flasks of increasing volume when required. Cells were collected by centrifugation and washed with phosphate-buffered saline (PBS) before flash-freezing the pellet.
For fluorescence imaging, cells were grown adherently on cover slips to 50% confluency. After washing the cells with pre-warmed (37°C) PBS, they were fixed with 3.7% paraformaldehyde in PBS for 10 min at 37°C. The fixation was stopped by addition of 100 mM glycine in PBS for 5 min at 37°C and cells were washed twice with PBS. The cells on the cover slips were mounted on the specimen slide with the help of a drop Prolong Gold Antifade Mountant with 4′,6-diamidino-2phenylindole (DAPI) (P36941, Thermo Fisher Scientific) and dried for at least 3 days in the dark.
The fluorescent specimens were imaged using a Zeiss Axio Observer.Z1/7 microscope and a ×63 oil-objective lens. DAPI staining was detected with the help of a 405 nm excitation laser and for the emission a wide band-pass filter (300-720 nm) was used. For sfGFP detection, a 488 nm laser and the same band-pass filter (300-720 nm) was applied. The images were captured with the Airyscan mode and detector. Processing was done using the Zeiss AxioVision software, the Zeiss ZEN 3.0 (ZEN lite) software and Fiji.
Anti-GFP pulldown from nuclear and cytosolic fractions. A total of~7 L POLR1C-GFP Hela cells were grown in a spinner flask (Corning) and collected by centrifugation, yielding a cell pellet of 8.1 g (estimated total of 2.3 × 10 9 cells). Nuclear extract production was based on a published protocol 41 . The cell pellet was resuspended in 9.8 ml of MC Buffer (10 mM HEPES-KOH pH 7.6, 10 mM KOAc, 0.5 mM Mg(OAc) 2 , 5 mM dithiothreitol (DTT), 0.5 mM phenylmethylsulfonyl fluoride (PMSF)) and incubated for 5 min on ice. Following dounce homogenization, nuclei and cytosolic fractions were separated by centrifugation using a Sorvall SS34 rotor at 18,000 × g at 4°C for 5 min. This resulted in a nuclear pellet of 5.1 g, which was resuspended with 6.7 ml 'Roeder C Buffer' (25% v/v glycerol, 20 mM HEPES-KPOH pH 7.9, 1.5 mM MgCl 2 , 0.2 mM EDTA pH 8.0, 420 mM NaCl, 0.5 mM DTT, 0.5 mM PMSF) and corresponds to 12 ml of nuclear extract. Nuclei were lysed with a dounce homogenizer, slowly stirred at 4°C and centrifuged in a Sorvall SS34 rotor at 16,000 × g at 4°C for 30 min. Nuclear extracts and cytosolic fraction were split into fractions of 0.5 mL each.
Large-scale human RNA Pol purification. Large-scale cell growth was carried out at the Cell Services Scientific Technical Platform at The Francis Crick Institute, London. Adherent HeLa POLR1C-GFP cells were grown in DMEM-4 medium supplemented with 1% fetal bovine serum (FCS), 1% Glutamax and 1% Penicillin/ Streptomycin. Confluent cells were collected by trypsin treatment followed by gentle centrifugation. The cell pellet was subsequently resuspended in RPMI-1640 supplemented with 5% FCS, 1% Glutamax and 1% penicillin/streptomycin, to allow for cell growth in suspension. Cells were expanded in suspension using a small glass spinner flask flushed with CO 2 at 37°C. Cells were expanded to a maximum volume of 1.2 L per growth in a 3 L glass spinner flask. Cells were grown to a density of to 1 × 10 6 cells/ml with viability maintained at >90%. Following growth, cells were collected by gentle centrifugation at room temperature. The resulting cell pellets were washed with PBS and cells pelleted again via centrifugation. The final cell pellets were stored at −80°C prior to purification.
For large-scale purification of human RNA Pol, whole-cell lysate was produced from a cell pellet derived from 20 L of HeLa cells grown to 1 × 10 6 cells/ml density. The cell pellet was resuspended in lysis buffer (50 mM Tris-HCl pH 8.0, 250 mM (NH 4 ) 2 SO 4 , 20% v/v glycerol, 1 mM MgCl 2 , 10 μM ZnCl 2 , 10 mM βmercaptoethanol) and two protease-inhibitor tablets (Roche) added. Lysis was performed through repeated passage of the cell suspension through a dounce followed by sonication with the resulting lysate cleared through centrifugation at 28,000 × g at 4°C for 40 min followed by filtration of the soluble fraction through gauze. The cleared lysate was incubated with 1 ml of GFP selector beads 50% slurry (Nanotag) pre-equilibrated in lysis buffer. Beads were incubated for 3 h at 4°C under continuous rotation. Beads were washed with 60× slurry volume in lysis buffer and eluted through overnight incubation at 4°C with 160 μl of HRV−3C protease (Millipore) in a final volume of 2-3 ml. Following elution, the eluate was collected through gentle centrifugation of the beads at 1000 × g and collection of the supernatant. The beads were then washed in double the eluate volume with wash buffer (50 mM Tris-HCl pH 8.0, 50 mM (NH 4 ) 2 SO 4 , 1 mM MgCl 2 , 10 μM ZnCl 2 , 10 mM β-mercaptoethanol) and the resulting wash fraction combined with the eluate to dilute the (NH 4 ) 2 SO 4 to~120 mM. The eluate mixture was further diluted through addition of an equivalent volume of Tris buffer (50 mM Tris-HCl, 1 mM MgCl 2 , 10 μM ZnCl 2 , 10 mM β-mercaptoethanol) to reduce the final (NH 4 ) 2 SO 4 concentration to~60 mM. Next, the eluate was loaded onto a MonoQ GL 5/50 column (GE Healthcare) and eluted in a linear gradient from 60 mM to 1 M (NH 4 ) 2 SO 4 in 50 mM Tris-HCl pH 8.0, 1 mM MgCl 2 , 10 μM ZnCl 2 , 10 mM βmercaptoethanol. MonoQ purification produced two peaks corresponding to RNA Pol I (eluting at~380 mM (NH 4 ) 2 SO 4 ) and RNA Pol III (eluting at~550 mM (NH 4 ) 2 SO 4 ). Human RNA Pol III fractions were collected and diluted to a final (NH 4 ) 2 SO 4 concentration of~110 mM. The sample was then concentrated using a Vivapsin 500 (100,000 Molecular Weight Cut Off to a final concentration of 0.05-0.1 mg/ml. The concentrated sample was used immediately for grid preparation. RNA elongation and cleavage assay. Human Pol III (0.5, 1 or 2 pmol) was preincubated with 0.25 pmol of pre-annealed minimal nucleic acid scaffold (template DNA: 5′-CGAGGTCGAGCGTTGTCCTGGT-3′, non-template DNA: 5′-CG CTCGACCTCG-3′; RNA: 5′-FAM-AACGGAGACCAGGAC-3′) in transcription buffer (20 mM Hepes pH 7.8, 42-168 mM (NH 4 ) 2 SO 4 (hs Pol III buffer), 8 mM MgSO 4 , 10 µM ZnCl 2 , 10% (v/v) glycerol, 10 mM DTT) for 1 h at 20°C in a 45 µl reaction. For RNA elongation, 10 µmol of each Nucleotide triphosphate was added and the reaction was incubated for 1 h at 28°C. To examine cleavage activity, the preincubated reaction was incubated for 1 h at 28°C without the addition of NTPs. In the following, nucleic acid purification was examined by adding 5 M NaCl to an end concentration of 0.5 M and 800 µl 100% ethanol. After precipitation for at least 1 h at −20°C, the sample was centrifuged for 30 min at 20,000 × g and 4°C. The pellet was washed with 80% ethanol and, after drying, resuspended in 1× RNA loading dye (4 M Urea, 1× Tris-Borate-Ethylenediaminetetraacetic acid, 0.01% bromophenol blue, 0.01% xylene cyanol). The sample was heated to 95°C for 5 min. As control, 0.25 pmol of scaffold was treated identically, without addition of Pol and NTPs. FAM-labelled RNA product (0.125 pmol) was separated by gel electrophoresis (20% polyacrylamide gel containing 7 M Urea) and visualized with a Typhoon FLA9500 (GE Healthcare).
Cryo-EM sample preparation and data collection. Human Pol III cryo-EM samples were prepared on C-Flat 1.2/1.3 (400 mesh) grids coated with a thin film of continuous carbon prepared in house. Grids were glow discharged at 15 mA for 30 s using a PELCO EasyGlow instrument prior to sample addition. A 3 μl volume of sample at~0.06 mg/ml concentration was applied and incubated for 30 s at 18°C and 100% humidity. Grids were blotted for 1 s at blot force 1 with a 0.5 s drain time and plunge frozen in liquid ethane using the VitroBot Mark IV system (FEI).
Data collection was carried out using a FEI Titan Krios transmission electron microscope (Thermo Fisher) operating at 300 KeV and equipped with a Falcon III direct electron detector (Astbury Biostructure Laboratory, University of Leeds). Separate data collections were carried out for both untilted and 30°tilted data sets. All data sets were imaged using EPU automated acquisition software with the Falcon III operating in electron counting mode at a nominal magnification of ×75,000 and a calibrated sampling of 1.065 Å/pixel. For untilted data collection, 3115 movies were collected. Movies were collected over 45 frames with a 70 s exposure time and a total dose of 44.1 e − , giving a dose per frame of 0.98 e − /Å 2 and a dose rate of 0.63 e − /Å 2 /s. Data were collected over a defocus range of −1 μm to -3μm. Tilted data collection was carried out at 30 o in two separate sessions. The first session collected 921 movies, with a total dose of 37.8 e − fractionated over 38 frames during a 70 s exposure, yielding a dose per frame of 0.99 e − /Å 2 and a dose rate of 0.54 e − /Å 2 /s. The second session collected 1703 movies, imaged with a total dose of 40.6 e − fractionated over 38 frames during a 70 s exposure. This gave dose per frame of 1.07 e − /Å 2 and a dose rate of 0.58 e − /Å 2 /s. In both tilted data collections, micrographs were collected using a −1.2 to −3 μm defocus range.
Cryo-EM image processing. Frame alignment and dose weighting was carried out on-the-fly using MotionCor2 42 . Following motion correction, CTFFIND4 implemented in the cisTEM software package was used for contrast transfer function (CTF) estimation 43 . Particle picking was carried out using the ab initio particle picking option in cisTEM 44 and resulting particles exported to Relion 3.1 45 . Subsequent 2D and 3D classification, refinement and post-processing steps were carried out using Relion 3.1, and ab initio model generation using Cryosparc v2 46 . For the untilted dataset, 332,238 particles were selected, yielding a final particle set of 139,891 particles corresponding to hPol III following multiple rounds of 2D classification. This particle subset was used to generate an initial model of the hPol III structure using the ab initio model functionality in Cryosparc v2. Similarly, 87,075 particles were selected from 2624 30°tilted micrographs, yielding 32,787 particles following 2D classification. Both particle sets were combined generating the merged particle set of 172,678 particles. This was subject to 3D classification in Relion 3.1 using the Cryosparc ab inito model as a reference. Classification produced 5 classes, with a single class (class 4, containing 68,291 particles) corresponding to the complete Pol molecule. This class was refined and subject to CTF refinement. This was a sequential procedure, first correcting for trefoil and fourthorder aberrations, followed by correction for magnification anisotropy in the second step. In the final step, the defocus was refined on a per particle basis to correct for errors in CTF estimation for tilted particles. Following this, a further refinement was carried out, yielding a model at 3.7 Å resolution at the gold-standard 0.143 Fourier shell correlation (FSC) cut-off criterion. Following refinement and post processing, the map was filtered according to the local resolution of each region using the local resolution functionality implemented in Relion 3.1. Inspection of the resulting density revealed poor density and low resolution of the heterotrimer region. To improve this region, a local mask was generated and 3D classification carried out without alignment localized to the heterotrimer. This produced 3 classes, of which 1 (containing 25,369 particles) produced a model with improved heterotrimer density following consensus refinement. This was subject to further CTF refinement, consensus refinement and local resolution estimation to produce the final model reporting 4.0 Å global resolution at the 0.143 FSC cut-off criterion.
Cryo-EM model building and refinement. As an initial step, homology models were generated for all core (RPC1, RPC2, RPC10, RPAC1, RPAC2, RPABC1, RPABC2, RPABC3, RPABC4 and RPABC5), heterodimer (RPC4 and RPC5) and stalk (RPC8 and RPC9) subunits using the PHYRE2 webserver 47 . These were rigidly fitted into the locally filtered map using the fitted yeast apo-RNA Pol III structure (RCSB Protein Data Bank (PDB) code: 6EU2) as a guide for the relative positioning of the subunits in UCSF Chimera 48 . The placed homology models were then fitted manually to the density using the COOT software package 49 , at this stage regions not present in the EM density were removed from the model. Following manual fitting, the model was fitted using the real-space refinement functionality in PHENIX 50 . The lower resolution of the more dynamic heterotrimer region did not permit the use of this strategy for these subunits. Therefore, the existing crystal structure of human RPC3 in complex with a fragment of RPC7 51 (RCSB PDB Code: 5AFQ) was structurally aligned with the yeast heterotrimer in the yeast apo-RNA Pol III structure in UCSF Chimera. Comparison revealed a highly similar structure and relative arrangement for the RPC3 and RPC7 regions present in both structures. Further to this, a homology model for RPC6 was generated using PHYRE2. This was structurally aligned in UCSF Chimera to the yeast RPC6 subunit present in the yeast apo Pol III structure. The region of human RPC6 consisting of residues 174-289 was selected for inclusion in the model, as this was the region which corresponded to residues 171-271 of yeast RPC6, which were visible in the yeast RNA Pol III apo state 4 . Following selection of the relevant models, they were positioned using the yeast apo structure as a guide and then fitted to the human EM map using the sequential fit option in UCSF chimera.
RPC5 cloning, expression and purification. Based on secondary structure predictions using PsiPred 52 and HHPred programmes 53 , we designed 13 different constructs of the C-terminal extension of RPC5 subunit (Uniprot ID Q9NVU0). A PCR-based strategy was used to amplify fragments of RPC5-tWHD1, RPC5-tWHD2 and RPC5EXT from its genomic DNA (Genscript). The constructs were subsequently cloned into pOPINF or pOPINJ plasmids for bacterial expression or into pACEBac1 plasmid for baculovirus-insect cells expression. Two hexahistidine-tagged constructs that rendered high yield expression of undegraded proteins (RPC5 (259-440) and RPC5 (556-708)) were selected for large-scale production. Both protein constructs were expressed and purified following the same protocol. Cells were grown at 37°C, 200 r.p.m. in Terrific Broth to OD 600 = 1.5 and protein expression was induced with 1 mM isopropyl β-D-1-thiogalactopyranoside at 20°C overnight. All subsequent steps were performed at 4°C. Collected cells were resuspended in 20 mM HEPES pH 7.9, 150 mM NaCl, 10 mM imidazole and 10 mM β-mercaptoethanol supplemented with DNAse I and two protease-inhibitor tablets (Roche). After a 30 min incubation, the sample was sonicated and fractionated by centrifugation at 20,000 r.p.m. for 40 min. Then, the soluble fraction was loaded in a HisTrap HP 5 mL affinity column (GE Healthcare) pre-equilibrated with lysis buffer. After extensive washes of the chromatographic column, the protein was eluted with lysis buffer supplemented with 250 mM imidazole. The sample was diluted to 70 mM NaCl and injected into an HiTrap Heparin HP 5 ml (GE Healthcare) column equilibrated with 20 mM HEPES pH 7.9, 70 mM NaCl and 10 mM β-mercaptoethanol. The protein was subsequently eluted using an isocratic gradient from 70 mM to 2 M NaCl in 30 column volumes. Fractions containing RPC5 constructs were identified by SDS-PAGE analysis. Cleavage of the His-tag was performed overnight incubating the protein with 3C protease in a 1 : 50 molar ratio (3C protease: RPC5). Uncleaved His-tagged proteins were removed by incubation of the sample with 1 ml HisPur TM (Thermo Fisher) nickel resin for 1 h at 4°C. The cleaved protein was concentrated to 5 ml and loaded in a HiLoad 16/600 Superdex 75 pg gel-filtration column (GE Healthcare) equilibrated with 50 mM Tris-HCl pH 7.5, 150 mM NaCl and 10 mM β-mercaptoethanol. Purified RPC5 (259-440) and RPC5 (556-708) were concentrated to 30 and 80 mg/ml, respectively, flash-frozen and stored at −80°C.
The expression of selenomethionine-derivatized proteins was performed in the methionine-auxotroph Escherichia coli B834(DE3) strain (Novagen) using SelenoMet TM medium (Molecular Dimensions) supplemented with SelenoMet Nutrient Mix and 40 mg/l L-selenomethionine (SeMet). Purification of the proteins was performed as described for the native proteins.
The expression of the whole C-terminal extension of RPC5 (referred as RPC5EXT) was performed in the insect cells/baculovirus expression system. Largescale suspension cultures (300 mL) of High Five insect cells at 0.5 × 10 6 cells/ml were grown in Insect-Xpress media (Lonza) and inoculated with P2 baculovirus solution containing the RPC5 constructs. Proliferation arrest was assessed by measurement of GFP production until fluorescence reached a plateau. Cells were collected at 800 × g for 5 min and the pellets were stored at −20°C. After a milder sonication step, purification was performed following the protocol described above. Finally, protein was concentrated to 10 mg/mL, flash-frozen in liquid nitrogen and stored at −80°C.
Crystallization, data collection and structure determination. Crystals used for structure determination were grown from a 1 : 1 ratio solution (protein : reservoir) using the vapour diffusion technique at 4°C. RPC5 (259-440) crystals in P6 1 22 space group were obtained at 30 mg/mL after 3-4 days equilibration in 3.2 M NaCl, 100 mM Sodium Acetate pH 4.6 and 10 mM ZnCl 2 . SeMet-RPC5 (259-440) crystals in the same space group were obtained at similar conditions but required the use of streak seeding with diluted native crystals to favour nucleation. RPC5 (556-708) crystals in P2 1 space group grew at 35 mg/mL in 3.2-3.8 M Ammonium Acetate and 100 mM Bis-Tris Propane pH 6.5-7 after 2 weeks. Crystallization of SeMet-RPC5 (556-708) protein was performed under identical conditions. All crystals were flash-frozen in liquid nitrogen using perfluoropolyether oil (Hampton Research) as a cryoprotectant.
A dataset from native RPC5 (259-440) was collected at 0.9198 Å wavelength in I24 beamline of Diamond Light Source (DLS). In addition, multi-wavelength anomalous dispersion (MAD) data collections were performed at the peak, remote and inflection wavelengths from SeMet-derivatized crystals of RPC5 (259-440) and RPC5 (556-708) in I03 beamline of DLS. Using the MAD dataset, an initial model of SeMet-RPC5 (259-440) at 2.7 Å was determined with the SHELXC/D/E suite from the HKL2Map programme (for phase determination) and the Buccaneer software (for model building). The native structure at 2.2 Å was solved by molecular replacement using the initial SeMet model as a search reference in PHENIX.automr. Subsequent refinement was performed using COOT and PHENIX suites. The structure of RPC5 (556-708) at 1.48 Å was solved from the SeMet data sets using HKL2Map and Buccaneer programmes, and further refined to acceptable R free /R work values with COOT and PHENIX. Protein secondary structure assignment from the atomic coordinates was performed using STRIDE 54 .
RPC5 SAXS data collection and processing. SAXS data collection was carried out at the SWING small-and wide-angle scattering beamline, SOLEIL Synchrotron, Saint Aubin, France. Purified RPC5EXT (164 μM), tWHD1 (922 μM) and tWHD2 (3691 μM) were passed through a Bio SEC-3 HPLC column (Agilent) at 0.2 ml/min in 50 mM Tris-HCl pH 7.5, 150 mM NaCl, 10 mM β-mercaptoethanol with protein elution monitored using A 280 . Data were collected using an Eiger X 4 M detector (Dectris) at a 2 m distance, using a q-range of 0 < q < 0.68 Å −1 . Data were reduced and buffer subtraction performed at the beamline. Data analysis was carried out using the ScÅtter software package for determination of radius of gyration (R g ), P(r) distribution, particle maximum dimension (D max ) parameters and for qualitative flexibility analysis (through generation of R g -Normalized Kratky, SIBLYS and Porod-Debeye plots) 55 . Volumetric bead modelling was performed using the DAMMIN software package 56 . Briefly, ab inito bead models were calculated using DAMMIN ten times for each construct, fitting over the q-range of 0 < q < 0.25 Å −1 . The resulting bead models were averaged and filtered using the DAMAVER package 57 , generating the final bead model reconstruction.
Comparison of the theoretical scatter profiles of the determined crystal structures for the tWHD1 and tWHD2 constructs was performed using the CRYSOL package for structural validation 58 . Modelling of the entire RPC5 C terminus (RPC5EXT) was performed using Ensemble Optimisation (EOM) analysis from the ATSAS package 59 of the RPC5EXT SAXS data using the q-range 0 < q < 0.2 Å −1 . The determined tWHD1 and tWHD2 crystal structures were defined as rigid bodies in the RPC5 C-terminal sequence, with EOM analysis modelling the intervening 115 amino acid linker as dummy atoms, generating a pool of 10,000 random structural conformations from which the ensemble was selected to sample the continuous structural heterogeneity.
Co-immunoprecipitation and western blotting. HEK293T cells were seeded into 10 cm plates in the presence of siRNA. After 24 h, they were subsequently transfected with RPC5 (HA-tagged full-length ΔtWHD2 or ΔC) and maintained for a further 24 h. Cells were lysed in RIPA buffer and co-immunoprecipitation was performed using Pierce TM anti-HA magnetic beads (Thermo Fisher Scientific). HA-tagged proteins were eluted from the beads through addition of NuPAGE TM LDS 4× sample buffer and boiling the samples for 10 min. For whole-cell lysates, cells were lysed in RIPA buffer and then NuPAGE TM LDS 4× sample buffer (Thermo Fisher Scientific) plus NuPAGE TM 10× sample reducing agent (Thermo Fisher Scientific) was added before being boiled for 5 min. SDS-PAGE was subsequently performed on the lysates in 4-12% Bis-Tris protein gels, transferred to nitrocellulose membrane, blocked for 1 h in 5% milk/Tris-buffered saline/0.1% Tween20 and probed with primary antibody overnight at 4°C. Secondary antibodies were incubated for 1 h at room temperature in the dark and detected using the Odyssey-CLx fluorescence imaging system (LI-COR Biosciences). All uncropped gel images are available in the source data file.
Cycloheximide chase assay. HEK293T cells were seeded into six-well plates in the presence of siRNA. After 24 h, they were subsequently transfected with either RPC5 (HA-tagged full length or ΔC) and maintained for a further 24 h. Cycloheximide was then added at a concentration of 300 μg/ml and cells lysed at regular time points (every 2 h, up to a maximum of 8 h) using RIPA buffer and lysates were analysed via western blotting, as previously stated. All uncropped gel images are available in the Source Data file.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
The electron density reconstructions and final model were deposited with the Electron Microscopy Data Base under accession code number EMD-11904 and with the Protein Data Bank (PDB) under accession code 7AST. The PDB accession numbers for the atomic coordinates and structure factors of the RPC5EXT tWHD1 and tWHD2 crystal structures reported in this paper are 7ASU and 7ASV, respectively. The RNA Pol models used in this study are available from PDB under accession codes 5FYW, 5W66, 6EU0, 6EU2 and 6EU3. Source data are provided with this paper.