Main

Influenza A virus (FluA) mainly infects water and domestic fowl, although some strains cause disease in mammals such as humans, pigs, horses, seals and bats. The viral genome, composed of eight segments of negative-sense single-stranded RNA packaged in separate ribonucleoprotein particles, is transcribed and replicated by the heterotrimeric viral RNA (vRNA)-dependent RNA polymerase (RdRp), which comprises subunits PA, PB1 and PB2. The high mutation rate of the polymerase and the generation of novel viruses through reassortment of genome segments between different strains ensure rapid evolution of the virus with resultant seasonal epidemics and occasional, potentially devastating, pandemics. Although the polymerase has been studied extensively since the late 1960s, detailed understanding of its many functions both in vitro and in the context of the infected cell remains elusive (reviewed in refs 1 and 2), largely owing to the lack of atomic resolution structural information on the full-length polymerase. Nevertheless, in recent years, several crystal structures of fragments of the polymerase subunits have yielded important insights (reviewed in refs 1 and 3). These include the two domains involved in the unique cap-snatching mechanism of transcription used by the virus4—the PA amino-terminal endonuclease domain (PA-Nter)5,6, and the central PB2 cap-binding domain7—structures that have contributed to a renaissance in anti-influenza drug design targeting the polymerase8,9. In addition, structures are available of the inter-subunit interfaces between the PA carboxy-terminal domain (PA-C) and PB1-Nter (refs 10,11), between PB1-Cter and PB2-Nter (ref. 12), and of the PB2 C-terminal double 627-NLS domain13, which carry the host-specific PB2 residue 627 (Lys and Glu in human and avian strains, respectively) (reviewed in ref. 14) and the PB2 nuclear localization signal (NLS)15, respectively.

Here we describe the crystal structure of the complete heterotrimeric FluA polymerase bound to the vRNA promoter. To bypass difficulties in expression of recombinant human or avian polymerases, we used polymerase from the recently discovered bat-specific influenza virus (bat FluA)16, which is evolutionarily close to human/avian A strains with 70.0 (78.2), 79.5 (87.7) and 68.0 (78.6) per cent identity (similarity) for PA, PB1 and PB2, respectively (Supplementary Fig. 1). Bat polymerase can replicate efficiently in human cells16 and vice versa17, suggesting that the bat structure will be a good model for all FluA polymerases. Here we describe the overall architecture of the polymerase, the structure of each subunit and their interfaces, and how the conserved 3′ and 5′ sequences of the vRNA promoter are bound. In the accompanying manuscript18, using two additional crystal structures of influenza B polymerase, implications of the structures for the mechanisms of de novo vRNA replication and cap-dependent transcription are presented.

Structure determination and overall architecture

Heterotrimeric influenza polymerase from A/little yellow-shouldered bat/Guatemala/060/2010(H17N10) was expressed in insect cells as a self-cleaving polyprotein and purified in milligram quantities to homogeneity (Extended Data Fig. 1). Using short templates, such as a 39-nucleotide vRNA mini-panhandle containing the conserved extremities or separated 3′ (template) and 5′ (activator) sequences, the recombinant bat polymerase is active in cap-dependent transcription as well as ApG-primed and, less efficiently, unprimed replication assays (Extended Data Fig. 2) without the need for the viral nucleoprotein, consistent with previous work19. Co-crystals of FluA polymerase were obtained with nucleotides 1–16 from the vRNA 5′ end (5′-pAGUAGUAACAAGAGGG-3′), and nucleotides 1–18 or 3–18 from the 3′ end (3′OH-UCGUCUUCGUCUCCAUAU-5′OH). The structure was solved by molecular replacement at 2.65 Å resolution using the structure of FluB polymerase18 (Extended Data Table 1). The FluA polymerase structure is 97.8% complete with 699 out of 714 (for PA), 750 out of 756 (for PB1), and 733 out of 760 (for PB2) residues modelled (2,182 out of 2,230 total).

The FluA polymerase has a U-shaped structure, with approximate height, width and depth of 115 × 100 × 75 Å, respectively (Fig. 1, Extended Data Fig. 3 and Supplementary Videos 1 and 2). The two protruding arms are formed by the PA-Nter endonuclease and PB2 cap-binding domains, which face each other across a solvent channel. The bottom of the U is formed by the large PA-C domain and one of the sides by the C-terminal two-thirds of PB2 (PB2-C) including the cap-binding domain. The body of the trimer is formed by PB1, decorated on one side by the N-terminal third of PB2 (PB2-N) (Fig. 1a, b) and on the other side by the linker (PA-linker) that connects the PA endonuclease (PA-Nter) with PA-C (Fig. 1b). Previous studies have revealed crucial but limited tail (Cter) to head (Nter) interactions between PA and PB1 (refs 10 and 11) and PB1 and PB2 (refs 12, 20 and 21). The actual inter-subunit interactions are much more extensive than this owing to an extremely complex intertwining of the subunits. The total buried surface area between PB1 and PA is 17,330 Å2 and between PB1 and PB2 is around 14,100 Å2, whereas the area between PA and PB2 is only 2,880 Å2, confirming the central scaffolding role of PB1. The trimer contains a large, internal, catalytic and RNA-binding cavity formed by PB1 and PB2-N that is partially open at the top to the solvent channel between the PA endonuclease and PB2 cap-binding domains (putative template/product exit channel), as well as being accessible via two narrow side tunnels, the putative NTP and template entrance channels (see below). For sequence alignments of bat and human FluA polymerase and secondary structure assignments, see Supplementary Fig. 1. A schematic of each subunit domain structure is given in Fig. 1d.

Figure 1: Overall structure of the bat influenza A polymerase complex with the vRNA promoter.
figure 1

a, b, Two ribbon views colour-coded according to the domain structure in d, except that PA-C, PB1 and PB2-N are uniformly green, cyan and red, respectively. The vRNA 5′ and 3′ ends are pink and yellow tubes, respectively. c, Side-view in space-filling representation showing emergence of vRNA duplex at the interface of all three subunits. d, Subunit domain structure with subdomain names and colour scheme and showing the location of the conserved polymerase motifs in PB1.

PowerPoint slide

PB1 subunit

Apart from the 15 N-terminal and 80 C-terminal residues, which form tight inter-subunit contacts with PA-C (refs 10, 11) and PB2-N (ref. 12), respectively, the detailed structure of the PB1 subunit has until now been completely unknown. However, sequence analysis revealed the presence of motifs pre-A (also known as F) and A–E characteristic of RNA-dependent RNA polymerases22,23,24 and correspondingly PB1 contains in its central region (residues 21–669) a typical right-handed RdRp fold, comprising fingers, fingertips, palm and thumb domains (Fig. 2a, b). A three-dimensional similarity search shows that hepatitis C virus (HCV) polymerase is structurally most like the polymerase region of PB1 (Fig. 2c), but many other RNA virus polymerases are also similar. Structural analysis has shown that Flaviviridae polymerases (for example, HCV, Dengue virus, West Nile virus)25,26,27 as well as bacteriophage Φ6 (ref. 28) contain a ‘priming loop’ to promote initiation of unprimed RNA synthesis29. In PB1, residues 641–657 form a conserved anti-parallel β-loop (Fig. 2b) structurally analogous to the HCV priming loop (Fig. 2d), which could be involved in unprimed genome or anti-genome replication by influenza polymerase.

Figure 2: PB1 structure and comparison with other RNA virus polymerases.
figure 2

a, Ribbon diagram of PB1, coloured as in Fig. 1d, highlighting idiosyncratic elements including the PB1-Cter extension (wheat), the β-ribbon (orange, with NLS1 and NLS2 motifs shown) and the β-hairpin (grey). b, As in a but rotated roughly 90° to show the internal cavity occupied by the putative priming loop (residues 640–657, magenta) and the PB1-Nter extension (yellow-orange). c, Same view as in b of Norwalk virus (PDB code 3BSO; top) and HCV (PDB code 2XI3; bottom) polymerases after superposition with PB1, and coloured equivalently. Norwalk and HCV polymerases both have two fingertip loops (blue) but only HCV has a priming loop.

PowerPoint slide

There are several idiosyncratic features of PB1. First, there are the N- and C-terminal extensions (N-ext and C-ext; Fig. 1d) that make inter-subunit contacts with PA and PB2, respectively. Second, there is an unusually long (55 Å), solvent-exposed, flexibly hinged β-ribbon (strands β6 and β7, residues 177–212) (Fig. 2a, b). Interestingly, this element contains the PB1-NLS motifs, two separated basic patches (NSL1, 187-Lys/Arg-Lys-Lys/Arg-Arg-190 (bat/human) on β6; NSL2, 207-Lys-Lys-Arg/Lys-Val/Gln-Lys/Arg-211 on β7; Fig. 2a) that have been shown to be important for binding RanBP5, the PA–PB1 heterodimer nuclear import factor30. A third special feature of PB1 is a β-hairpin insertion (strands β12 and β13, residues 352–360; Fig. 2a) in the finger domain, which, notably, is inserted through an extended loop in PA (the ‘PA-arch’; Fig. 3a). Both structures form an integral part of the 5′ vRNA-binding site (see below). The C-terminal extension of PB1 after the putative priming loop is involved in direct 3′-template binding (residues 671–676, see below).

Figure 3: PA and PB2 structure and the PA-linker–PB1 interface.
figure 3

a, The PA subunit in rainbow colouring from N-ter (dark blue) to C-ter (red). The PA-linker, PA-arch and 550-loop, which contains a putative host-specific residue, are highlighted. b, Ribbon diagram of the PB2 subunit with sub-domains coloured as in Fig. 1d. c, As in b but rotated roughly 90° and showing only the arc of the PB2-C domain.

PowerPoint slide

PA and PB2 subunits

The two structurally known domains of PA, the PA-Nter endonuclease domain (residues 1–195) and the large PA-C domain (258–714), are on opposite sides of the molecule, connected by the previously uncharacterized PA-linker (196–257) (Figs 1b and 3a), which wraps around the external face of the PB1 fingers and palm domain. In particular, residues 201–257, which include three helical segments (α7–α9), lie across the surface of PB1 making numerous, often conserved, inter-subunit contacts that are both hydrophobic and polar in nature (Extended Data Fig. 4a). The endonuclease domain is anchored to the rest of the polymerase through contacts with the same helical region of PB1-Cter that interacts with PB2-Nter, so that all three subunits are involved in positioning the endonuclease (Fig. 1a, b). The main contacts are via the packing of endonuclease helix α4 against both the penultimate PB1 helix α21 and the PB2 ‘170-loop’ (169–174), and via the endonuclease insertion (67–74) with the last PB1 helix α22 (Extended Data Fig. 4b). The endonuclease active site is solvent-exposed and facing the cap-binding domain (Fig. 1a, b), as discussed elsewhere in relation to the mechanism of cap-snatching18.

The PB2 subunit is divided into the N-terminal third (PB2-N, residues 1–247) and the C-terminal two-thirds (PB2-C, residues 248–760), each formed by several folded subdomains (Figs 1d and 3b, c). PB2-N comprises a series of linked modules that wrap around one edge and face of PB1, interacting mainly with the PB1 C-terminal extension and the polymerase thumb domain, opposite to where the PA linker binds (Figs 1 and 3b). After the well-characterized helical bundle interface with PB1-Cter, residues 35–54 of PB2-Nter are in an extended conformation followed by helix α4 that interacts with the template as it enters the polymerase active site (see below). Residues 55–103 (β1, α5, β2, β3 and α6) form a more compact subdomain (PB2-N1) that buttresses the PB1 thumb domain (for example, PB2 helix α6 packs parallel against PB1 helix α17). Another linker leads to the PB2-N2 subdomain (residues 110–247), which has an extended shape (Fig. 3b). At one end a helical bundle (α9–α11, residues 160–212) is inserted, denoted the PB2 helical lid. This includes the 170-loop (around 169–174), which contacts the endonuclease (Extended Data Fig. 4b), and the projecting helix α10, the N terminus (residue Asp 180) of which closely approaches the cap-binding domain. At the other extremity of the N2 domain are two anti-parallel β-ribbons (β4–β7 and β5–β6) with a helix inserted between them (α12–α13). These make hydrophobic contacts with PA-Cter and with the thumb and palm domains of PB1.

PB2-C (residues 248–736) forms a single, arc-shaped unit (Fig. 3c), divided into five sub-domains, which constitutes one arm of the polymerase U-shape (Fig. 1). At one end of the arc is the cap-binding domain (319–481), and, at the other end, is the NLS domain (685–760), which is disordered beyond the NLS1 motif (736-Lys-Arg-Lys-Arg)15. The NLS domain is juxtaposed to the 627-domain (539–675) as observed in crystal structures of the isolated double 627-NLS domain13,31. The loop carrying the host-specific residue 627, normally lysine in human and glutamate in avian strains but serine in bat, is in a solvent-exposed position remote from the PB1 active site. A possible role of the 627-domain is discussed elsewhere18 (see also Supplementary Information). The central part of the PB2-C arc is composed of two disconnected but interacting sub-domains: the PB2 mid-domain (248–319) that directly precedes the cap-binding domain, and the cap-627 linker (483–538). The mid-domain is a four helix bundle with one of the inter-helical linkers containing a short β-strand (β8) that makes a stabilizing two-stranded parallel sheet with the cap-627 linker (β24) (Fig. 3b). The bat cap-binding domain is very similar to that of human or avian FluA32, but Phe 357 forms one side of the methylated base sandwich rather than a histidine (Supplementary Fig. 1). The cap-627 linker proceeds from the C terminus of the cap-binding domain into a small three-stranded β-sheet (495–515, β21–β23) that packs on the last helix (α17) of the PB2 mid-domain. This sheet has a distinctly concave, solvent-facing surface that could be involved in protein–protein interactions. The mid, cap and cap-627 linker domains do not make extensive interfaces with other polymerase subunits.

PB1 functional regions

The catalytic centre responsible for template-directed nucleotide addition is located in the PB1 internal cavity and formed mainly by the highly conserved RdRp motifs pre-A/F and A–E. Comparison with known polymerase structures allows modelling of the template, substrate RNA and incoming NTPs into the PB1 active site, and deduction of the roles of certain key conserved residues (Fig. 4 and Extended Data Fig. 5). Motif pre-A/F is partly contained in the fingertips, a loop (residues 222–246) that extends from the fingers towards the thumb domain and the tip of which is stabilized by contacts with PA helix α20 (Fig. 2b and Extended Data Fig. 5a). Whereas HepC and Norwalk virus polymerases have two fingertip loops (one corresponding to motif F and the other closer to the polymerase N terminus) (Fig. 2c, d), influenza polymerase PB1-Nter is analogous to the second loop with residues 24–38 crossing from thumb to fingers in intimate association with the fingertips. Several conserved basic residues from motif pre-A/F are likely to be involved in template binding, and NTP channelling and binding33 (Fig. 4a). Motif A contains the conserved active site Asp 305, which, together with Asp 445 and Asp 446 on motif C, coordinate two divalent metal ions (Fig. 4a) and promote catalysis33. These residues have been shown to be essential for PB1 activity23. Motif B has a characteristic methionine-rich loop in PB1 (406-GMMMGMF), and is probably involved in stabilizing the base pair between the incoming NTP and the template. Motif D contains conserved Lys 480 and Lys 481 residues (involved in NTP binding) and is stabilized by contacts with PA helix α20 (656–663) and the PA peptide 671–684. Motif E forms another β-hairpin containing conserved residues thought to stabilize the position of the substrate/priming NTP (Fig. 4a).

Figure 4: PB1 functional regions.
figure 4

a, View into the PB1 catalytic site showing conserved polymerase motifs and key functional residues colour-coded according to: motif pre-A/F (residues 229–257, orange-yellow), motif A (296–314, lime), motif B (401–422, light blue), motif C (436–449, magenta), motif D (474–486, green-cyan) and motif E (487–497, orange). Template, substrate/priming nucleotide and incoming NTP (green) and two divalent cations (black spheres, coordinated by Asp 305, Asp 445 and Asp 446) are modelled after superposition with the Norwalk polymerase primer-template structure (PDB code 3BSO). Directions of NTP and template entrance tunnels and the template/product exit are indicated by arrows. Motif A contains the conserved active site Asp 305 as well as Asn 310 (probably to bind the 2′ OH of incoming NTP) and Lys 308 (NTP tunnel). Motif B has a characteristic methionine-rich loop (406-GMMMGMF) and probably stabilizes the base pair between the incoming NTP and the template. Motif C forms a β-hairpin containing Ser 444 (2′ OH of priming NTP) and active site aspartates Asp 445 and Asp 446. Motif D contains conserved Lys 480 and Lys 481 residues (NTP channel). Motif E forms another β-hairpin containing conserved Glu 491, Phe 492 and Ser 494; it probably stabilizes the position of the substrate/priming nucleotide. b, Context of the vRNA promoter relative to the PB1 polymerase domain. PB1 is coloured as in Fig. 1d.

PowerPoint slide

As in other polymerases, a narrow tunnel, lined with positively charged residues, connects the internal cavity to the outside and this is presumed to attract and channel NTPs into the active site electrostatically (Extended Data Fig. 5a, b). In PB1, this putative NTP tunnel directly leads to the tip of the putative priming loop and involves highly conserved PB1 basic residues Arg 45, Lys 235, Lys 237 and Arg 239 (motif F3), Lys 308 (motif A), and Lys 480 and Lys 481 (motif D). A second tunnel constitutes the putative template entrance channel that is lined by conserved residues from all three subunits (Extended Data Fig. 5c, d).

Promoter binding

For initiation of RNA synthesis, the influenza polymerase needs to be bound to a promoter that comprises both conserved extremities of the pseudo-circularized vRNA or complementary RNA (cRNA)34,35. The pyrimidine-rich 3′ (template) and purine-rich 5′ (activator) extremities are partially complementary and can form a non-canonical double helix, usually referred to as the panhandle36. However, they are thought to bind the polymerase in a partially single-stranded conformation35, either as a ‘corkscrew’37,38 or a ‘fork’39,40, or as a combination of both41. These models concur on the presence of a distal base-paired region between nucleotides 11–14 of the 5′ and 10–13 of the 3′ ends, but differ in whether the individual proximal strands have internal structure or not. The polymerase–promoter crystal structure shows that the distal region is indeed base-paired, and that nucleotides 1–10 of the 5′ end form a compact stem–loop (hook) structure (Fig. 4b).

The hook structure, formed by nucleotides 1–10 of the 5′ vRNA (5′-pAGUAGUAACA), has two central canonical base pairs (G2–C9 and U3–A8) flanked by mismatch base pairs A1–A10 and A4–A7 (Fig. 5a). The stem is capped by G5, which is stacked antiparallel on A4 and U6 whose base faces outward. The sequence characteristics of the 5′ hook are conserved in all known influenza virus vRNAs and cRNAs, the only variations, reflecting the imperfect complementarity of the two extremities, being the nature of the 2–9 and 3–8 Watson–Crick base pairs (G–C and A–U in vRNA, and G–C and C–G in cRNA, respectively) and the loop nucleotides 5 (usually a G) and 6 (usually an A). This hook structure is also likely to be conserved in orthomyxoviruses of the Thogoto lineage, except that G4–A7 would replace the A4–A7 mismatch42.

Figure 5: Structure of the vRNA promoter and how it binds to the polymerase.
figure 5

a, Stick representation of the vRNA promoter highlighting internal hydrogen bonds (green dotted lines) within the 5′ hook structure (pink) and the distal duplex region with the 3′ end (yellow). The non-canonical A1–A10 and A4–A7 pairs are both of the N6 amino (A1, A7)-N3 (A4, A10) type. b, Ribbon diagram of the 5′-hook binding site between the PA β-sheet and PA-arch (plum) and the inserted PB1 β-hairpin (grey). PA is otherwise green and PB1 cyan. The 3′–5′ duplex region contacts the PB1 β-ribbon (orange) notably via residues Lys 188, Thr 201 and Arg 203. c, Detail of the interactions at the 3′ (yellow) and 5′ (pink) strand junction showing the role of conserved PA residues Met 472, Arg 503 and His 505 in splaying apart the duplex. His 505 stacks on base A11 of the 5′ strand, and contacts the O6 of unpaired G9 of the 3′ strand, which in turn stacks on PA Met 472. PA Arg 503 and the phosphate-binding loop (367–370) within the PA-arch (plum) interact with the phosphate of 5′ A11. PB1 β-hairpin residue Arg 365 (cyan) makes hydrogen bonds to the phosphates of 5′ nucleotides C9, A10 and G12 as well as to the N7 of A10, and Glu 358 (cyan) contacts the N6 of A10. d, Protein interactions of the 5′ hook involving highly conserved PB1 N-terminal residues His 32 and Tyr 38, and PA basic residues Lys 281, Arg 279 and Arg 561. Pro 392 and Pro 393 from the PA-arch (plum) stack with 5′ nucleotides U6 and G5, respectively; only the second proline is universally conserved in all influenza strains.

PowerPoint slide

The 5′ hook is sandwiched in a pocket formed on one side by strands β17–β18 and β20 of the main β-sheet of PA, and on the other by the PA-arch (366–397) and the PB1 β-hairpin (353–370) that inserts through the arch (Fig. 5b). The buried surface area of the 5′ end totals 4,044 Å2 (60% with PA, 40% with PB1). Numerous polar interactions to the backbone (Extended Data Table 2) sense the shape of the stem–loop, including contacts to all phosphates (except 6–7) as well as to several ribose 2′ OHs. Base contacts are made to invariant 5′ residues G2, A7, A10 and A11 as well as to G5 and U6. Key interacting and highly conserved residues from PA are His 326, the peptide 366–370, 388-Tyr-Lys, 503-Arg-Leu-His, Lys 534, Arg 561 and Lys 569. From PB1 they include His 32, Thr 34 and Tyr 38 (conserved in all influenza strains) and 356-Met-Phe-Glu (Fig. 5c, d and Extended Data Fig. 6). An especially dense series of interactions binds and stabilizes the sharp turn between 5′ A10–A11 (Fig. 5c). The PA-arch motif 366-Gly-Glu-Gly-Gln-Ala-370 forms a phosphate-binding loop, which interacts tightly with the backbone of A10–A11. His 505 (His 510 in human/avian strains) stacks on base A11 and hydrogen bonds to unpaired G9 of the 3′ strand, which in turn stacks on PA Met 472. This histidine has previously been shown to be a crucial residue in regulating transcription43. PA Arg 503 and PB1 Arg 365 make multivalent interactions with the RNA backbone (Fig. 5c). Conserved PB1 residues His 32 and Tyr 38 contact the phosphates of G5 and U6 and the double prolines 392-Pro-Pro in the PA-arch stack on the bases of these same nucleotides (Fig. 5d).

There are five base pairs in the duplex region of the promoter, 3′ 10-UCUCC-14 with 5′ 11-AGAGG-15, which projects away from the polymerase (Fig. 1c). The self-complementary four-nucleotide overhang 15-AUAU-18 of the crystallized 3′ end base-pairs with a crystal symmetry-related equivalent, thus forming a pseudo-continuous double-stranded RNA of 14 base pairs between two two-fold-related polymerases (Extended Data Fig. 7). The duplex region of the promoter is contacted by the central section of the long PB1 β-ribbon and by residues 672–676 of PB1-Cter (Extended Data Fig. 6). The PA peptide 503-Arg-Leu-His, reinforced by 466–475, forms a wedge that separates the 5′ and 3′ strands into binding pockets (Extended Data Fig. 6). Only the proximal single-stranded 3′ nucleotides 6-UUCG-9 are visible in the structure, and these are directed towards the polymerase template entry tunnel before turning away towards the solvent. There is a sharp turn between unpaired 3′ end nucleotides G9 and C8 (Extended Data Fig. 6). Residues, very highly conserved in all influenza strains, from all three subunits (PA 505–509 and Lys 567, PB1-Cter 671–676 and PB2 36–49) are involved in binding the 3′ nucleotides 6-UUCG-9 (Extended Data Fig. 6). At the apex of the sharp turn, the phosphate of 3′ C8 is bound by PA Lys 567 and PB2 Arg 46, the latter being positioned by salt bridges with PA Asp 509 and PB2 Glu 40. PA Arg 507 and PB1 C-terminal extension residues Asn 671, Arg 672 and Ser 673 interact extensively with the backbone of 3′ U7 and U10.

Conclusions

The structure of influenza polymerase, the first from any negative-strand RNA virus, reveals the enormous complexity of the molecule and highlights the fact that all three subunits are intricately involved in many of most important functional regions. This undoubtedly explains why 40 years’ of polymerase biochemistry has often led to confusing and contradictory results. For instance, numerous studies have tried to identify the vRNA 3′- and 5′-end binding sites by crosslinking and/or mutagenesis44,45,46,47 but have failed to reveal the critical residues (see Supplementary Information). Conversely, the vRNA promotor structure itself is essentially as predicted41, although the A–A mismatches in the 5′-end hook were not foreseen. Indeed, the hook, tightly bound in a pocket formed by PA and PB1, is an integral part of the polymerase structure and this binding is required to enhance or activate polymerase functions48,49,50 (Extended Data Fig. 2). Without an apo-structure, this cannot be fully rationalised yet, but it is likely that without the stabilization promoted by 5′-end binding the nearby polymerase active site will be disorganized. Whereas, in the bat polymerase structure, the 3′ end of the template is not completely visible, in the FluB polymerase structure the complete 3′ strand is well ordered18. However, rather than being directed in to the PB1 active site, the vRNA 3′ end seems to have an alternative, but specific, binding site lying on the surface of the polymerase in the vicinity of the long PB1 β-ribbon. This is discussed further in the accompanying paper, along with other insights into polymerase function derived from the structure18.

There is considerable interest in understanding the exact role of polymerase residues that have been implicated in host adaptation, notably between avian and human influenza A strains14. Such mutations, identified by analysis of natural sequences or serial adaptation of viruses to mice, typically have a neutral effect in avian cells but enhance polymerase activity in mammalian cells. Because the positions of implicated residues can henceforth be mapped onto the full polymerase structure, an initial distinction can now be made between those residues that are more likely, because of their internal location, to affect the intrinsic rate of polymerase functions (which could be important for species-dependent physiological reasons), and others, which, because of their surface location, possibly act through direct interaction with other viral or cellular factors. Some initial observations are made in the Supplementary Information, but further structural studies of the polymerase in different functional conformations and eventually with bound host factors are required to determine the exact role of these putative host-specific residues.

Finally, the unexpectedly good resolution of this crystal structure gives hope that structure-based drug design targeting the PB1 active site, vRNA binding or numerous potential allosteric sites, will soon become possible.

Methods

Construct

The influenza A/little yellow-shouldered bat/Guatemala/060/2010(H17N10) polymerase heterotrimer was expressed as a self-cleaving polyprotein (Extended Data Fig. 1a). A codon-optimized synthetic construct (DNA2.0) with the composition GNHBstEII GSGSENLYFQTEVGSHHHHHHHH8×His-tag GSGS-PA (GenBank ID AFC35437.1) GSGSGENLYFQTEVGSGSGSGSG-PB1 (GenBank ID AFC35436) GSGSGENLYFQTEVGSGSGSGSG-PB2 (GenBank ID AFC35435.1) GWSHPQFEKStrep-tag GGGSGGGSGGSAWSHPQFEKStrep-tag GRSGRsrII was cloned via BstEII and RsrII sites into the vector pKL-PBac51, which also contains coding sequences for tobacco etch virus (TEV) protease (5′) and cyan fluorescent protein (CFP) (3′). (The TEV-site, His-tag and Strep-tag are underlined.)

Expression and purification

The bat FluA polymerase was produced in HighFive insect cells using the baculovirus expression system. Cells were collected by centrifugation, re-suspended in buffer A (50 mM Tris-HCl, 500 mM NaCl, 10% (v/v) glycerol and 5 mM β-mercaptoethanol, pH 8) supplemented with protease inhibitors (Roche, complete mini, EDTA-free), and lysed by sonication. Cell debris was spun off (30 min, 4 °C, 35,000g) and ammonium sulphate added to the clarified supernatant (0.5 g ml−1) to force the protein out of solution. The precipitated protein was collected by centrifugation (30 min, 4 °C, 70,000g) and re-suspended in buffer A. After a final centrifugation step (30 min, 4 °C, 70,000g) the polymerase was purified from the fraction of soluble proteins via immobilized metal ion affinity chromatography and a strep-tactin resin (IBA, Superflow), using buffer A as running buffer in both cases. Fractions containing the target protein were pooled and diluted with an equal volume of buffer B (50 mM HEPES/NaOH, 10% (v/v) glycerol and 2 mM TCEP, pH 7.5) before loading on a heparin column (HiPrep Heparin HP, GE Healthcare). Polymerase was eluted by a gradient of buffer B supplemented with 1 M NaCl, concentrated, and subjected to size-exclusion chromatography (S200, GE Healthcare) in buffer C (50 mM HEPES/NaOH, 500 mM NaCl, 5% (v/v) glycerol and 2 mM TCEP, pH 7.5). Monomeric and RNA-free polymerase was concentrated, flash-frozen and stored at −80 °C. The typical yield of pure heterotrimer is about 1 mg l−1 of insect cells.

Crystallization, data collection and structure solution

Polymerase protein in buffer C was adjusted to a concentration of 10 mg ml−1, mixed in a 1:1 ratio with vRNA, which was an equimolar mixture of nucleotides 1–16 from the 5′ end (5′-pAGUAGUAACAAGAGGG-3′) and nucleotides 1–18 or 3–18 from the 3′ end (3′OH-UCGUCUUCGUCUCCAUAU-5′OH) (IBA). Crystallization trials were performed by vapour diffusion at 4 °C using a Cartesian robot. The best crystals grew in mother liquor containing 0.7–1.5 M sodium/potassium phosphate at pH 5.0. For data collection, crystals were flash-frozen in well solution supplemented with 25% glycerol. Diffraction data were collected at 100 °K with an X-ray wavelength of 0.9763 Å on beamline ID23-1 of the European Synchrotron Radiation Facility equipped with a Pilatus 6M-F detector and integrated and scaled with XDS52. Initial phases were obtained by molecular replacement with the structure of the influenza B polymerase18. The model was improved by making use of the five known high-resolution structures of FluA polymerase fragments (endonuclease53, PA-Cter-PB1-Nter (PDB codes 2ZN1 and 3CM8), PB1-Cter/PB2-Nter (PDB code 3A1G), PB2-cap and 627-NLS domains (PDB code 2VY6). Refinement was performed with Refmac54. A putative zinc ion is found bound between PB1 His 562 and PA Asp 421. Figures were drawn with Pymol55. The vRNA and most protein regions have very good electron density apart from a few connecting peptides and the PA endonuclease domain, which has poor density except where it contacts the rest of the polymerase. Ramachandran statistics, as calculated by Molprobity56 are 94.2% (favoured), 0.7% (disallowed).

Polymerase activity assays

A T7-transcribed 39-nucleotide mini-panhandle or equimolar mixture of separated synthetic 3′ and 5′ ends were used as vRNA (Extended Data Fig. 2a, b).

For the ApG-primed replication assay, 0.5 μM protein, 0.5 μM vRNA, 0.5 mM ApG, 0.4 mM GTP/CTP, 1 mM ATP, 0.04 mM UTP, 32P-UTP and 0.8 U μl−1 Ribolock, in buffer (150 mM NaCl, 50 mM HEPES, pH 7.5, 5 mM MgCl2 and 2 mM TCEP) were mixed and incubated at 30 °C for 2 h.

For the cap-dependent transcription assay, 0.5 μM protein, 0.5 μM vRNA, 0.4 mM GTP/CTP/UTP, 1 mM ATP and 32P-labelled capped RNA in the same buffer (150 mM NaCl, 50 mM HEPES, pH 7.5, 5 mM MgCl2 and 2 mM TCEP) were mixed and incubated at 30 °C for 2 h. For this purpose, a 5′ diphosphate synthetic 20-base RNA, 5′-ppAAUCUAUAAUAGCAUUAUCC-3′ (Chemgenes), was capped by incubating with vaccinia virus capping enzyme (purified in house following ref. 57) and 20 µM SAM, 32P-GTP, 50 mM Tris, pH 8.0, 6 mM KCl, 1.25 mM MgCl2 and 0.8 U μl−1 Ribolock.

For the endonuclease assay, transcription mix without any NTPs was incubated at 30 °C for 2 h. Samples were separated on 7 M urea, 20% acrylamide gel in TBE buffer, exposed on a storage phosphor screen and read with a Typhoon scanner.

For the time course of unprimed and ApG-primed vRNA replication, 0.5 μM bat FluA polymerase was mixed with 1 μM 39-nucleotide vRNA mini-panhandle template, NTPs (1 mM ATP, 0.4 mM GTP, 0.4 mM CTP and 0.04 mM UTP) and 0.12 μCi μl−1 32P-UTP, in the absence or presence of 0.5 mM ApG. Reactions were incubated at 30 °C and samples were analysed on a 20% acrylamide, 7 M urea denaturing gel after 0, 2, 5, 10, 15, 20, 30, 40 and 50 min, 1, 2 and 3 h.