Main

The influenza virus genome comprises eight segments of single-stranded RNA (viral RNA, or vRNA), each packaged in separate ribonucleoprotein particles (RNPs). Both conserved 3′ and 5′ ends of the vRNA (the promoter) are bound to the RNA-dependent RNA polymerase, and the rest of the pseudo-circularised vRNA is coated with nucleoprotein. The polymerase is a heterotrimer composed of subunits PA, PB1 and PB2 and, in the context of the RNP, it performs the distinct processes of transcription and replication using the same template vRNA (reviewed in refs 1 and 2). Transcription of viral mRNA occurs through a unique process called cap-snatching, in which short capped oligomers, derived from host pre-mRNA, are bound by the PB2 subunit3,4,5, cleaved by an endonuclease in the PA subunit6,7 and then used to prime mRNA synthesis by the PB1 subunit. Stuttering of the polymerase at an oligo-U stretch near the vRNA 5′ end leads to auto-polyadenylation8. Thus, translation-competent viral mRNAs are generated without the need for a viral-encoded capping enzyme nor the host poly-adenylation machinery, which is shut down by viral-encoded NS1 protein9. By contrast, replication involves unprimed synthesis of an exact, full-length copy of the vRNA into complementary RNA (cRNA) and subsequently the inverse process back to progeny vRNA. Nascent replicates are immediately packaged with nucleoprotein into new viral RNPs (vRNPs) or complementary RNPs (cRNPs). In contrast, viral mRNA is not so packaged but is treated as host pre-mRNA10 and further spliced (in the case of NS and M segments) and/or exported to the cytoplasm by host cell machineries. Interestingly, cRNPs do not perform transcription in infected cells and may require a second polymerase to replicate11,12. Despite many years of study, the mechanism by which RNPs are able to perform these different functions and what determines the type of RNA synthesis that occurs are still obscure. Here we infer, using complementary information from atomic resolution structures of influenza A and B polymerases in complex with the vRNA promoter together with known structures of other viral RNA polymerases, the mechanism by which the polymerase can perform either cap-dependent transcription or unprimed (de novo) RNA synthesis. The structures thus open the way to a detailed description of how the influenza transcription/replication machine works in a context-dependent manner.

Structure of FluB polymerase compared with FluA

Full-length heterotrimeric influenza polymerase from B/Memphis/13/03 (FluB) was obtained by expression in insect cells as a self-cleaving polyprotein13 (Extended Data Fig. 1). Recombinant FluB polymerase was active in the absence of nucleoprotein in cap-dependent transcription and both ApG-primed and, less efficiently, unprimed replication assays using short model vRNAs (Extended Data Fig. 2). Two different crystal forms of FluB polymerase were obtained with consensus promoter sequences for influenza B14 (Extended Data Table 1). Both contain nucleotides 1–14 from the vRNA 5′ end (5′-pAGUAGUAACAAGAG-3′ΟΗ) and either nucleotides 5–18 (FluB1 form) or nucleotides 1–18 (FluB2 form) from the 3′ end (3′OH-UCGUCUUCGUCUCCAUAU-5′OH). The FluB1 form yielded a fully interpretable experimental map (Extended Data Fig. 3a–c) at 3.4 Å resolution, allowing an almost complete model of FluB polymerase to be built (Fig. 1a). The 2.7-Å resolution FluB2 structure, solved by molecular replacement using the FluB1 structure, is extremely well ordered (Extended Data Fig. 3d, e). Owing to crystal contacts, it has the best defined endonuclease domain, which, however, is in the same position as in all other structures. By contrast, the C-terminal two-thirds of PB2 (PB2-C) completely lacks electron density in the FluB2 form (Fig. 1b), although intact PB2 is present in the crystal.

Figure 1: Structure of influenza B polymerase.
figure 1

a, Surface view of FluB1 structure colour-coded according to domain structure (Extended Data Fig. 4) except that PA-C, PB1and PB2-N are uniformly green, cyan and red, respectively. The vRNA 5′ and 3′ ends are pink and yellow, respectively. b, Surface view of the polymerase in the FluB2 crystal form that lacks the entire PB2-C domain but includes the full-length 3′ end of the vRNA (black arrow). c, Bat FluA PB2-C colour-coded according to domain structure (Extended Data Fig. 4). d, The complete PB2 subunit as in the FluB1 crystal form in the same orientation as in c, highlighting the 70° difference in orientation of the cap-binding domain.

PowerPoint slide

Sequence alignments with bat FluA, the structure of which is described elsewhere15, show that influenza B/Memphis/13/03 has 36.0 (48.6), 59.5 (71.0) and 37.0 (50.9) per cent amino acid identity (similarity) for PA, PB1 and PB2, respectively (with higher than average conservation in the functionally important regions) (Supplementary Fig. 1). The FluA and FluB polymerase structures and their mode of binding to the vRNA promoter are remarkably similar (Extended Data Fig. 4). However, a striking difference of 70° in the orientation of the PB2 cap-binding domain (Fig. 1c, d) suggests that this domain can rotate in situ. Concerning vRNA binding, all FluA and FluB structures exhibit identical conformations of 5′ nucleotides 1–14 and 3′ nucleotides 5–13 as described elsewhere15, and most protein–RNA contacts for these regions are conserved between FluA and FluB (Extended Data Table 2). The higher resolution FluB2 structure shows that the protein–RNA interface is highly hydrated with numerous water-mediated protein–RNA interactions. In both the FluB1 and FluB2 structures, the 3′–5′ duplex region of the promoter comprises four base pairs (3′ 10-UCUC-13 with 5′ 11-AGAG-14). In the FluB1 structure, the 5-nucleotide 3′ overhang (14-CAUAU-18) forms a triple-stranded structure at a two-fold crystal contact, including two base triples with the symmetry-related duplex (Extended Data Fig. 5), whereas in the FluB2 structure the RNA does not participate in crystal contacts.

Promoter 3′ end binding

Only in the FluB2 crystal form is the complete 3′ end of the promoter structurally ordered (including nucleotides 1–5, not visible in other structures). The single-stranded 3′ extremity, 1-UCGUCUUCG-9, perhaps unexpectedly, does not enter the polymerase active site but is bound in an alternative location on the surface of the polymerase in an arc conformation, such that U1 is not far from the distal 3′–5′ duplex region (Fig. 2a). Bases 1–4 stack on each other but other bases are bound in individual pockets. Most bases are orientated towards the protein and all except U1 make base-specific RNA–RNA or RNA–protein interactions (Fig. 2a and Extended Data Table 2). All three subunits are involved in binding 3′ nucleotides 6–9, whereas nucleotides 1–5 only interact with PB1. Residues 670–679 of PB1 are involved in binding both extremities of the 3′ end whereas the PB1 β-ribbon interacts with 3′ nucleotides 1–3 (Fig. 2a). The sequence-specific nature of the 3′-end binding and conservation of interacting residues strongly suggests that this binding site is functionally important. This implies that there must be a mechanism for relocating the 3′ end into the PB1 active site during initiation of RNA synthesis. The observed 3′-end conformation is inconsistent with a hook conformation16, but overall the promoter structure is consistent with that proposed in ref. 17, which suggests that the sequence constraints imposed on the 3′ end by the necessity of almost exact complementary to the 5′ hook would make it appear that the 3′ end would also take a hook conformation.

Figure 2: Promoter 3′-end binding and PB1 β-ribbon flexibility.
figure 2

a, Diagram showing RNA–RNA and RNA–protein interactions of the complete 3′ end (nucleotides 1–13, yellow sticks) of the promoter as in the FluB2 structure. For clarity, not all interactions (nor water-mediated interactions) are depicted. All three subunits, PA (green ribbons and residues), PB1 (cyan) and PB2 (red), are involved. Nucleotides 1–9 are single-stranded, and 10–13 form a duplex with the 5′ end (not shown). The PB1 β-ribbon interacts with the proximal part of the 3′ end and PB1-Cter interacts with both proximal and distal nucleotides. Specific RNA–RNA interactions include N2 to OP2 of G9, N4 of C5 to OP2 of U6, O2′ of U1 to OP2 of C2. b, Superposition of PB1 from the FluB2 (cyan) and FluB1 (light cyan) structures, showing flexibility of the long β-ribbon. The 3′ (nucleotides 1–13, yellow) and 5′ (nucleotides 1–14, pink) ends of the promoter are as in the FluB2 structure. The 3′ deviates from the path into the PB1 active site that is depicted by the template strand (orange) from the superposed Norwalk template–primer elongation complex (PDB code 3BSO).

PowerPoint slide

In all structures, the unusually long PB1 β-ribbon (residues 177–214) has a role in interacting with the vRNA on the exterior of the polymerase. In the FluB1 structure, the β-ribbon is straight and projects away from the polymerase, its tip (residues 195–196) interacting with crystal symmetry-related RNA (Fig. 2b and Extended Data Fig. 5). In the bat FluA structure, the ribbon is bent towards the polymerase and its central part contacts the duplex region of the promoter, whereas its extremity is disordered (not shown). In the FluB2 structure, the β-ribbon is the most bent and residues 184–186 and influenza-conserved Arg 203 interact with the proximal 3′ end (Fig. 2a, b). These observations show that the PB1 β-ribbon has an affinity for RNA and is flexible. It could therefore have a dynamic role in translocating the RNA into the polymerase from the RNP and/or could mediate interactions with proximal nucleoprotein molecules of the RNP. This hypothesis is supported by fitting of the polymerase–promoter structure to the available electron microscopy map of the mini-RNPs, which predicts the close proximity of the ribbon to nucleoprotein (Extended Data Fig. 6).

Mechanism of replication

Influenza virus polymerase catalyses primer-independent (de novo) replication to generate cRNA from vRNA and vice versa. It has been proposed that efficient replication requires nucleoprotein18 and/or polymerase oligomers11,12,19. RNA polymerases that perform de novo synthesis generally possess a special ‘priming’ loop that is thought to stabilize the priming and incoming NTPs in the absence of a priming oligonucleotide. This phenomenon was first structurally characterized for bacteriophage Φ6 polymerase, in which a tyrosine at the extremity of the priming loop stacks on the priming nucleotide20. Flavivirus polymerases such as those of hepatitis C virus (HCV) or dengue virus (DENV) also have an aromatic residue (a tyrosine in HCV and a tryptophan for DENV) as putative priming platforms21. For PB1, a β-hairpin loop (residues 641–657), structurally analogous to that of HCV, is observed in an ordered configuration in the FluA structure15 but is disordered in the FluB polymerase structures. The loop tip contains the 648-Ala-His-Gly-Pro motif, conserved in all influenza polymerases. Modelling, on the basis of the Φ6 initiation complex structure, shows that the loop could potentially act as a priming platform to promote correct initiation, with His 649 plausibly interacting with the initial incoming nucleotides (Fig. 3a, b). More details of the active site configuration, which largely involves the canonical polymerase motifs, are given for the FluA structure15. A model of the elongation step of influenza polymerase can be obtained by superposing the primer–template complex of poliovirus polymerase22 on PB1, the high conservation of the polymerase active site ensuring an unambiguous superposition (Fig. 3c, d). The putative priming loop would need to be displaced once elongation starts because it would sterically clash with an emerging template–product duplex (Fig. 3c).

Figure 3: Model for replication initiation and elongation by influenza polymerase.
figure 3

a, FluA PB1 with bound 3′ (nucleotides 5–18, yellow) and 5′ (nucleotides 1–14, pink) vRNA superposed with the Φ6 initiation complex structure (PDB code 1HI0) with template (orange) and two initial incoming NTPs with magnesium (green sticks and black spheres). The PB1 putative priming loop is magenta with the palm (red) and fingers (cyan). The thumb is omitted for clarity. b, As in a but showing only the PB1 putative priming loop. The influenza conserved 648-Ala-His-Gly-Pro motif at the loop tip could stabilize the initiation complex (the electron density for the His-Gly residues is poor). 3′-end nucleotides 5 and 6 that deviate from the canonical template pathway (orange) are sand coloured. c, As in a but with the primer (green) and template (orange) RNA from the poliovirus polymerase elongation complex (PDB code 3OL7) after superposition of the polymerase domains. The PB1 putative priming loop clashes with the duplex RNA and therefore must be displaced. d, As in c but excluding the protein. The influenza vRNA 3′ nucleotide 8/9 is at the template entrance but corresponds to the 3′ nucleotide 5/6 template in the polio virus polymerase complex. e, As in c but end-on view and with PB1 uniformly coloured cyan. The template–product duplex can be accommodated in the PB1 cavity although the thumb domain is expected to open. f, As in e but including the PB2-N domain with subdomains coloured as in Extended Data Fig. 4. The product/primer strand (green) can potentially exit/enter (see also Fig. 5) but the template strand is blocked by the PB2 helical lid domain (red).

PowerPoint slide

These results lead to the following two observations. As highlighted above, in our structures, the vRNA 3′ extremity does not enter the PB1 active site. However, comparison for instance with the polio template–product complex shows that vRNA 3′ nucleotide 8/9 is at the template tunnel entrance but corresponds to 3′ nucleotide 5/6 in the polio polymerase complex (Fig. 3d). Thus, the 3′ end, on reorientation into the PB1 active site, would have to draw back three nucleotides to initiate at the first position, perhaps concomitantly with breaking of the 3′–5′ duplex region. The mechanism to do this is unclear at present. Interestingly, it has been proposed that vRNA and cRNA initiate replication differently, either synthesizing pppApG at positions 1 and 2 (1-UC) of the 3′ end directly for vRNA, or internally at 4-UC followed by realignment at 1-UC for cRNA23. This suggests that the 3′ end is differently positioned in the active site depending on whether it is vRNA or cRNA. This could be because the c3′ end sequence differs at three positions and is one nucleotide longer than the v3′ end before the 3′–5′ duplex region. According to the modelling, it would thus be positioned correctly for internal initiation and the putative priming loop could have a role in this.

The second observation concerns the fact that modelling with the polio template–product elongation complex shows that an extended duplex cannot be accommodated in the cavity of the current structures because of a severe steric clash of the outgoing template strand. This is not primarily due to PB1 (apart from the putative priming loop) (Fig. 3e), but to elements of PB2-N that lie directly on top of the duplex (Fig. 3f). As discussed below, the current structures do provide an open channel into the PB1 active site for a capped primer to initiate transcription but the outgoing template is blocked by helices α8–α10 of the PB2 lid domain. In the case of HCV polymerase, the structure of a product–template duplex complex revealed a 20° rotation of the thumb domain that opened up the product–template binding cavity24. In analogy to this, we expect elongation to be accompanied by equivalent conformational changes in which thumb opening could be coupled to displacement of the priming loop and rotation out of the way of the PB2 lid domain. The length of the product–template duplex that is accommodated by influenza polymerase, what causes strand separation (although the PB2 N2 and lid domains are plausibly involved) and which exit path the two strands take are open questions. As discussed below, in the case of cap-dependent transcription, a likely exit pathway for the nascent mRNA is away from the nuclease domain and towards the PB2 627-domain (containing the host-specific amino acid residue 627).

Mechanism of cap-dependent transcription

Cap-snatching is uniquely performed by segmented negative-strand RNA viruses including orthomyxoviruses, bunyaviruses and arenaviruses25,26. The PB2 cap-binding and PA endonuclease domains involved in this process were previously characterized structurally and functionally5,6,7. The complete polymerase structures now allow a plausible mechanism for cap-snatching and cap-dependent priming to be proposed. All structures show the PA-Nter endonuclease in the same position and orientation, anchored to the PB1-Cter–PB2-Nter interface (Fig. 1a). By contrast, comparison of the FluA and FluB1 structures after superposition suggests that the PB2 cap-binding domain is able to rotate as a rigid body in situ. Whereas PB2 residues before Ile 319 (Ile 321 in FluB) and after Arg 495 (Lys 496 in FluB) align very well, the entire cap-binding domain in between differs in orientation by 70° between the two structures, suggesting that it is flexibly hinged at these anchor points (Fig. 1c, d and Supplementary Video 1). In the FluA structure, the cap-binding site faces the endonuclease active site directly across a solvent channel at a distance of about 50 Å (Fig. 4a, b). This configuration is consistent with a cap-bound host pre-mRNA being cleaved 10–15 nucleotides downstream by the nuclease27, bearing in mind that the observed cap-binding domain orientation, probably constrained by crystal contacts, is not necessarily optimal for cap-snatching. The observed variability of the primer length27 would be explained by flexibility in both the cap-binding domain orientation and RNA conformation and possibly the sequence preference of the nuclease cleavage site28. Cleaved primers would then be further selected by their efficiency in priming mRNA synthesis, which probably correlates with the complementary to the extremity of the 3′ template27,29,30,31,32. In the FluB1 structure, the rotated position of the cap-binding domain both shields the bound-capped primer from the endonuclease (Fig. 4c, d) and directs it down into the polymerase RNA catalytic cavity (Fig. 4e). This model is supported by the observation in FluB1 crystals (but not other crystal forms) of residual difference electron density, strongly suggestive of RNA, that descends precisely from the Trp 369–Phe 406 sandwich in the FluB cap-binding site into the throat of the polymerase, which leads to the PB1 active site (Fig. 4e and Extended Data Fig. 7). The nature and origin of this RNA is unclear, making it difficult to fit a precise model. But whatever the RNA origin, its fortuitous occurrence in the FluB1 structure gives a very plausible model of how a capped primer might be configured during transcription initiation. The 424-loop of the cap-binding domain seems to have key roles in channelling the capped primer into the polymerase throat (the integrity of this loop was previously shown to be important for transcription5), as well as the projecting amino-terminal end of PB2 lid domain helix α9, and in particular the double prolines 157-Pro-Pro that force the RNA into a 90° bend (Fig. 4e and Extended Data Fig. 7). The observed RNA density corresponds to about six nucleotides plus the cap, and extends over a straight-line distance of around 26 Å to the bend. The remaining distance to the polymerase active site is around 28 Å, which is compatible with a primer of around 12–14 nucleotides (Extended Data Fig. 7). An overall model of how cap-dependent priming is likely to occur in influenza polymerase is given in Fig. 5.

Figure 4: Cap-snatching and cap-dependent priming of transcription.
figure 4

a, Cap-snatching configuration. Top view of the relative orientations in the FluA structure of the cap-binding domain (orange) with bound cap analogue (m7GTP, yellow spheres, obtained by superposition with PDB code 4CB4) and endonuclease (green) with active site indicated by a bound inhibitor (purple spheres). PA is uniformly green, PB1 cyan and PB2 subdomains coloured as in Extended Data Fig. 4. Cap-bound host pre-mRNA can reach the endonuclease active site unimpeded across a solvent channel (red arrow) at a straight-line distance of around 50 Å (although the cap-binding domain orientation observed is not necessarily optimal for primer cleavage). b, As in a but side view. c, d, As in a and b but for the FluB1 structure. The rotated orientation of the cap-binding domain shields cap-bound RNA from the endonuclease. e, The FluB1 cap-binding domain configuration is compatible with cap-dependent priming. Yellow spheres represent model of bound capped primer derived from RNA-like residual electron density (Extended Data Fig. 7). The primer is channeled towards the PB1 active site by the 424-loop of the cap-binding domain and the N-terminal end of PB2 lid domain helix α9. f, The putative exit channel for the capped transcript is between the PB2 cap-627 linker and 627-domains towards host-specific residue Lys 627, and away from the nuclease.

PowerPoint slide

Figure 5: Model for cap-dependent transcription.
figure 5

The FluB1 structure is superposed with the template–primer (orange/green) duplex and incoming NTP (black sticks) from the poliovirus complex (PDB code 3OL7). PA is uniformly green, PB1 cyan and PB2 subdomains coloured as in Extended Data Fig. 4. The capped RNA primer (yellow spheres) is as in Fig. 4e and connects with the primer strand in the polio complex. The polio template strand connects with the vRNA 3′ end (yellow tube) at the template tunnel entrance. During elongation the emerging template strand would clash with the PB2 helical lid (red), which therefore has to move. When the vRNA template has mostly passed through the polymerase there will be a minimal loop remaining with the tightly bound 5′ hook (pink tube), which will generate the poly(A) tail on the transcript by stuttering on the oligo-U sequence at 17–22 nucleotides from the 5′ end. During elongation, the polymerase will sequester 20–25 nucleotides of the template. In the context of transcription by an RNP, at least this amount of RNA would have to dissociate from nucleoprotein and re-associate after exiting the polymerase.

PowerPoint slide

One can hypothesize about subsequent steps in cap-dependent transcription (Extended Data Fig. 8). Once the capped primer 3′ end engages the vRNA template in the PB1 active site, primer elongation occurs by template-directed nucleotide addition. Further rotation in situ of the cap-binding domain could initially accommodate the growing mRNA while still maintaining cap-binding (Extended Data Fig. 8c). However, at some stage the buckling out of the lengthening mRNA would force cap release, which has previously been estimated to be after 11–15 nucleotides33 (Extended Data Fig. 8d). The transcript would naturally emerge into the basic channel between the cap-binding domain and the cap-627 linker/627-domain in the vicinity of host-specific residue 627 (Fig. 4f), possibly explaining why capped RNA was crosslinked to the 627-domain34,35. This exit pathway avoids the endonuclease, consistent with reports that the polymerase protects its own mRNAs from degradation by transiently binding to the conserved AGCAAAGCAGG sequence, which occurs just downstream of the host mRNA-derived primer sequence and is transcribed from the conserved 3′ end of the template36. The 627-domain may have a role in this as it has a binding preference for 5′ vRNA-like sequences37. When eventually released, the 5′ cap structure itself is bound by the nuclear cap-binding complex and the mRNA subsequently processed by host machinery10 (Extended Data Fig. 8d). More generally, the same exit pathway could be used for cRNA or vRNA replicates, and the 627-NLS domain (a double domain containing host-specific PB2 residue 627 and the PB2 nuclear localization signal (NLS)) could have a role in their packaging with incoming nucleoprotein into nascent cRNPs or vRNPs38.

Concerning auto-polyadenylation of viral mRNA by the polymerase, the tight binding of the hook at the 5′ end of the template is thought to cause stuttering at the oligo-U stretch typically 17–22 nucleotides from the 5′ end, resulting in the addition of several adenosine residues8,17. Because a minimum of ten 5′ nucleotides are required to form the 5′ hook, and, on the basis of the structural alignment with the polio polymerase primer–template complex, a minimum of seven extra nucleotides is required to reach the site of nucleotide addition (Fig. 3e), the crystal structure is fully compatible with the proposed polyadenylation model (Extended Data Fig. 8d).

Conclusions

The FluA and FluB polymerase structures presented here seem to be in an inactive pre-initiation state requiring relocation of the 3′ end into the PB1 active site before RNA synthesis can begin. However, we think the observed 3′-end binding site on the polymerase surface could have functional importance in, for instance, providing an additional docking site for the 3′ end (on the same polymerase or a different polymerase) after it has been copied and exited from the active site. This would be an efficient way to allow several rounds of primary transcription from the same vRNP in the early stages of infection. Alternatively, the 3′ end bound to the surface of one polymerase could translocate into the active site of a second, empty polymerase as has been proposed in some models of replication that imply polymerase oligomerization11. It is also clear, as discussed above, that additional conformational changes must occur to allow progression from the initiation to the elongation stages of RNA synthesis. Although it is expected, in analogy to HCV, that the PB1 RNA-binding cavity widens during this step and the lid made by the PB2 N2 domain should open (see above), several other lines of evidence suggest that PB2 as a whole is the most mobile part of the polymerase. First, docking of the polymerase crystal structure into the mini-RNP electron microscopy map39 shows that the PA–PB1 heterodimer fits well but the extra density assigned to PB2 is detached from the rest and cannot be fitted without a gross conformational change of PB2 (Extended Data Fig. 6). Second, detachment of a large fragment of PB2 is compatible with the polymerase structure in the FluB2 crystal form, in which two-thirds of PB2 (PB2-C) is not visible at all although there is space in the crystal for it. Similarly, in the electron microscopy reconstructions of native RNPs19, part of the polymerase (the ‘arm’), was observed to be detached and flexible. Although this was assigned to PA-C, we think it is most likely to be PB2-C, for the reasons just given and because our structures suggest that the integrity of the PA–PB1 heterodimer is very unlikely to be disrupted at least while both are intimately binding the vRNA 5′ end.

Although there are still very many open questions, our three complementary structures already give considerable new insight into the mechanism of replication and transcription by influenza polymerase. They provide a solid structural framework for future studies aimed at refining understanding of this complex and dynamic molecular machine, not only in isolation but also in the more complicated physiological context of the RNP and host factors.

Methods

Construct

The influenza B/Memphis/13/03 polymerase heterotrimer was expressed as self-cleaving polyprotein (Extended Data Fig. 1). A codon-optimized synthetic construct (DNA2.0) with the composition GNHBstEII GSGSENLYFQTEVGSHHHHHHHH8xHis-tag GSGS-PA (GenBank ID AAU94844) GSGSGENLYFQTEVGSGSGSGSG-PB1 (GenBank ID AAU94857) GSGSGENLYFQTEVG SGSGSGSG-PB2 (GenBank ID AAU94870) GWSHPQFEKStrep-tagGRSGRsrII was cloned via BstEII and RsrII into the vector pKL-PBac13, which also contains coding sequences for tobacco etch virus (TEV) protease (5′) and cyan fluorescent protein (CFP) (3′). (TEV cleavage site, His-tag and Strep-tag are underlined.)

Expression and purification

High Five insect cells expressing the target protein complex were resuspended in buffer A (50 mM Tris-HCl, 500 mM NaCl, 10% (v/v) glycerol and 5 mM BME, pH 8) supplemented with protease inhibitors (Roche, complete mini, EDTA-free), lysed by sonication and centrifuged at 30,000 r.p.m. for 30 min at 4 °C (rotor type 45 Ti, Beckman Coulter). Ammonium sulphate was added to the clarified supernatant (0.5 g ml−1), the resulting precipitate collected by centrifugation as above and re-dissolved in buffer A supplemented with 20 mM imidazol. Soluble proteins were loaded on a nickel nitrilotriacetic acid (NTA) column (GE, FF crude) and bound proteins were eluted by 500 mM imidazole in buffer A. The target protein was loaded on a strep-tactin matrix (IBA, Superflow) and bound proteins eluted by 2.5 mM d-desthiobiotin in buffer A. Fractions containing the target protein were pooled and diluted with an equal volume of buffer B (50 mM HEPES/NaOH, 10% (v/v) glycerol and 2 mM Tris(2-carboxyethyl)phosphine (TCEP), pH 7.45) before loading on a heparin column (HiTrap Heparin HP, GE Healthcare). Proteins were eluted by a gradient of buffer B supplemented with 1 M NaCl, concentrated (Amicon Ultra, 50 kDa molecular mass cut-off) and further purified by size-exclusion chromatography (S200, GE Healthcare) in buffer C (50 mM HEPES/NaOH, 500 mM NaCl, 5% (v/v) glycerol and 2 mM TCEP, pH 7.45). Homogeneous monomeric FluB polymerase was concentrated as above and stored in aliquots at −80 °C. Protein concentration was determined by measuring the absorbance at 280 nm using the extinction coefficient 287,300 M−1 cm−1.

Crystallization

FluB polymerase was concentrated to 9 mg ml−1 (37 μM) in a buffer containing 500 mM NaCl, 50 mM HEPES, pH 7.5, 5% glycerol and 2 mM TCEP, and mixed with 40 μM vRNA for crystallization in hanging drops at 4 °C. A trigonal crystal form (FluB1) was obtained by mixing polymerase with nucleotides 5–18 of the 3′ end and 1–14 of the 5′ end of the vRNA (IBA) in a condition containing 0.1 M bicine, pH 9.0, 10% MPD. Large (up to 150 μm) diamond-like crystals grew within a few days and diffracted to around 3.4 Å resolution but were very radiation-sensitive. The structure was solved with data at 6.5 Å resolution from a single heavy metal derivative obtained by soaking native crystals with 1 mM K2PtCl4 for 1 h. Seleno-methionylated protein crystals were obtained in the same conditions as native ones. Polymerase with nucleotides 1–18 of the 3′ end and 1–14 of the 5′ end of the viral RNA gave thin hexagonal plates (form FluB2) in 1 M LiCl, 10% PEG 6000 and 0.1 M bicine, pH 9.0, that took 3–4 weeks to grow and diffracted to 2.7 Å resolution. All crystals were cryo-protected in mother liquor supplemented with 20% glycerol and flash-frozen in liquid nitrogen. Data was collected at 100 K on beamline ID23-1 at the European Synchrotron Radiation Facility (ESRF), equipped with a Pilatus 6M-F detector, at wavelengths of 0.9730 and 0.9792 Å for FluB1 and FluB2 crystals, respectively. All data were integrated and scaled with XDS40.

Structure determination

A partial molecular replacement solution (LLG 334) was found with PHASER41 using the known PA-C–PB1-Nter (PDB codes 2ZN1 and 3CM8) and PB2 627 (PDB code 2VY7) domain structures initially both from FluA. The cap-binding and endonuclease domains could not be located even using the actual FluB domain structures (unpublished data). Nevertheless, 22% of the complete structure was sufficient to identify around 20 platinum sites by inspection of a model-phased difference anomalous map. Several of the platinum peaks coincided with known positions of methionine residues. After scaling the platinum and native data sets, the platinum substructure was refined in SHARP42 to 7 Å and treated as SIRAS (single isomorphous replacement with anomalous), using the partial molecular replacement phases in the form of Hendrickson–Lattman coefficients. The final phasing statistics were phasing power (PP)anomalous = 0.716, PPiso,centric = 0.609, PPiso,acentric = 0.714, figure of merit (FOM)centric = 0.21, FOMacentric = 0.36. Solvent flattening and phase extension to the full resolution of the native data (initially 3.7 Å and subsequently 3.4 Å) was then performed with SOLOMON43 benefitting from the high solvent content of 73%. The resultant map had an overall correlation on |E|2 of 81.4% and Rfactor of 23.8%. The exceptionally good continuity of the map (Extended Data Fig. 3a–c) allowed immediate placing of known structures of the cap-binding, endonuclease and PB1–PB2 interface domains and revealed numerous additional secondary structures that could eventually be linked to trace almost the entire chain of each subunit as well as the vRNA. During model building and refinement with REFMAC44 map sharpening with Bfactor of −50 Å2 was used to improve visibility of side chains. Accurate model building was aided by using four high resolution structures of FluB polymerase domains determined during the course of this work (PA endonuclease at 2.1 Å resolution, PA-C/PB1-Nter at 2.4 Å resolution, PB2 cap-binding domain with m7GTP at 1.5 Å resolution, and the PB2 627-domain at 1.05 Å, unpublished data). Sequence assignment was verified by using the methionine positions located using the anomalous differences measured at 4.1 Å resolution from a seleno-methionylated polymerase crystal (Extended Data Fig. 3). The FluB2 crystal structure was determined by molecular replacement using the FluB1 structure. The C-terminal two-thirds of PB2 (PB2-C) is completely absent in the electron density map in this crystal form, although gel analysis shows PB2 to be intact and the crystal packing can accommodate PB2-C. When they became available, the higher resolution bat FluA and FluB2 structures enabled improvement in the quality of the FluB1 model with the help of secondary structure constraints derived using PROSMART45. Full crystallographic statistics are given in Extended Data Table 1. Figures were drawn with Pymol46. Ramachandran statistics, as calculated by Molprobity47, are 93.7% (favoured), 0.6% (disallowed) for the FluB1 structure and 97.5% (favoured), 0.1% (disallowed) for the FluB2 structure.

Polymerase activity assays

A T7-transcribed 39-nucleotide mini-panhandle or equimolar mixture of separated synthetic 3′ and 5′ ends were used as vRNA (Extended Data Fig. 2), corresponding to the consensus promoter sequences for influenza B polymerase14.

For the ApG-primed replication assay, 0.5 μM protein, 0.5 μM vRNA, 0.5 mM ApG, 0.4 mM GTP/CTP, 1 mM ATP, 0.04 mM UTP, 32P-UTP and 0.8 U μl−1 Ribolock, in buffer (150 mM NaCl, 50 mM HEPES, pH 7.5, 5 mM MgCl2 and 2 mM TCEP) were mixed and incubated at 30 °C for 2 h.

For the cap-dependent transcription assay, 0.5 μM protein, 0.5 μM vRNA, 0.4 mM GTP/CTP/UTP, 1 mM ATP, 32P-labelled capped RNA in the same buffer (150 mM NaCl, 50 mM HEPES, pH 7.5, 5 mM MgCl2 and 2 mM TCEP) were mixed and incubated at 30 °C for 2 h. For this purpose, a 5′ diphosphate synthetic 20-base RNA, 5′-ppAAUCUAUAAUAGCAUUAUCC-3′ (Chemgenes), was capped by incubating with vaccinia virus capping enzyme (purified in house following ref. 48) and 20 μM SAM, 32P-GTP, 50 mM Tris, pH 8.0, 6 mM KCl, 1.25 mM MgCl2 and 0.8 U μl−1 Ribolock.

For the endonuclease assay, transcription mix lacking NTPs was incubated at 30 °C for 2 h. Samples were separated on 7 M urea, 20% acrylamide gel in TBE buffer, exposed on a Storage Phosphor screen and read with a Typhoon scanner.

For the time course of unprimed and ApG-primed vRNA replication, 0.5 μM FluB polymerase was mixed with 1 μM 39-nucleotide vRNA mini-panhandle template, NTPs (1 mM ATP, 0.4 mM GTP, 0.4 mM CTP and 0.04 mM UTP) and 0.12 μCi μl−1 32P-UTP, in the absence or presence of 0.5 mM ApG. Reactions were incubated at 30 °C and samples were analysed on a 20% acrylamide, 7 M urea denaturing gel after 0, 2, 5, 10, 15, 20, 30, 40 and 50 min, 1, 2 and 3 h.

Fitting to the mini-RNP electron microscopy map

Influenza polymerase (FluB2 model) and nine influenza A nucleoproteins (PDB code 2IQH)49 were rigidly fitted into the 18 Å mini-RNP cryo-EM reconstruction39 using chimaera fit-in-map module50 and VEDA51. Map scaling was optimized by the cross-correlation between the model and map for different pixel sizes as implemented in VEDA. Down-scaling the electron microscopy map from 2.8 Å/pixel to 2.4 Å/pixel improved the cross-correlation and fit quality considerably. The fitting of the nine nucleoproteins follow the model previously proposed39, with each nucleoprotein and loop 402–428 of its neighbour being considered as a rigid entity to maintain the nucleoprotein–nucleoprotein interaction mode. For polymerase fitting, different starting positions of the PA–PB1 heterodimer with only 1–32 of PB2 (FluB2 model) were used for rigid body fitting using the chimera fit-in-map module and allowed to identify one preferred rigid fit position. Finally, the model was refined with a simultaneous rigid fit of the polymerase and the nine nucleoproteins using VEDA.