Molecular basis of host-adaptation interactions between influenza virus polymerase PB2 subunit and ANP32A

Avian influenza polymerase undergoes host adaptation in order to efficiently replicate in human cells. Adaptive mutants are localised on the C-terminal (627-NLS) domains of the PB2 subunit. In particular, mutation of PB2 residue 627 from E to K rescues polymerase activity in mammalian cells. A host transcription regulator ANP32A, comprising a long C-terminal intrinsically disordered domain (IDD), is responsible for this adaptation. Human ANP32A IDD lacks a 33 residue insertion compared to avian ANP32A, and this deletion restricts avian influenza polymerase activity. We used NMR to determine conformational ensembles of E627 and K627 forms of 627-NLS of PB2 in complex with avian and human ANP32A. Human ANP32A IDD transiently binds to the 627 domain, exploiting multivalency to maximise affinity. E627 interrupts the polyvalency of the interaction, an effect compensated by an avian-unique motif in the IDD. The observed binding mode is maintained in the context of heterotrimeric influenza polymerase, placing ANP32A in the immediate vicinity of known host-adaptive PB2 mutants.

I nfluenza A virus (IAV) is responsible for 3-5 million severe cases every year, resulting in 250-500,000 deaths 1 . Most influenza strains evolve exclusively in the large reservoir of water birds, but some highly pathogenic avian strains (e.g., H5N1, H5N8 and H7N9) can infect humans with lethal consequences (up to 60% mortality) and are potential pandemic threats for humanity if they develop human-to-human transmissability 2 . However, for these avian (av) viruses to efficiently replicate in mammalian cells, host adaptation of the viral polymerase is necessary. Replication of IAV is carried out by the RNAdependent RNA viral polymerase that functions as a heterotrimeric complex, formed from separate components PA, PB1 and PB2. Few mutations are required for avian to human adaptation [3][4][5][6] , and a number of these cluster on the surface of the Cterminal 221 amino acid section of PB2, comprising separate '627' and 'NLS' domains 4,7 . In particular mutation of residue 627 from E to K in avPB2 rescues polymerase activity and viral replication in mammalian cells [8][9][10][11] . Members of the host transcription regulator family ANP32 12 , comprising a long low-complexity acidic intrinsically disordered domain (IDD; sometimes known as LCAR) at the C-terminus, have been shown to be responsible for this viral adaptation 13 . Human ANP32A (hANP32A) lacks an insertion of 33 disordered residues compared to avANP32A, restricting avH5N1 polymerase activity in mammalian cells. This restriction is lifted by E627K mutation, suggesting an essential role for ANP32A through interaction with PB2 [14][15][16][17][18][19][20][21][22][23][24] , although there are currently no molecular descriptions of these interactions. The interaction between members of the ANP32 family and influenza polymerase is critical in supporting IAV replication, and is attracting increasingly intense interest 16,18,19 . Recent studies also demonstrate that the interaction occurs in the nucleus 24 , and further studies point to the importance of related members of the ANP32 family, in particular ANP32B [20][21][22] , as well as the role of surface residues in the folded leucine-rich region (LRR) of ANP32A 23 .
Conformationally, the 627-NLS region of PB2 exhibits intriguing behaviour 7,25 . X-ray crystallographic structures of both h and av-adapted 627-NLS revealed a compact twodomain structure 4,7 , a conformation also found in full-length transcriptionally active polymerase 26 . Solution NMR, however, revealed the coexistence of two forms of 627-NLS, corresponding to 'open' and 'closed' states that interchange in a highly dynamic equilibrium 27 , while crystallographic investigation of the transcriptionally inactive polymerase complex suggested a role for this open form in viral replication or polymerase assembly 28,29 .
Here, we combine NMR and quantitative ensemble analysis to describe and compare the complexes formed between avANP32A and av-adapted 627-NLS (627-NLS(E)), and between hANP32A and h-adapted 627-NLS (627-NLS(K)). Although the complexes are both found to be highly dynamic they exhibit significant differences. Polyvalent combinations of transient interactions between the acidic IDD and the positively charged 627(K) domain stabilise the hANP32A:627-NLS(K) complex. This polyvalency is less efficient for avANP32A:627-NLS(E), due to the interruption of an exposed basic surface on the 627 domain by the presence of E627. The weaker interaction is however compensated by the recruitment of additional sequences on the longer avIDD, in particular an avian-specific hexapeptide motif that interacts with the linker between 627 and NLS. Notably the crossinteraction between hANP32A and 627-NLS(E) exhibits neither of these possible stabilisation mechanisms, which may be related to the inability of IAV polymerase to function in human cells without the E627K mutation. Importantly, we show that the interaction exhibits the same properties in the presence of heterotrimeric influenza polymerase, providing insight into the role of the IDD in the function of putative ANP32A:polymerase complexes.

IDDs of h and avANP32A and their interactions with 627-NLS.
We have compared the two complexes using solution state NMR, initially from the side of ANP32A. The IDDs of h and avANP32A comprise 63/96 and 79/129 Asp or Glu residues, respectively, leading to extensive spectral overlap. h and avANP32A IDDs differ principally due to a 33 amino acid insert in avANP32A (176-209) comprising an av-unique hexapeptide, 176 VLSLVK 181 , followed by a duplication of 27 amino acids also present in hANP32A IDD. Backbone resonance assignment was completed to 78% and 58%, respectively, revealing that h and avANP32A IDDs are indeed both intrinsically disordered (Supplementary Figs. 1 and 2), with a slight tendency (20%) towards helical conformation for the hexapeptide, but negligible secondary structural tendency elsewhere.
Upon addition of the 627(K) domain, 1 H and 15 N chemical shift perturbations (CSPs) are seen for a large number of resonances in the IDD of hANP32A (Fig. 1b, Supplementary  Fig. 1d). NMR relaxation rates measured at increasing titration admixtures (Fig. 1c) show maximal effects for the acidic strand 180 DEDA 183 , while the largest CSPs are seen for the adjacent hydrophobic residues 184 QVV 186 (Fig. 1b, Supplementary  Fig. 1d). Additional interactions are seen throughout the chain, in particular at 164 VE 165 and 214 YND 216 . Comparison of backbone 13 C shifts in the free and fully bound states reveals that only 179 YDED 182 shows any evidence of folding upon binding ( Supplementary Fig. 1d), in this case into an extended βsheet conformation, while the remainder of the chain remains highly flexible in the complex, retaining its random coil nature ( Supplementary Fig. 1b). Similar evidence of multiple interaction sites is seen for the IDD of avANP32A in complex with the 627 (E) domain ( Supplementary Fig. 1d, Fig. 1d). Relaxation properties (Fig. 1c, h) and more specifically CSPs (Fig. 1e) of the hexapeptide are strongly influenced by the presence of both domains of 627-NLS, compared to constructs each comprising only the 627 or NLS domains and the linker. This implies that the specific conformational behaviour of the linker in integral 627-NLS, or the relative position of the two domains, are essential for the interaction with the hexapeptide of avANP32A IDD.
Interaction of 627-NLS and ANP32A reveals similar profiles to those measured for the IDDs alone, with additional interactions involving the LRR for both h and avANP32A ( Fig. 1f-i). In both h and avANP32A, the largest CSPs in the LRR are centred on 120 LFN 122 , with additional shifts induced following the spine of the beta-helix (Fig. 1g). Notably, the cross-interaction, between hANP32A and 627-NLS(E) shows much smaller shifts than hANP32A:627-NLS(K), while comparison with avANP32A:627-NLS(E) clearly identifies increased shifts centred on the hexapeptide (Fig. 1i).
Although most of the IDD is involved in the interaction, the strongest binding or highest populations of binding interactions, occur at a distance of 25-35 amino acids from the LRR. In the case of avANP32A, this concerns the avian-unique hydrophobic hexapeptide, and in hANP32A the sequence 180 DEDAQVV 186 . Further interactions are observed downstream of these interaction sites until the nuclear localisation sequence (KRKR), situated 15 amino acids from the end of the chain, beyond that point no significant interactions are observed.
Despite the evidence of clear interaction between ANP32A and 627-NLS, the complex is polyvalent, and in all cases weak (Supplementary Table 1, Supplementary Fig. 3), with none of the individual interactions exhibiting a stronger affinity than 800 μM for the interaction of hANP32A with 627(K) (1700 μM for avANP32A with 627(E)). Notably, the interactions between hANP32A and 627(E) are also weaker (>1400 μM), suggesting that the single E627K mutation plays an important role. Interaction with integral 627-NLS is weaker due to the openclosed equilibrium reducing the population of available binding states. Interaction of the LRR of ANP32A alone with the 627 domain reveals weaker affinity (>1500 μM). Although these interactions are weak, the extended interaction surface involving 80-100 disordered amino acids in h and avANP32A nevertheless results in tighter binding as experienced by 627-NLS.
Interactions of av and h-adapted 627-NLS with av and hANP32A. The interaction was also investigated from the side of 627-NLS domains of PB2. The two domains exhibit an openclosed equilibrium (Fig. 2a) that is populated~40:60 at 293 K leading to two sets of resonances for the majority of the protein 27 .
Addition of hANP32A to 627-NLS(K) resulted in CSPs throughout the protein ( Supplementary Fig. 4). The largest shifts are observed in the open form, suggesting that ANP32A interacts preferentially with this conformation. The closed form of 627-NLS(K) is stabilised by a tripartite salt bridge, and can be removed from the equilibrium by mutation of the implicated amino acids (R650 or D730/E687) 27 . CSPs induced upon interaction with the ANP32A IDD are illustrated for clarity using an open-only mutant (Fig. 2b), and the distribution of CSPs as measured on the wild-type proteins (Fig. 2c). Figure 2d illustrates the major shifts observed for 627-NLS(K) upon addition of hANP32A IDD. Although basic sidechains are found on both sides of the domain (Fig. 2e), CSPs are observed mainly on one face of the 627 domain, again suggesting that the interaction is not uniquely driven by electrostatic attraction with the acidic IDD. Despite strong similarity of the CSP profiles, clear additional shifts are measurable for 627-NLS(K):hANP32A compared to 627-NLS(E):avANP32A (Fig. 2c, Supplementary Fig. 4), particularly in the vicinity of R630, L636 and 587-591, forming a continuous interaction surface that is strongly enhanced in 627-NLS(K) (Fig. 2f). It is interesting to note that residues 590 and 591 provide an alternative pathway to host adaptation, as evidenced in the 2009 pandemic strain 5,30 . The long extended loop comprising 627E/K, that is highly flexible in both 627E and 627 K ( Supplementary Fig. 5), wraps around the 588-605 helix and interaction with hANP32A appears to enhance the accessibility of the N-terminus of this helix to the ANP32A IDD uniquely when 627 K is present. The affinity measured when observing resonances from 627-NLS(K) is considerably tighter than from the side of hANP32A, with values of 20 μM for 627(K) and 50 μM for 627-NLS(K). This increased affinity apparently occurs due to avidity with the extensive interaction surface presented by hANP32A. Although the stoichiometry cannot be determined accurately, titration curves imply that it is significantly different to 1:1 from the side of 627-NLS ( Supplementary Fig. 3), again suggesting that the increased affinity occurs due to polyvalent binding of multiple, weak binding sites on ANP32A to each site on the surface of 627-NLS(K). Notably, the affinity for avANP32A IDD measured when observing resonances of 627(E) is~20 times weaker than for the equivalent human IDD:627(K) complex, with values >600 μM. In combination with observations measured from the side of ANP32A, it therefore appears that the absence of K627, which disrupts the continuity of the positively charged surface on 627, strongly abrogates this component of the interaction.
In addition, the interaction with ANP32A strongly favours the open form of 627-NLS (Fig. 2g), especially for the avian interaction, with the closed form essentially disappearing from the equilibrium at 1:2 ratio of 627-NLS(E):avANP32A and falling from 65 to 40% at the same stoichiometry in 627(K)-NLS: hANP32A.
The polyvalent nature of the interaction of the IDD with 627-NLS is further substantiated by the comparison of the CSPderived profiles measured when individual adjacent peptides from huANP32A IDD interact with the open-mutant of 627-NLS(K) ( Supplementary Fig. 6). These results demonstrate that both peptides interact with residues on the surface of 627-NLS, some that are common to both peptides and to the full-length IDD (Fig. 2b), and others that are unique to one or other of the peptides, confirming the multivalent, transient and dynamic nature of the interaction of the full-length huANP32A IDD with 627-NLS(K). One of these peptides, that comprises 11 consecutive acidic amino acids, also binds much tighter to the open form of 627-NLS(K) than to the open form of 627-NLS(E) (Supplementary Fig. 7), again supporting the observation that the contribution of K627 to the positively charged surface of the 627(K) domain is the essential factor for the tighter interaction with the negatively charged regions of huANP32A IDD. These data therefore further support our model of differentiation of the binding modes of the two complexes.
Ensemble descriptions of dynamic ANP32A:627-NLS complexes. To develop a more detailed description of the dynamic interaction between 627-NLS and ANP32A, we have incorporated eight cysteine mutants into both 627-NLS(K) and (E), and individually labelled the proteins with single TEMPO-based paramagnetic spin-label. This allows the detection of paramagnetic relaxation enhancements (PREs), from the perspective of five positions on the 627 domain and three on the NLS domain ( Fig. 3a-c). PREs report on weak, or sparsely populated contacts between the two proteins, providing long-range positional constraints that are complementary to the short-range modulation of the electronic and chemical environment probed by CSPs, and the modulation of the dynamics of each site as measured by spin relaxation. These orthogonal experimental probes reveal different aspects of the same interfaces, while providing a high level of confirmatory experimental validation (vide infra).
The experimental results show strong PREs, reporting on tighter or more populated interactions, distributed over long stretches of the IDD for certain spin-label positions, and little broadening for other positions. Such profiles again suggest a polyvalent interaction, whereby distinct sites dispersed along the IDD visit the same sites on 627-NLS. Interestingly, some labels (in particular 587 and 631) induce a well-defined pattern over the long β-sheet on the concave face of the LRR of ANP32A, allowing the determination of its orientation with respect to the 627 domain (Fig. 3d, see 'Methods' section), despite the weak interaction between the two domains (Supplementary Table 1). This orientation is in full agreement with the observed CSP on the surface of the LRR of ANP32A (Fig. 1), where the largest chemical shifts are seen for F121, Y122 and C123. These residues are positioned in closest proximity to the 627 domain in the ensemble of conformers that best fit the entire set of experimental data. The interaction is apparently stabilised by hydrophobic interactions, involving residues on the surface of 627 and ANP32A, and an electrostatic interaction between D130 (ANP32A) and R646 (627) (Fig. 3e). It seems likely that the observed interaction relates to observations that have recently implicated D130 in host adaptation 20,21,23 . The optimal poses determined for the 627-NLS(K):hANP32A and 627-NLS(E): avANP32A complexes are very similar. The interface between 627 and ANP32A is bordered by a nearly continuous ridge of solvent accessible basic sidechains, including K589, R630, R641 and R650, that is completed by the presence of K627 in the case of 627-NLS(K) (Fig. 3f). Inspection of the PRE data from the IDD regions of h and avANP32A reveals weaker effects in the region immediately following the LRR for avANP32A, and more contacts with the dislocated NLS domain (Fig. 3a, b), in particular the spin-label at position 699 in the NLS broadens the IDD maximally at the position of the hexapeptide of avANP32A. This again confirms that the interaction with the immediately proximal basic face of 627, defined by the differential CSPs (Fig. 2f), is weaker in the case of avANP32A.  Ensemble analysis of the PRE data using the ASTEROIDS approach 31 , accounting for the flexibility in the 627-NLS linker and IDD domains of ANP32A ( Supplementary Fig. 8), allows us to propose a molecular description of the conformational sampling of the entire complex. Representative ensembles of conformations of the multi-domain complex (Fig. 4)   f Position of positively charged ridge of solvent-exposed basic sidechains in the vicinity of the interface between the folded domains.
Importantly, comparison of PRE profiles of hANP32A:627-NLS(E) demonstrates that the cross-interaction, that represents the case encountered when a non-adapted avian IAV infects human cells, shows less extensive and in general weaker contacts compared to hANP32A:627-NLS(K) and avANP32A:627-NLS(E) (Fig. 5). The absence of K627 is again seen to diminish the numerous polyvalent contacts present in hANP32A:627-NLS(K) as illustrated by the profiles induced by probes attached to 587 and 631, while the shorter IDD and lack of the hexapeptide abrogates the extensive interaction surface and the linker-specific contact present in avANP32A:627-NLS(E) (compare profiles induced by probe attached at 699). The required components for interaction modes specific to either 627-NLS(K):hANP32A or 627-NLS(E):avANP32A are therefore both weakened when avian polymerase interacts with hANP32A.
hANP32A:627-NLS in the context of integral FluB polymerase.
To determine whether the interactions characterised here are relevant in the context of a human-adapted full-length influenza polymerase, we initially repeated the chemical shift titrations using the heterotrimeric influenza B (FluB) polymerase (bound to vRNA), with the IDD of hANP32A (Fig. 6a). The interaction sites are found to be the same as for 627-NLS from IAV (Fig. 6b). In the case of full-length hANP32A (Fig. 6c), the first 50 amino acids of the IDD disappear from the spectrum, likely because the large particle results in extreme line broadening for the residues that are closest to the LRR domain due to slow tumbling in solution.
Given the spectroscopic similarities of the interaction of ANP32A with IAV 627-NLS and full-length FluB polymerase, it is interesting to speculate whether the interaction described in detail for IAV 627-NLS can be accommodated within known structures of the polymerase. Superposition of the 627 domains of the ensembles describing the complex onto the 627 domain in the transcription-active conformation of FluB polymerase (the form used in the interaction study described above) 26,32 reveals that the LRR of ANP32A can indeed be inserted into a broad cylindrical pocket formed by 627, the mid and cap-binding domains of PB2, also bordered by the C-terminal domain (CTD) of PB1, allowing ANP32A to adopt the pose determined in solution (Fig. 6d). Interestingly, the site of recently described adaptive mutations of PB2 at 521 and 355 33 lie in the immediate vicinity of the surface of ANP32A in this pose (inset Fig. 6d), suggesting that the importance of these mutations involves interaction with ANP32A. Figure 6e, f illustrates the expected conformational space sampled by the IDDs of av and hANP32A within the 627-NLS:ANP32A complexes, indicating a broader capture radius for the avian complex. A similar procedure was applied to the conformationally very distinct transcriptionally inactive IAV polymerase bound to the cRNA 5′ terminus 29 , where the 627 domain sits on the surface, and is displaced and rotated relative to the polymerase core by~70°3 4 . In this case, the 627-NLS:ANP32A complex can be easily accommodated (Fig. 6g, h). The position of NLS on the surface of the polymerase, and its observed positional variability in existing structures, suggests that the open form of 627-NLS is sampled in both transcriptionally active and inactive polymerases. Finally, the recently determined dimeric structure of apo IAV polymerase accommodates the ensemble equally well (Supplementary Fig. 9) 35 .

Discussion
In this study, we describe and compare the molecular complexes formed by the human-adapted or avian-adapted 627-NLS domains with the respective ANP32A host proteins, in order to understand the nature and specificity of these interactions. All of the implicated proteins exhibit extensive intrinsic disorder. The elaboration of atomic resolution descriptions of such highly disordered complexes requires methodologies that can account for the ensemble of structures sampled by the two proteins in solution 36,37 . Investigation of the hANP32A:627-NLS(K) and hANP32A:627-NLS(E) complexes using complementary structural data from NMR chemical shifts, spin relaxation and paramagnetic relaxation combined with quantitative ensemble modelling reveals the existence of highly dynamic molecular assemblies that exhibit very different interaction modes.
The LRR of ANP32A interfaces with the 627 domain of PB2 at the C-terminal end of its concave surface, apparently stabilised via hydrophobic and electrostatic interactions. This interaction is IDD of hANP32A results in an effective increase in affinity tõ 50 μM for 627-NLS(K), an effect of avidity that has been observed in a number of systems exhibiting extensive intrinsic disorder 38,39 . The critical E627K mutation completes a continuous ridge of solvent-exposed positively charged residues 4 that are available for interaction with the highly dynamic acidic IDD. This ridge colocalizes interacting residues of the IDD in the vicinity of the surface, that differentially implicates residues 589 and 591, 629, 631, 635 and 637 when the E627K mutation is present.
By contrast the IDD of avANP32A interacting with 627-NLS(E) populates fewer conformations in the vicinity of the surface of 627. This lack of interaction, due to the interruption of the positively charged surface by the presence of E627, is compensated by an even broader conformational sampling of the IDD that exploits a more extensive interaction surface, implicating the hexapeptide motif specific to avANP32A and the NLS domain, and in particular the linker region of 627-NLS. It is again interesting to note that two adaptative mutations (V683T and A684S) 40  Investigating the interaction of individual peptides isolated from the IDD, we are able to confirm the multivalent nature of the interaction of the huIDD with 627-NLS, and to strongly substantiate the differential binding modes of the two complexes.
Notably, the 'cross-interaction' between hANP32A and 627-NLS (E) exhibits neither the stabilisation properties mediated by the avIDD in the avANP32A:627-NLS(E) complex, nor the polyvalent binding specific to 627(K) as observed in the hANP32A:627-NLS (K) interaction. This lack of adapted molecular mechanisms results in fewer and weaker contacts between human ANP32A and avianadapted 627-NLS. In combination, these effects may explain the inefficiency of avian polymerase in human cells in the absence of avANP32A or 627-NLS(K).
It is interesting to compare the interaction results measured by different NMR-based techniques with existing studies, where results measured using different techniques seem to paint a slightly different picture. While the individual interaction sites reported in our study indicate a weaker binding between avANP32A and 627-NLS (E) as compared to hANP32A and 627-NLS(K), some biochemical and cell biology studies report on a higher affinity of avANP32 to avian-adapted polymerase 17,24 . This apparent contradiction may result from different tagging techniques used in the reported pulldown assays, but we also note that overall affinities between proteins are not necessarily comparable to individual multivalent interactions in terms of affinities, and it will be interesting to resolve the dependencies in the future.
Importantly, NMR indicates that the complex characterised for the minimal 627-NLS:ANP32A interaction is maintained in the context of the integral FluB polymerase in its transcriptionally active form, showing very similar NMR-binding characteristics to hANP32A. Superposition of the binding pose of ANP32A with respect to 627 onto the 627 domain in the associated transcriptionally active polymerase structure 26 would place the LRR domain in a similarly dimensioned pocket bounded by PB2 domains (Fig. 6d). Intriguingly, recently characterised hostadaptive PB2 mutants 33 lie in the immediate vicinity of the surface of the folded domain of ANP32A in this binding conformation. Influenza polymerase exhibits extensive plasticity in solution, as demonstrated by recent electron microscopy and Xray crystallographic studies describing multiple states of the polymerase 26,28,29,34,35,41,42 , and it appears highly likely that many of these states will be in conformational exchange in solution. It is also possible that the highly dynamic and transient nature of the interaction with ANP32A enables or enhances exchange between these forms during the viral cycle. It is therefore of interest to investigate the possible impact of our observed binding mode on other known states of influenza polymerase. The proximity of ANP32A IDD to the interface between the 627 and mid-domains of PB2, both of which undergo large-scale reorientations and dislocation between apo-, cRNA-bound and vRNA-bound polymerase, also raises the possibility that ANP32A interaction is associated with these conformational changes. In the transcriptionally inactive structure, the 627 domain appears almost dislocated from the core, a conformation that can easily accommodate the binding pose of ANP32A determined for the minimal complex.
The information contained in our ensemble descriptions of the ANP32A:627-NLS interactions therefore allow us to speculate further on its role in viral function. Recent observations have established that dimerisation of influenza polymerase is essential for the initiation of vRNA synthesis during replication 35 . The apopolymerase structure of IAV polymerase that was recently solved in its dimeric form is also found to be compatible with the binding pose determined here ( Supplementary Fig. 9), again with the 627 domain dislocated and sitting on the surface of the catalytic core domains. It has been suggested that ANP32A plays a role in assembly or regulation of this dimerisation process, for example, by recruitment of a second apo 'packaging' polymerase to a replicating polymerase to initiate formation of the progeny viral RNP 43 . In this context, the polyvalent nature of the interaction between ANP32A and 627-NLS may be of functional relevance, allowing for more than one polymerase to simultaneously bind to ANP32A thereby colocalizing two polymerases to facilitate viral replication. In all of these aspects, the more extensive effective capture radius of the IDD of avANP32A may be important.
It is known that the intrinsically disordered, phosphorylated CTD of host RNA polymerase II (Pol II) binds to the surface of PA of influenza polymerase to facilitate the cap-snatching mechanism 44,45 . Given the highly negatively charged nature of the IDD of ANP32A it seems possible that it may play a regulatory role in this interaction, for example, by competing with the phosphorylated CTD to inhibit the interaction with Pol II. Notably, the extremely long IDDs, allied to the fact that ANP32A is bound to the 627 domain that exhibits extensive mobility with respect to the rest of the polymerase core, would appear to facilitate the kind of flexible chaperoning action seen in other highly dynamic viral proteins 46 .
In summary, the description of these highly dynamic speciesspecific assemblies reveals unique mechanistic insight into the role of the ANP32 family in host adaptation of avian influenza polymerase to the human cells, and provides a molecular framework for understanding the considerable volume of experimental observation measured on this complex system, as well as informing the identification of novel targets for IAV inhibition.
Protein expression and purification. Plasmids were transformed into E. coli Rosetta cells, and the cultures were grown in LB and induced with IPTG for 16 h at 18°C. Bacteria were harvested by centrifugation, resuspended in buffer A (50 mM Tris-HCl pH 7.5 and 200 mM NaCl) with protease inhibitors (complete, Roche) and bacterial lysis was performed by sonication. All proteins were purified by affinity chromatography on Ni-NTA agarose (Qiagen), followed by incubation with TEV protease at 4°C coupled with dialysis into buffer A. A second affinity column with Ni-NTA agarose was performed and the flow-through was loaded into a Superdex 75 column (GE Healthcare) for size-exclusion chromatography in buffer A.
To produce 15 N-labelled or 15 N, and 13 C-labelled proteins for NMR spectroscopy, bacteria were grown in M9 minimal medium containing MEM vitamins (Gibco), supplemented with 1 g L −1 of 15 NH 4 Cl and 2 g L −1 of unlabelled or 13 C-glucose. To produce additionally 2 H-labelled proteins, the M9 minimal medium was prepared in D 2 O and 2 g L −1 of deuterated 13 C-glucose. Protein purity was checked by SDS-PAGE and mass spectrometry. Single-point mutations of 627-NLS were done using the Quick change method 48 , using Phusion highfidelity DNA polymerase and DpnI (Thermo Scientific). Cysteine mutants were purified as mentioned above for wild-type protein; however, 10 mM of dithiothreitol (DTT) was added after the second Ni-NTA column to keep proteins in a reduced state until labelling. The heterotrimeric human influenza polymerase from B/Memphis/13/03 (FluB) was expressed as a self-cleaving polyprotein and purified, using NTA affinity and heparin columns followed by size-exclusion chromatography 32 . ANP32A peptides were purchased from Caslo, Denmark.
NMR spectroscopy. All samples for NMR were measured in 50 mM Tris-HCl buffer pH 6.5, 200 mM NaCl and 10% D 2 O. The assignment of the intrinsically disordered regions of avian and human ANP32A were obtained using 15 N, 13 Clabelled samples (700 μM) using BEST-TROSY tridimensional experiments recorded on a Bruker spectrometer equipped with a cryoprobe operating at 20°C and a 1 H frequency of 850 MHz. All spectra were processed using NMRPipe 49 and analysed in Sparky 50 . MARS 51 was used for spin system identification, followed by manual verification. The folded domain of ANP32A has been assigned previously 52,53 . 13 C α chemical shifts of the intrinsically disordered regions were compared to random coil values using the software SSP 54 . 15 N R 1ρ relaxation rates were measured at 293 K and a 1 H frequency of 850 MHz using a spin lock of 1.5 kHz as described 55 . A typical set of relaxation delays included points measured at 1, 15, 30, 50, 100, 140, 200 and 230 ms, including repetition of one delay. Relaxation rates were determined using in-house software and errors were estimated on the basis of noise-based Monte Carlo simulation. Interaction experiments with full-length polymerase were acquired with 15 N-labelled hIDD or full-length 15 N-hANP32A at a concentration of 4 μM after the addition of 32 μM of human FluB polymerase bound to the 5′ terminal viral RNA promoter (5′-pAGUAGUAACAAGAG-3′ OH). These experiments were recorded at 293 K and a 1 H frequency of 850 MHz.
PRE effects used to model the complex formed by ANP32A ( 15 N labelled, human or avian) and 627-NLS (E627 or K627) were measured from the peak intensity ratios between a 15 N-HSQC 2D spectrum recorded on a sample containing 627-NLS labelled with TEMPO, and a reference diamagnetic sample that was incubated previously with 5 mM of DTT. For these experiments, singlecysteine mutants at positions 539, 587, 605, 631, 643, 699, 707 and 717 were tagged using 4-maleimido-TEMPO. Briefly, purified 627-NLS single-cysteine mutants were reduced with 10 mM of DTT at 4°C for 12 h and then dialysed throughly into 50 mM phosphate buffer pH 7.0 containing 150 mM NaCl without DTT. A fivefold molar excess of 4-maleimido-TEMPO dissolved in DMSO was added to the reduced 627-NLS cysteine mutants. The reaction was incubated for 12 h at 4°C and then injected into a Superdex S75 column to eliminate the excess of TEMPO through size-exclusion chromatography. Complete labelling with TEMPO was verified by mass spectrometry. Measurement of PRE effects was performed in samples containing 200 μM of 15 N-labelled hIDD, and 100 or 200 mM of the respective TEMPO-labelled 627-NLS (K627) mutants. Measurements in full-length hANP32A were performed with 300 μM of 15 N-hANP32A and 150 μM of the 627-NLS (K627) mutants, and measurements on full-length avANP32A were carried out with 220 μM of 15 N-labelled protein and 400 μM of the 627-NLS (E627) mutants.
Determination of the relative position of ANP32A and 627. Experimentally determined PREs measured on h and avANP32A in the presence of different spinlabelled forms of h and av627-NLS, respectively, were used to determine the relative position of the folded domain of ANP32A with respect to the 627 domain. Two thousand different positions of the two domains were generated using the programme Haddock 56 , varying over a wide range of distances and orientations. Positions of spin-label-bearing sidechains were generated on the basis of rotameric libraries (see Supplementary Fig. 8) 31,37 , and an ensemble of sidechain positions was used to calculate expected PREs on ANP32A for a given position of the 627 domain for each label. Admixtures were adjusted to ensure a population of the bound state of 10% for both complexes. The position of each of the 2000 starting conformations were varied over a range of ±10 Å along three orthogonal cartesian axes at a resolution of 0.1 Å, and the best fitting position retained. The ten best fitting structures are shown in Fig. 3d.
Ensemble descriptions of ANP32A:627-NLS complexes. Having determined the relative position of the two domains, the flexible parts were constructed onto this conformation. For both h and av complexes, the statistical coil model flexible meccano 57 was used to predict 10,000 conformations of the linker region of 627-NLS, the NLS domain, the NLS peptide that terminates 627-NLS, and the 96 or 128 amino acid IDD of h and avANP32A. Conformers were calculated using amino acid-specific potentials that reproduce the experimentally observed behaviour of the IDD domains, and were calculated to avoid steric overlap between any of the domains (Supplementary Fig. 8). PREs over the entire ANP32A molecule (folded and unfolded domains) were calculated for each of the conformers calculated for each complex, and these conformations were used as a basis set from which ensembles were selected using the ASTEROIDS approach 36,58 . Ensemble size was estimated on the basis of direct and cross-validated PRE profiles (60 conformers were used for both h and av complexes).
Distance matrices were calculated by calculating the average distance between C α atoms between the two proteins in the selected ensembles from the two complexes, and the distance difference matrix shown in Fig. 5a by subtracting the matrix from hANP32A:627-NLS(K) from the avANP32A:627-NLS(E) matrix.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.