STAT6 belongs to a family of transcription factors known as the signal transducers and activators of transcription (STAT). STAT family members share a similar protein structure, which is essential for their activation and function. They are composed of an N-terminal coiled-coil domain1, a centrally located DNA-binding domain2, a linker region, an SH2 domain for dimerization3 and a transactivation domain at the C-terminus4. STAT proteins mediate signaling from activated cytokine receptors to the nucleus5. After phosphorylation at a specific tyrosine by a receptor associated Janus kinase, STATs form homo- or heterodimers and translocate into the nucleus where they modulate transcription after binding to specific DNA sequence elements4,5. STAT6 becomes activated in response to IL-4 and IL-13 and mediates most of the gene expression regulated by these cytokines6.

By direct interaction with specific parts of its transactivation domain, STAT6 recruits the co-activators p300/CBP and NCoA1 (also called steroid receptor coactivator-1, SRC-1), which are essential for transcriptional activation7. In particular, the interaction between STAT6 and NCoA1 is modulated by a short region of the transactivation domain that includes the motif LXXLL (where L is leucine and X is any amino acid)7. The crystal structure of a STAT6-derived peptide (Leu794-Gly814) in complex with the NCoA1 PAS-B domain257–385 (PDB ID: 1OJ5) (Fig. 1) revealed that the leucine side-chains of the motif (Leu802, Leu805 and Leu806), are deeply embedded into a hydrophobic groove on the surface of NCoA18. More recently, Robinson and coworkers9 demonstrated by fluorescence polarization (FP) binding assays that a peptide comprising STAT6 residues 783–814 (for sequence see Fig. 2) binds about 6.5 to 8 times stronger than Leu794- Gly814 (Kd = 0.04 µM vs. 0.32 µM from direct FP, Ki = 0.04 µM vs. 0.26 µM from competitive FP). Obviously, more residues located N-terminally of the LXXLL motif in STAT6, play an important role in stabilizing the protein binding to NCoA19. Yet, the molecular recognition mechanism of NCoA1 by the STAT6 transactivation domain is still poorly understood. Here, we report the structural characterization of the complex between a STAT6-derived peptide encompassing the region from Gly783 to Gly814 and the NCoA1 PAS-B domain257–385 using NMR and X-ray crystallography. The structural characterization of the NCoA1257–385/STAT6783–814 complex demonstrates that the STAT6783–814 peptide binds to the NCoA1 PAS-B domain257–385 by additional amino acids from its N-terminal region resulting in a more extended binding interface with NCoA1 compared to that identified before in the crystal structure with the STAT6794–814 peptide8. Overall, the data indicate that the conformational propensity of free STAT6783–814 peptide in solution on the level of secondary and tertiary structure supports conformational selection as the key mechanism driving the molecular recognition of the coactivator by STAT6.

Figure 1
figure 1

Overview of the NCoA1/STAT6794–814 X-ray structure (PDB ID: 1OJ5). Ribbon drawing representation of the NCoA1/STAT6794–814 complex. The helical region containing the LXXLL motif of the STAT6 derived peptide is depicted in red. The side chains of the STAT6 residues Leu802, Leu805 and Leu806 are shown as sticks.

Figure 2
figure 2

Secondary structure propensity analysis of STAT6783–814. (A,B) Secondary chemical shifts of Hα (A) and Cα (B) for the STAT6783–814 peptide in the free (blue) and bound (red) state. (C) 3JHNHα coupling constants (upper) and the intensity ratio of αH-HN(i,i+1)/HN-HN(i,i+1) NOEs (middle) for the STAT6783–814 peptide in the free (blue) and bound (red) state. The open bars indicate the reference 3JHNHα values in the random coil conformation17 and the reference values of the intensity ratio of αH-HN(i,i+1)/HN-HN(i,i+1) NOEs are: 1.4 for random coil, 0.25 for α-helix and 55 for β-strand17. The SSP score (Secondary structural propensity) (lower) of the free STAT6783–814 peptide is also reported.


Structural characterization of the STAT6783–814 peptide

To gain insight into the structural features of the protein-protein recognition mechanism between STAT6783–814 and the NCoA1 PAS-B domain we first investigated by NMR the STAT6783–814 peptide in the free form. The 1H,15N-HSQC spectrum of 15N-13C-labeled STAT6783–814 (SI Fig. 1) shows narrow dispersion of signals in both proton and nitrogen dimensions indicating that the peptide adopts an unstructured conformation in absence of the binding partner. This finding was further supported by the analysis of the 3D 1H,15N-NOESY-HSQC spectrum that does not show any medium and long range NOEs (SI Fig. 2). To better understand the structural details of the conformational ensemble sampled in solution by the STAT6783–814 peptide we also analyzed the backbone chemical shifts that are sensitive reporters of the secondary structure content10. Hence, we obtained the complete assignment of proton, Cα and Cβ chemical shifts of STAT6783–814 in the free state (see Materials and Methods section) (Table SI 1a).

To obtain secondary structure propensities through the chemical shifts, we used the random coil values suitable for IDPs11,12. Overall, the secondary Cα and Hα chemical shifts are relatively small and no patterns could be discerned, consistent with a lack of ordered secondary structure (Fig. 2A,B).

Nevertheless, the positive Cα and negative Hα secondary chemical shifts suggest the presence of a significant amount of transiently formed helix for the residues located in the region containing the LXXLL motif (Glu799-Glu808) (Fig. 2A,B). Additionally, 3JHNHα coupling constants and the intensity ratio of αH-HN(i,i + 1)/HN-HN(i,i + 1) NOEs (Fig. 2C) were analyzed. Altogether the data for the residues located at the N-terminal part of the peptide in the region Ile786-Ile790 deviate from random coil values indicating helical propensity that is lower than the helical propensity for the helix between residues Glu799-Glu808.

To further investigate the conformational properties for STAT6783–814 in terms of helical populations, we utilized the secondary structure propensity (SSP) approach13. Our quantitative analysis (Fig. 2C) reveals more than 30% α-helical propensity in the region from Glu799 to Glu808 containing the LXXLL motif which is in quantitative agreement with the analysis of the coupling constants (Table SI 1b). In the region from Ile786 to Ile790 of the N-terminal part of the peptide 10% α-helical propensity is predicted from the SSP analysis as well as the 3JHNHα coupling constants analysis (Table SI 1b). Taken together the data clearly indicate that the free STAT783–814 peptide does not adopt a completely random coil conformation but contains two regions (Ile786- Ile790, Glu799-Glu808) with a significant α-helical propensity of around 10% for the first and 30% for the second sequence.

Structure of the NCoA1257–385/ STAT6783–814 complex by X-ray crystallography

In order to understand the increase of affinity of the STAT6783–814 peptide compared to the STAT6794–814 peptide in complex with NCoA1 by a factor of 10 we first solved the crystal structure of the NCoA-1 PAS-B domain in complex with the STAT6783–814 peptide (PDB ID: 5NWX) (SI Tables 2 and 3). In this crystal structure the residues located in the N-terminal part (Gly783 - Pro793) are not ordered indicating that this region is too dynamic to be resolved in the crystal (SI Fig. 3). Nevertheless, the new X-ray structure of the complex clarifies the structural role of Leu794 whose side chain fits into a deep pocket on the cofactor surface formed by Phe314, Phe300 and Ala310. Despite the fact that this amino acid was present in the construct crystallized previously8 it was not visible in the X-ray structure. Yet, there is no interaction visible between the residues N-terminal of Leu794 in the crystal structure. Therefore, we decided to perform a structural characterization of the NCoA1257–385/STAT6783–814 complex by NMR.

Mapping of the NCoA1 PAS-B domain binding site on STAT6783–814

The STAT6783–814-NCoA1 PAS-B interaction was first described by investigating the structural changes of the STAT6783–814 peptide upon binding with the coactivator. Therefore, the interaction was monitored by acquiring the 1H,15N-HSQC spectrum of 15N-13C-labeled STAT6783–814 in complex with the unlabeled PAS-B domain of the coactivator. A subset of STAT6783–814 backbone amide resonances becomes well dispersed in the presence of the binding partner (Fig. 3A). These changes are indicative of a well-structured region within the bound STAT6783–814 peptide. In particular, the resonance peaks of the residues from Thr798 to Gly811 next to the LXXLL motif are markedly perturbed when STAT6783–814 is bound to the NCoA1 PAS-B domain, and among these, the three leucine residues (Leu802, Leu805 and Leu806) are strongly perturbed upon binding (Fig. 3B,C). In agreement with the X-ray data reported previously8, these findings indicate that the recognition mechanism of NCoA1 cofactor by STAT6 is principally mediated by the region containing the LXXLL motif with the three leucine residues playing an important role in the complex formation. In addition, in agreement with the X-ray structure (PDB ID: 5NWX), the residues Leu794 and Leu795 located at the N-terminal region of the peptide show small but significant chemical shift variations upon binding. Furthermore, Phe791 also shows chemical shift changes suggesting that the complex may be further stabilized by additional residues flanking the LXXLL motif.

Figure 3
figure 3

NMR analysis of STAT6783–814 binding to NCoA1 PAS-B domain. (A) Overlay of the 1H,15N-HSQC spectra of STAT6783–814 in free form (blue) and in complex with the NCoA1 PAS-B domain (red). (B) Chemical shift perturbations (ppm) of STAT6783–814 upon binding plotted versus the primary sequence; the orange line indicates the mean value. (C) Signal intensity ratios of STAT6783–814 bound to NCoA1 PAS-B domain (Ibound) and free (Ifree).

Chemical shift assignments and conformational analysis of the NCoA1 PAS-B domain in complex with STAT6783–814

The PAS-B domain of NCoA1 adopts in the complex a stable folded structure (Fig. 4A). A nearly complete 1H, 13C and 15N assignment of the NCoA1257–385 domain has been obtained using standard triple resonance experiments (see Materials and Methods). More than 91% of the backbone resonances (1HN,15N,13Cα, and 13CO) and 89% of all side chain 13C and 1H resonances were assigned. Secondary structure elements of NCoA1257–385 were identified by the analysis of the chemical shift index14, and then confirmed by 3JHNHα coupling values and hydrogen exchange experiments (Fig. 4B). Specifically, the 3JHNHα coupling values drop for the two helices identified before in the free peptide to values indicating close to 100% helix formation. Indeed, as shown in Fig. 4B, NCoA1257–385 in complex with the STAT6783–814 peptide preserves all secondary structure elements reported in the crystallographic structure of the PAS-B domain in complex with the shorter peptide STAT6794–814 (PDB ID: 1OJ5) as well as the structure with STAT6783–814 (PDB ID: 5NWX) with the exception of the helix Ile786 - Ile790 which is fully populated in solution, however, there is no electron density in the crystal structure. To check this result independently and get more insight into the secondary and especially of the tertiary structure adopted by the NCoA1 PAS-B domain in complex with the STAT6783–814 we analyzed RDCs that are a sensitive probe of local structure as well as of protein motions15,16. In particular, we used the 1DNH RDCs to detect if the PAS-B domain of the coactivator undergoes local structural variations, in order to bind with high affinity the STAT6783–814 peptide. Weak alignment of the selectively 15N-labeled NCoA1257–385/STAT6783–814 complex was achieved by the addition of filamentous bacteriophage Pf117. Large 1DNH RDCs (40 Hz) (Fig. 4B) were obtained for the complex, which indicated substantial alignment and allowed for high sensitivity RDC measurement. In order to understand whether the PAS-B domain bound to STAT6783–814 adopts in solution a different conformation with respect to that observed in the crystal of the NCoA257–385/ STAT6794–814 complex we calculated theoretical RDC values from the X-ray structure reported in this manuscript (PDB ID: 5NWX). Alignment tensors were determined employing a linear fit procedure18 using the X-ray structure (PDB ID: 5NWX) and the measured RDCs, but considering only the residues located in the region having a secondary structure in accordance with the chemical shifts, 3JHNHα coupling and hydrogen exchange data. Using these alignment tensors together with the crystal coordinates, RDC values were predicted for the entire protein and then compared with the experimental RDC values measured on the NCoA257–385/STAT6783–814 complex. All RDC values throughout the entire protein, apart from the additional C-terminal tail, which is missing in the crystal structure, are in good agreement with the crystal structure as reflected by the Q factor value (Q = 0.20) and by the Pearson’s correlation coefficient (R = 0.96) (Fig. 4C). This demonstrates that the NCoA1 PAS-B domain in complex with the longer STAT6783–814 adopts a conformation similar to the crystal structure (PDB ID: 5NWX) and that any dynamic variations compared to the crystal structure coordinates must be small.

Figure 4
figure 4

Conformational analysis of the NCoA1 PAS-B domain bound to STAT6783–814 by using NMR and X-ray (PDB ID: 5NWX) data. (A) 1H,15N-HSQC spectrum of the 15N-13C NCoA1257–385 in complex with unlabeled STAT6783–814. (B) RDC values of NCoA1257–385 in bound form measured at 900 MHz. A scheme of the secondary structure elements of NCoA1257–385 in dependence of the protein sequence shown on top as derived by the Chemical Shift Index (CSI) based on Cα and Hα resonance assignments. 3JHNHa coupling constants are also reported and indicated by filled circles and filled squares for values of 3JHNHα < 4.5 Hz or > 8 Hz, respectively. The slowly exchanging amide protons are indicated by stars in the H/D line. (C) Plot of measured RDCs versus those calculated from the x-ray structure reported in this manuscript (PDB ID: 5NWX) using PALES49.

Identification of the NCoA1 PAS-B/STAT6783–814 binding interface

The structural features of the binding mode of the coactivator NCoA1 with STAT6 was determined by intermolecular NOEs measured using an 13C-edited/12C-filter NOESY-HSQC experiment19, exploring the 15N-13C enriched STAT6783–814 peptide and the natural abundance NCoA1 PAS-B domain. Based on the intermolecular NOEs, the interface formed by NCoA1 with STAT6783–814 appears to be considerably larger compared to the shorter peptide STAT6794–814. A large number of intermolecular NOEs is observed between the side chains of Leu802, Leu805 and Leu806 of STAT6794–814 and the NCoA1257–385 residues Ile272, Ile273, Ser274, Thr277, Trp288, Val292, Arg293 and Tyr297 (Fig. 5A). These findings, confirming the crystal structure of the NCoA1257–385/STAT6794–814 complex, indicate that the three leucine residues of the LXXLL motif play indeed a crucial role in the molecular recognition process. Interestingly, the intermolecular NOE analysis indicates that the NCoA1257–385/STAT6783–814 complex is stabilized by interactions of additional residues located in the N-terminal region of the peptide. In particular, in agreement with the X-ray structure of the NCoA1257–385/STAT6783–814 complex the side chain of Leu794 shows intermolecular NOEs with the residues Gly270, Phe300, Ser308, Ala310, Arg311and Ile358.

Figure 5
figure 5

Mapping of the STAT6 binding surface on the NCoA1 PAS-B domain (PDB ID: 1OJ5). (A) Strips from the 3D 13C-edited/12C-filter NOESY experiment measured on the unlabeled NCoA1 PAS-B domain in complex with 15N-13C STAT6783–814 showing inter-molecular NOEs. The strips are related to the residues STAT6 Leu806, Leu805, Leu802, Leu794, Ile790 and Ile786 directly involved in the interaction. (B) Mapping of the residues of NCoA1 involved in the interaction onto the x-ray structures (PDB ID: 1OJ5) in two orientations rotated by 45° around the z-axis.

The side chains of Ile786 and Ile790 that were not observed in the X-ray structure (PDB ID: 5NWX) interact with the NCoA1 residues Leu346, Tyr348, Gln355 and Pro356 (Fig. 5A). Notably, the NCoA1 residues showing intermolecular NOEs with the STAT6-derived peptide create a continuous patch on the surface of the PAS-B domain defining an extended binding interface (Fig. 5B). Residues Ile786 and Ile790 were not present in the peptide crystallized in the complex with the NCoA1 domain8 (PDB ID: 1OJ5) and are also not visible in the x-ray structure described above (PDB ID: 5NWX), such that their interaction is only observed in the solution structure and may be responsible for the enhanced binding.

Structure of the NCoA1257–385/STAT6783–814 complex by NMR

The NMR structure of the NCoA1257–385/STAT6783–814 complex (PDB ID: 5NWM), is of high quality (Table SI 4). In the complex the PAS-B domain adopts a well-defined globular fold (rmsdBackboneAtoms 260–367 = 0.485 Å) (Fig. 6A) ranging from Glu260 to Glu367 and a dynamically disordered tail at the C-terminus as confirmed by 15N-1H heteronuclear NOE values (SI Fig. 4). The PAS-B domain shows all structural features known already from the two crystal structures (PDB IDs: 1OJ5, 5NWX) (SI Fig. 5) with a five-stranded anti-parallel β-sheet and three α-helices that connect the second and third β-strand (Fig. 6B).

Figure 6
figure 6

The NMR structure of the NCoA1257–385/STAT6783–814 complex (PDB ID: 5NWM). (A) Overlay of the 20 lowest energy structures of the NCoA1257–385/STAT6783–814 complex. (B) Ribbon drawing of one representative conformer of the NMR structure NCoA1257–385/STAT6783–814 complex. (C) Solvent accessible surface of the NCoA1257–385/STAT6783–814 complex. The NCoA1 PAS-B domain is depicted in light grey while the STAT6 peptide is reported in dark green. (D) Electrostatic surface of NCoA1257–385 bound to the STAT6783–814 peptide. The positively charged residues are depicted in blue while the negatively charged residues are in red. STAT6783–814 is reported in dark green.

The STAT6783–814 peptide in complex with the NCoA1 PAS-B domain presents a short flexible N-terminal tail with a one-turn α-helix (α1) in the region from Ile786-Ile790 (as confirmed by the low values of the 3JHNHα couplings and the NOEs (Fig. 2C) that is connected by a linker that adopts an extended conformation to the second α-helix (α2) formed by the residues from Glu799 to Glu808. The STAT6 peptide binds into a shallow groove at the surface of the NCoA1 PAS-B domain, (Fig. 6C,D) with the specific contacts already known from the crystal structures (SI Fig. 5). Shortly, Leu802, Leu805 and Leu806 form the major hydrophobic side chain contacts (Fig. 7A) while Pro796 and Pro797 make contact with the residues Ile272 and Phe300 located in the C-terminal part of helix α3 of the PAS-B domain (Fig. 7B). These findings are also in line with the alanine scanning mutagenesis data of Robinson and coworkers9. They found that the L802A, L806A and P797A single mutants abolished the NCoA1257–385/STAT6794–814 complex formation in their FP assay and that the L805A mutant reduced the affinity considerably.

Figure 7
figure 7

Structural details of the NCoA1257–385/STAT6783–814 interface as reported by the NMR structure (PDB ID: 5NWM). (A) STAT6 Leu802, Leu805 and Leu806 fit into the binding groove of the NCoA1 PAS-B domain. (B) STAT6 Pro796 and Pro797 interact with residues Ile272, Ile296, Tyr297 and Phe300 located in the C-terminal part of helix α3 of the NCoA1 PAS-B domain. (C) STAT6 Leu794 is embedded in a shallow hydrophobic depression constituted by the residues Phe314, Phe300 and Ala310 of the NCoA1 PAS-B domain. Ile790 is inside a less deep hydrophobic pocket formed by Phe314, Met318 and Ile358. (D) STAT6 Ile786 interacts with Met318, Leu346, Pro356 and Ile358 of the NCoA1 PAS-B domain.

Importantly, the interaction between the NCoA1 PAS-B domain and STAT6783–814 is further stabilized by the interaction of STAT6 Leu794 with a shallow hydrophobic indentation constituted by the PAS-B domain residues Phe314, Phe300 and Ala310 (Fig. 7C), seen also in the X-ray structure and in agreement with the alanine scanning mutagenesis data where F794A reduced the affinity about 50-fold9. In addition, the two isoleucine residues (Ile786, Ile790) located in the N-terminal part of the STAT6 peptide bind into a shallow hydrophobic pocket on the surface of the PAS-B domain formed by Phe314, Met318, Leu346, Pro356 and Ile358 (Fig. 7D). Moreover, also STAT6 Trp785 slightly contributes to stabilize this interaction making contacts with Pro356 of the PAS-B domain. Altogether, the NMR structure indicates that the coactivator recognition mechanism by STAT6 occurs by the formation of a complex in which two folded regions are connected by a linker that adopts an extended conformation. This backbone conformation is further stabilized due to the side chain interaction of Leu794 with NCoA1. Thus despite the invisibility of the peptide N-terminal to Leu794 in the X-ray structure, the complex formation between the NCoA1 PAS-B domain and STAT6783–814 is well defined by specific interactions in solution. This is in contrast to a fuzzy complex20,21 as it has been identified e.g. for the interaction between the transcription activator GCN4 and a subunit of the Mediator complex22.

Last, in both X-ray structures as well as in the NMR structure the C-terminal region of the STAT6 peptide downstream of Glu808 is disordered, in line with the FP assay data with C-terminally truncated peptides that still bound with high affinity to the NCoA1 PAS-B domain9.

Tertiary structural preorganization of free STAT6783–814 before forming the NCoA1257–385/STAT6783–814 complex

We then set out to investigate whether in addition to the preformation of secondary structure elements, there is also a preformation of tertiary structure, i.e. a preorganization of the relative orientation of the two α-helices in the free STAT6 peptide, to facilitate binding to NCoA1257–385. Therefore, we evaluated the RDCs measured for STAT6783–814 in the free (Table SI 5) and bound forms and compared them to the values predicted from the NMR structure of the complex (PDB ID: 5NWM). Remotely related are earlier studies where proteins were unfolded in urea or guanidinium hydrochloride and the RDCs were compared with the RDCs from the folded forms23,24 and the 3D topology was retained in part. Indeed, RDCs are faithful reporters on the relative orientation of structural segments to each other. For the bound form we find a Q factor of 0.17 which is slightly higher than for the PAS-B domain (Q = 0.14) but, as reported by the normal scalar product NSP, with a similar alignment tensor (Table SI 6). Accordingly, fitting of the experimental RDCs of STAT6783–814 bound to PAS-B by using the tensor derived from the RDCs of the PAS-B domain resulted in a Q factor of 0.23 (Table SI 6). Taking now the experimental RDCs of the free form (ranging from 1 to 10 Hz) (Table SI 5), the Q factor increased only to 0.32. In addition, for several structural models (SI Fig. 5) in which the N-terminal helix was rotated by 30° about any axis perpendicular to the axis of the C-terminal helix, the Q factor increased to values between 0.36 and 0.54 (Table SI 7). Finally, in order to consider in our analysis the conformational heterogenety of the STAT6783–814 in the free form, we generated, as reported in the materials and methods and illustrated in the supplementary information (SI Fig. 7A–C), a pool of random conformers to properly describe the conformational space sampled by the peptide in solution. Interestingly, for all models of the conformational ensemble (SI Fig. 8A), we find Q factors higher than 0.32 for fitting the RDCs of the free form of STAT6783–814 (SI Fig. 8B).

Even in this much more exhaustive ensemble, we find a Q factor close to 0.32 only for two structural models, namely SM20 and SM21. For these models the angle between the α-helices (interhelical angle θ) is θSM20 = 51°, θSM21 = 89° compared to θNMR = 134°) (SI Fig. 8C) which will be further discussed below (see Discussion).

These findings indicate that indeed the tertiary structural elements are preorganized in the free form to facilitate binding. This finding differs from the mentioned studies23,24 in the sense that the “unfolding conditions” for STAT6783–814 are constituted by the absence of the binding partner PAS-B and don’t require chemical denaturants.


How molecular recognition and binding occurs between highly flexible protein domains, is not yet well understood. The conformational selection theory provides a very elegant explanation for molecular recognition, especially in the context of partially structured protein regions25,26,27,28.

A detailed understanding of the fundamental mechanisms of the molecular recognition of the coactivator NCoA1 by STAT6 is central to understanding biology at the molecular level of this interaction. The NMR structure of the NCoA1257–385/STAT6783–814 complex indicates that the coactivator recognition mechanism by STAT6 occurs by the formation of a partially ordered complex in which two α-helical regions (Ile786-Ile790 and Glu799-Glu808) are connected by a linker that adopts an extended, albeit dynamic, conformation. This dynamics in the linker may act more strongly onto the N-terminal region of the STAT6783–814 peptide that has fewer interactions with the NcoA1 PAS-B domain than the C-terminal region, ultimately rendering residues 783–793 disordered in the X-ray structure.

As mentioned above, the conformational characterization of STAT6783–814 in the free form indicates that the peptide is prestructured in the two regions Ile786-Ile790 and Glu799-Glu808 with a significant propensity to occupy α-helical conformation, 10% and 30%, respectively. Regarding the STAT6783–814 peptide in the bound form, the analysis from chemical shifts, 3JHNHα couplings and the relative weight of interresidual NOEs (Fig. 2C) suggests that the STAT6783–814 peptide presents a flexible N-terminal tail, followed by an α-helix in the region Ile786-Ile790 whose population increases from around 10% in the free form to close to 100% upon binding to NcoA1 PAS-B (Fig. 2C) (SI Fig. 9). Then a short extended linker follows and finally the second α-helix (Glu799-Glu808) (Fig. 2A,B) whose population increases from 30% to close to 100%. Given that existing prestructured motifs are enhanced in the bound form, the data suggest that the recognition of the coactivator by STAT6 occurs by a conformational selection25 mechanism regarding the secondary structure. We then investigated whether there is also preorganization of the binding motif, i.e. the two helices on the tertiary structural level and fitted experimental RDCs of the free and bound forms to the bound structure and a distorted bound structure. We find indeed that the experimental RDCs of the free peptide fit well to the bound structure and that a rigid body rotation of the N-terminal helix away from this orientation as well as the use of additional conformers with larger conformational heterogenety, deteriorates the fit. These findings indicate that not only on the secondary but also on the tertiary level, i.e. the arrangement of the two helices, there are prestructure motifs which are then also found in the bound form.

Overall the data supports conformational selection25,26,27,28 in this region as the key mechanism driving the molecular recognition of the NCoA1 PAS-B domain by STAT6 in which the coactivator strongly shifts the α-helical propensity in the regions Ile786-Ile790 and Glu799-Glu808 to an even more populated helical state (Fig. 8). In particular, the two helices are present at 10% and 30% population in the free form and become fully populated in the bound form making conformational selection the probable binding mechanism. The 3D arrangement of the two helices found in the bound form is also very likely to prevail in the free form based on RDCs. This suggests a conformational selection not only for the secondary but also for the tertiary structure. This similarity of 3D arrangement of secondary structures in the free and bound form was further investigated by exhaustive sampling of the conformational space. In 77 clusters that deviated sufficiently from the bound structure we found only two that had a similar quality factor for the RDCs. This is still a very strong support for the similarity of the 3D arrangement in the free and bound structure for the following reason: It should be noted that the binding peptide comprises two helices in which the NH vectors point along the helix axis and indeed there is little variation of the respective RDCs (+6 Hz for the C-terminal helix and around −1 Hz for the N-terminal helix). In addition, the peptide contains an extended stretch (F791, L794, and L795) where also little variation of the dipolar couplings is observed (around −9 Hz). For an accurate tensor determination one needs normally at least 5 independent orientations. Thus, it is not surprising that there is symmetry related degeneracies of possible orientations of the helices which fit to the RDCs. Even if these additional structures were populated, the conformational space would still be restricted to an ensemble with less members than a totally disordered peptide, supporting the mechanistic conclusion that a preformation of the bound 3D conformation happens in the free form.

Figure 8
figure 8

Molecular recognition mechanism. Cartoon representation of the conformational selection mechanism. The binding of the coactivator changes the free-energy landscape of STAT6783–814.

Conformational selection has been observed for quite a number of interactions between structurally well ordered domains and disordered transactivation domains containing an amphipathic α-helical binding motif (ΦXXΦΦ, Φ being a bulky hydrophobic residue) of transcription factors, e.g. the interaction of the transactivation domain of c-Myb with the KIX domain of CBP/p30029. In this complex as in other such complexes additional interactions outside the core binding motif that confer specificity have been structurally resolved. But to our knowledge the combination of a highly ordered amphipathic α-helical binding motif interaction with another N-terminally located second α-helical binding motif of equally high specificity in combination with a prearranged 3D arrangement of these helices is so far uniquely observed in the NCoA1 PAS-B/STAT6 transactivation domain complex described in this work and likely explains the higher affinity of STAT6783–814 compared to STAT6794–814 to the NcoA1 PAS-B domain.

Materials and Methods

Protein expression and purification

The recombinant NCoA-1 PAS-B domain comprising amino acids 257–385 of NCoA-1 was expressed and purified as previously published8. The fragment containing amino acids 783–814 of STAT6 was produced recombinantly as an N-terminal Z-tag fusion protein with a His7-tag and a TEV cleavage site between Z-tag and the target sequence. After TEV cleavage and removal of the Z-tag by Ni-NTA resin (Qiagen) the STAT6 peptide was further purified by reversed phase HPLC. Expression of labeled protein was performed in Toronto minimal medium with 15N ammonium chloride as nitrogen source and U-13C6-D-glucose as carbon source. The STAT6 peptide was added in 1.5 fold molar excess and the complex of the PAS-B domain with this peptide was purified by gelfiltration on a Superdex 75 column (GE Healthcare).

NMR spectroscopy

NMR samples were made up in the following buffer: 50 mM HEPES, pH 7.0, 150 mM NaCl, 2 mM DTT and 10% 2H2O. Complex samples were made up by adding STAT6 peptide in 1.5 fold molar excess. All spectra were recorded at a total protein concentration of 1 mM and were carried out at 309 K on Bruker 600, 700, 800, 900 NMR spectrometers equipped with a cryogenic probe and on Bruker 600, 700 NMR spectrometers equipped with a triple resonance probe head. The following experiments were recorded:

- on samples of 15N- or 15N-13C-labeled STAT6783–814 in the free form 1H,15N-HSQC, 1H,13C-HSQC, HNHA, HNCA, HNCACB, CBCACONH, 1H,15N NOESY-HSQC and 1H,15N TOCSY-HSQC30.

-on sample of 15N-13C-labeled STAT6783–814 in complex with unlabeled NCoA1257–385 1H,15N-HSQC, 1H,13C-HSQC aliphatic, 1H,13C-HSQC aromatic, HNHA, HNCA, HNCACB, CBCACONH, 1H,15N NOESY-HSQC, 1H,15N TOCSY-HSQC, HCCH-TOCSY, 1H,13C NOESY-HSQC aliphatic, 1H,13C NOESY-HSQC aromatic30, 1H,13C-edited/12C-filter NOESY-HSQC19.

-on sample of 15N-13C-labeled NCoA1257–385 in complex with unlabeled STAT6783–814 1H,15N-HSQC, 1H,13C-HSQC aliphatic, 1H,13C-HSQC aromatic, HNHA, HNCA, HNCACB, CBCACONH, 1H,15N NOESY-HSQC, 1H,15N TOCSY-HSQC. HCCH-TOCSY, 1H,13C NOESY-HSQC aliphatic, 1H,13C NOESY-HSQC aromatic.

The 15N edited NOESY-HSQC and 13C edited NOESY-HSQC experiments were acquired with a mixing time of 100 ms and 80 ms, respectively.

Slowly exchanging amide protons were identified in an 1H,15N-HSQC spectrum reordered immediately after exchanging the proton into a buffer prepared with 2H2O.

Vicinal (three-bond) HN-Hα coupling constants (3JHNHα) were evaluate from cross-peak intensities in quantitative J-correlation (HNHA) spectra31. Backbone torsion angles were estimated from Cα, CO, Cβ, N, HN and Hα chemical shift using the program TALOS+32.

Residual dipolar couplings 1DNH RDCs for the 15N-labeled STAT6783–814/U-NCoA1257–385 and U-STAT6783–814/ 15N-labeled NCoA1257–385were measured by taking the difference in the one-bond 1H-15N splittings (1JNH+ 1DNH) in aligned (~20 mg/ml phage pf117) and isotropic media using an in-phase/anti-phase (IPAP) HSQC experiment33.

The observed chemical shift change (Δδobs) for each backbone amide between the STAT6783–814 in the free and bound form was measured as the weighted average of the proton and nitrogen chemical shift changes by using equation (1)34:

$${{\rm{\Delta }}\delta }_{{\rm{o}}{\rm{b}}{\rm{s}}}={[({{{\rm{\Delta }}\delta }^{2}}_{{\rm{H}}{\rm{N}}}+{{{\rm{\Delta }}\delta }^{2}}_{{\rm{N}}}/25)/2]}^{1/2}$$

For the evaluation of the 15N-[1H] steady-state heteronuclear NOE two 1H,15N-HSQC were acquired (in one the protons were unsaturated and in the other the protons were saturated for 3 s).

For the secondary structure analysis and the secondary structure propensity the random coil chemical shifts of Zhang et al.12 were used. The SSP scores were calculated with the random chemical shifts and the average secondary shifts for fully formed secondary structure12 as described previously13.

The evaluation of the alignment tensors was performed by using the normalized scalar product (NSP) defined by equation (2):

$$NSP=\frac{\langle {{\rm{S}}}^{sample1}{|S}^{sample2}\rangle }{\sqrt{\langle {{\rm{S}}}^{sample1}{|S}^{sample1}\rangle \langle {{\rm{S}}}^{sample2}{|S}^{sample2}\rangle }}$$

where the scalar product between the two vectors formed from the Saupe matrix elements can be defined according to equation (3):

$$\langle {{\rm{S}}}^{sample1}{|S}^{sample2}\rangle =\sum _{\begin{array}{c}i=x,y,z\\ j=x,y,z\end{array}}{S}_{ij}^{sample1}{{\rm{S}}}^{sample2}$$

in which Sij are the elements of the 3 × 3 Saupe matrices.

The values of the NSP close to 1.0 indicate that the two alignment tensors differ only by a scaling factor, whereas the values around 0.0 suggest that the alignment frames are orthogonal. The situation where NSP = −1 indicates that the two alignment tensors are antiparallel.

The quality factor (Q) is defined by equation (4)35:

$$Q=\,\frac{{({\sum }_{i=1}^{N}{({D}_{i}^{exp}-{D}_{i}^{calc})}^{2}/N)}^{1/2}}{{({\sum }_{i=1}^{N}{({D}_{i}^{exp})}^{2}/N)}^{1/2}}$$

The spectra were processed using NMRpipe36 and analyzed using SPARKY37 and CARA38. 1H, 13C and 15N chemical shifts were calibrated indirectly by external DSS references.

Structure Calculation

NOE-derived distance constraints, coupling constants, TALOS dihedral angles, hydrogen bonds and residual dipolar couplings were used to calculate the structure of the NCoA1257–385/ STAT6783–814 complex with the program CYANA 3.039. The input data for the final structure are reported in Table SI 4. A total of 100 structures was calculated, and the 20 conformers with the lowest CYANA target function were selected. The small number of residual constraint violations indicates that the input data represent a self-consistent set and that the constraints are well satisfied in the calculated conformers. The structures were visualized and evaluated by using the programs, MOLMOL40, CHIMERA41, PROCHECK-NMR42 and MOLPROBITY43. The Adaptive Poisson-Boltzmann Solver (APBS)44 was used to calculate spatial distributions of electrostatic potentials using the linearized Poisson-Boltzmann equation and parameters from the PQR files obtained using the PDB2PQR server45. The electrostatic map was generated with CHIMERA41.

Generation of the conformational ensemble for RDCs evaluation

A conformational sampling approach was used to generate a conformational ensemble of the free STAT6 in order to describe the conformational space sampled by the peptide in the absence of the binding partner more comprehensively. The Normal Mode-based Simulation (NMSim)46 approach has been shown to be a computationally efficient alternative to molecular dynamics simulation for conformational sampling of proteins. Therefore, starting from the NMR structure of the STAT6 peptide in complex with the co-activator, an ensemble of 2500 conformers was generated with the program NMSim46 by using the default parameters for large scale motions. Then, the conformational ensemble was clustered by using the software NMRCLUST47 implemented in the program CHIMERA41. A total of 153 clusters were found and for each of them the representative structure was considered as reference model. Successively, the representative models were further filtered by considering only the conformers in which the θ angle between the first and the second α-helix deviated from this angle in the NMR structure by more than ±30°, i.e. all structures were excluded where this angle was between 104° and 164°. This procedure resulted in the selection of 77 conformers which defined the final conformational ensemble for the evaluation of the RDCs. The structural models (SM1, SM2, SM3, SM4) of the STAT6 peptide obtained from the NMR structure of the complex by rotating the α1 helix by 30° with respect to the α2 helix were also included in the final ensemble.

Data availability

Coordinates and structural restraints for the NMR structure and the X-ray of NCoA1257–385/STAT6783–814complex have been deposited in the PDB under the accession number (PDB ID: 5NWM) and (PDB ID: 5NWX), respectively. The chemical shifts have been deposited in the BioMagResBank, accession number (BMRB ID: 34131). The conformational ensemble for RDCs evaluation generated during the current study is available from the corresponding author upon request.