Main

Antitermination in bacteriophage λ is a process in which the phage-encoded N protein and host-encoded Nus factors modify RNA polymerase to a termination resistant form at the nut site in the transcribed RNA1,2,3,4,5,6. The nut site contains a 12-nucleotide strand (boxA), and a stem-loop structure (boxB). Two of the Nus proteins, NusB and NusE, form a heterodimer that specifically binds to boxA RNA and enhances antitermination7,8. In vitro, antitermination is decreased in the absence of NusB, in that it is limited to terminators close to the promoter9,10. NusB also competes with a cellular inhibitor that binds to boxA and prevents antitermination11. In vivo, a null mutation in the nusB gene produces a cold sensitive phenotype (no cell growth below 32 °C)12 that is often associated with defects in ribosome assembly. NusB is also required for antitermination of the E. coli ribosomal (rrn) RNA operon13, where it increases the rate of ribosomal rrn boxA-mediated transcription elongation14.

Description of the structure

The solution structure of NusB determined by NMR is shown as a superposition of 15 of the lowest energy structures in Fig. 1a. Structural statistics are given in Table 1. Complete resonance assignments are available at the BMRB (http://www.bmrb.wisc.edu, accession code 4737) and the list of restraints and coordinates are available at the RCSB (http://www.rcsb.org/pdb). NusB can be viewed as two subdomains with helices α1–α3 forming the N-terminal subdomain (Fig. 1b,c, gold) and α4–α7 forming the C-terminal subdomain (purple). The orientation of the two subdomains relative to each other is 130°, measured as the angle between α1 and α5. Subdomain orientation was determined from observed NOEs between α1 and α5 (Fig. 2) and between α3 and α5 (Table 2). Relative helix orientations and the NOEs that define the fold are listed in Table 2. In addition, several NOEs locate the C-terminal end of α4 near α7: from Ala 130 CHβ3–Met 66 Hα, Ala 130 CHβ3–Tyr 69 Hδ, and Ala 130 CHβ3–Tyr 69 Hɛ and Ala 130 HN–Tyr 69 Hɛ. These NOEs are well resolved and their unambiguous assignment was key to defining the global fold of NusB in the early stages of the structure calculation.

Figure 1: Solution structure of E.coli NusB.
figure 1

a, Stereo view showing the superposition of 15 of the lowest energy structures of NusB. b, Stereo view of a ribbon trace of NusB. The N-terminal subdomain is colored gold, and the C-terminal subdomain is colored purple. Helices and loops are labeled as discussed in the text. c, Stereo view showing a 90y° rotation of (b).

Table 1 Summary of restraints and structural statistics
Figure 2: Orientation of subdomains and momeric state of NusB.
figure 2

a, Long range NOE contacts between α1 (gold) and α5 (blue). These NOEs define the orientation of the N-terminal and C-terminal subdomains relative to each other. b, The determination of the molecular mass of E. coli NusB in 50 mM potassium phosphate, pH 6.8, 50 mM NaCl and 1 mM DTT at 20 °C. The absorbance gradient (at 280 nm) in the centrifuge cell after attaining sedimentation equilibrium at 25,000 r.p.m. is shown in the bottom panel. The solid line is the result of fitting to a single ideal species and the open circles are the experimental values. The corresponding top panel shows the difference between the fit and the experimental values as a function of radial position (residuals).

Table 2 Interhelix angles1 and residues with observed NOEs between helices

Most of the hydrophobic residues in NusB are internal and involved in helix packing. A few are partially exposed, as are all four Phe side chains. Most of the hydrophilic residues are on the surface. The packing quality of the structure is acceptable as evaluated by WhatIf15 (QUACHK = −1.80 ± 0.07) and by analysis of the distribution of hydrophobic and hydrophilic residues (INOCHK = 1.0 ± 0.02). An interesting feature of the structure is a small, positively charged cavity formed by the termini of Lys 82 and Arg 86, on the backside of the protein from the view in Fig. 1b. Since these residues are highly conserved (Fig. 3a), this charged cavity is likely a conserved feature in NusB proteins. It is present in the Mycobacterium tuberculosis NusB structure16.

Figure 3: Sequence conservation in NusB.
figure 3

a, Sequence alignment of NusB orthologs. Amino acids identical to the E. coli sequence are highlighted in green. Amino acids that represent a conservative substitution from the E. coli sequence are highlighted in orange. Amino acids that are identical across all NusB sequences are marked by a solid arrow. Amino acid positions with only conserved substitutions across all sequences are marked by an open arrow. b, Localization of conserved residues on NusB. The side chains of conserved or identical residues across all NusB proteins are displayed and labeled.

Comparison to another E. coli NusB solution structure

Another solution structure of E. coli NusB has been reported (Protein Data Bank accession code 1BAQ)17. Although some of the helix-to-helix contacts are similar between this structure and ours, the three-dimensional fold is remarkably different. The root mean square (r.m.s.) deviation between the two structures over the Cα atom trace of the helices is 9.9 Å. The Cα trace r.m.s. deviation between the helices in the N-terminal subdomain is 6.0 Å, while it is 8.3 Å for the C-terminal subdomain. In our structure, the subdomain orientation is 130° (roughly antiparallel; Fig 2a), while it is 10° in 1BAQ (parallel). Additionally, the nearly perpendicular orientation of α3-α4 (80°) and α4-α5 (101°) in our structure is markedly different from the angles between these helices in 1BAQ (31° and 54°, respectively). Furthermore, the core helices in our structure are α1 and α5 while in 1BAQ they are α3 and α6. The long range contacts reported as α1–α6 NOEs in 1BAQ were not observed in our NOESY data. Key long range NOEs that define our structure are between α1 and α5 (Fig. 2 and Table 2). The solution conditions used in the two studies are reported to be similar, except for the absence of salt for 1BAQ, and the presence of 100 mM NaCl in our buffer. The 15N HSQC spectrum reported by each group appears to be very similar, although there are some differences in sequential assignments17,18. Since the full assignments and list of restraints for 1BAQ were not reported, further evaluation and comparisons could not be made. In our structure determination of NusB, 4D NOESY spectra were used to assign the contributing heteronucleus to each H-X pair (where X is either N or C). Complete side chain assignment for observed residues was critical, particularly for the 14 aromatic side chains (Table 2). Additionally, the use of ARIA19,20 contributed to a resulting completeness of assignment of 93% (2,045 of 2,189 crosspeaks) of the observed peaks in the 4D NOESY spectra. Our structure of E. coli NusB has been independently confirmed by the X-ray crystal structure of M. tuberculosis NusB16.

Comparison to M. tuberculosis NusB

The structure of the M. tuberculosis NusB protein has been solved by X-ray crystallography16. The amino acid sequences of E. coli and M. tuberculosis NusB are 57% homologous and 34% identical (Fig. 3a). The two NusB structures were solved independently and compared after the refinement for each protein was completed. The M. tuberculosis NusB crystal structure reveals a dimer in the asymmetric unit. Prior characterization of the E. coli NusB protein showed it to be a monomer in solution at concentrations up to 1 mM18,21. Our data confirm these results; the E. coli NusB protein behaves as an ideal monomer during sedimentation equilibrium centrifugation (Fig. 2b). There was no tendency for aggregation or self-association over the concentration range studied, and the determined molecular mass of 15,499 Da was within a few percent of that expected from the sequence (15,679 Da).

The structures of the E. coli and M. tuberculosis NusB monomers are very similar. Helices α1 and α3–α7 superimpose well with an r.m.s. deviation between Cα atoms of 2.0 Å. Helix α2 is closer to α1 in the E. coli structure than it is in the M. tuberculosis one. This difference may be due to the conformational heterogeneity of α2 and loop ℓ2 in solution (residues Glu 31–Asp 44 are more disordered; Fig. 1a). The difference in the positions of α2 may also be due to dimer contacts in the M. tuberculosis protein, since α2 makes contacts with α2′. The similarity of the E. coli and M. tuberculosis NusB structures supports the correctness of the protein fold presented here.

Structural biology of sequence conservation

NusB homologs are present in a variety of organisms. A subset of NusB orthologs found using the COGnitor program (COG0781, http://www.ncbi.nlm.nih.gov/COG/)22 is aligned in Fig. 3a. There is a high degree of identity to the E. coli sequence across helices α1, α5, α6 and α7. The location of conserved residues is shown in Fig. 3b. Four conserved residues are on the surface of the protein (Arg 10, Asp 63, Glu 106 and Asn 124) and all three conserved aromatic residues (Tyr 18, Phe 114 and Phe 122) are partially exposed to solvent. Aromatic residues exposed on the surface are often involved in recognition and specificity at intermolecular binding surfaces. The biological significance of Tyr 18 was established by defective antitermination of the nusB5 mutant23,24. Phe 114 and Phe 122 may have similar functional significance. The remaining conserved residues are in the core of the protein and are involved in multiple helix-to-helix contacts. This distribution suggests that the main role of conservation in NusB is to stabilize the protein fold. It also implies that the fold of the orthologous NusB proteins is likely to be very similar.

Evaluation of NusB mutants

Several mutations in NusB have been identified that affect the activity of the phage λ transcription complex. The NusB5 protein contains a Y18D mutation that abolishes antitermination of the native complex23,24. Tyr 18 is partially exposed at the C-terminal end of α1 and also makes important hydrophobic contacts with α5 (Fig. 2 and Table 2). Possibly, the substitution of Asp for Tyr destabilizes the protein structure by reducing the number of hydrophobic contacts between α1 and α5; alternatively, the mutation may change the electrostatic nature of the region. Another possibility is that Tyr 18 is involved in a specific interaction with RNA, perhaps stacking with nucleic acid bases. In this case, substitution of any nonaromatic amino acid at position 18 would have the same effect as the NusB5 mutation.

A second NusB mutant, NusB101, is a D118N substitution that has no effect on the native transcription complex, yet rescues defective antitermination caused by the NusA1 mutant25. NusB101 similarly rescues an antitermination defect resulting from the NusE71 mutation24. The Asp to Asn substitution is normally considered conservative, and the location of residue 118 on a surface loop makes it unlikely to destabilize the structure. More likely, Asp 118 is part of or near contact surfaces for nucleic acid interactions. The observation that the rescue of antitermination defect of NusB101 requires the presence of boxA24 supports this hypothesis as does the report that NusB101 has native binding affinity for NusE. This mutation is specific to phage λ, since the NusB101 mutation does not have the same affect on phage 21 N-mediated antitermination. The overall implication is that the NusB101 mutation might directly enhance binding of NusB (or a NusB–protein complex) to phage λ boxA.

Structural homology

No protein with a structure similar to NusB has been found; hence, this represents a new protein fold. However, structural similarities were found for subsets of the NusB structure and several of these proteins contain homeodomains26,27. Both the N-terminal and C-terminal subdomains of NusB contain helix-turn-helix (HTH) features, including a helix rich in basic residues that corresponds to the recognition helix of the homeodomains. Although the HTH motif is prevalent in transcription factors, it is not yet known whether the HTH-like domains in NusB function as nucleic acid binding domains.

Implications for nucleic acid binding

In the context of specific single stranded RNA binding, the all helical structure of NusB is unique. Recent structures of single stranded nucleic acid–protein complexes show that the nucleic acid binding surface of the protein commonly consists of a β-sheet28,29,30,31. However, there is no obvious groove or cleft in NusB containing positive charges and exposed aromatic residues that would make specific contacts similar to those in the single stranded nucleic acid–protein complexes. Specific boxA binding to NusB has only been demonstrated in the presence of NusE8. NusE may assist recognition by either contributing part of the RNA binding site, or by changing the structure of NusB to create a specific nucleic acid binding site. Future work on the details of specificity will undoubtedly reveal more novel features in the transcription regulation machinery.

Methods

Sample preparation.

NusB was prepared as described18 to produce 2H/15N, 13C/15N and 15N labeled samples. The sample conditions were 1 mM protein concentration in 50 mM sodium phosphate buffer, 0.1 M NaCl and 2 mM dithiothreitol at pH 6.8 and 25 °C. The NusB protein was active in an in vitro antitermination assay32.

Analytical ultracentrifugation.

Analytical ultracentrifugation was performed at 25,000 rpm, 20 °C, using a Beckman XL-I Optima analytical ultracentrifuge with an An-60 Ti rotor and standard double sector centerpiece cells. Solvent density was calculated according to Laue et al.33. The partial specific volume of the protein (0.744) was calculated from the predicted amino acid composition34. Centrifugation data were analyzed using the Beckman-Origin software.

NMR spectroscopy.

NMR spectra were acquired on a Varian Unity Plus 600 MHz spectrometer. Spectra were processed using NMRPipe35 and assigned using ANSIG 3.336,37. 1H, 15N and 13C assignments have been made for NusB and are available at the BMRB. Sequential assignments are 94% complete. Out of the 139 residue in NusB, 4 residues (Arg 6, Arg 7, Arg 8 and Asp 44) are completely unassigned. In addition, the NHs of 11 residues (Met 1–Arg 10 and Asp 44) have no assignments; only one atom pair in each of the four residues (Met 1, Arg 10, Asp 42 and Val 43) are assigned. For six residues (Ala 4, Ala 5, Ala 9, Phe 34, Phe 122, Lys 138), only one atom in each is unassigned. Aromatic side chains were assigned using 2D CB(CGCD)HD and CB(CGCDCE)HE experiments combined with NOESY and HSQC spectra. The aromatic side chains are completely assigned with the exception of Phe 34 CξHξ and Phe 122 CξHξ. Interproton distances were measured from the following spectra with the given mixing times (tmix): 15N edited NOESY-HSQC (tmix = 120 ms); 15N/13C (simultaneous) edited NOESY-HSQC (tmix = 100 ms); 15N/15N HSQC-NOESY-HSQC (tmix = 200 ms); 13C/15N HMQC-NOESY-HSQC (tmix = 100ms), and 13C/13C HMQC-NOESY-HSQC (tmix = 100 ms).

Structural calculation.

Structures were calculated using X-PLOR 3.851 (ref. 38). A fully extended starting conformation was used on which 24,000 steps of simulated annealing at 1,200 K followed by 15,000 cooling steps of 0.005 ps to 100 K were carried out. Initial structures (100) were calculated using 1891 unambiguous NOE restraints. Fifteen of the lowest energy structures (backbone r.m.s.d. 1.1 Å) were used as starting coordinates for ARIA19,20. The following ARIA protocol included 343 NOE crosspeaks as ambiguous distance restraints (‘P’ values are listed first followed by the assignment cutoff distance in parentheses): 0.999 (5), 0.999 (2), 0.99 (1), 0.99 (0.5), 0.98 (0.5), 0.96 (0.25), 0.93 (0.25), 0.90 (0.20), 0.80 (0.2).

Structural homology.

The DALI program (http://www2.ebi.ac.uk/dali) was used to search protein databases for structures similar to NusB. Selection criteria were a Z-score ≥3.0 or an r.m.s. deviation between Cα atoms ≤3.0 Å over subsets comprising ≥3 helices.

Coordinates.

Coordinates have been deposited in the Protein Data Bank (accession code 1EY1).