Introduction

The modification of proteins with polyubiquitin chains has profound consequences for their behaviour and can target proteins to the proteasome for degradation, activate signalling cascades or regulate the DNA damage response amongst many other functions. The fate of the modified protein depends on the linkage type between ubiquitin molecules within the chain: the C-terminal carboxylate of one ubiquitin molecule can form an isopeptide bond with one of the seven lysine residues within ubiquitin thereby creating seven distinct monotypic chains or, alternatively, form a peptide bond with the N-terminal α-amino group of methionine1 to produce so-called linear or Met1-linked polyubiquitin chains1,2. These linear chains play important roles in the regulation of immune and inflammatory signalling pathways and contribute to the regulation of apoptotic signalling processes3,4. They are synthesized by the multi-component E3 ligase LUBAC that consists of three subunits termed HOIP, HOIL-1L and SHARPIN5,6,7. HOIP provides LUBAC with the ability to produce linear polyubiquitin chains in a highly specific manner using its C-terminally located RBR (RING between RING) domain8,9,10. RBR domain-containing E3 ubiquitin ligases form a subfamily of E3s that adopt a hybrid mechanism integrating the properties of RING and HECT-type ligases: a canonical RING domain (“RING1”) initially recognizes the E2~ubiquitin conjugate and subsequently the ubiquitin is transferred onto a conserved cysteine residue located in the RING2 domain of the RBR to form a thioester intermediate before the final transfer of ubiquitin onto a substrate11,12,13. This mechanism is similar to that of HECT-type ligases, which also form a thioester intermediate with ubiquitin, whereas RING-type E3s play a more indirect role and act as a platform to bring the E2~Ub and substrate into close proximity and stabilise a conformation of the E2~Ub conjugate that is primed for ubiquitin transfer14. Regardless of the type of E3 ligase involved, for the transfer of ubiquitin onto the substrate to proceed efficiently the incoming nucleophile, which is either the ε-amino group of a lysine side chain or the N-terminal α-amino group of Met1, needs to be in its deprotonated form. The pKa of lysine side chains is around 10.5, meaning they are protonated at physiological pH and hence to be effective nucleophiles a mechanism is required to depress the pKa. Acidic residues in E2s, such as Asp127 in the SUMO-specific E2 Ubc9 and Asp117 in UbcH5, have been suggested to contribute to pKa depression during ubiquitin transfer by RING ligases15,16,17,18. Similarly, an Asp residue in the active site of the HECT ligase Rsp5 has been reported to play a role in deprotonating the substrate lysine19.

In contrast to lysine side chains, the pKa values of the N-terminal amino group in proteins are less well characterised. Values between 6.8 and 9.1 have been reported, with an average of 7.7 ± 0.520. These relatively low values raise the possibility that the synthesis of linear polyubiquitin chains may not require a general base to activate the nucleophile. However, our structural work on the catalytic core of the RBR ligase HOIP – which is the only E3 ligase capable of synthesizing linear ubiquitin chains8 - provided a molecular explanation for the observed high chain linkage specificity of HOIP, and highlighted a histidine residue in the active site, His887, that was ideally positioned to carry out the role of a general base21. Indeed, substitution of this histidine residue with alanine severely suppresses catalytic activity, which however could be rescued at high pH, strongly indicating that His887 acts as a general base to deprotonate the α-amino group of Met1. This apparent need for a general base prompted us to ask what the pKa value of the N-terminal α-amino group of Met1 might be. Although ubiquitin is one of the best studied proteins and the pKa of most of its ionisable groups have been determined experimentally22 we could not find any reports describing Met1 pKa determination. A number of computational approaches exist to calculate pKa values based on the three-dimensional structure of proteins23,24. However, an overlap of different NMR and crystal structures of ubiquitin showed slight differences in the environment of Met1 and we therefore decided to use an NMR-based approach to determine the pKa experimentally.

NMR chemical shifts are exquisitely sensitive to local ionisation events and serve as excellent reporters in pH titrations. As a result, NMR titrations have been very widely used to measure pKa values in a large range of molecules, including the biologically important ionization of the side-chains of aspartic and glutamic acid and histidine residues of proteins, for which many hundreds of measurements have been reported20. In contrast, the literature contains relatively few pKa measurements of N-terminal amino groups. Direct detection of the 15N resonance has been used in a small number of studies, but suffers from poor sensitivity25,26. Higher sensitivity can in principle be achieved by exploiting indirect proton-detected methods such as HSQC and SOFAST, but these methods depend on scalar coupling between the 15N and 1H nuclei, and are therefore not applicable to free amino groups in aqueous solution where the amino protons exchange with solvent at a rate that is much faster than the scalar coupling.

Alternative strategies involve detection of resonances from nuclei that are more remote from the ionization site, and whilst this approach provides opportunities for using more sensitive NMR methods it creates the risk that the chosen reporters will be affected by ionization events from multiple sites in the protein, resulting in chemical shifts that show complex pH dependence, with concomitant difficulties in interpretation of the data and the possibility of substantial errors in derived pKa values27. It is generally recognised that the best reporter is the ionization site itself, where the frequency change associated with the ionization is likely to be largest, followed by the nucleus or nuclei that are covalently bound to the ionization site and other closely adjacent nuclei.

In view of these considerations we have adopted an indirect-detection approach to determine the pKa of the α-amino group of Met1 by monitoring the 15N resonance frequency, detected indirectly via the non-labile alpha hydrogen of the terminal residue. This approach is based on that used by André et al. to determine the pKa values of side-chain ionizable groups of lysine and arginine residues28, and by Lorieau et al. for the amino group of the influenza hemagglutinin fusion peptide29.

Results and Discussion

Data acquisition and analysis

The terminal amino 15N data were collected using a two-dimensional version of the HACAN pulse sequence30, essentially in the form of Kanelis et al.31, modified to provide discrimination against the unwanted signals from backbone amide groups and side-chain amino groups of lysine residues. The modifications (described in Methods) reduced the unwanted signals to undetectable levels, and, as a consequence, the assignment of the signal from the N-terminal amino group was self-evident.

The pH titration was carried out by progressive transfer of aliquots between a pair of samples that were initially adjusted to pH 5.8 and 10.5 respectively, as described in Methods. At each stage of the titration the pH of the samples was measured via the 1H chemical shifts of a set of indicator molecules present as co-solutes32; this avoids the difficulties that arise with conventional electrode-based pH measurement of small-volume samples. The experimental methodology and the resulting pKa values at 21.5 °C for the indicator molecules are described in Methods.

The results of the HA(CA)N experiment are shown in Fig. 1a, and although the primary purpose of this experiment was to detect the 15N chemical shift of the terminal nitrogen, the alpha hydrogen clearly also shows a pH dependence that is strongly correlated to that of the 15N. In addition to the HA(CA)N experiment, a two-dimensional 1H-15N SOFAST-HMQC data set33 was recorded at each pH point to assess the integrity of the protein, which was found to remain folded up to the highest pH used in this work (Fig. 1b). In the SOFAST spectra a number of cross-peaks migrate as a function of pH, most notably the peak from the backbone amide of Gln2, and the coordinates of this peak were also analysed.

Figure 1: Overlap of ubiquitin spectra at different pH values.
figure 1

(a) Overlay of successive 2D HA(CA)N spectra from the titration of human ubiquitin over the pH range 5.8 (lower left) to 10.5 (upper right). The peak arises from the correlation of the terminal 15N with the Hα of the same residue; the pulse sequence has been adapted so that only this correlation has detectable intensity. In the interests of clarity only a subset of the recorded spectra have been shown on this plot. (b) Overlay of 1H-15N SOFAST-HMQC spectra at the end-points of the titration and three intermediate pH values. The peaks that show appreciable pH dependence in this range are annotated; the arrows show the movement of the peaks with increasing pH. In the inset structural diagram (1UBQ.pdb)50 the backbone nitrogen atoms corresponding to these peaks are shown as red spheres, and the N-terminal amino group is shown in blue).

The peak positions for the ubiquitin titrations were fitted using non-linear regression to the Henderson–Hasselbalch equation, recast in terms of chemical shifts:

where σpeak is the measured NMR chemical shift of the peak of interest, σHA is the chemical shift of the protonated form and Δσ is the difference between σHA and the shift of the deprotonated form. All fitting and statistical analysis was carried out using the statistics package R, as described in Methods34. Good fits to eq. (1) were obtained for all four measured ubiquitin chemical shifts (Fig. 2).

Figure 2: Chemical shift dependence on pH.
figure 2

The chemical shift dependence on pH of (a) Met1-15N and (b) Met1- 1Hα measured with the HA(CA)N experiment; (c) Gln2-15N and HN measured with 1H-15N SOFAST-HMQC. The solid lines are the curves fitted to the Henderson-Hasselbalch equation. The widths of the vertical bars denote the 95.45% confidence intervals of the fits, as determined by the statistical bootstrap.

pKa of the N-terminal amino group

The pKa values derived from the NMR measurements carried out on the four nuclei at or near the N-terminal residue are in close agreement (Table 1). The pKa derived from the 15N chemical shift of the terminal amino group is 9.14; this nucleus is reasonably assumed to be the most faithful reporter. The values from the other three measured nuclei fall in the range 9.15 to 9.18, with fitting confidence intervals which overlap that of the Met1-N determination. This suggests that all of the observed chemical shift changes reflect the N-terminal amino group ionisation event alone. This is unsurprising as the measured value lies in a pH window devoid of other typical ionisations in proteins. The absence of other ionization events in this pH range is further evidenced by an examination of the full set of peaks in the SOFAST 1H-15N-correlation spectrum, where all of the 1H-15N correlations which display appreciable titration behaviour in this range arise from atoms that are in close proximity to the N-terminus in the three-dimensional structure (Fig. 1b).

Table 1 pKa values derived from chemical shift measurements.

Conclusions

The pKa of 9.14 for the N-terminal amino group of ubiquitin, determined from the 15N chemical shift, is at the upper limit of experimental values for proteins reported in the literature, which span 6.8 to 9.1, and is also higher than that determined for model peptides20,35. The α-amino group of Met1 is solvent exposed making it difficult to rationalize why its pKa is higher than in most other cases and we speculate that the proximity of Glu16 and Glu18 (Fig. 3a) create a negatively charged environment that raises the pKa. Interestingly, the conserved histidine residue in HOIP that we previously proposed to act as a general base to deprotonate the N-terminal amino group is conserved in a number of RBR ligases, all of which transfer ubiquitin onto a lysine sidechain21. At present the structure of the HOIP/ubiquitin complex is the only snapshot of an RBR ligase/substrate active site. The only other structures of RBR ligases available are of HOIP bound to E2~ubiquitin, and HHARI and Parkin in the autoinhibited state, or partially active forms of Parkin36,37,38,39,40,41. The histidine is in a different conformation in those structures (Fig. 3b) and points away from the ubiquitin which could indicate that substrate binding might either induce changes around the active site in HHARI or Parkin or that the substrate is presented to the active site in a different orientation in these RBRs. Further studies are required to fully understand how the incoming nucleophile is activated in other RBRs.

Figure 3: Structural environment of Met1 of ubiquitin and active site of RBR ligases.
figure 3

(a) Structure of ubiquitin highlighting the residues close to Met1 (1UBQ.pdb)50. (b) Overlap of the structures of HOIP in purple (4LJO.pdb)21, Parkin in green (5CAW.pdb)40 and orange (5C23.pdb)38 and HHARI in yellow (4KBL.pdb)36 zoomed in onto the active site. The catalytic Cys885 and His887 of HOIP, plus Met1 of the acceptor ubiquitin (in gray) are shown.

Methods

Internal pH indicator validation

We used the set of internal indicator molecules proposed by Baryshnikova et al.32 – Tris, formate, piperazine, and imidazole - but first re-determined the indicator pKa and limiting chemical shift values at 21.5 °C, the temperature of interest in this study. We also re-assessed the range of applicability of each indicator for this temperature, and extended the applicability to a higher pH range by utilising the second ionisation of piperazine. The chemical shifts of the titrating signals of Tris, formate, and imidazole-H2 were analyzed using the Henderson-Hasselbalch equation (in the form shown above as eq. (1)), whereas the piperazine chemical shift was fitted to the extended form of the equation which accommodates two ionization events42:

where Δσ1 is the variation of chemical shift associated with the titration of the deprotonation governed by pKa1 and Δσ2 the added chemical shift variation governed by pKa2.

The sample used for these measurements consisted of a solution of 100 mM KCl, 2 mM Tris, 2 mM formate, 2 mM piperazine, 2 mM imidazole and 0.2 mM DSS in 95% H2O-5% D2O, matching the buffer used subsequently in the ubiquitin measurements. The pH was adjusted across the desired range and measured using a glass pH electrode.

The resulting limiting chemical shifts and pKa values (Fig. 4 and Table 2) agree well with those reported by Baryshnikova et al. at 30 °C, demonstrating the validity of their observation that this approach is quite robust with respect to changes in temperature32.

Figure 4: Titration curves of internal NMR reporters.
figure 4

pH titration curves for the four internal NMR reporters: formate, piperazine, imidazole-H2 and Tris. The solid lines are the curves fitted to the Henderson-Hasselbalch equation as described in the text. The resulting pKa values are marked with vertical lines.

Table 2 Measured pKa values of four internal NMR reporters.

NMR sample preparation

Recombinant wild-type human ubiquitin was expressed as untagged protein using pET15 and E. coli strain BL21(DE3), and purified by anion exchange using a Q Sepharose resin followed by gel filtration on a Sephadex G-100 column (GE Health-care). 13C/15N labelled ubiquitin was obtained by expression in minimal medium containing glucose-13C6 and 15NH4Cl as sole carbon and nitrogen sources.

Two 0.5 mM ubiquitin samples for NMR were prepared by dialysis into a 95% H2O-5% D2O buffer (100 mM KCl, 2 mM Tris, 2 mM formate, 2 mM piperazine, 2 mM imidazole, 0.2 mM DSS). The two initial samples were adjusted to pH 5.8 or pH 10.5. Intermediate pH values were attained by transferring small aliquots between samples after each set of NMR measurements. The sample that was initially at pH 5.8 thereby increased with each transfer, whilst the pH of the other sample decreased. NMR measurements were made on both samples. Accurate sample pH was monitored after each transfer using the 1H chemical shift of the appropriate buffer component as an internal indicator, making use of values for pKa, σHA, σA and applicability regions re-derived experimentally for 21.5 °C as shown in Table 2.

NMR measurements

All NMR measurements were performed on a Bruker Avance 600 MHz equipped with a 5 mm TCI cryoprobe, at a sample temperature of 21.5 °C, as demonstrated by a prior calibration using the method of Findeisen et al.43.

Three NMR spectra were acquired at each pH value: a 1D 1H spectrum using excitation sculpting44, a 2D 1H-15N SOFAST-HMQC spectrum33, and a 2D HA(CA)N spectrum30,31. The latter was based on the sequence of Kanelis et al.31, adapted to confer selectivity and improve sensitivity for the desired N-terminal amino signal. Selectivity with respect to backbone amide signals was achieved by replacing the rectangular 15N inversion pulses of the standard sequence with Q3-shaped selective pulses45, centred at a 15N offset of 35 ppm, and having a duration of 1.3 ms. Discrimination against signals from the side-chain amino groups of lysine residues was obtained by setting the delay within the sequence that serves for evolution of proton-coupled 13C magnetization to 3.4 ms, a value that was optimal for the methine Cα group of N-terminal methionine and simultaneously minimized the signal from the methylene Cε groups of lysine side-chains. 2D spectra were processed using nmrPipe46. 1H chemical shifts were referenced to internal DSS; 13C and 15N shifts were referenced indirectly using the gyromagnetic ratios of Wishart et al.47. Chemical shifts for the signals used in the ubiquitin pKa analysis are given in Table 3.

Table 3 Ubiquitin titration data.

Statistical methods

Statistical analysis and Henderson-Hasselbalch equation fitting were done using R34. Both the NMR reporters and the ubiquitin titration curves were fitted using a non-linear least-squares method (Gauss-Newton). Because of the non-linear nature of the Henderson-Hasselbalch equation, the pKa values here are given with estimated confidence intervals rather than standard deviations48. Confidence intervals of 95.45% were calculated using a bootstrap procedure49; this interval corresponds to ± 2σ for a normal distribution. The bootstrap calculation was run with N = 1000 and incorporated the bias correction and acceleration methods, which improve the accuracy by taking into account the non-normality of the bootstrap distribution48.

Additional Information

How to cite this article: Oregioni, A. et al. Determination of the pKa of the N-terminal amino group of ubiquitin by NMR. Sci. Rep. 7, 43748; doi: 10.1038/srep43748 (2017).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.