Introduction

Protein turnover plays a key role in maintaining protein homeostasis1. It is important to healthy biological functioning2 and is often dysregulated in diseases3,4. Metabolic labeling of live animals with stable isotopes followed by liquid chromatography coupled with mass spectrometry (LC-MS) and combined with sophisticated data processing algorithms has been a powerful tool for estimating the turnover of individual proteins in high-throughput and large-scale studies5,6. The labeling agents can be divided into two groups. The amino acid-based labeling7,8,9,10 uses a diet enriched in stable isotope labeled essential amino acids such as 13C6-Lys. The label incorporation results in separate mass profiles of unlabeled and labeled forms of a peptide. Only peptides containing the heavy amino acid are useful for protein turnover rate. Non-canonical amino acids have also been used to study the protein turnover in LC-MS11,12.

The atom-based labeling uses a diet enriched in heavy13,14,15,16,17 (such as 13C, 15N, 2H) or light18 (12C) atoms. In a strategy named SILAM5 (stable isotope labeling of mammals), rats and mice have been metabolically labeled with a diet containing 15N-labeled spirulina. Turnover rates of individual proteins in mouse brain and liver have been reported14. The atom-based labeling results in composite spectra of unlabeled and labeled forms of a peptide. Therefore, the estimation of label enrichment resulting from atom-based labeling agents is computationally more complex13.

An alternative approach to protein turnover study is to characterize a protein based on its expression as a fusion protein with an appropriate tag19. Fluorescent proteins (FPs) have been widely exploited tags. Bleach-chase and photoactivation/photoconversion of FPs permit monitoring fluorescence (and estimating protein half-live) after the bleaching or change in fluorescent light color from the newly synthesized proteins in cell culture20. Tandem fluorescent proteins have also been developed and applied21. However, in living animals, photoactivation may not be easily achieved. SNAP-tag22 overcame this problem and was used to measure protein turnover in transgenic mice. FP tagging allows the detection of expression levels of target proteins. A technique termed Global protein stability, integrated fluorescence-based protein stability analysis, and the DNA microarray technology23. The stabilities of thousands of proteins in cell culture have been estimated23.

A classical method for estimating protein degradation in eukaryotes is the chemical inhabitation of the ribosome translocation with cycloheximide24. Protein abundances are measured by Western blot, using antibodies for target proteins. The dependence on antibody-based quantification limits the throughput of the method. A comprehensive review of the techniques used for protein turnover studies can be found elsewhere25,26,27.

Among the stable isotope labeling agents, heavy water (2H2O) is easy to use, biologically safe at low enrichments, relatively inexpensive, and does not require diet adaptation28. The deuterium (2H) incorporation from heavy water labeling has been monitored in vivo by stimulated Raman scattering microscopy29. The vibrational frequency of C-D bond is located in a cell-silent spectral window in which no other Raman peaks exists30. The technique is non-invasive and non-destructive. It allows for microscopic imaging of lipid and protein synthesis in vivo. However, currently, it does not provide the ability to determine the turnover rate of individual proteins.

Deuterium in heavy water is incorporated into all non-essential amino acids31. The gradual incorporation of the deuterium into a peptide decreases the relative isotope abundance (RIA) of its monoisotope. The monoisotopic RIA is computed from the complete isotope profile, which comprises up to six mass isotopomers. The time course of the RIA is modeled with an exponential decay function32,33,34,35,36. The coefficient of determination, R2, is often used as a measure of the goodness-of-fit (GOF) between the experimental data and theoretical curve fitting36,37. This approach can produce turnover rates of thousands of proteins; however, it has been observed that only 35–45% of all quantified peptides have been useful for the estimation of protein turnover rates37,38. For the rest of the quantified peptides, the R2 is too low (R2 < 0.8). One of the factors affecting the estimation of label incorporation is the complexity of the mammalian proteome; the isotope profiles of many peptides overlap in LC-MS. Since abundances of up to six mass isotopomers of a peptide are required for most approaches36,39, the chances of co-elution affecting the estimation of monoisotopic RIA are high.

This work develops an approach to increase the proteome coverage for protein turnover by using partial isotope profiles. The basis of this strategy is the estimation of label enrichment of a peptide from any pair of its mass isotopomers. We provide theoretical formulas for the time courses of six mass isotopomers. Here, the formulas are utilized to determine label enrichment, followed by protein turnover rate estimations. Previously, Papageorgopoulos32 and colleagues have shown that the label enrichment during metabolic labeling can be modeled as linear regression on the ratios of pairs of mass isotopomers. The mass isotopomer abundances, in turn, were modeled as polynomials of the enrichment. The formulas presented in this work provide the polynomials and their coefficients analytically. They allow computationally efficient estimations of label enrichment in data analysis of large-scale datasets.

We applied the method to five datasets acquired on Orbitrap Eclipse and Orbitrap Q Exactive HF mass spectrometers. The Orbitrap Eclipse dataset (acquired for this study from murine liver tissue) was used as the developmental dataset to compute RIA from partial isotope profiles and compare it with the traditional approach. The other datasets (Orbitrap Q Exactive HF) are from a recent study36, which generated LC-MS data of heavy water labeled samples of four murine tissue types. The approach using two mass isotopomers (partial isotope profiles) has consistently increased the number of quantified peptides in both datasets and across all tissue types.

Results

Protein turnover rates are computed using the monoisotopic RIA, I0(t), obtained from LC-MS data of heavy water metabolically labeled peptides. At every time point of labeling, the monoisotopic RIA is determined as the normalized abundance of the monoisotope from the complete isotope profile of a peptide36,37, Supplementary Eq. (1). The time series of the monoisotopic RIA of a peptide is modeled as an exponential decay function to obtain the turnover rate, Fig. 1.

$${{{{{{\rm{I}}}}}}}_{0}\left({{{{{\rm{t}}}}}}\right)={{{{{{\rm{I}}}}}}}_{0}^{{{{{{\rm{asymp}}}}}}}+\left({{{{{{{\rm{I}}}}}}}_{0}\left(0\right)-{{{{{\rm{I}}}}}}}_{0}^{{{{{{\rm{asymp}}}}}}}\right){{{{{{\rm{e}}}}}}}^{-{{{{{\rm{kt}}}}}}}$$
(1)
Fig. 1: An approach to estimate protein turnover rates from the abundances of two mass isotopomers.
figure 1

a Experimental and workflow of metabolic labeling, LC-MS, and data processing. b Relative abundances of mass isotopomers shifted during the gradual incorporation of deuterium from heavy water. Shown are the isotope profiles of peptide, NLDKEYLPIGGLAEFCK, at three labeling durations: 0, 3, and 21 days. The turnover rate is determined from the depletion of the monoisotopic RIA, I0(t). c The monoisotopic RIA is traditionally obtained as normalized monoisotopic abundance from the complete isotope profile (red bar). This work determines the monoisotopic RIA from only two mass isotopomers (green bars). At first, the deuterium enrichment of a peptide is determined from two mass isotopomers. Then the deuterium enrichment is used to calculate the monoisotopic RIA. The time course of I0(t) is fit to an exponential decay function to obtain the turnover rate. d The RIAs of the first five mass isotopomers of the peptide as a function of deuterium enrichment. The Ik-1(t)/Ik(t) ratios. The experimental values (black circles) and theoretical fit (purple) for the I0(t) of the peptide. I0(0) is the monoisotopic RIA of the unlabeled peptide. NEH is the number of hydrogens accessible to deuterium in drinking water. pX(t) is the deuterium enrichment of a peptide at the labeling time point t. Ai(t) is the raw (non-normalized) abundance of the ith mass isotopomer measured in LC-MS. L– labeled, Uunlabeled, LC liquid chromatography, MS mass spectrometry.

In Eq. (1), I0(0) is the monoisotopic RIA of the unlabeled peptide, I0asymp is the monoisotopic RIA achieved at the plateau of labeling, t is the labeling duration, and k is the turnover rate (also referred to as degradation rate constant26,40). I0asymp is calculated from the number of hydrogens accessible to deuterium from heavy water (NEH), the deuterium enrichment of body water (pW), and the natural abundance of deuterium (pH) as shown in Supplementary Eq. (10). GOF characteristics such as the coefficient of determination (R2), Pearson correlation, and residual standard error are used to evaluate the quality of the theoretical fit to the experimental data. Normally, R2 > 0.8 is used to filter peptides36,37. The residual standard error is used to compute the confidence interval of the turnover rate, k (Supplementary Eqs. (8), (9), and (11)).

The exponential decay model relies on the accurate estimations of the monoisotopic RIAs. Since mammalian proteome samples are complex, their peptides often coelute, and isotope profiles overlap. Examples of overlapping isotope profiles are provided in Supplementary Figs. 15. The overlaps resulted in inaccurate estimations of the monoisotopic RIA when complete isotope profiles were used. However, the label enrichment could accurately be estimated from the non-affected mass isotopomers which can then be used to reconstruct the monoisotopic RIA (as will be shown below). For example, the deuterium enrichment, pX(t), is obtained, in the equation for the ratio of abundances of the second (A2(t)) and first heavy mass isotopomers (A1(t)):

$$\frac{{{{{{{\rm{A}}}}}}}_{2}\left({{{{{\rm{t}}}}}}\right)}{{{{{{{\rm{A}}}}}}}_{1}\left({{{{{\rm{t}}}}}}\right)}=\left\{\frac{{{{{{{\rm{I}}}}}}}_{2}\left(0\right)}{{{{{{{\rm{I}}}}}}}_{0}\left(0\right)}-\frac{{{{{{{\rm{I}}}}}}}_{1}\left(0\right)}{{{{{{{\rm{I}}}}}}}_{0}\left(0\right)}{{{{{{\rm{b}}}}}}}_{1}\left(0\right)+\frac{{{{{{{\rm{N}}}}}}}_{{{{{{\rm{EH}}}}}}}+1}{{{{{{{\rm{N}}}}}}}_{{{{{{\rm{EH}}}}}}}-1}\left({{{{{{\rm{b}}}}}}}_{2}\left(0\right)-{{{{{{\rm{b}}}}}}}_{2}\left({{{{{\rm{t}}}}}}\right)\right)\right\}{{{{{{\rm{I}}}}}}}_{0}\left({{{{{\rm{t}}}}}}\right)+{{{{{{\rm{b}}}}}}}_{1}\left({{{{{\rm{t}}}}}}\right)$$
(2)

bn(t) is defined as:

$${{{{{{\rm{b}}}}}}}_{{{{{{\rm{n}}}}}}}\left({{{{{\rm{t}}}}}}\right)=\left(\begin{array}{c}{{{{{{\rm{N}}}}}}}_{{{{{{\rm{EH}}}}}}}\\ {{{{{\rm{n}}}}}}\end{array}\right){\left(\frac{{{{{{{\rm{p}}}}}}}_{{{{{{\rm{X}}}}}}}\left({{{{{\rm{t}}}}}}\right)+{{{{{{\rm{p}}}}}}}_{{{{{{\rm{H}}}}}}}}{1-{{{{{{\rm{p}}}}}}}_{{{{{{\rm{H}}}}}}}-{{{{{{\rm{p}}}}}}}_{{{{{{\rm{X}}}}}}}\left({{{{{\rm{t}}}}}}\right)}\right)}^{{{{{{\rm{n}}}}}}}$$

In the above equation, Ai(t) is the raw (non-normalized) abundance of the ith mass isotopomer. Numerically, pX(t) is determined by minimizing the absolute value of the difference between experimental, \(\frac{{{{{{{\rm{A}}}}}}}_{2}^{{{{{{\rm{expr}}}}}}}({{{{{\rm{t}}}}}})}{{{{{{{\rm{A}}}}}}}_{1}^{{{{{{\rm{expr}}}}}}}({{{{{\rm{t}}}}}})}\), and theoretical \(\frac{{{{{{{\rm{A}}}}}}}_{2}\left({{{{{\rm{t}}}}}}\right)}{{{{{{{\rm{A}}}}}}}_{1}\left({{{{{\rm{t}}}}}}\right)}\), ratios from Eq. (2):

$${p}_{X}(t)={\arg }\mathop{{{\min }}}\limits_{{{{{{{\rm{p}}}}}}}_{{{{{{\rm{X}}}}}}}\left({{{{{\rm{t}}}}}}\right)}\left|\frac{{{{{{{\rm{A}}}}}}}_{2}\left({{{{{\rm{t}}}}}}\right)}{{{{{{{\rm{A}}}}}}}_{1}\left({{{{{\rm{t}}}}}}\right)}-\frac{{{{{{{\rm{A}}}}}}}_{2}^{{{{{{\rm{expr}}}}}}}({{{{{\rm{t}}}}}})}{{{{{{{\rm{A}}}}}}}_{1}^{{{{{{\rm{expr}}}}}}}({{{{{\rm{t}}}}}})}\right|{subject\; to}:{{{{{{\rm{p}}}}}}}_{{{{{{\rm{X}}}}}}}(t)\in \left[0{{{{{\rm{;}}}}}}{{{{{{\rm{p}}}}}}}_{{{{{{\rm{W}}}}}}}\right]$$

The analytical formulas for the ratios of other mass isotopomers (A2(t)/A1(t), A1(t)/A0(t), A3(t)/A0(t), and A4(t)/A0(t)) are provided in the Supplementary Notes. We have implemented this technique in the software tool d2ome35. The updated version of the tool will be referred to as d2ome + . The main differences from the previous version are the ability to quantify the label enrichment from two mass isotopomers and the computation of the confidence intervals. d2ome+ is available at GitHub, https://github.com/rgsadygov/d2ome.

Once pX(t) is determined from a ratio of pair of raw abundances (from Eq. (2), and Supplementary Eqs. (2), (3) for A1(t)/A0(t), A2(t)/A0(t)), the RIA of the monoisotope, \(\widetilde{{{{{{{\rm{I}}}}}}}_{0}\left({{{{{\rm{t}}}}}}\right)}\), is reconstructed as:

$$\widetilde{{{{{{{\rm{I}}}}}}}_{0}\left({{{{{\rm{t}}}}}}\right)}={{{{{{\rm{I}}}}}}}_{0}\left(0\right){\left(1-\frac{{{{{{{\rm{p}}}}}}}_{{{{{{\rm{X}}}}}}}\left({{{{{\rm{t}}}}}}\right)}{1-{{{{{{\rm{p}}}}}}}_{{{{{{\rm{H}}}}}}}}\right)}^{{{{{{{\rm{N}}}}}}}_{{{{{{\rm{EH}}}}}}}}$$
(3)

The value of \(\widetilde{{{{{{{\rm{I}}}}}}}_{0}\left({{{{{\rm{t}}}}}}\right)}\) is used in the time course data to fit the exponential decay function, Eq. (1). Thus, the monoisotopic RIA is estimated from the abundances of two mass isotopomers, instead of the completed isotope profile used by the traditional approach.

Figure 2 presents an example of the implementation of the approach for the CPMS_MOUSE peptide, GTTITSVLPKPALVASR. It shows the monoisotopic RIAs estimated from the complete isotope profile (black circles), A1(t)/A0(t) (blue pluses), A2(t)/A0(t) (green stars), and A2(t)/A1(t) (yellow crosses) ratios. The rates computed using the time series of reconstructed \(\widetilde{{{{{{{\rm{I}}}}}}}_{0}\left({{{{{\rm{t}}}}}}\right)}\) by utilizing the label enrichment estimations from the ratio of raw abundance of two mass isotopomers ranged from 0.117 day−1 to 0.137 day−1. The turnover rate computed from the complete isotope profile was 0.135 day−1. In the Supplementary Data 1, there are RIA time series data for 1227 distinct peptide sequences (amino acid sequence, charge, post-translational modifications) which showed improvements in the GOF characteristics (R2) after the application of the method. The first page of the file includes improved RIAs estimations of peptides from Supplementary Figs. 15.

Fig. 2: An approach using two mass isotopomers accurately reproduces the monoisotopic RIA from the complete isotope profiles.
figure 2

The time series of monoisotopic RIAs (y-axis) is shown along the labeling duration (x-axis). The turnover rates, which were computed by using the reconstruction of the RIAs by each one of the three ratios, A1(t)/A0(t) (blue pluses), A2(t)/A0(t) (yellow crosses), and A2(t)/A1(t) (green stars), agreed to up to 14% with the turnover rate (0.135 day−1) calculated using the complete isotope profile. Solid lines show the fits from the corresponding (based on the color of the data points) label enrichment determination technique. CPSM_MOUSE – mouse protein carbamoyl-phosphate synthase (ammonia), mitochondrial.

The validations of monoisotopic RIAs and turnover rates estimated from partial isotope profiles

The Orbitrap Eclipse dataset was used to validate the use of partial isotope profiles for estimation of the monoisotopic RIAs and turnover rates. First, we compared the values of I0(t) from complete isotope profiles with those obtained by using a pair of mass isotopomers. The comparison was made for peptides with high GOF characteristics to the experimental data (R2 ≥ 0.95). Further filtering was done to restrict the peptides to those whose turnover rates fit the range of shortest (1 day) and longest durations of the labeling (21 days), e.g., 0.05 ≤ k ≤ 0.6. There were 1928 distinct peptides that passed the thresholds. It is assumed that the estimations of the monoisotopic RIAs from the complete isotope profiles were accurate, and the isotope profiles of these peptides served as a validation set for the method using only two mass isotopomers. At every time point of labeling, \(\widetilde{{{{{{{\rm{I}}}}}}}_{0}\left({{{{{\rm{t}}}}}}\right)}\) was computed using Eq. (3) and compared with that obtained from full isotope profile, Supplementary Eq. (1). The density of the relative differences, \(({{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}})-\widetilde{{{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}})})/{{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}})\), is plotted in Supplementary Fig. 6. As the figure shows, the distribution is bell-shaped with a single mode at zero. The relative RIA differences did not exceed 10% of the RIA computed from the complete isotope profile, Supplementary Eq. (1). The final width of the distribution is attributed to the fluctuations in measurements of the mass isotopomer abundances41. The results showed that for high-quality spectral data, the values of the monoisotopic RIA, I0(t), computed from complete or partial isotope profiles, agreed well and validated our approach for using partial isotope profiles for estimating the monoisotopic RIA in experimental data.

Next, the turnover rates obtained from partial and complete isotope profiles were compared. For the high-quality dataset (R2 ≥ 0.95), we used Eq. (3) to compute turnover rates obtained from the reconstructed RIAs. The rates were compared with those obtained from RIAs computed from complete isotope profiles. Supplementary Fig. 7 shows the relative differences between turnover rates obtained by the original approach (using a full isotope profile to compute the monoisotopic RIA) and those obtained by using a pair of mass isotopomers, \(\widetilde{{{{{{{\rm{I}}}}}}}_{0}\left({{{{{\rm{t}}}}}}\right)}\). The density of the distribution was zero-centered with a width about 20%. These data show that our approach using two mass isotopomers for computing the monoisotopic RIA can, in practice, be used to estimate the protein turnover rates. Supplementary Fig. 8 is the scatter plot of the relative errors of the turnover rates and I0(t) estimations from a pair of mass isotopomers. As seen from the figure, the joint distribution is centered around zero on both axes. The Pearson correlation between the relative differences was 0.11. It indicates that there was no systematic error in estimation of the turnover rate from the reconstructed monoisotopic RIA.

The monoisotopic RIAs obtained by using the A1(0)/A0(0), A2(0)/A1(0), and A2(0)/A0(0) ratios and complete isotope profiles were tested to determine how often each one of them improved the RIA estimations. The test determines if there is any redundancy in the estimations. Isotope profiles of non-labeled peptides were used because, for them, the accurate isotope profile is that of the natural isotope distribution, and a direct comparison can be made between the theoretical prediction and experimental observation. For this test, isotope profiles of peptides with R2 larger than 0.75 were used, (15700 peptides). Supplementary Fig. 9A shows the heat map and scatter plot of the experimentally (from complete isotope profiles) and theoretically (from the natural isotope distributions of atoms) computed monoisotopic RIAs. The ideal distribution would be the identity line (shown in red). Supplementary Fig. 9B–D show the heat map and scatter plots of monoisotopic RIAs obtained by using A1(0)/A0(0), A2(0)/A1(0), and A2(0)/A0(0) ratios versus the theoretical monoisotopic RIAs. As is seen from the figures and the correlation analyses, the best matches to the theoretical RIAs are obtained by using A1(0)/A0(0). Supplementary Fig. 10A shows the result obtained from choosing the best fit from any of the ratios, and Supplementary Fig. 10B shows the best fit from either a complete isotope profile or any of the ratios. The mean and standard deviation of the relative differences of RIAs from complete isotope profiles and theoretical calculations (using atomic isotope distributions) were 0.0427 and 0.093, respectively, Supplementary Fig. 9A. After combining the data from the complete mass isotopomer profiles and the ratios of abundances of mass isotopomers, the mean and standard deviation of the relative differences (from the theoretical values) were reduced to 0.0097 and 0.037, respectively, Supplementary Fig. 10B. Pearson correlation between the RIAs computed from only complete isotope profiles was 0.95. It increased to 0.99 when the ratios and complete isotope profiles were used. The greatest number (34%) of RIA estimations came from the A1(0)/A0(0) ratio, Supplementary Fig. 11. The other ratios and complete isotope profile estimations contributed approximately equally. In the d2ome+ workflow section of Supplementary Information, we provide additional data and analyses of the performances of complete and partial isotope approaches for the unlabeled sample. In particular, we provide the density plots of the relative errors of RIA estimations from all methods (Supplementary Fig. 12), the complementarity of the various isotope ratios (Supplementary Fig. 13), and improvements for the peptides whose monoisotopic RIAs are under or over-estimated by complete isotope profiles and were improved by partial isotope profiles (Supplementary Table 1). The description also contains the substantiation for using six mass isotopomers in the protein turnover study based on the deuterium labeling, Supplementary Fig. 14. In Supplementary Data 2, we provide the results of the estimations of the monoisotopic RIAs from each of the methods for all peptides in an unlabeled sample. The proportions of erroneously estimated RIAs were higher for the low abundance peptides, Supplementary Table 1. The relative error of the RIA estimation was higher at the start and end of the chromatographic peak elution, which is shown in the box plot, Supplementary Fig. 15. The analyses provide the characteristics of the monoisotopic RIA estimation from complete and partial isotope profiles with mass accuracy (Supplementary Table 2), chromatographic elution window (Supplementary Table 3), and sample fractionation (Supplementary Table 4).

The approach using two mass isotopomers doubles the number of high-quality model fits

The Orbitrap Eclipse dataset was used to determine the performance of the approach using only two mass isotopomers, identify mass spectral characteristics of peptides for which the quantification was improved, and analyze the accuracy of the rate constant estimations for these peptides. The number of quantified peptides that passed the R2 threshold of 0.8 before and after the use of two mass isotopomer ratios are shown in Fig. 3. The time course of peptides that were quantified in at least four labeling durations were considered36,39. In the dataset, there were 22468 such peptides in total. d2ome+ increased the number of peptides with the improved theoretical fit (R2 ≥ 0.8) to the experimental data by 60%. The exact numbers of peptides are shown in Supplementary Table 5. To detail the improvements by the ranges of the R2 values, the latter were divided into four groups (R2 < 0.8; 0.8 ≤ R2 < 0.9; 0.9 ≤ R2 < 0.95; R2 ≥ 0.95). It is important to detail the improvements, as R2 ≥ 0.8 may mean 0.8 ≤ R2 < 0.85 or R2 ≥ 0.95. The latter interval indicates the most accurate fit to the experimental data. These results are shown in Fig. 4a (original d2ome) and B (d2ome + ). The improvement was in the category of peptides for which the R2 ≥ 0.95, where the number of peptides more than doubled, Supplementary Table 5. The improvements were achieved by each one of the ratios. It is shown in Fig. 5 which depicts proportions that each specific ratio (out of the three) had contributed to the improvements in the R2 coefficients. This finding indicates that all three mass isotopomers were important for improvement of the isotope enrichment estimations.

Fig. 3: The number of peptides useful for protein turnover quantification increases by 60% after processing with d2ome + (two mass isotopomer approach).
figure 3

The blue bars show the number of peptides for which the coefficient of determination (R2) of the theoretical fit to the experimental data was equal to or larger than 0.8. The red bars show the number of peptides for which R2 is less than 0.8. From the estimations using complete isotope profiles (original d2ome), theoretical fits for 10644 distinct peptides had R2 ≥ 0.8. The use of partial isotope profiles (d2ome+) increased the corresponding number to 16708. The number of distinct peptides with R2 < 0.8 was reduced from 14610 to 8546.

Fig. 4: The number of peptides for which the coefficient of determination (R2) is 0.95 or higher has more than doubled when using d2ome + (two mass isotopomer approach).
figure 4

Shown are the pie charts of detailed percentages of the results turnover rates from d2ome (a) and d2ome + (b) approaches. The percentages of quantified peptides with respect to four different R2 intervals are shown. The percentage of peptides for which R2 ≥ 0.95 (blue color) has increased from 16 to 40%. This group has become the largest category of peptides after the theoretical fit using the monoisotopic RIAs obtained from the ratios of two mass isotopomers. The percentages of peptides with R2 < 0.8 (red color) decreased by 24%. The number of peptides in the other two categories remained unchanged.

Fig. 5: Each of the three ratios contributed about equally to the improved estimations of the monoisotopic RIA.
figure 5

The pie chart shows the percentages of data points for which the estimations of labeling enrichment from each ratio type were optimal. For 34% (~61000 data points) of improved estimations of the label enrichments, the improvements were achieved by using A2(t)/A0(t) ratio.

The results were analyzed to find out for which type of mass spectral data the two mass isotopomer approach improves the GOF of the exponential decay model. The focus was on the spectral accuracy (SA) of the complete isotope profiles and its effect on the R2. For this, at first, the distributions of SA of the data for which the R2 was high (R2 ≥ 0.95, 3106 peptides) and low (R2 ≤ 0.25, 7372 peptides) were computed. Since the isotope profiles of peptides change during the labeling, theoretical natural isotope distributions are not sufficient for estimating the SA. It was assumed that at the time point of labeling, t, the mass isotopomer distribution of a peptide is a composite spectrum, \({{{{{\rm{S}}}}}}({{{{{\rm{t}}}}}})\), of unlabeled (with the proportion θ), \({{{{{{\rm{S}}}}}}}_{{{{{{\rm{unlabeled}}}}}}}\), and labeled (with the (1– θ) proportion), \({{{{{{\rm{S}}}}}}}_{{{{{{\rm{labeled}}}}}}}\), forms of the peptide42:

$${{{{{\rm{S}}}}}}({{{{{\rm{t}}}}}})={{{{{{\rm{\theta }}}}}}\,{{{{{\rm{S}}}}}}}_{{{{{{\rm{unlabeled}}}}}}}+\left(1-{{{{{\rm{\theta }}}}}}\right){{{{{{\rm{S}}}}}}}_{{{{{{\rm{labeled}}}}}}}$$
(4)

where \({{{{{{\rm{S}}}}}}}_{{{{{{\rm{unlabeled}}}}}}}\) is the isotope profile of the unlabeled (natural) form of a peptide, and \({{{{{{\rm{S}}}}}}}_{{{{{{\rm{labeled}}}}}}}\) is the isotope profile of the fully labeled form of the peptide. \({{{{{{\rm{S}}}}}}}_{{{{{{\rm{labeled}}}}}}}\) is determined from the body water enrichment (pW) and the number of exchangeable hydrogens (NEH), using the formulas for mass isotopomers abundances (Eqs. (5)–(9) in the “Methods” section). θ can be between 0 to 1; zero corresponds to the unlabeled peptide, one is the fully labeled peptide. This is an often-used form of representation of a composite spectrum43, but here isotope profile of the labeled peptide is computed from a formula instead of the simulation, e.g., by Fourier transforms. θ is obtained from the minimization of the sum of squares of differences between computed, \({{{{{\rm{S}}}}}}({{{{{\rm{t}}}}}})\), and experimental, \({{{{{{\rm{S}}}}}}}^{{{{{{\rm{expr}}}}}}}({{{{{\rm{t}}}}}})\), isotope profiles:

$${{{{{\rm{\theta }}}}}}={\arg }\mathop{{{\min }}}\limits_{{{{{{\rm{\theta }}}}}}}{\left({{{{{{\rm{S}}}}}}}^{{{{{{\rm{expr}}}}}}}({{{{{\rm{t}}}}}})-{{{{{\rm{S}}}}}}({{{{{\rm{t}}}}}})\right)}^{2}={\arg }\mathop{{{\min }}}\limits_{{{{{{\rm{\theta }}}}}}}\mathop{\sum }\limits_{k=0}^{5}{\left({{{{{{\rm{S}}}}}}({{{{{\rm{t}}}}}})}_{{{{{{\rm{k}}}}}}}^{{{{{{\rm{expr}}}}}}}-{{{{{{\rm{S}}}}}}({{{{{\rm{t}}}}}})}_{k}\right)}^{2}{subject}\,{to}:{{{{{\rm{\theta }}}}}}\in \left[0{{{{{\rm{;}}}}}}1\right]$$

The minimum sum of the squares of the differences (SSD) between the theoretical profile from Eq. (4) and the experimental mass isotopomer profile of peptides characterizes the SA of the experimental mass isotopomer at the labeling time point t. The SSDs were calculated for two groups of peptides with R2 ≥ 0.95 and R2 ≤ 0.25. Supplementary Fig. 16 shows the density plots of the natural log-transformed values of SSDs for each group. The densities of the SSDs for these two groups were well separated. It shows that high GOF characteristics are obtained from isotope profiles that have better SAs. Our approach corrects the estimations of the label incorporation for peptides with moderate-to-low GOF characteristics.

Accuracy of turnover rates obtained from the ratios

Since the real turnover rates of proteins are unknown, it is not possible to directly estimate the accuracy of the turnover rates. However, one can use the turnover rates of proteins quantified by multiple distinct peptides, as a basis for the comparison. In general, the estimations of turnover rates of proteins quantified by many peptides are robust. For this analysis, we used only proteins that had at least six distinct quantified peptides that passed the R2 ≥ 0.8 threshold (using complete isotope profiles). Turnover rates, whose GOF characteristics were improved with the computation using the ratios, were compared with the distribution of the rates obtained from the complete isotope profiles. For the comparison metric, a normalized difference between the protein turnover rate, kprot (the median of turnover rates of all peptides of the protein), and the peptide turnover rate, kpep, was used. The metric was \(({k}_{{pep}}-{k}_{{prot}})/\scriptstyle\sqrt{{{k}_{{pep}}}^{2}+{{k}_{{prot}}}^{2}}\). The histograms of the metric before and after using the ratios are shown in Supplementary Fig. 17. The addition of the peptides improved the mean and median, while the standard deviation of the distribution of the metric slightly increased. Thus, the median and mean of the test metric in the updated distribution were 0.02 and 0.04, respectively. In the original distribution, the corresponding values were 0.04 and 0.06. The standard deviation of the distribution before and after the update was 0.16 and 0.2. The numbers of distinct peptides that passed the R2 threshold before and after the use of the ratios for label enrichment were 7210 and 10543, respectively. The results show that using the ratios in peptide turnover rate calculations increased the number of distinct peptides and did not affect the accuracy of the turnover rate estimations for proteins.

Murine liver protein turnover

In the Orbitrap Eclipse dataset, there were 2392 proteins, which had at least one peptide quantified in four or more experiments. As summarized in Supplementary Table 5, for 1769 of the proteins, there were at least one peptide with R2 ≥ 0.8 from time courses computed with complete isotope profiles. The number increased to 2108 for the approach using partial isotope profiles. The number of proteins with at least one peptide with R2 ≥ 0.95 increased by 80% when the monoisotopic RIA was computed from partial isotope profiles.

The global turnover rate of murine proteins has been reported to be 2–3 days8, which corresponds to turnover rates 0.23–0.35 day−1. The median and mean of the turnover rates in the dataset were 0.28 day−1 and 0.22 day−1, respectively. The density plot of the turnover rate distribution is shown in Supplementary Fig. 18. Proteins such as Major Urinary Protein and APOE turnover very fast44—by the time the data was collected for the first labeling time point, 1 day. For these proteins, it can accurately be stated that their turnover rates are faster than 0.69 day−1. The rates of all quantified proteins are reported in Supplementary Data 3. The tables also include the CIs for protein turnover rates which are computed from standard deviations of the rate constants of their constituent peptides (Supplementary Eq. (12)).

Long-lived proteins

There existed a group of murine liver proteins with computed half-lives longer than 21 days (turnover rate smaller than 0.033 day−1). In general, long-lived proteins have received prominent attention in protein turnover studies45,46. Intracellular proteins with long lifespans have been linked to age-dependent defects, which include decreased fertility and functional decline in neurons45. We analyzed the list of proteins with half-lives longer than 21 days using the STRING47 database. The list included histones, lamins, collagens, nuclear pore proteins, hemoglobins, cytoskeletal keratins, Band 3 anion transport protein (major integral membrane glycoprotein of the erythrocyte membrane), and carbonic anhydrases. The STRING47 analysis generated the protein network shown in Supplementary Fig. 19. There are a few subnetworks that are made of interacting histones, cytoskeletal keratins, collagens, carbonic anhydrases, and hemoglobins. Figure 6 shows the representative time series data and the corresponding theoretical fits for different Histone proteins. Members (H2A1B_MOUSE, H2B1B_MOUSE, H31_MOUSE, H4_MOUSE) of the four histone families (H2A, H2B, H3, H4) that are the core components of nucleosome had slower turnover rates ranging from 0.024 day−1 to 0.043 day−1, Fig. 6A–D. Histones of the H1 family bind to the linker DNA between the nucleosomes. The H14_MOUSE variant had a faster turnover rate, 0.0715 day−1, Fig. 6E. Time series data from peptides of these and other slow turnover proteins obtained from time series of 96 distinct peptide sequences are shown in Supplementary Data 4. The approach using partial isotope profiles increased the number of distinct peptides that passed the GOF threshold by 60%.

Fig. 6: The turnover rates of histones are accurately computed by the approach using two mass isotopomers.
figure 6

Slow turnover rates of Histones are reflected in the time series of label incorporation obtained using complete (experimental) isotope profiles and reconstructed isotope profiles (using a pair of mass isotopomers). Shown are the labeling time series data of a peptide of histones: (a) H2A1B (histone 2A type 1-B), (b) H2B1B (histone 2B type 1-B), (c) H3.2 (histone H3.2), (d) H4 (histone H4), (e) H1.4 (histone H1.4), and (f) H2AY (core histone macro-H2A.1).

One group of proteins provides a positive control for zero turnover rate computations. The contaminants, e.g., trypsin and various human keratins, are introduced into proteomic samples at each sample preparation48. They are not labeled with heavy water; their “turnover rate” should be zero. Supplementary Data 5 contains labeling time series of 123 distinct peptides from the contaminant proteins. These peptide sequences are not shared with any of the murine protein sequences in the SwissProt database. As seen from the figures, the computed turnover rates were practically zero. It is noted that for very slow turnover proteins (turnover rate less 0.01 day−1), the R2 is not an appropriate measure of the GOF37. For these proteins, the standard error of the theoretical fit (≤0.05) was used as a GOF measure.

Turnover of proteins in complexes

In general, proteins that form a physical complex are expected to have similar turnover rate. CORUM49 and Gene Ontology (GO)50 databases were used to determine the protein enrichments of complexes. A GO complex was included when the complex was not defined in CORUM. Figure 7 shows the boxplots of the ten-based logarithms of protein turnover rates in each complex. The number of quantified proteins in each complex is shown inside the parentheses. The vertical dashed line is the median of protein turnover rates (0.28 day−1). It corresponds to the median half-life of 2.5 days. The median turnover rate of each one of the mitochondrial respiratory chain complexes was smaller than the liver median. The observation agrees with mouse brain protein turnover, where the mitochondrial respiratory chain proteins were reported7 to turn over slower than the median protein turnover. The ribosomal proteins, too, had slower turnover rates than the median. This is different from the reported result in mouse brain7, where the median turnover rate of the ribosomal proteins was similar to that of all proteins. The Wilcoxon rank sum test was used to identify statistically significant differences in turnover rates of the complexes. The p values were adjusted for the multiple hypothesis testing. The proteins of the mitochondrial ribosome had statistically significant turnover rates from those of the large ribosomal subunit. The differences in turnover rates of the proteins of respiratory chain complex V (mitochondrial proton transporting ATP synthase) from those of the respiratory chain complexes I and III were statistically significant. While the median turnover rate of the complex V proteins was smaller than those of the complexes II and IV, the results were not statistically significant. The sample size in complex II was too small (three proteins out of four listed in the CORUM database). Therefore, the p value of the difference is not significant. In mouse brain, proteins in the complexes III and V had been reported7 to have similar behavior of the median turnover rates. Their turnover rates were slower than those of proteins in the other complexes of the mitochondrial respirator chain.

Fig. 7: Proteins of a complex have similar turnover rates.
figure 7

Turnover rates of proteins in protein complexes obtained from Gene Ontology and CORUM databases. x-axis is the base-10 logarithm of the protein turnover rates. Shown are the boxplots of turnover rates of proteins for each complex. Each box comprises turnover rates between the 25th and 75th percentiles of the complex proteins. The vertical bar in each box is the median turnover rate of the proteins in the corresponding complex. The blue dot in each box is the mean turnover rate of the proteins in the complex. The horizontal blue line indicates the standard error of the mean. The dashed vertical line is the median protein turnover rate in the liver sample. COPI coatomer complex, ATP adenosine triphosphate, ER endoplasmatic reticulum, rRNP ribosomal ribonucleoprotein, COP9 Constitutive photomorphogenesis 9. 1—complexes obtained from the Gene Ontology database (cellular component). 2—complexes obtained from the CORUM database.

The FDR of the Wilcoxon rank sum test (p_value was equal to 0.04) for the comparison of turnover rates of the regulatory particle and core (20 S proteasome) complex of the proteasome was 0.07. The median turnover rate of the core complex proteins was smaller than that of the regulatory particle proteins. This result agrees with the corresponding result for turnover rates from several cell types51. The regulatory particle consists of the lid and base subcomplexes. It recognizes proteins destined for degradation by the proteasome. It is turned over faster than the core complex proteins (includes proteins for the catalytic subunit for the proteolysis). All Wilcoxon rank sum test results are provided in the Supplementary Data 6.

Validations of the approach on datasets from four murine tissue types acquired on Orbitrap Q Exactive HF mass spectrometer

Since the Orbitrap Eclipse dataset was used for development and testing, we also used other datasets to confirm that d2ome+ increased the proteome coverage for protein turnover rate estimations. A recent study36 provided heavy water-labeled LC-MS dataset for analyses of protein turnover in four tissues of the C57/BL6J mouse strain. The data was acquired on Orbitrap Q Exactive HF. In addition to the liver, the study included kidney, heart, and muscle tissues. The latter two have slower protein turnover and differ in that regard from liver tissue analyzed in this study. We have processed the LC-MS data from all four tissue types for improvements in the GOF characteristic of the theoretical fit to the experimental time course data. Protein turnover rates are reported in Supplementary Data 3. The changes in the numbers of peptides that passed the R2 threshold for each tissue type are shown in Supplementary Fig. 20A–D. It is noteworthy that the percent of the peptides for which R2 ≥ 0.8 using the complete isotope profiles obtained in the Orbitrap Q Exactive HF mass analyzer was at least 50%, which was higher than that in the data acquired on Orbitrap Eclipse. For these data, the percent of high (R2 ≥ 0.8) GOF peptides ranged from 51% (muscle) to 62% (liver). Nonetheless, d2ome+ improved the quantification results in all four datasets. In each dataset, it has increased the number of peptides with R2 ≥ 0.8 by at least 30%. The peptide turnover rates for each tissue type obtained using heavy water and amino acid labeling matched closely. The correlation coefficients ranged from 0.96 (muscle proteome) to 0.87 (liver proteome). The scatter plots of the peptide turnover rates are shown in Supplementary Fig. 21A–D. All peptides contained at least one Lys amino acid, as the original study36 compared the turnover rates obtained from labeling with Lys with those using heavy water. In summary, the estimation of label enrichment from abundances of a pair of mass isotopomers improved the proteome coverage of turnover rate estimations from Orbitrap Q Exactive mass spectrometer for all four tissue types (with varying turnover rates). The result indicates that the approach taken in this work generalizes to slow (heart, muscle) and fast (liver, kidney) protein turnover tissues.

Discussion

This work reports on the development of an approach to compute protein turnover from partial isotope profiles. It is based on the estimation of label enrichment from raw abundances of only two mass isotopomers. The enrichment was used to re-construct the corresponding monoisotopic RIA. The newly generated RIAs were used in the time series data to fit the exponential decay model to extract the turnover rate. The developments were necessitated by the observations37,38 that the traditional approach, which estimates the monoisotopic RIA from the complete isotope profiles, results in many quantified peptides that are not used for protein turnover rate estimations. Their GOF characteristics to experimental time series from these peptides were low, R2 < 0.8.

The ratios from three pairs of mass isotopomers were examined in the study, A1(t)/A0(t), A2(t)/A0(t), and A2(t)/A1(t). The best fit from all pairs of the ratios resulted in a 60% increase in the number of peptides that passed the R2 ≥ 0.8 thresholds. The number of peptides for which R2 ≥ 0.9 more than doubled. The examination of the contribution from each one of the ratios revealed the improvements resulted from all ratios with nearly equally contributions. It has been previously reported52,53 that contaminant peptides affect the quantification in protein turnover studies that included SILAC54 labeling. For example, a “prior ion ratio” was introduced as a metric for co-elution with the target peptide52. The prior ion was defined as an ion whose mass is one neutron less than that of the target peptide’s monoisotopic mass. The prior ion ratio was defined as the ratio of the abundance of the prior ion to the sum of the abundances in the isotopic cluster of the target. The ratio served as an indicator for contamination of the target peptides’ isotope cluster. In our approach, this (prior ion) co-elution will mainly describe the interferences with the monoisotope. In this case, the label enrichment estimation and follow-up quantification will be done using the A2/A1 ratio.

The spectral data, which improved the estimation of label enrichment from a pair of mass isotopomers, were analyzed to identify their common properties. The spectral accuracy of these peptides was poor. They are contrasted with the corresponding data for peptides with high R2, which showed good spectral accuracy. It was concluded that the approach for estimating label enrichments from a pair of mass isotopomers improves the labeling time series data for peptides whose mass profiles had inferences from contaminants. Since the complete isotope profile of a peptide requires measurements of up to six mass isotopomers, the co-elutions are likely in complex mammalian samples.

The approach was applied to other datasets from four tissue types with varying turnover rates. In all cases, the number of quantified peptides was increased, and the number of high-quality peptides was doubled. It shows that the method generalized well to slow and fast turnover proteins.

Previous studies have examined approaches for using partial isotope profiles to estimate label incorporation43,55,56. Thus, determination43 of the label enrichment from the first two mass isotopomers had been used for samples labeled with 12C. An approximation to Supplementary Eq. (2) was used to estimate the protein turnover in heavy water labeling from a single labeled sample and two mass isotopomers55. An approximate expression for the A2(t)/A0(t) ratio and a “fudge” factor for it were used in another study56. Another use of the ratios was to estimate the label enrichment from a regression model32. Regression coefficients of several ratios were determined from theoretical simulations of isotope profiles using known label enrichments. The regression coefficients then were used to determine the label enrichment from experimental isotope profile. From the accurate formulas, a single pair of mass isotopomers is enough to estimate the label enrichment. In addition, the applications in this work showed (Fig. 5 and Supplementary Figs. 913) that the estimations separately from each pair of mass isotopomers are necessary to improve the quality of the label quantification. The ratios are chosen from the first three mass isotopomers as they are often the most abundant for peptides. To the best of our knowledge, this work is the first to systematically implement the use of partial isotope profiles for computing the monoisotopic RIA in stable isotope labeling.

We compared the murine liver protein turnover rates from heavy water labeling (Orbitrap Eclipse)14 and heavy amino-acid, 13C6-Lys, labeling36. In both studies, the C57/BL6J mouse strain was used. In heavy amino acid-based labeling, protein turnover is calculated from peptides that contain at least one labelable amino acid. The scatter plot and heat map of protein turnover rates from all proteins are shown in Supplementary Fig. 22A. There were 984 proteins common to both datasets in the turnover rate range [0.01, 1.0] day−1. The turnover range was restricted based on the shortest and the longest durations of labeling. The Pearson correlation between the rates was 0.71. For proteins quantified by multiple peptides, the turnover rate estimations are expected to be accurate. When we filtered the proteins to require at least three unique peptides, the correlation increased to 0.85, and the number of common proteins reduced to 340. The corresponding scatter plot and heat map showed a structure in the data, Supplementary Fig. 22B. It was observed that for proteins whose turnover rate falls into the [0.01, 0.3] day−1 range, the differences between the turnover rates from the two labeling approaches were mainly smaller than 20%. In this range, there were 214 proteins, and for 164 of them, the turnover rates computed by the two labeling approaches differed by less than 20%. For the proteins in this turnover range, the largest relative deviation of rates between the two methods was 70%. We conclude that for abundant proteins in the [0.01, 0.3] day−1 range, both labeling methods produced similar and reproducible results. It is noted that the structure in the heat map was noticeable in the unfiltered (by the number of quantified peptides) data as well, Supplementary Fig. 22A.

In summary, we developed an approach to estimating protein turnover from two mass isotopomers. The approach improved the GOF characteristics of the protein turnover model in various tissue types with slow and fast turnover proteins. The implemented tool, d2ome + , is publicly available. It increases the proteome coverage in protein turnover studies using heavy water metabolic labeling and LC-MS. We expect that it will be useful to a broader community as a data processing tool and promote the applications of the heavy water labeling platform for protein turnover studies.

Here the approach was applied to a deterministic and one-compartment model of the depletion of monoisotopic RIA. There are more complex (such as stochastic and two-compartment) models for the time course of depletion and turnover rate57,58. The stochastic model accounts for the correlations in the time course data, and the two-compartment model incorporates label enrichment kinetics of amino acids. The suggested approach to estimating the label enrichment from a ratio of two mass isotopomers reconstructs the monoisotopic RIA. The reconstructed RIA can be used in complex models for protein turnover rate computation.

Methods

Animal experiments, sample preparation, LC-MS experiments, protein identification, and d2ome+ workflow are described in the correspondingly named sections of Supplementary Information.

The time course of the monoisotopic RIA was given in Eq. (3) in the main text. The normalized abundances of the first three heavy mass isotopomers were previously derived59:

$${{{{{{\rm{I}}}}}}}_{1}\left({{{{{\rm{t}}}}}}\right)=\frac{\left(1-{{{{{{\rm{p}}}}}}}_{{{{{{\rm{H}}}}}}}-{{{{{{\rm{p}}}}}}}_{{{{{{\rm{X}}}}}}}({{{{{\rm{t}}}}}})\right){{{{{{\rm{p}}}}}}}_{{{{{{\rm{X}}}}}}}({{{{{\rm{t}}}}}})}{\left(1-{{{{{{\rm{p}}}}}}}_{{{{{{\rm{H}}}}}}}\right)}{{{{{{\rm{N}}}}}}}_{{{{{{\rm{EH}}}}}}}{{{{{{\rm{I}}}}}}}_{0}\left({{{{{\rm{t}}}}}}\right)+{\left(1-\frac{{{{{{{\rm{p}}}}}}}_{{{{{{\rm{X}}}}}}}({{{{{\rm{t}}}}}})}{1-{{{{{{\rm{p}}}}}}}_{{{{{{\rm{H}}}}}}}}\right)}^{{{{{{{\rm{N}}}}}}}_{{{{{{\rm{EH}}}}}}}}{{{{{{\rm{I}}}}}}}_{1}\left(0\right)$$
(5)
$${{{{{{\rm{I}}}}}}}_{2}\left({{{{{\rm{t}}}}}}\right)={{{{{{\rm{I}}}}}}}_{0}\left({{{{{\rm{t}}}}}}\right)\left\{\frac{{{{{{{\rm{I}}}}}}}_{2}\left(0\right)}{{{{{{{\rm{I}}}}}}}_{0}\left(0\right)}-\frac{{{{{{{\rm{I}}}}}}}_{1}\left(0\right)}{{{{{{{\rm{I}}}}}}}_{0}\left(0\right)}\frac{{{{{{{\rm{p}}}}}}}_{{{{{{\rm{H}}}}}}}{{{{{{\rm{N}}}}}}}_{{{{{{\rm{EH}}}}}}}}{\left(1-{{{{{{\rm{p}}}}}}}_{{{{{{\rm{H}}}}}}}\right)}+{{{{{{\rm{b}}}}}}}_{2}\left(0\right)-{{{{{{\rm{b}}}}}}}_{2}\left({{{{{\rm{t}}}}}}\right)\right\}+{{{{{{\rm{b}}}}}}}_{1}\left({{{{{\rm{t}}}}}}\right){{{{{{\rm{I}}}}}}}_{1}\left({{{{{\rm{t}}}}}}\right)$$
(6)

In Eq. (7), cn denotes the following coefficient:

$${{{{{{\rm{c}}}}}}}_{{{{{{\rm{n}}}}}}}=\left(\begin{array}{c}{{{{{{\rm{N}}}}}}}_{{{{{{\rm{EH}}}}}}}+{{{{{\rm{n}}}}}}-1\\ {{{{{\rm{n}}}}}}\end{array}\right){\left(\frac{{{{{{{\rm{p}}}}}}}_{{{{{{\rm{H}}}}}}}}{1-{{{{{{\rm{p}}}}}}}_{{{{{{\rm{H}}}}}}}}\right)}^{{{{{{\rm{n}}}}}}}$$

bn(t) was defined in the main text.

For this work, we derived equations for the heavy mass isotopomers I4(t) and I5(t), and finalized the derivation of I3(t), which made it independent of the number of all hydrogens unlike the previous formula59. The heavy mass isotopomers are important because their relative abundance significantly increase even for small peptides (<1100 Da mass) after labeling with 5% enriched heavy water, Supplementary Fig. 14.

$${{{{{{\rm{I}}}}}}}_{3}({{{{{\rm{t}}}}}})= \,{{{{{{\rm{b}}}}}}}_{3}({{{{{\rm{t}}}}}}){{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}})+{{{{{{\rm{b}}}}}}}_{2}({{{{{\rm{t}}}}}})({{{{{{\rm{I}}}}}}}_{1}({{{{{\rm{t}}}}}})-{{{{{{\rm{b}}}}}}}_{1}({{{{{\rm{t}}}}}}){{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}})) \\ +{{{{{{\rm{b}}}}}}}_{1}({{{{{\rm{t}}}}}})\left\{{{{{{{\rm{I}}}}}}}_{2}({{{{{\rm{t}}}}}})-{{{{{{\rm{b}}}}}}}_{1}({{{{{\rm{t}}}}}})({{{{{{\rm{I}}}}}}}_{1}({{{{{\rm{t}}}}}})-{{{{{{\rm{b}}}}}}}_{1}({{{{{\rm{t}}}}}}){{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}}))-{{{{{{\rm{b}}}}}}}_{2}({{{{{\rm{t}}}}}}){{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}})\right\} \\ +\left\{\frac{{{{{{{\rm{I}}}}}}}_{3}(0)}{{{{{{{\rm{I}}}}}}}_{0}(0)}-{{{{{{\rm{c}}}}}}}_{1}\frac{{{{{{{\rm{I}}}}}}}_{2}(0)}{{{{{{{\rm{I}}}}}}}_{0}(0)}+{{{{{{\rm{c}}}}}}}_{2}\frac{{{{{{{\rm{I}}}}}}}_{1}(0)}{{{{{{{\rm{I}}}}}}}_{0}(0)}-{{{{{{\rm{c}}}}}}}_{3}\right\}{{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}})$$
(7)
$${{{{{{\rm{I}}}}}}}_{4}({{{{{\rm{t}}}}}})= \,{{{{{{\rm{b}}}}}}}_{4}({{{{{\rm{t}}}}}}){{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}})+{{{{{{\rm{b}}}}}}}_{3}({{{{{\rm{t}}}}}})\{{{{{{{\rm{I}}}}}}}_{1}({{{{{\rm{t}}}}}})-{{{{{{\rm{b}}}}}}}_{1}({{{{{\rm{t}}}}}}){{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}})\}\\ +{{{{{{\rm{b}}}}}}}_{2}({{{{{\rm{t}}}}}})\{{{{{{{\rm{I}}}}}}}_{2}({{{{{\rm{t}}}}}})-{{{{{{\rm{b}}}}}}}_{1}({{{{{\rm{t}}}}}})({{{{{{\rm{I}}}}}}}_{1}({{{{{\rm{t}}}}}})-{{{{{{\rm{b}}}}}}}_{1}({{{{{\rm{t}}}}}}){{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}}))-{{{{{{\rm{b}}}}}}}_{2}({{{{{\rm{t}}}}}}){{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}})\} +{{{{{{\rm{b}}}}}}}_{1}({{{{{\rm{t}}}}}})\{{{{{{{\rm{I}}}}}}}_{3}({{{{{\rm{t}}}}}})\\ -{{{{{{\rm{b}}}}}}}_{1}({{{{{\rm{t}}}}}})[{{{{{{\rm{I}}}}}}}_{2}({{{{{\rm{t}}}}}})-{{{{{{\rm{b}}}}}}}_{1}({{{{{\rm{t}}}}}})({{{{{{\rm{I}}}}}}}_{1}({{{{{\rm{t}}}}}})-{{{{{{\rm{b}}}}}}}_{1}({{{{{\rm{t}}}}}}){{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}}))-{{{{{{\rm{b}}}}}}}_{2}({{{{{\rm{t}}}}}}){{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}})]-{{{{{{\rm{b}}}}}}}_{2}({{{{{\rm{t}}}}}})[{{{{{{\rm{I}}}}}}}_{1}({{{{{\rm{t}}}}}}) \\ -{{{{{{\rm{b}}}}}}}_{1}({{{{{\rm{t}}}}}}){{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}})]-{{{{{{\rm{b}}}}}}}_{3}({{{{{\rm{t}}}}}}){{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}})\}+\{\frac{{{{{{{\rm{I}}}}}}}_{4}(0)}{{{{{{{\rm{I}}}}}}}_{0}(0)}-{{{{{{\rm{c}}}}}}}_{1}\frac{{{{{{{\rm{I}}}}}}}_{3}(0)}{{{{{{{\rm{I}}}}}}}_{0}(0)}+{{{{{{\rm{c}}}}}}}_{2}\frac{{{{{{{\rm{I}}}}}}}_{2}(0)}{{{{{{{\rm{I}}}}}}}_{0}(0)}-{{{{{{\rm{c}}}}}}}_{3}\frac{{{{{{{\rm{I}}}}}}}_{1}(0)}{{{{{{{\rm{I}}}}}}}_{0}(0)}+{{{{{{\rm{c}}}}}}}_{4}\}{{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}})$$
(8)
$${{{{{{\rm{I}}}}}}}_{5}({{{{{\rm{t}}}}}})= \,{{{{{{\rm{b}}}}}}}_{5}({{{{{\rm{t}}}}}}){{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}})+({{{{{{\rm{b}}}}}}}_{4}({{{{{\rm{t}}}}}})-{{{{{{\rm{b}}}}}}}_{1}({{{{{\rm{t}}}}}}){{{{{{\rm{b}}}}}}}_{3}({{{{{\rm{t}}}}}}))\{{{{{{{\rm{I}}}}}}}_{1}({{{{{\rm{t}}}}}})-{{{{{{\rm{b}}}}}}}_{1}({{{{{\rm{t}}}}}}){{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}})\}\\ +({{{{{{\rm{b}}}}}}}_{3}({{{{{\rm{t}}}}}})-{{{{{{\rm{b}}}}}}}_{1}({{{{{\rm{t}}}}}}){{{{{{\rm{b}}}}}}}_{2}({{{{{\rm{t}}}}}}))\{{{{{{{\rm{I}}}}}}}_{2}({{{{{\rm{t}}}}}})-{{{{{{\rm{b}}}}}}}_{1}({{{{{\rm{t}}}}}})({{{{{{\rm{I}}}}}}}_{1}({{{{{\rm{t}}}}}})-{{{{{{\rm{b}}}}}}}_{1}({{{{{\rm{t}}}}}}){{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}}))-{{{{{{\rm{b}}}}}}}_{2}({{{{{\rm{t}}}}}}){{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}})\}\\ +({{{{{{\rm{b}}}}}}}_{2}({{{{{\rm{t}}}}}})-{{{{{{\rm{b}}}}}}}_{1}^{2}({{{{{\rm{t}}}}}}))\{{{{{{{\rm{I}}}}}}}_{3}({{{{{\rm{t}}}}}})-{{{{{{\rm{b}}}}}}}_{1}({{{{{\rm{t}}}}}})[{{{{{{\rm{I}}}}}}}_{2}({{{{{\rm{t}}}}}})-{{{{{{\rm{b}}}}}}}_{1}({{{{{\rm{t}}}}}})({{{{{{\rm{I}}}}}}}_{1}({{{{{\rm{t}}}}}})-{{{{{{\rm{b}}}}}}}_{1}({{{{{\rm{t}}}}}}){{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}}))-{{{{{{\rm{b}}}}}}}_{2}({{{{{\rm{t}}}}}}){{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}})]\\ -{{{{{{\rm{b}}}}}}}_{2}({{{{{\rm{t}}}}}})[{{{{{{\rm{I}}}}}}}_{1}({{{{{\rm{t}}}}}})-{{{{{{\rm{b}}}}}}}_{1}({{{{{\rm{t}}}}}}){{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}})]-{{{{{{\rm{b}}}}}}}_{3}({{{{{\rm{t}}}}}}){{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}})\}+{{{{{{\rm{b}}}}}}}_{1}({{{{{\rm{t}}}}}})({{{{{{\rm{I}}}}}}}_{4}({{{{{\rm{t}}}}}})-{{{{{{\rm{b}}}}}}}_{4}({{{{{\rm{t}}}}}}){{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}}))\\ +\{\frac{{{{{{{\rm{I}}}}}}}_{5}(0)}{{{{{{{\rm{I}}}}}}}_{0}(0)}-{{{{{{\rm{c}}}}}}}_{1}\frac{{{{{{{\rm{I}}}}}}}_{4}(0)}{{{{{{{\rm{I}}}}}}}_{0}(0)}+{{{{{{\rm{c}}}}}}}_{2}\frac{{{{{{{\rm{I}}}}}}}_{3}(0)}{{{{{{{\rm{I}}}}}}}_{0}(0)}-{{{{{{\rm{c}}}}}}}_{3}\frac{{{{{{{\rm{I}}}}}}}_{2}(0)}{{{{{{{\rm{I}}}}}}}_{0}(0)}+{{{{{{\rm{c}}}}}}}_{4}\frac{{{{{{{\rm{I}}}}}}}_{1}(0)}{{{{{{{\rm{I}}}}}}}_{0}(0)}-{{{{{{\rm{c}}}}}}}_{5}\}{{{{{{\rm{I}}}}}}}_{0}({{{{{\rm{t}}}}}})$$
(9)

It is emphasized that the formulas for the dynamics of RIA of the six mass isotopomers are applicable to any metabolic labeling enrichment resulting from atom-based stable isotope enrichments, such as 15N or 13C. The only change that is needed to make is to replace NEH with the number of atoms in a peptide which are accessible to the labeling agent. For example, in 15N labeling, NEH is replaced with the number of exchangeable Nitrogens in a peptide. In addition, the formulas are also applicable to accurately estimate deuterium enrichment in hydrogen/deuterium exchange mass spectrometry for studying protein structure60.

The above equations are used to obtain the ratios of raw abundances of the ith and jth mass isotopomers, Ai(t)/Aj(t), since the normalization coefficient cancels out:

$${{{{{{\rm{A}}}}}}}_{{{{{{\rm{i}}}}}}}\left({{{{{\rm{t}}}}}}\right)/{{{{{{\rm{A}}}}}}}_{{{{{{\rm{j}}}}}}}\left({{{{{\rm{t}}}}}}\right)={{{{{{\rm{I}}}}}}}_{i}{({{{{{\rm{t}}}}}})/{{{{{\rm{I}}}}}}}_{{{{{{\rm{j}}}}}}}(t)$$

The ratios are simplified and shown in the Supplementary Notes, where the outlines of the derivations for I4(t) and I5(t) are also presented.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.