Correcting for natural isotope abundance and tracer impurity in MS-, MS/MS- and high-resolution-multiple-tracer-data from stable isotope labeling experiments with IsoCorrectoR

Experiments with stable isotope tracers such as 13C and 15N are increasingly used to gain insights into metabolism. However, mass spectrometric measurements of stable isotope labeling experiments should be corrected for the presence of naturally occurring stable isotopes and for impurities of the tracer substrate. Here, we analyzed the effect that such correction has on the data: omitting correction or performing invalid correction can result in largely distorted data, potentially leading to misinterpretation. IsoCorrectoR is the first R-based tool to offer said correction capabilities. It is easy-to-use and comprises all correction features that comparable tools can offer in a single solution: correction of MS and MS/MS data for natural stable isotope abundance and tracer impurity, applicability to any tracer isotope and correction of multiple-tracer data from high-resolution measurements. IsoCorrectoR’s correction performance agreed well with manual calculations and other available tools including Python-based IsoCor and Perl-based ICT. IsoCorrectoR can be downloaded as an R-package from: http://bioconductor.org/packages/release/bioc/html/IsoCorrectoR.html.


The necessity of performing MS/MS correction on MS/MS data
Until recently, no algorithms suitable for correcting MS/MS data for natural isotope abundance were available. Thus, MS/MS data was subjected to correction with algorithms actually intended for MS 1 data. An approach to correcting MS/MS data with an MS 1 correction algorithm and without accounting for fragmentation is to sum up the measured values of MS/MS transitions of a given molecule that contain an equal amount of label in the precursor ion. E.g, the measured values of a transition with 2 13 C in the product ion and 0 13 C in the neutral loss and a transition with 1 13 C in the product ion and 1 13 C in the neutral loss would be added up because they share the same amount of label in the precursor (2 13 C). This way, the MS/MS data is converted to MS 1 data, losing information on fragmentation. Supplementary Fig. 1 shows an example where hypothetic MS/MS data from an alanine molecule that fragments at the C1-C2 bond is converted to MS 1 data. Following conversion, MS 1 correction can be performed based on the precursor ion molecular formula, as if no fragmentation had occurred.
However, correcting MS/MS data with an MS 1 approach is only valid when one of the two fragments cannot be labeled through metabolism. This may be the case if it is a derivatizing group or does not contain the tracer element. But even then, only the sum formula of the fragment that contains label should be used for correction, not the precursor ion in total. To show the effect of MS 1 correction on MS/MS data and the parameters that drive the resulting deviations, simulated data are used. Uncorrected MS/MS data have been simulated (see methods section) for carbon chains of varying length. The simulated carbon chains (see Supplementary Fig. 2a) always consist of a product ion with 10 carbons that can be metabolically labeled at 4 positions. The amount of C that can be metabolically labeled in the neutral loss is 2 for every simulated species, resulting in a maximum number of 6 labels in the precursor. However, depending on the simulated species, the total amount of C in the neutral loss can increase up to 10. Thus, the simulated species differ in the amount of neutral loss C atoms that cannot be labeled. The carbons that cannot be labeled in product ion and neutral loss can be thought of as derivatizing groups. Supplementary Fig. 2b shows the error resulting from MS 1 correction on MS/MS data of the simulated species. In this data, for all simulated species, the true abundance (after MS/MS correction) of all isotopologues is equal. Deviations from this equality that are found after MS 1 correction are shown relative to the expected value.
It can be seen that MS 1 correction works perfectly for the species that contain 0 to 2 13 C from labeling. However, as the amount of 13 C incorporation increases, the relative error increases.
At the same time, the deviation also increases with the growth of the neutral loss carbon chain portion that cannot be labeled. It is important to state that exchanging neutral loss and product ion in this simulation, i.e., the neutral loss remains the same while the size of the product ion increases, will provide the same results. In Supplementary Fig. 2c, the error resulting from MS 1 instead of MS/MS correction on the maximally labeled species (6 13 C) is shown for the different simulated carbon chains. The figure demonstrates a case where the abundance of the 6 13 C species is substantially lower than that of the species contributing to its signal via natural abundance: the true ratio expected after MS/MS correction is 1:10. As can be seen in Figure   2b (manuscript), such a ratio increases the impact of natural abundance correction strongly.
The example was chosen to demonstrate the potential magnitude of inappropriately applied MS 1 correction: for the longest neutral loss carbon chain (NL-C10), the relative error of MS 1 correction is as high as 75%.
How can those effects be explained? In the case of MS/MS data, usually only those transitions that cover stable isotope incorporation into the part of the molecule that can be metabolically labeled are measured. If, for example, the maximum amount of 13 C label from metabolism expected in the product ion is 4 and 2 in the neutral loss, usually no transitions exceeding an amount of 4 13 C in the product ion and 2 in the neutral loss will be measured. However, natural abundance contributions according to the precursor molecular formula can also arise from labeling states that can only be produced by natural abundance and not by metabolic labeling and, therefore, they are not covered by the transitions. Considering the simulated molecules in Supplementary Fig. 2a, examples for such labeling states are species carrying 3 13 C in the neutral loss or 5 13 C in the product ion. Although the natural abundance contributions of such species are not measured in MS/MS, they are nevertheless removed when performing MS 1 correction using the precursor formula. Consequently, this leads to an overcorrection of the data, resulting in values that are smaller than the true (MS/MS corrected) values. Considering this, it becomes evident why, in the example in Supplementary Fig. 2b, the MS 1 corrected values match perfectly with the true corrected values up to 2 13 C in total. In that case, all labeling states that can possibly arise from the precursor are still covered by the transitions.
However, considering the species carrying 3 13 C in total, the results begin to deviate as, in that example case, no transition with 3 13 C in the neutral loss is measured. This is because a neutral loss with more than 2 13 C is not expected to be produced through metabolism for this hypothetic molecule. The cause for the neutral loss chain length dependency of the correction can be explained as follows: the number of C that cannot be labeled increases the natural abundance of species (e.g., molecules having 3 13 C in the neutral loss) that are not covered by the transitions. Thereby, it increases the difference to the true results. However, regardless of the deviations produced by MS 1 correction on MS/MS data, the loss of transition resolution alone makes the approach inappropriate in all cases where one is interested in positional/isotopomer information. Supplementary Fig. 1 Supplementary Fig. 1a shows MS/MS data from alanine (chemical structure 1) from a hypothetic 13 C stable isotope labeling experiment. The x-axis labels correspond to the number of 13 C in the precursor ion and in the product ion of the given transition, separated by a dot. Supplementary Fig. 1b shows the data from a after it has been converted to MS 1 data by adding the area values of transitions with an equal amount of label in the precursor ion (transitions 1.0 + 1.1. and transitions 2.1 + 2.2).  Fig. 2c illustrates the same error as Supplementary Fig. 2b, except that only the values for the maximally labeled isotopologue (6 13 C) are shown for each hypothetic molecule. Furthermore, the true abundance of the 6 13 C species is only 1/10 of that of the other isotopologues.

Comparison of correction results with manual correction
To validate the correction results of IsoCorrectoR, we also performed a manual correction for PCF-derivatized glycine (chemical structure 2 in Supplementary Fig. 3). The probability matrices were generated using Microsoft Excel. They were then imported together with simulated, uncorrected values into Matlab to solve the system of linear equations for the corrected values. The process of manual probability matrix calculation is laborious and errorprone. However, for small core molecules like glycine it is still manageable and the derivatization, as long as it does not introduce new elements into the molecule, does not add substantially to the complexity of the calculation. Supplementary Fig. 3  Manual correction was also performed to validate the high-resolution multiple-tracer correction mode of IsoCorrectoR. Here, uncorrected high-resolution data for PCF-glycine was simulated in which the incorporation of 15 N and 13 C isotopes could be resolved. Manual correction was performed with the same workflow as for nominal mass resolution correction. Considering Supplementary Fig. 4a, the correction results of IsoCorrectoR, PyNAC and the manual approach are compared for the case that tracer purity is not considered. All results match perfectly. Supplementary Fig. 4b shows the correction of the same uncorrected values, however now applying a correction for 98% tracer purity. Again, the results of IsoCorrectoR and of the manual approach match perfectly. PyNAC is not included in this diagram as it cannot correct for tracer purity.