## Introduction

The function of a cellular metabolic network is to convert nutrient molecules into biochemical energy and biomass. Metabolomics which focuses on measuring the concentrations of metabolite can offer a panoramic snapshot of the metabolic network [1]. However, such a still-image characterization offers no information on how metabolites interconvert [2]. A common analogy is the car traffic. High density of cars can either be a smooth traffic having large flux, or a complete stop and no traffic flux. Simply counting cars on the road from a picture cannot tell us about the traffic. Therefore, it is important to investigate the metabolic fluxes which are rates at which metabolites are converted to their enzymatic products. For cultured cells, simple metabolic fluxes such as glucose uptake rate or lactate production rate can be calculated by measuring the decrease of glucose concentration or increase of lactate concentration in the medium over time. The intracellular fluxes, however, are impossible to measure by this method because the intracellular metabolite concentrations are constant at steady state [3]. To observe and infer the intracellular metabolic fluxes, we can use stable isotope labeled tracers. In such an experiment, a 13C-labeled nutrient tracer is fed to the biological system. The labeled carbon atoms are then distributed over the metabolic network [4, 5]. When the isotope labeling steady state is reached, the labeling patterns of intracellular metabolites are uniquely determined by only two factors: the metabolic fluxes in the network and the labeling pattern of the isotope tracer [6, 7]. The metabolite labeling patterns are measurable on the mass spectrometers, and can be used to infer the metabolic fluxes in a procedure referred to as the metabolic flux analysis (MFA).

MFA has been widely applied in the fields of metabolic engineering and mammalian physiology to study metabolic networks [8,9,10]. MFA aims at solving metabolic fluxes when given the metabolite isotope labeling patterns. The forward problem, which is to calculate the labeling patterns when given fluxes, is a deterministic problem and can be solved very efficiently using the elementary metabolic unit (EMU) framework [11, 12]. MFA is the reverse to this problem and it usually has no analytical solution. We are looking for a set of fluxes that can generate the metabolite labeling patterns that fit the measurements the best. Therefore, the fluxes are determined by an optimization process. The residual that measures the differences between simulated and measured labeling patterns should be minimized.

For mass spectrometry based labeling measurements, the conventional MFA uses metabolite mass isotopomer distributions (MIDs) as the input information. Moreover, multiple parallel experiments using different tracers can be combined to achieve better flux determination. In such parallel labeling experiments, each tracer alone may not be sufficient to determine all the fluxes in the network. However, the combined MID information can provide a comprehensive profiling of the fluxes. Crown and Antoniewicz have shown that the use of parallel labeling can dramatically improve the precision of MFA [13, 14]. It is noteworthy that the MID does not account for the positional isotope labeling. For example, 1,2-13C-lactate and 2,3-13C-lactate are both described as lactate M + 2 (or M2). Tandem mass spectrometry, which breaks specific C–C bonds in the metabolite molecules, can generate information on positional isotope labeling. Recently, new computational approaches were developed to accommodate tandem mass spectrometry data to improve the performance of MFA. Tandem MS MFA may show unique advantages when compared to the parallel labeling strategy, especially in animal studies. Parallel labeling would require more animals in the study, and unlike culture cells, the animals may not be perfectly parallel in their metabolism. The best strategy is probably to combine the use of parallel labeling and tandem MS for MFA. In this review paper, we will focus on discussing what tandem mass spectrometry data can be used for MFA. We will also analyze the flux constraints provided by the tandem mass spectrometry data. Our results show that combining the MIDs of the parent and daughter ion and the tandem MID for the MFA is more powerful than using tandem MID alone.

### Tandem mass spectrometry

In tandem mass spectrometry, a parent ion is first selected on a quadrupole mass filter. The parent ion is then fragmented to give daughter ions which are detected on the mass analyzer. In the example shown in Fig. 1a, a mixture of metabolite A consists of equal amount of four isotopomers. The parent MID is 25% M0, 25% M1, 25% M2, and 25% M3. This molecule of four carbon atoms can be fragmented to a daughter ion composed of the last two carbon atoms (denoted $$A_{1234}^{34}$$). Based on the fragmentation pattern, the M1 parent will only generate m1 daughter ions. The M2 parent will only generate m0 daughter ions and the M3 parent will only generate m2 daughter ions. The M0 parent will always generate m0 daughter ions. Such information of the tandem mass isotopomer distribution (TMID) can be expressed in a matrix form (Fig. 1b) [15, 16]. TMID is more informative than the parent and daughter MIDs combined because the parent–daughter relationship is also revealed. From the TMID matrix, we can easily calculate the MIDs for the parent and daughter ions which are the column-wise and row-wise sums respectively (Fig. 1b). Experimentally, the measurement of TMID depends on the type of instrument. On the triple quadrupole (QQQ) instruments, each scan event covers a parent–daughter pair and provides a single number in the TMID matrix (e.g., M2-m0). On hybrid mass spectrometers such as Q-TOF or Q-Orbitrap instruments, all the daughter ions from a parent can be measured simultaneously. Therefore, each scan event on the hybrid instruments provides a whole column in the TMID matrix [17]. Note that the lower-left corner of the TMID matrix (Fig. 1b) is blank because the number of labeled atoms in the daughter ion cannot exceed the number of labeled atoms in the parent ion. Similarly, the upper-right corner of the TMID matrix is blank because the number of unlabeled atoms in the daughter ion cannot exceed the number of unlabeled atoms in the parent ion. Choi and Antoniewicz [18] proposed to use the compact tandem MS matrix in which each row is shifted to the left to remove the blank corners. Alternatively, the tandem MS data can be expressed in a vector form which is generated by concatenate the rows of the compact tandem MS matrix (Fig. 1b). The vector form is more convenient when simulating the EMU labeling patterns [19].

The main motivation for introducing tandem MS data to MFA is to improve the precision of flux determination. Figure 1c shows an example using the metabolic network from by Anotoniewicz et al. [11]. When using only the MID of the metabolite F in the network, the flux f2 can take any value above 100. However, if we use the tandem MS data which measures the labeling of F at C-3 position, the f2 is confined in the interval of (138, 164). A well-designed MFA study should have tight confident intervals on all the important fluxes. This can be achieved by utilizing measurements from tandem MS, and/or using parallel labeling technique.

A TMID is used to describe the conditional MIDs of the daughter ions generated from the same fragmentation reaction. However, a metabolite ion may fragment by breaking different bonds and generate multiple daughter fragments at the same time [20]. Kappelmann et al. [21] reported a comprehensive investigation of fragmentation patterns of central metabolism intermediates. Key information from this paper is that some daughter ions may have mixed identities. Malate, for example, may lose a water molecule and become fumarate (Fig. 2). Fumarate is a symmetric molecule and can decarboxylate at either C1 or C4 position. Both reactions generate C3H2O2 (m/z 71.0139). Therefore, this daughter ion represents a mixture of $$Mal_{1234}^{234}$$ and $$Mal_{1234}^{123}$$ at 1:1 ratio. Alternatively, malate may decarboxylate at C4 first and lost a water to become C3H2O2 which is exclusively $$Mal_{1234}^{123}$$ (Fig. 2). Because the probabilities of these two fragmentation routes are unknown, the C3H2O2 represents the mixture of $$Mal_{1234}^{234}$$ and $$Mal_{1234}^{123}$$ at unknown ratio. Therefore, this daughter ion should not be used for MFA. Only the daughter ions having unambiguous identity such as $$Mal_{1234}^{34}$$ and $$Mal_{1234}^{12}$$ should be used for MFA. Tables 1,  2 summarize the results from Kappelmann et al. [21] and show MS fragments in TCA cycle that can and cannot be used for 13C-MFA.

### Simulating TMID using EMU

EMU is currently the most widely used mathematical framework for MFA. EMU is an algorithm that simulates metabolite MIDs when given the tracer labeling pattern and all the fluxes in the network. The concept of EMU and MID calculation was described by Anotoniewicz et al. [11]. We wish to highlight the fact that the tandem MS data are naturally simulated when calculating the MID of observable metabolites using the EMU approach. To illustrate this idea, we use the example of gluconeogenesis network shown in Choi’s paper (Fig. 3) [16]. There are 12 metabolic fluxes in this network. The tracer input to this metabolic network is asparate (Asp) that has 25% 4-13C1, 25% 1,2-13C2, 25% 2,3,4-13C3 labeling, and 25% unlabeled. The only measurable metabolite is oxaloacetate (OAC). To calculate the MID of the whole OAC molecule (denoted OAC[1234]), we need to know the labeling pattern of Fum[1234] which makes OAC[1234] through f6. Fum[1234] will be traced back to other size four EMUs. Additionally, we need to know the labeling patterns of AcCoA and Glyox which make OAC through f9. AcCoA is an unlabeled input to the metabolic network. Glyox is made from Cit[45] through f8, and Cit[45] is made from OAC[34] through f1. Therefore, in order to calculate OAC[1234], we need to know the OAC[34] first. OAC[34] is a hypothetical EMU when calculating the MID of OAC. However, OAC[34] is also the MID of the C3C4 fragment of OAC, which is measurable using tandem MS. Therefore, from the TMID of $$OAC_{1234}^{34}$$, we can calculate the MIDs of OAC[1234] and OAC[34] and use these values as the constraints for flux optimization. This method is essentially utilizing the existing EMU framework that was designed for MS1 data measurements. To fully utilize the data from tandem MS, we need to expand the EMU to simulate TMID.

Tepper and Shlomi proposed the extended EMU framework to simulate TMID, which is termed tandemer [18, 19]. The tandemer EMU network decomposition is very similar to the normal EMU network decomposition process. For MS1 level measurements, the EMU network decomposition starts from OAC[1234]. For the MS2 level measurements, the tandemer EMU network decomposition starts from OAC[1234]-[34]. The tandemer EMU network decomposition may generate more EMUs than the original method because the tandem MS breaks the molecular symmetry. For example, Fum[1234] and Fum[4321] are two identical EMUs and are treated as one in the calculation. However, the Fum[1234]-[34] and Fum[1234]-[12] are two EMUs that are numerically the same but conceptually different. When the EMU network decomposition is completed, the tandem EMU calculation is the same as normal EMU, except the use of TMID vectors instead of the MID vectors.

### Evaluating the power of TMID for MFA

When performing MFA, it is important to find the optimal flux solution that fits the observations. It is also important to evaluate the uncertainty of the fluxes given the measured data. Many papers report the flux uncertainty as the confidence intervals of individual fluxes [22]. Such results are often erroneously interpreted as the confidence interval of each flux is independent from others. An alternative approach to illustrate the flux uncertainty is to plot the feasible region of the fluxes [23]. This approach provides excellent visualization of the dependency of flux confidence intervals. However, it is difficult to plot for more than two free fluxes. This limitation is obvious for large metabolic networks. As a work around, we can show conditional confidence regions by plotting for two free fluxes with greatest uncertainty while leaving other fluxes unchanged. Here, we focus on the flux feasible region for the gluconeogenesis network that has four free fluxes, f3, f7, f9, and f10. Since the TMID is less sensitive to the two exchange fluxes f3 and f7, we will plot the combination of these two fluxes and leave the other two free fluxes unchanged. In Fig. 4a, we show the 95% confidence region of flux combinations determined from OAC[1234] MID. The optimal solution is f3 = 150, f7 = 40, which is the red center. The plot shows that both fluxes have a broad feasible region which indicates big uncertainty in the flux estimations. In this case, using only the MID of OAC[1234] is clearly not sufficient to determine the f3 and f7 to a high degree of precision. When both the MIDs of the parent ion OAC[1234] and the daughter ion OAC[34] are used, the feasible region of the fluxes is narrowed (Fig. 4b). Furthermore, when the complete TMID of OAC[1234]-[34] ($$OAC_{1234}^{34},$$ fragment of C3–C4 generated from OAC[1234]) is used, the flux feasible region becomes even smaller (Fig. 4c). The feasible region is very narrow in the f7 dimension, suggesting the TMID provides strong constraint on this exchange flux. However, the f3 estimation still has a large uncertainty. In fact, the uncertainty of f3 calculated from TMID OAC[1234]-[34] is even larger than the one calculated from OAC[1234] and OAC[34]. This result is counterintuitive because the TMID has more information than the combination of parent and the daughter ion MIDs, yet the constraint provided by the TMID is weaker on f3. This paradoxical result comes from the way the residual between simulated and measured MIDs is calculated. When the value of f3 deviates from the optimal solution of 150 to the less optimal value of 350, the TMID OAC[1234]-[34] has increases of 0.27 and 0.24% in M1-m0 and M1-m1 fractions respectively. Since the assumed standard deviation of each fraction is 0.2% [16], these two fractions contribute 3.2 to the χ2 statistic. Meanwhile, in the MID OAC[1234], these two fractions adds up to a deviation of 0.51% in the M1 fraction, which contributes 6.38 to the χ2 statistic. This larger increase in the χ2 statistic suggests f3 = 350 makes the MID of OAC[1234] deviate more from the measurements than the TMID OAC[1234]-[34] would do. Therefore, we can conclude that although TMID provides more information than MIDs, it does not necessarily provide stronger constraints on every flux. When both the TMID and the MIDs are used, the precision of the flux estimation can be further improved (Fig. 4d). Therefore, we recommend using both TMID and the parent and daughter MIDs to constrain the fluxes.

## Conclusion

Tandem mass spectrometry can reveal metabolite positional labeling and provide more information for improving the performance of MFA. To utilize the tandem mass spectrometry for MFA, the identity of the daughter ions must be carefully inspected. Daughter ions of mixed origins should not be used. The EMU framework has been extended to accommodate the tandem mass spectrometry data. When calculating the fluxes, the TMID as well as the MIDs of parent and daughter ions should all be used to constrain the fluxes in order to achieve the best performance.