Abstract
Maintaining the high-temperature and pressure conditions required for sustained nuclear fusion is challenging due to the turbulent transport that naturally occurs in the plasma. Developing reliable models for turbulent transport is essential for progress in fusion research and development. This study proposes multi-fidelity modeling for the improved accuracy of regression models for turbulent transport in magnetic fusion plasma. Multi-fidelity modeling combines low-fidelity data, which have low accuracy but many data points, with high-fidelity data, which are highly accurate but have few data points or small parameter ranges, to enhance the overall predictive accuracy of a model. We used a multi-fidelity information fusion technique, Nonlinear AutoRegressive Gaussian Process regression (NARGP), to solve the regression problems associated with turbulent transport in plasma. We applied NARGP to (i) merge the low-resolution and high-resolution simulation results, (ii) apply regression of turbulence diffusivity to the experimental dataset using linear analyses, and (iii) adapt the quasi-linear transport model to nonlinear simulation results of a particular discharge. We demonstrated that NARGP improved the prediction accuracy of the plasma turbulent transport model. NARGP offers a robust and versatile method for integrating multi-fidelity data, and its broad applicability may contribute to optimizing fusion reactor design and operation.
Similar content being viewed by others

Introduction
Nuclear fusion is attracting attention as a future sustainable energy source. Technology that is able to confine high-temperature, high-pressure plasma with stability using a strong magnetic field has been extensively investigated to realize nuclear fusion. Plasma confinement is affected by the turbulent transport that naturally occurs in the plasma, which presents a challenge to maintaining the high-temperature and high-pressure conditions that are essential to sustain nuclear fusion reactions. Therefore, an accurate understanding of turbulent transport in plasma is needed to predict the performance of plasma confinement and to optimize the design and operation of fusion reactors.
Turbulent transport in magnetically confined plasma originates from micro-instabilities that are driven by steep density or temperature gradients1. The physics of micro-instabilities and their associated turbulent transport is described by gyrokinetics2. While the gyrokinetics is a reduced kinetic theory derived under the assumption of strong magnetic fields, it is widely regarded as a near first-principle approach for capturing low-frequency fluctuations with high accuracy in plasma turbulence, providing a more precise representation of the plasma turbulence than gyro-fluid or extended magnetohydrodynamic models. Researchers routinely use gyrokinetic simulations for plasma turbulence to measure the physical characteristics of plasma turbulence in detail3,4,5. However, large-scale kinetic simulations incur high computational costs, so high-precision transport models with reduced computational costs are needed for parameter surveys. In this vein, physics-based quasi-linear transport models such as TGLF6,7 and QuaLiKiz8,9 have been developed. Recent studies in the magnetic fusion community have developed neural-network-based surrogate models for faster processing10,11,12,13.
We re-organize the data available for turbulent transport research in terms of the number of data points, the measurement range of parameters, and fidelity (or accuracy and reliability). Turbulent transport research uses various data types, including experimental observation data, large-scale simulation output data, and prediction data from reduced models. These data have different fidelities: for example, experimental data should have high fidelity that represents the physical phenomena although the data may contain measurement errors or systematic errors related to experimental devices. The parameter range of the experimental data is limited for existing devices and obviously lacks information about future fusion device parameters. In contrast, large-scale simulations could be used to explore future fusion devices but they generate a limited number of data points due to their high computational costs. Simple reduced models are computationally lightweight and can cover a wide range of parameters by using many data points. However, their fidelity depends heavily on the accuracy of corresponding theoretical models and approximations.
In this study, we propose multi-fidelity modeling14,15 to integrate the various fidelity data and verify the effectiveness of the multi-fidelity approach. Specifically, we hope to improve the predictive accuracy of the regression of turbulent transport by leveraging the correlation between low-fidelity and high-fidelity data using Nonlinear AutoRegressive Gaussian Process regression (NARGP)16. NARGP is a multi-fidelity information fusion algorithm, where fusion, in this context, refers to the integration of fidelity data at multiple hierarchical levels, and it models the nonlinear relationships of these data. We apply this method to practical problems related to turbulent transport in plasma by employing various datasets: simulations having different resolutions, quasi-linear transport models, linear and nonlinear simulations, and experimental observations.
The structure of this study is as follows. The “Methods” Section presents an overview of Gaussian process (GP)-based multi-fidelity regression approaches and the theoretical framework of the GP regression and NARGP. In the “Results” Section, we verify the performance of NARGP using four numerical experiments. The first case is a basic one-dimensional benchmark for analytical data. The second case is a data combination of low-resolution and high-resolution simulation results. The third case is an application to an experimental observation dataset related to turbulent transport in plasma. The fourth case is a correction of quasi-linear transport flux for a specific discharge. The “Discussion” Section describes the conditions under which NARGP can be applied effectively and outlines a procedure for its practical applications in turbulent transport modeling. In the “Conclusion” Section, we summarize the results of this study and discuss future problems and prospects.
Methods
Overview of GP-based multi-fidelity regressions
Several regression methods have been developed for processing multi-fidelity data. Multi-output GP regression captures linear correlations among output data, and provides a framework for integrating information from different fidelity sources using the linear model of coregionalization (LMC)17. An Auto-Regressive (AR-1) model18 has been proposed as a linear method particularly suitable for hierarchical fidelity data, where high-fidelity outputs depend on low-fidelity outputs in a structured way. For such hierarchical multi-fidelity data, NARGP16 represents a more powerful alternative than the AR-1 model by capturing nonlinear relationships between fidelity levels. Although NARGP uses a disjointed learning process that sequentially trains GP models, more advanced approaches such as the multi-fidelity deep GP (MF-DGP)19 train all fidelity levels jointly, which allows for analyzing more complex relationships at the cost of a substantially high number of hyperparameters. In our study, we used NARGP, which balances between the effective modeling of nonlinear correlations and maintenance of simplicity in parameter learning.
The remainder of this section reviews the theory of GP regression and NARGP, which we use to construct regression models for turbulent transport in plasma.
GP regression
A GP regression model20 is a probabilistic surrogate model that estimates unknown functional relationships from observation data.
Problem setup
GP regression is categorized as a supervised machine learning model trained on a set of samples of size n that consist of the input parameters \(X = \{ \varvec{x}_0, \varvec{x}_1, \ldots , \varvec{x}_{n-1} \}\) (d-dimensional vectors, \(\varvec{x}_l \in {\mathbb {R}}^d\)) and the corresponding unknown scalar function observations \(Y = \{ y_0, y_1, \ldots , y_{n-1} \}\). Each observation contains the response of an unknown function \(f(\varvec{x})\) and a white Gaussian noise \(\varepsilon\) with variance \(\sigma ^2\), i.e., \(y_l = f(\varvec{x}_l) + \varepsilon _l\). In the case of multi-output observations (i.e., \(\varvec{y}_l\), \(\varvec{f}\) and \(\varvec{\varepsilon }_l\) become vectors), we apply scalar-valued regressions independently for each component of the vector-valued output, although multi-output regression techniques17 that handle correlation among components are available.
Algorithm
In a GP regression framework, the response function is assumed to follow a Gaussian process \(f \sim {{\mathcal {G}}}{{\mathcal {P}}}(f \mid m(\varvec{x}),v(\varvec{x},\varvec{x}'))\) with a mean function \(m(\varvec{x})\) and a symmetric positive-definite covariance function \(v(\varvec{x},\varvec{x}')\) between two points \(\varvec{x}\) and \(\varvec{x}'\) in the input parameter space. If there is no prior knowledge of the mean function, a constant is often assumed for the mean function (e.g., \(m(\varvec{x})=0\)). The covariance function is assumed to be expressed as a kernel function \(k(\varvec{x}, \varvec{x}'; \varvec{\theta })\) with hyper-parameters \(\varvec{\theta }\). The expression of the kernel function k is chosen by the researcher, and it determines the prior assumptions of the function to be modeled. We use the squared exponential form for Numerical Experiments 1, 3 and 4,
where the hyper-parameters \(\varvec{\theta }\) consist of a variance parameter \(\sigma _r^2\) and automatic relevance determination (ARD) weights \(w_i\). The inverse of \(w_i\) corresponds to the scale length of the correlation, and it accounts for the directional anisotropy in each input dimension.
The posterior distribution of the function is estimated using the Bayes’ theorem. The conditional probability distribution of y under a given dataset and given hyper-parameters follows a multivariate normal distribution \({\mathcal {N}}.\) Then, a predictive inference of the response \(f_{*} \equiv f(\varvec{x}_*)\) at a new input point \(\varvec{x}_*\) is given by
The posterior mean \(\mu _*\) and the posterior variance \(\sigma _*^2\) are calculated using
where \(k_{*j} = k(\varvec{x}_*,\varvec{x}_j; \varvec{\theta })\) and \(\hat{K}_{jl} = k(\varvec{x}_j,\varvec{x}_l; \varvec{\theta }) + \sigma ^2 \delta _{jl}\) with Kronecker’s delta \(\delta _{jl}\). Although the aforementioned equation is a formulation with a general nonzero mean function \(m(\varvec{x})\) [Eqs. (2.24) and (2.38) in20], numerical experiments in this study employ \(m(\varvec{x})=0\). The optimal values of the hyper-parameters are determined by maximizing the marginal log-likelihood of the model20.
GP regression has been applied in the field of magnetic confinement fusion under several contexts. For example, it has been used to quantify uncertainty in experimental measurements21 and to perform surrogate-based optimization in flux-profile matching22.
Multi-fidelity regression by NARGP
NARGP16 is an approach to integrating datasets that have multiple fidelity levels using a combination of GP regressions. In NARGP, the correlation between low-fidelity and high-fidelity data is leveraged to compensate for the scarcity of data points in high-fidelity datasets and to improve the predictive model’s overall accuracy. Although Perdikaris et al.16 explored only noiseless cases, we considered output noise, as explained in the previous GP regression section while neglecting the input noise.
Problem setup
In the multi-fidelity regression problem, we assume that datasets have different fidelity levels. Low-fidelity data is characterized by low accuracy but many data points. High-fidelity data has high accuracy, but its parameter ranges and/or the number of data points are small. Generally, datasets contain multiple levels of fidelity, equal to or more than two. We denote the t-th fidelity dataset by a pair of input parameters and observations \((X^{(t)},Y^{(t)})\) by counting the fidelity levels \(t=0, 1, \ldots\). Thus, in a typical problem, the sample size of the t-th fidelity dataset \(n_t\) is smaller than that of the lower fidelity dataset \(n_{t-1}\).
The dimension of the input vector \(\varvec{x}\) must be the same across the datasets at all fidelity levels (\(t=0, 1, \ldots).\) However, it is acceptable for the output vectors \(\varvec{y}\) of the datasets at different fidelity levels to have different dimension sizes. One may think that the term “low-fidelity data” implies that it has the same physical quantities as high-fidelity data, but this is not necessarily a requirement. This can be understood by considering low-fidelity data as observational data that correlates with high-fidelity data.
Algorithm
The procedure for NARGP is as follows:
-
1.
Lowest fidelity (\(t=0\)): Apply a standard GP regression to the lowest-fidelity dataset,
$$\begin{aligned} \varvec{f}_0 \sim {{\mathcal {G}}}{{\mathcal {P}}}(\varvec{f}_0 \mid \varvec{m}_0(\varvec{x}), k_0(\varvec{x}, \varvec{x}'; \varvec{\theta }_0)). \end{aligned}$$(5) -
2.
Higher fidelity (\(t>0\)): The t-th fidelity function \(\varvec{f}_t(\varvec{x})\) is expressed as a function of the input parameter and the output of the \((t-1)\)-th posterior \(\varvec{f}_{*,t-1}(\varvec{x}),\)
$$\begin{aligned} \varvec{f}_t(\varvec{x}) = \varvec{g}_t(\varvec{x}, \varvec{f}_{*,t-1}(\varvec{x})). \end{aligned}$$(6)Here, the function \(\varvec{g}_t\) is evaluated using a GP regression in the extended space,
$$\begin{aligned} \varvec{g}_t \sim {{\mathcal {G}}}{{\mathcal {P}}}\left( \varvec{f}_t \mid \varvec{m}_t(\varvec{x}), k_t((\varvec{x}, \varvec{f}_{*,t-1}(\varvec{x})), (\varvec{x}', \varvec{f}_{*,t-1}(\varvec{x}')); \varvec{\theta }_t) \right) . \end{aligned}$$(7)
We employ the separated-variable form of the kernel function proposed in16,
where \(k_{t\rho }, k_{tf}\) and \(k_{tw}\) are kernel functions having as hyper-parameters \(\varvec{\theta }_{t\rho }, \varvec{\theta }_{tf}\) and \(\varvec{\theta }_{tw}\), respectively. The kernel function in Eq. (8) considers the decoupled treatment of the input parameters \(\varvec{x}\) and the response variables \(\varvec{f}_{*,t-1}\) while capturing nonlinear and space-dependent correlations between the low- and high-fidelity data. We employ the squared exponential form for each kernel function, Eq. (1), for Numerical Experiments 1, 3 and 4.
NARGP represents the nonlinear correlations between low-fidelity and high-fidelity data and the non-uniformity with respect to the input parameter space \(\varvec{x}\) through multistage GP regression. The learning and hyper-parameter adjustments for NARGP are obtained by performing GP regression (as explained in the previous GP section) for each fidelity level, so the computational cost is on the same order as for the standard GP.
Owing to recursive inference, the posterior probability distribution \(\varvec{f}_{*,t}(x)\) at high fidelity levels \(t \ge 1\) is no longer a normal distribution. To execute prediction, evaluating the propagation of uncertainty is necessary. During prediction, we draw samples from one-order lower fidelity posterior \(\varvec{y}_{*,t-1} = \varvec{f}_{*,t-1}(\varvec{x}_*)\) to obtain prediction samples for the present fidelity \(\varvec{y}_{*,t} = \varvec{g}_{*,t}(\varvec{x}_*,\varvec{y}_{*,t-1})\), and marginalized this joint distribution with respect to \(\varvec{y}_{*,t-1}\) for deriving the posterior distribution as a function of the input space \(\varvec{y}_{*,t} = \varvec{f}_{*,t}(\varvec{x}_*)\) as23
We evaluate the predictive mean and variance by computing the above integration using Monte Carlo sampling16.
Results
We showcase the effectiveness of NARGP using a range of applications for plasma turbulent transport modeling. Table 1 summarizes the training and testing datasets used in the following subsections. These applications encompass a variety of examples, from solving a one-dimensional test problem, merging the simulation data having different resolutions, addressing an experimental turbulent diffusion database, and refining low-fidelity predictions using high-fidelity data from a specific plasma discharge. We implement standard GP regression and NARGP using the open-source Python package GPy24.
Numerical experiment 1: one-dimensional analytic problem
In this section, we apply NARGP to a simple one-dimensional test problem, as was illustrated in Perdikaris16. This simple example helps us understand the setup of multi-fidelity regression problems. Additionally, it confirms that our analysis reproduces the results of the test problem in previous research, which ensures the reliability of the code that was used in this study.
Dataset
Figure 1a describes the dataset used for training. The following equations also describe these low-fidelity and high-fidelity data:
The low-fidelity data \(f_l(x)\) is a simple sinusoidal wave function that produces a periodic pattern in the input x. The high-fidelity data \(f_h(x)\) is a nonlinear transformation of the low-fidelity data, and it forms a more complex pattern. Using these datasets, we compare between the performance of standard single-fidelity GP regression and NARGP.
(a) The green daggers represent 50 points of low-fidelity data, whereas the blue circular dots depict 15 points of high-fidelity data that were used for training. (b) The red dashed line represents the regression curve generated by standard GP regression on the high-fidelity data. The uncertainty is visually indicated by shading, with a shading width that is equivalent to twice the standard deviation \(2\sigma\). (c) Multi-fidelity regression by NARGP.
Outcome
Figure 1b shows the regression results for standard single-fidelity GP regression on only high-fidelity data. The GP regression effectively estimates the target function within the parameter range x where the data points exist, and it produces a small variance. However, the variance increases in areas that have few data points. Furthermore, it is evident that the GP regression’s prediction is unreliable in the region \(x > 1.25\) due to the absence of data points. This behavior can be attributed to the hyper-parameter used for the GP regression. The optimized value of the length scale \(w^{-1}\) of the kernel function, Eq. (1), was 0.10, which limited the regression’s influence to within several times the distance of 0.10 from the data points.
In this way, it was confirmed that the prediction by a single GP regression is poor in places where there are few data points or where the parameter range of the data points is limited.
Figure 1c presents the results of applying NARGP using both low-fidelity and high-fidelity data. NARGP effectively reduces the variance in areas that have few data points and can make accurate predictions even in the region beyond \(x > 1.25\), where no data points exist. This improvement of prediction in the extrapolated region can be attributed to the scale length of the NARGP kernels. The scale length of the lowest-fidelity GP regression, \(w_0^{-1}\), was 0.28, which precisely describes the low-fidelity curve. The high-fidelity GP regression has three scale lengths, \(w_{1\rho }^{-1}\), \(w_{1f}^{-1}\), and \(w_{1w}^{-1}\), for the kernels in Eq. (8). The values of these scale lengths were 4.68, 2.37, and 5.5, respectively, which are significantly longer than the oscillation period of the high-fidelity data. This suggests that the oscillation of high-fidelity data can be effectively expressed with reference to low-fidelity data. By leveraging the relationship between the high-fidelity and the low-fidelity data in areas that have data points available, NARGP is able to use the behavior of the low-fidelity data to predict the high-fidelity data, thereby improving predictive accuracy even in regions where high-fidelity data points are absent.
Numerical experiment 2: information fusion of low and high-resolution simulation results
This section presents an example that demonstrates the improved regression accuracy for turbulent transport flux that results from combining low-resolution and high-resolution simulation data.
In this subsection, we employ a kernel function by summing the squared exponential kernel Eq. (1) and the following exponential kernel
to represent a sharp function curve.
(a) Data set for ion heat transport flux Q (normalized by the so-called gyro-Bohm heat flux \(Q_\text{gB}\)) against a local plasma parameter, magnetic shear \(\hat{s}\), used for training. The blue crosses represent the results obtained from the low-resolution simulations as low-fidelity data, \(Q_\text{L}\), and the green crosses represent the results obtained from high-resolution simulations as high-fidelity data, \(Q_\text{H}\). (b) Regression results for standard GP regression on high-fidelity data only. (c) Regression curve for standard GP regression using a combined data of low-fidelity and high-fidelity data, where high-fidelity data are used for the parameters when both data types are available. (d) Multi-fidelity regression using NARGP on both low-fidelity and high-fidelity data.
Dataset
Figure 2a describes the dataset used for training. We utilized the GKV code25, a local gyrokinetic simulation code, to study the impact of magnetic shear on the heat flux of ion transport. Due to the characteristics of the local flux-tube simulation model, a large number of grid points in the radial simulation domain \(N_x\) must be used when the magnetic shear \(\hat{s}\) is small. From the perspective of conventional numerical simulations, this is a convergence problem related to numerical resolutions. When the resolution is sufficiently high, further increasing the resolution should not change the results. However, we treat this problem as a multi-fidelity problem. Low-resolution data contain valuable physical information, despite their inherent inaccuracies in certain regions (such as near-zero magnetic shear). By treating the discrepancy in the resolution as a difference in fidelity, NARGP allows us to leverage low- and high-resolution data effectively. We conducted low-resolution (\(N_x=168\)) and high-resolution (\(N_x=336\)) numerical simulations, and we used the resultant heat fluxes from these simulations as low-fidelity and high-fidelity data, respectively.
Low-resolution simulations incur low computational costs, and they allow for extensive data points across a wide parameter range. Conversely, due to their higher computational costs, high-resolution simulations are limited in terms of the number of data points they can generate. In practical applications, it may be difficult to obtain sufficient data points due to the significant computational cost that is associated with multi-scale turbulence simulations, such as electron-ion scale turbulence simulations4. Therefore, the problem setting that is considered in this subsection (namely, a large number of low-resolution simulation results and a small number of high-resolution simulation results) represents a realistic scenario. By using NARGP to develop a predictive model utilizing the relationship between low- and high-resolution data, we aim to achieve high predictive accuracy under these constraints.
In practice, the GKV simulations described in this section employed the adiabatic electron model, which significantly reduces computational costs (less than one-thousandth of multi-scale simulations). As a result, we were able to conduct a greater number of high-resolution simulations and subsequently use them to assess predictive accuracy.
Outcome
Figure 2 compares the results obtained by applying GP regression and NARGP in different ways to the dataset. First, Fig. 2b represents the regression results using standard GP regression on only high-fidelity data. Due to the limited number of data points, the regression fits only around the data points. In particular, there is no supporting evidence for the regression results in the region of negative magnetic shear \(\hat{s}<0\) where there are few high-fidelity data points.
However, Fig. 2c present a result of standard GP regression on a combined data of low-fidelity and high-fidelity data. The training data points were selected as incorporating high-fidelity data for \(\hat{s}\) where both low-fidelity and high-fidelity data existed and used low-fidelity data where no high-fidelity data existed. Then, a standard GP regression produced a curve that was fitted to this combined data. However, some points, particularly the low-fidelity data points around a magnetic shear of \(\hat{s}=-0.03\), were deemed to contain numerical errors due to the low-resolution simulation of the local flux-tube model under the small \(\hat{s}\) regime. The regression fit to such points is not ideal in terms of accuracy. This is the reason for the plot in Fig. 2c exhibiting a curve with a sharp cusp.
By contrast, Fig. 2d demonstrated that NARGP did not overestimate the heat flux prediction around a magnetic shear of \(\hat{s}=-0.03\), because NARGP used the correlation between low- and high-fidelity data. The regression curve by NARGP is corrected to better align with the high-fidelity data.
(a) Dependence of high-fidelity data \(Q_\text{H}\) (\(=: \varvec{f}_t\)) in the extended input space of the input parameter \(\hat{s}\) (\(=: \varvec{x}\)) and the low-fidelity posterior \(Q_\text{L}\) (\(=: \varvec{f}_{*,t-1}\)). NARGP learns the function of this hypersurface \(\varvec{f}_t = \varvec{g}_t(\varvec{x},\varvec{f}_{*,t-1})\) (shown in 3D color surface) as Eq. (6). (b) Comparison of prediction and actual values of turbulent heat flux Q. Blue daggers, orange crosses, and green circular dots represent the results of the standard GP regression for high-resolution data only, standard GP regression on low- and high-fidelity combined data in Fig. 2c, and NARGP, respectively.
NARGP expresses high-fidelity data as a function that is dependent on both of the input parameter and the low-fidelity regression data, as Eq. (6). This relationship \(\varvec{f}_t = \varvec{g}_t(\varvec{x},\varvec{f}_{*,t-1})\) is illustrated in Fig. 3a. The figure illustrates the dependency of high-fidelity data \(Q_\text{H}\) on the input parameter \(\hat{s}\) and low-fidelity regression \(Q_\text{L}\). The NARGP inference procedure first estimates the value of the low-fidelity function posterior \(Q_\text{L}(\hat{s})\) that corresponds to the input parameter \(\hat{s}\). It then predicts the value of the high-fidelity data using the surface that represents the relationship against the input parameter and low fidelity function as \(Q_\text{H} = Q_\text{H}(\hat{s}, Q_\text{L})\), rather than expressing the high-fidelity data as a function of the input parameter directly \(Q_\text{H} = Q_\text{H}(\hat{s})\). This process corrects the tendency of low-fidelity data to overestimate small magnetic shear and makes the prediction align with the high-fidelity data.
Finally, Fig. 3b compares the predictions generated by single GP regression or NARGP with the actual high-fidelity data values, where the amount of high-resolution simulation data is increased by performing an additional twenty simulations. The magnetic shear values of high-resolution simulation points (8+20) were chosen from 32 points used for low-resolution simulations, ensuring that the same magnetic shear values were covered. It shows that standard GP regression tends to generate large errors when it uses only a few high-fidelity data points. It also shows that standard GP regression with mixed low- and high-resolution data gives overestimated predictions because of the numerical errors in low-resolution simulations. NARGP successfully corrects these overestimates. We also evaluate the coefficient of determination \(R^2\)
where \({\bar{y}}\) is the average of the dataset. The models gave \(R^2\) values of 0.752 for standard GP with high-resolution data, 0.87 for standard GP with mixed data, and 0.96 for NARGP. These results provide quantitative evidence that supports the superiority of NARGP over the other models.
Numerical experiment 3: plasma turbulent transport dataset
In this section, we use the NARGP method to analyze the plasma turbulent transport dataset that was previously utilized in related studies13 to develop surrogate models for existing magnetic fusion plasma experiments.
Scatter plots of turbulent diffusion coefficient \(D_\text{exp}\) versus 12 local plasma parameters. From the top left, the horizontal axes correspond to the inverse of the normalized length of the density gradient \(R/L_n\), normalized length of the electron temperature gradient \(R/L_{T_\text{e}}\), normalized length of the ion temperature gradient \(R/L_{T_\text{i}}\), ion-to-electron density ratio \(n_\text{i}/n_\text{e}\), electron-to-ion temperature ratio \(T_\text{e}/T_\text{i}\), plasma beta value \(\beta\), electron-electron collision frequency \(\nu _\text{ee}\), safety factor \(q_0\), magnetic shear \(\hat{s}\), inverse aspect ratio \(\epsilon\), elongation \(\kappa\), and triangularity \(\delta\). Each panel contains 45 points of training data that are used for training (plotted as green daggers) and 90 points of test data that are used for validation (plotted as blue crosses).
Dataset
We used an experimental dataset obtained from the Joint European Torus (JET) device26. They were from 16 plasma discharges: 3 from L-mode plasmas and 13 from H-mode plasmas, heated by neutral beam injections. Plasma currents ranged from 0.963 to 3.20 MA; the toroidal magnetic fields ranged from 1.05 to 3.12 T; the heating power ranged from 2.64 to 17.4 MW; and the safety factor at 95% flux surface ranged from 3.71 to 5.40. All data were collected before the ITER-Like-Wall Project started. The vessel wall material was Inconel with carbon, and the divertor tiles were made of carbon or beryllium. The main ion species was deuterium. Data were selected at normalized minor radial positions from 0.30 to 0.85 at intervals of 0.05. Electron particle diffusivity \(D_\text{exp}\) was evaluated using a power balance equation. Narita et al.13 expressed the electron particle and heat fluxes as \(\Gamma = D ( R/L_n + C_\text{T} R/L_{T\text{e}} + C_\text{P})\) and \(Q = \chi ( C_\text{N} R/L_n + R/L_{T\text{e}} + C_\text{HP})\), where \(R/L_n\) and \(R/L_{T\text{e}}\) were the inverse of the normalized length of the electron density and temperature gradients, respectively. The off-diagonal coefficients \(C_\text{T}\), \(C_\text{P}\), \(C_\text{N}\), and \(C_\text{HP}\) and ratio \(\Gamma /Q\) were estimated from linear gyrokinetic calculations. Because particle flux is difficult to estimate directly due to particle recycling from the wall, \(Q_\text{exp}\) was first obtained from the electron power balance and \(D_\text{exp}\) was derived from the ratio \(\Gamma /Q\).
As a dataset for regression, the input parameters were 12 local plasma parameters that expressed the density and temperature profiles and the geometric factors of the confinement magnetic field. The high-fidelity data we sought to model were the experimentally observed turbulent diffusion coefficients of particle transport \(D_\text{exp}\). Figure 4 shows a scatter plot of the experimental turbulent diffusion coefficients against local plasma parameters. The dataset contains 135 data points. Figure 4 presents a projection of multi-dimensional input parameter data along each parameter axis. It does not represent uncertainties; each set of plasma parameters corresponds to a single value of \(D_\text{exp}\). It should be noted that the NARGP implementation can handle uncertain datasets which contain different output values y for the same input \(\varvec{x}\). To facilitate the regression analysis, we partitioned the high-fidelity dataset into a training set of 45 points and a validation set of 90 points. The use of only 45 points for training simulates a typical situation in multi-fidelity modeling, where high-fidelity data are often limited. While adding more points could improve performance, this study demonstrates the effectiveness of the multi-fidelity modeling even with constrained training data.
The dataset also contained, as the low-fidelity data, the results of linear analyses of local gyrokinetic simulations that used the corresponding local plasma parameters13. We used the poloidal wavenumber of the linearly most unstable mode \(k_\theta\) and its linear growth rate \(\gamma\) as low-fidelity information. The linear gyrokinetic calculations included electromagnetic fluctuations, with the shaping factors being incorporated using the Miller geometry27. Although the modern quasi-linear transport models include multiple wavenumbers6,7,8,9, we only used the most unstable wavenumber. To identify the most unstable mode, a wavenumber scan was conducted over the ion gyro-radius scale for each set of input parameters. In contrast to the high-fidelity data, we assumed that all 135 points of low-fidelity data were available. This setup mimicked a practical multi-fidelity problem in plasma turbulent transport modeling. Although the number of high-fidelity experimental data was limited within existing plasma discharges (e.g., JT-60U, JET, DIII-D tokamaks or stellarator/heliotron devices), the low-fidelity data were producible when many simulations were carried out (using gyrokinetic codes), even in the experimentally unexplored parameter regime (e.g., target parameters for future devices such as JT-60SA, SPARC, and DEMO reactors). Our goal was to achieve highly accurate regression against the high-fidelity experimental turbulent diffusion coefficient.
Outcome
Comparison of predicted and actual values of turbulent diffusion coefficient \(D_\text{exp}\). The blue daggers denote the training data, and the orange circular dots denote the test data used for validation. The horizontal axis displays the predicted values generated by the constructed regression models: (a) quasi-linear transport model of Eq. (6) in13, (b) standard single-fidelity GP regression using only high-fidelity data, and (c) multi-fidelity regression using NARGP.
For comparison, we also evaluated a quasi-linear transport model13,
where \(\tau _r\) is a model of the characteristic time of zonal flows given by local plasma parameters13. The coefficients in Eq. (14) were determined by least square fitting against the training data, given as \(c_0 = 0.716, c_1 = 0.840\), and \(c_2 = -0.592\). Figure 5a shows the regression result of Eq. (14). The coefficient of determination for the validation test data was \(R^2 = 0.222\) using the logarithmic value \(\ln D\). We also evaluated the mean squared logarithmic error,
The value of the MSLE for Eq. (14) was 0.205.
Next, Fig. 5b presents the result of the standard GP regression calculated using only the high-fidelity training dataset. Both the results on training and test datasets show better agreements in this case than for the quasi-linear model. The model performance metrics were \(R^2 = 0.777\) and \(\text{MSLE} = 0.0294\). Note that these results suggest that GP regression with only 45 training points outperforms the particular quasi-linear model presented in Fig. 5a. The significant deviation from the experimental data observed for some predicted points is due to the large distance between the test points and training data in the input parameter space. Data classification using the Mahalanobis distance confirmed that points with larger distances tend to show greater deviations. Furthermore, we confirmed that increasing the training data coverage reduced the occurrence of such outliers. Finally, the results of NARGP are shown in Fig. 5c. NARGP gives \(R^2 = 0.854\) and \(\text{MSLE} = 0.0243\), showing a higher predictive accuracy than the other models. These results indicate that NARGP is an effective method for improving the predictive accuracy for experimental data of turbulent transport by utilizing the correlation between low-fidelity and high-fidelity data rather than simply regressing against high-fidelity data. The numerical experiments in this section also demonstrate that the low-fidelity data are not necessarily limited to the same physical quantities of the high-fidelity data. In this case, the information about the linearly most unstable wavenumber and its linear growth rate contains helpful information for improving the prediction of the experimental turbulent diffusion coefficient. NARGP utilizes their nonlinear relationship.
Numerical experiment 4: correction of a reduced model prediction for specific discharge data
This section presents an application example of NARGP, attempting to correct the disparities between a reduced turbulent transport model28 and nonlinear simulation results concerning a specific plasma discharge.
Dataset
This section showcases an analysis of turbulent transport flux using specific discharge data (#88343) from the Large Helical Device (LHD) experiments29. Figure 6a plots the density, temperature, and safety factor profiles that were measured in this discharge. Using the equilibrium magnetic geometry of the LHD and the local plasma parameters from the above measurements, it was possible to perform linear and nonlinear analysis based on local gyrokinetic simulations at each radial position. The input parameters were the local plasma density n, electron temperature \(T_\text{e}\), ion temperature \(T_\text{i}\), inverse of the normalized length of the density gradient \(R/L_n\), normalized length of the electron-temperature gradient \(R/L_{T_\text{e}}\), normalized length of the ion-temperature gradient \(R/L_{T_\text{i}}\), safety factor q, and magnetic shear \(\hat{s}\). For the region \(r/a < 0.4\), the ion-scale micro-instabilities are stabilized, which results in a negligible turbulent flux. In the region \(r/a > 0.8\), other transport mechanisms exist such as collisions with neutral particles. Therefore, our analysis is focused on the mid-radius region where turbulent transport is expected to dominate.
Figure 6b shows the low- and high-fidelity data for the particle fluxes caused by the plasma turbulence. Owing to the quasi-neutrality condition, particle fluxes for electrons and ions are ambipolar, i.e., \(\Gamma \equiv \Gamma _\text{e} = \Gamma _\text{i}\). The low-fidelity data are related to the particle flux estimated by the reduced model \(\Gamma _{\text{QL}}\) proposed in28, based on linear simulation results such as linear growth rates. Unlike the quasi-linear model described in Eq. (14), which is applied to JT-60U and other tokamak plasmas, the proposed reduced model was specifically developed and validated for LHD experiments, as described by Toda et al.28 Therefore, the reduced model incurs low computational cost and can be easily performed at multiple radial positions. The high-fidelity data is the particle flux evaluated by nonlinear turbulence simulations \(\Gamma _{\text{NL}}\). In Fig. 6b, the turbulent particle flux from nonlinear simulations shows negative values, which indicate inward transport. In addition to turbulent transport, neoclassical transport plays an important role in LHD plasmas. This particular discharge, which has no significant particle source in the analyzed range, is considered to be in a quasi-steady state, where neoclassical and turbulent transport balance each other. The number of data points is limited because nonlinear turbulence simulations incur a higher computational cost than linear analyses. Because the reduced model’s hyper-parameters are pre-trained against nonlinear simulations in other discharges, the low-fidelity reduced model’s estimations do not necessarily match the high-fidelity nonlinear simulation results in the present discharge.
In this section, we aim to correct the discrepancies between low- and high-fidelity datasets by utilizing NARGP. Although a few data points are insufficient for the regression of the global function, the focus on the correction for this particular discharge allows us to overcome this lack of high-fidelity data points. To achieve the aforementioned objective, we used the difference of particle flux \(\Delta \Gamma = \Gamma _{\text{NL}} - \Gamma _{\text{QL}}\) between the nonlinear turbulence simulation data \(\Gamma _{\text{NL}}\) and the quasi-linear transport model data \(\Gamma _{\text{QL}}\) as the high-fidelity data for the NARGP training. This means that the low- and high-fidelity dataset was configured as \((\Gamma _{\text{QL}}, \Delta \Gamma )\), rather than \((\Gamma _{\text{QL}}, \Gamma _{\text{NL}})\), and that the usage of our NARGP module remained the same as done in the previous subsections. During the prediction phase, NARGP predicted the low-fidelity \(\Gamma _{*\text{QL}}\) in the first stage and then predicted \(\Delta \Gamma _{*}\) in the second stage. We then obtained the prediction of the nonlinear turbulent flux \(\Gamma _{\text{NARGP}} = \Gamma _{*\text{QL}} + \Delta \Gamma _{*}\). This approach of configuring the dataset was needed because GP regression often fails when prediction points \(\varvec{x}_*\) are too far from the training data points in the input parameter space \(\varvec{x}\). In such cases, particularly with a zero-mean prior \(\varvec{m}(\varvec{x})=0\), the model tends to predict near-zero outputs with large prediction variances. By focusing on predicting the difference \(\Delta \Gamma\), we safeguard predictions from unrealistic zero estimates. Even when high-fidelity data \(\Gamma _{\text{NL}}\) are extremely sparse, the model \(\Gamma _{\text{NARGP}}\) will asymptote toward a low-fidelity approximation \(\Gamma _{*\text{QL}}\) when prediction points \(\varvec{x}_*\) are far from the training dataset, ensuring that the result is still anchored to the available low-fidelity information.
Plasma profiles in a specific LHD discharge. (a) Blue crosses, orange daggers, green circles, and red square dots represent the profiles for density n, electron temperature \(T_\text{e}\), ion temperature \(T_\text{i}\), and safety factor q against the minor radius r/a, respectively. (b) Profiles of the turbulent particle transport flux in low-fidelity data (reduced model predictions \(\Gamma _\text{QL}\) shown in blue daggers for 10 radial positions) and high-fidelity data (nonlinear simulation results \(\Gamma _\text{NL}\) shown in green crosses for 3 radial positions).
Outcome
Figure 7a plots the comparison of predictions and nonlinear turbulent transport flux. The results of multi-fidelity modeling by NARGP and the predictions made by the reduced model in28 are shown together for comparison. First, the data points used for training are compared and it is confirmed that the NARGP model is more accurate than the reduced model. On test data, a systematic overestimation by the reduced model is observed. NARGP corrected this overestimation downward, even for the test data, which enhanced prediction accuracy over a wide parameter range.
Furthermore, Fig. 7b presents the turbulent transport flux profiles in the radial direction. Note that this regression was conducted against local plasma parameters (not against radial positions). The original low-fidelity data, i.e., which represent the transport flux of the reduced model, tend to overestimate the outcomes of the nonlinear simulations. Notably, NARGP successfully corrects this discrepancy by incorporating the low- and high-fidelity data. Consequently, integrating the low-fidelity quasi-linear model and the high-fidelity nonlinear simulation results using NARGP facilitated higher accuracy predictions, even on a limited number of data points.
Results of the correction by NARGP for a specific discharge. (a) Comparison of predicted values and actual nonlinear turbulent flux \(\Gamma _\text{NL}\). NARGP results are shown as blue circles and red square dots for the training data (3 points shown in Fig. 6b) and the test data (additional 7 points), respectively. The predictions by the reduced model in28 are plotted as green daggers and orange crosses for the training and test data, respectively. (b) Profiles of the turbulent transport flux in the radial position r/a. Blue daggers plot the low-fidelity data \(\Gamma _\text{QL}\), whereas green crosses and orange triangular dots correspond to the high-fidelity data \(\Gamma _\text{NL}\). Red circles plot the predictions by NARGP \(\Gamma _\text{NARGP}\).
Discussion
Based on the numerical experimental results presented in previous sections, we now discuss the conditions under which NARGP can be effectively applied and outline practical steps for using this method to more practical scenarios.
Effective use cases and conditions for NARGP
Understanding the types of problems for which NARGP works effectively and improves predictions is crucial. NARGP is most effective when complex high-fidelity functions \(f_t(\varvec{x})\) can be simplified in the extended input space \(g_t(\varvec{x}, f_{*,t-1}(\varvec{x}))\). In Numerical Experiment 1, this simplification was confirmed by the longer correlation length of the kernel function, highlighting the smoother behavior of \(g_t(\varvec{x},f_{*,t-1})\) than the original function \(f_t(\varvec{x}).\) Similarly, Fig. 3a in Numerical Experiment 2 explicitly showed the function \(g_t\) in the extended input space \((\varvec{x},f_{*,t-1}),\) which is simpler than the expression in the input space \(f_t(\varvec{x})\).
For higher-dimensional data, it is more challenging to verify whether low-fidelity data simplify the function. Furthermore, because the choice of low-fidelity data is arbitrary, improved prediction can be realized using alternative low-fidelity sources. The selection of appropriate low-fidelity data remains an open area for further research.
Concerns regarding the quality of low-fidelity data
One may have a concern whether low-resolution simulations or experimental data with errors might harm prediction accuracy, especially if trends between low- and high-fidelity data differ.
It’s important to note that low-fidelity data need not closely resemble high-fidelity data. Even when the trends differ, as long as a nonlinear correlation exists between them, low-fidelity data can still contribute to improving predictions. In Fig. 1, for instance, low-fidelity data have longer oscillation periods than high-fidelity data; nevertheless, NARGP captures this nonlinear relationship and enhances accuracy. In Numerical Experiment 3, the wavenumber of the linearly unstable mode (meaning small-size eddy) tends to show an inverse correlation with turbulent transport. Even with opposite trends, low-fidelity data can aid high-fidelity predictions if such inverse correlations are captured by NARGP. The strength of NARGP lies in its ability to express complex nonlinear correlations.
A severe concern arises when the correlation between low- and high-fidelity data within the training data range changes considerably outside the prediction range. In such cases, the correlation learned by NARGP may no longer be useful, leading to incorrect predictions. However, this risk is not unique to multi-fidelity modeling but is a general problem with extrapolation. NARGP can flexibly adapt to new data, facilitating the rebuilding of the model as new data become available, thereby accommodating different correlations.
The choice and impact of kernel functions
The choice of an appropriate kernel function in GP regression is crucial and can remarkably impact predictive accuracy. In low-dimensional cases, selecting a composite kernel tailored to the data’s trends is well-known to improve model performance20. However, in high-dimensional cases, constructing such optimal kernels is generally challenging. Recent research has explored general-purpose kernels, such as spectral mixture kernel30, the neural kernel network31, or other deep kernels. Applying such advanced general-purpose kernels in NARGP can enhance its performance further; this aspect lies beyond the scope of the current study and remains a topic for future investigations.
Toward practical applications
The NARGP method proposed in this study aims to improve the predictive accuracy of turbulent transport in unexplored parameter regimes by leveraging correlations between low- and high-fidelity data. It will be useful to outline practical application procedures, with the objective of implementing the proposed method with advanced quasi-linear models including TGLF or QuaLiKiz and predicting future plasma devices such as JT-60SA, SPARC, and DEMO.
Specifically, to apply our Numerical Experiments 3 or 4 to more specific data, the process is as follow:
-
1.
Prepare particle and heat flux data as high-fidelity data.
-
2.
Perform linear simulations to produce low-fidelity data. These simulations should not only use existing experimental parameters but also cover a broad range of future plasma scenarios to be predicted.
-
3.
Apply low- and high-fidelity data to multi-fidelity modeling using NARGP. This can be implemented with the open-source code accompanying this paper. See details in the Data Availability statement.
-
4.
Use NARGP to predict new data points.
This process allows readers to apply the presented methodology to their specific problems.
Conclusion
This study proposed and validated the effectiveness of NARGP for integrating multi-fidelity datasets to improve the accuracy of predicting turbulent transport in magnetic fusion plasma.
We presented numerical experiments corresponding to practical regression problems in turbulent transport in plasma. First, we showed that NARGP improved prediction accuracy by combining low- and high-resolution simulation data. Second, NARGP achieved higher predictive accuracy for plasma turbulent transport data than standard GP regression and a quasi-linear transport model by integrating experimental data with linear gyrokinetic analysis results. Third, NARGP successfully corrected overestimation in reduced-model predictions by using a dataset of a specific plasma discharge, and this produced high-accuracy predictions for the related discharge.
The primary advantage of NARGP is its comprehensive multi-fidelity modeling approach. Unlike conventional quasi-linear models that rely solely on linear analysis results, NARGP leverages a broader range of low-fidelity data, which gives more versatile and accurate predictions that can be applied to various scenarios, including the information fusion of linear/nonlinear simulation data, simulations that use different resolutions, different physical models, and experimental observations.
Future research should focus on extrapolative settings, where predictions are made outside the parameter range that is found in the training data. The development of future fusion devices is essentially an extrapolation in plasma parameter space. Improving kernel design will enhance NARGP’s performance in extrapolative settings. Additionally, exploring the impact of the correlation between low- and high-fidelity data on extrapolation will further refine NARGP’s predictive capabilities.
In conclusion, NARGP offers a robust and versatile method for integrating multi-fidelity data, and it significantly enhances the predictive accuracy of turbulent transport phenomena. A generalized information fusion algorithm comprising theoretical, simulation, and experimental datasets paves a new approach to turbulent transport modeling in the magnetic fusion research community. Its broad applicability may contribute to optimizing fusion reactor design and operation.
Data availability
The data depicted in the plots of this paper will be made available at https://github.com/smaeyama/maeyama_scirep_2024 upon publication. The gyrokinetic simulation code GKV is an open-source project available from GitHub: https://github.com/GKV-developers/gkvp. The JET experimental data and corresponding gyrokinetic simulation results originate from a previous publication13.
References
Horton, C. Turbulent Transport in Magnetized Plasmas (G-Reference, Information and Interdisciplinary Subjects Series (World Scientific, 2012).
Brizard, A. J. & Hahm, T. S. Foundations of nonlinear gyrokinetic theory. Rev. Mod. Phys. 79, 421–468. https://doi.org/10.1103/RevModPhys.79.421 (2007).
Hatch, D. R. et al. Reduced models for ETG transport in the tokamak pedestal. Phys. Plasmas 29, 062501. https://doi.org/10.1063/5.0087403 (2022).
Maeyama, S. et al. Multi-scale turbulence simulation suggesting improvement of electron heated plasma confinement. Nat. Commun. 13, 3166. https://doi.org/10.1038/s41467-022-30852-0 (2022).
Belli, E. A., Candy, J. & Sfiligoi, I. Spectral transition of multiscale turbulence in the tokamak pedestal. Plasma Phys. Controll. Fus. 65, 024001. https://doi.org/10.1088/1361-6587/aca9fa (2022).
Staebler, G. M., Kinsey, J. E. & Waltz, R. E. A theory-based transport model with comprehensive physicsa. Phys. Plasmas 14, 055909. https://doi.org/10.1063/1.2436852 (2007).
Staebler, G. M. & Kinsey, J. E. Electron collisions in the trapped gyro-Landau fluid transport model. Phys. Plasmas 17, 122309. https://doi.org/10.1063/1.3505308 (2010).
Bourdelle, C. et al. Core turbulent transport in tokamak plasmas: Bridging theory and experiment with qualikiz. Plasma Phys. Controll. Fus. 58, 014036. https://doi.org/10.1088/0741-3335/58/1/014036 (2015).
Citrin, J. et al. Tractable flux-driven temperature, density, and rotation profile evolution with the quasi-linear gyrokinetic transport model qualikiz. Plasma Phys. Controll. Fus. 59, 124005. https://doi.org/10.1088/1361-6587/aa8aeb (2017).
Honda, M. & Narita, E. Machine-learning assisted steady-state profile predictions using global optimization techniques. Phys. Plasmas 26, 102307. https://doi.org/10.1063/1.5117846 (2019).
Ho, A. et al. Neural network surrogate of QuaLiKiz using JET experimental data to populate training space. Phys. Plasmas 28, 032305 (2021).
Meneghini, O. et al. Neural-network accelerated coupled core-pedestal simulations with self-consistent transport of impurities and compatible with iter imas. Nucl. Fus. 61, 026006. https://doi.org/10.1088/1741-4326/abb918 (2020).
Narita, E. et al. Modification of a machine learning-based semi-empirical turbulent transport model for its versatility. Contrib. Plasma Phys. 63, e202200152. https://doi.org/10.1002/ctpp.202200152 (2023).
Peherstorfer, B., Willcox, K. & Gunzburger, M. Survey of multifidelity methods in uncertainty propagation, inference, and optimization. SIAM Rev. 60, 550–591. https://doi.org/10.1137/16M1082469 (2018).
Brevault, L., Balesdent, M. & Hebbal, A. Overview of gaussian process based multi-fidelity techniques with variable relationship between fidelities, application to aerospace systems. Aerosp. Sci. Technol. 107, 106339. https://doi.org/10.1016/j.ast.2020.106339 (2020).
Perdikaris, P., Raissi, M., Damianou, A., Lawrence, N. D. & Karniadakis, G. E. Nonlinear information fusion algorithms for data-efficient multi-fidelity modelling. Proc. R. Soc. A: Math. Phys. Eng. Sci. 473, 20160751. https://doi.org/10.1098/rspa.2016.0751 (2017).
Alvarez, M. A., Rosasco, L. & Lawrence, N. D. Kernels for vector-valued functions: A review. Found. Trends Mach. Learn. 4, 195–266. https://doi.org/10.1561/2200000036 (2012).
Kennedy, M. C. & O’Hagan, A. Predicting the output from a complex computer code when fast approximations are available. Biometrika 87, 1–13. https://doi.org/10.1093/biomet/87.1.1 (2000).
Cutajar, K., Pullin, M., Damianou, A., Lawrence, N. & González, J. Deep Gaussian processes for multi-fidelity modeling. arXiv preprint https://doi.org/10.48550/arXiv.1903.07320 (2019).
Rasmussen, C. E. & Williams, C. K. I. Gaussian Processes for Machine Learning. Adaptive computation and machine learning (MIT Press, 2006).
Chilenski, M. A. et al. Improved profile fitting and quantification of uncertainty in experimental measurements of impurity transport coefficients using Gaussian process regression. Nucl. Fus. 55, 023012. https://doi.org/10.1088/0029-5515/55/2/023012 (2015).
Rodriguez-Fernandez, P., Howard, N. T. & Candy, J. Nonlinear gyrokinetic predictions of SPARC burning plasma profiles enabled by surrogate modeling. Nucl. Fus. 62, 076036. https://doi.org/10.1088/1741-4326/ac64b2 (2022).
Ravi, K. et al. Multi-fidelity Gaussian process surrogate modeling for regression problems in physics. Mach. Learn. Sci. Technol. 5, 045015. https://doi.org/10.1088/2632-2153/ad7ad5 (2024).
GPy. GPy: A gaussian process framework in python. http://github.com/SheffieldML/GPy (since 2012).
Watanabe, T.-H. & Sugama, H. Velocity-space structures of distribution function in toroidal ion temperature gradient turbulence. Nucl. Fus. 46, 24. https://doi.org/10.1088/0029-5515/46/1/003 (2005).
The ITER 1D Modelling Working Group et al. The international multi-tokamak profile database. Nucl. Fus.40, 1955. https://doi.org/10.1088/0029-5515/40/12/302 (2000).
Miller, R. L. et al. Noncircular, finite aspect ratio, local equilibrium model. Phys. Plasmas 5, 973–978. https://doi.org/10.1063/1.872666 (1998).
Toda, S. et al. Modeling of turbulent particle and heat transport in helical plasmas based on gyrokinetic analysis. Phys. Plasmas 26, 012510. https://doi.org/10.1063/1.5058720 (2019).
Tanaka, K. et al. Turbulence response in the high Ti discharge of the LHD. Plasma Fus. Res. 5, S2053. https://doi.org/10.1585/pfr.5.S2053 (2010).
Wilson, A. & Ryan, A. Gaussian process kernels for pattern discovery and extrapolation. Proc. 30th Int. Conf. Mach. Learn. 28, 1067–1075. https://doi.org/10.48550/arXiv.1302.4245 (2013).
Sun, S. et al. Differentiable compositional kernel learning for Gaussian processes. Proc. 35th Int. Conf. Mach. Learn. 80, 4828–4837. https://doi.org/10.48550/arXiv.1806.04326 (2018).
Acknowledgements
This work was partially supported by JST, PRESTO Grant Number JPMJPR23OB, National Institute for Fusion Science (NIFS) Collaboration Research program (NIFS22KIST023, NIFS23KIST041) in Japan. This work used the computational resources of the Fugaku supercomputer at RIKEN R-CCS through the HPCI System Research Project (Project ID: hp240166), the JFRS-1 at Computational Simulation Centre of International Fusion Energy Research Centre (IFERC-CSC), and the plasma simulator at NIFS. The authors would like to thank Enago (www.enago.jp) for the English language review.
Author information
Authors and Affiliations
Contributions
S.M. and M.H. contributed to conceptualization and methodology. S.M. conducted numerical experiments and analyzed the results. E.N. provided the JET dataset. S.T. provided the LHD dataset. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Maeyama, S., Honda, M., Narita, E. et al. Multi-fidelity information fusion for turbulent transport modeling in magnetic fusion plasma. Sci Rep 14, 28242 (2024). https://doi.org/10.1038/s41598-024-78394-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-78394-3








