Abstract
Learning operators with deep neural networks is an emerging paradigm for scientific computing. Deep Operator Network (DeepONet) is a modular operator learning framework that allows for flexibility in choosing the kind of neural network to be used in the trunk and/or branch of the DeepONet. This is beneficial as it has been shown many times that different types of problems require different kinds of network architectures for effective learning. In this work, we design an efficient neural operator based on the DeepONet architecture. We introduce UNet enhanced DeepONet (UDeepONet) for learning the solution operator of highly complex CO_{2}water twophase flow in heterogeneous porous media. The UDeepONet is more accurate in predicting gas saturation and pressure buildup than the stateoftheart UNet based Fourier Neural Operator (UFNO) and the Fourierenhanced MultipleInput Operator (FourierMIONet) trained on the same dataset. Moreover, our UDeepONet is significantly more efficient in training times than both the UFNO (more than 18 times faster) and the FourierMIONet (more than 5 times faster), while consuming less computational resources. We also show that the UDeepONet is more data efficient and better at generalization than both the UFNO and the FourierMIONet.
Similar content being viewed by others
Introduction
Geological CO_{2} storage (GCS) stands out as a promising solution to address the accumulation of anthropogenic carbon dioxide in our atmosphere^{1,2}. This method involves the direct injection of CO_{2} into suitable deep underground geological formations. Prime reservoirs for storage include deep saline aquifers^{3}, depleted oil and gas fields, and unmineable coal seams, each selected based on their capacity, injectivity, and longterm retention attributes. As time progresses, the stored CO_{2} undergoes a sequence of trapping mechanisms, ranging from structural and residual trapping to solubility and eventual mineral trapping, thus enhancing the security of the storage^{4}. To ensure the integrity and stability of these storage sites, rigorous monitoring and verification protocols are imperative.
Rigorous monitoring and verification protocols play an indispensable role in the efficacy and security of geological CO_{2} storage operations^{5,6}. While direct observation methods, such as seismic surveys^{7} and wellbore monitoring^{8}, provide valuable realtime data on CO_{2} plume dynamics and caprock integrity, simulation techniques offer a forwardlooking approach to understanding subsurface behavior^{9}. Advanced numerical simulations, grounded in reservoir engineering, enable the modeling of CO_{2} migration, dissolution, and mineralization over time^{10,11,12}. These simulations, which integrate reservoir characteristics, injection rates, local geology, and many other variables aid in predicting potential leakage paths, pressure buildups, and interactions between CO_{2} and formation water. By coupling realworld monitoring data with simulation results, stakeholders can achieve a more holistic understanding of the reservoir storage performance^{13}. Such a synergistic approach not only ensures the longterm stability of stored CO_{2} but also helps in promptly addressing unforeseen challenges.
Numerical simulation of geological CO_{2} storage is a resourceintensive task, demanding both robust algorithms and powerful hardware capabilities. A primary challenge arises from geological uncertainty^{14,15,16,17,18}. Given that the subsurface cannot be observed directly in detail, our understanding of it is based on sparse observations, which lead to significant uncertainties in reservoir properties like permeability, porosity, residual saturations, etc. To account for these uncertainties, multiple realizations of the geological model are generated, each representing a possible scenario of the subsurface^{19}. However, every realization entails solving coupled partial differential equations that govern fluid flow and transport, and geochemical reactions in porous media. As the number of realizations increases to adequately sample the uncertainty space, the computational expense escalates, often exponentially^{19}. Highresolution models, necessary for capturing finescale geological features and processes, further compound the computational demands. Yet, simulating the CO_{2} injection and migration process for each of these realizations is essential to assess the range of potential outcomes and risks.
Over the past five decades, numerical reservoir simulation has significantly advanced in sophistication and while these simulations are essential for evaluating flow patterns, their increased complexity demands substantial computational resources. Highperformance computing can manage these demands^{20,21}, but integrating new subsurface data into models requires continuous updates and recalibration, leading to extended simulation times. This iterative process can challenge timely decisionmaking, particularly when swift model updates are crucial for operational or safety adjustments.
In addressing these challenges, researchers are exploring various strategies, such as machine learningbased surrogates and reduced order models^{22,23,24,25,26,27,28,29,30,31} to achieve a balance between computational feasibility and simulation fidelity. Nonetheless, as the industry advances towards more extensive and deeper CO_{2} storage projects, ongoing efforts to optimize and innovate in the realm of computational methods remain paramount. Surrogate models have gained popularity as a way to reduce the computational burden associated with every realization. In essence, surrogate models need to be able to capture as much of the physics of the problem as possible, while maintaining high computational efficiency. The problem is that one is usually promoted at the expense of the other. Neural operator learning algorithms have the potential to solve this problem.
In contrast to traditional deep learning models like Convolutional Neural Networks (CNNs) and Graph Neural Networks (GNNs), which are primarily designed for mappings between finite dimensional vectors, neural operator learning models such as FNO and DeepONet focus on learning mappings between infinite function spaces. This allows operator learning models to efficiently handle complex, variabledimensional problems, such as those arising in geologic carbon sequestration, where generalization to new and unseen input functions is crucial. In operator learning, we aim to learn (approximate) the nonlinear operator \({\mathcal{G}}_{\theta }: \mathcal{U}\to \mathcal{S}\) subject to some loss function, such that
where \(u\in \mathcal{U}\) denotes a function in the input functions space, and \(\mathcal{G}\left(u\right)\in \mathcal{S}\) denotes the corresponding unknown solution in the output space. Both \(\mathcal{S}\) and \(\mathcal{U}\) are separable Banach spaces of functions that take values in \({\mathbb{R}}^{{d}_{s}}\) and \({\mathbb{R}}^{{d}_{u}}\), respectively. For example, \(u\) would be a function, and \(s={\mathcal{G}}_{\theta }(u)\) is its derivative. In this paper, \(u\) represents the field and scalar variables, \(s\) represents the spatiotemporal gas saturation and pressure buildup solutions to complete the input–output pairs required to train the operator \({\mathcal{G}}_{\theta }\), \(u\) is evaluated at \(x\) which is a collection of fixed points. A key advantage of operator learning is that once a model is trained, it can generalize to new input functions. Thus, in inference, a trained operator is orders of magnitude faster than a numerical solver. Another key advantage of operator learning is that it can be trained using simulation data, experimental (real/noisy) data, or both. It can also be informed of the underlaying partial differential equation (PDE) if it is known, which helps it generalize even better.
The UNet Fourier Neural Operator (UFNO) was recently proposed as an algorithm for operator learning in modeling CO_{2} geological storage^{32}. The UFNO is an extension of the FNO^{33} with an additional UNet inserted in all or some of the Fourier layers as shown in Fig. 1. The key feature of the FNO is that it formulates the operator learning task by parameterizing the integral kernel directly in Fourier space. This means that the parameters of the network are defined and learned in the Fourier space and the coefficients corresponding to the Fourier series representation of the output function are inferred from the dataset. The key component in the FNO architecture is the Fourier layer, which leverages the Fourier transform to capture global patterns by learning in the frequency domain, enabling efficient modeling of complex spatial temporal relationships. Unfortunately, the Fourier layer is computationally expensive due to the large number of trainable parameters, as there are two weight matrices for every Fourier layer. In addition, the size of the weight tensor grows exponentially with the increase in the number of Fourier modes, i.e., the Fourier based neural operators suffer from the curse of dimensionality. These issues are only exacerbated with a UFourier layer, as the number of trainable weight matrices increases to three per layer. Moreover, the UNet block in the UFourier layer could potentially be very expensive to compute if the input data is in 3D due to 3D convolutions. These sources of inefficiency contribute to high training times reported for the UFNO architecture. The effectiveness of the UNet stems from its ability to process data on structured grids through local convolution, and hence enriching the representation power of the architecture in higher frequencies leading to better accuracy compared to FNO and convolutionalFNO (convFNO)^{32}. In our opinion, the proposed UFNO has three main drawbacks: (1) An extremely large number of trainable parameters (more than 30 million in this case), (2) Difficulty in scaling to higher dimensions, and (3) Loss of the resolution invariance of FNO in time, i.e., it cannot make predictions at unseen time steps in the interpolation task.
In a recent work, Jiang et al.^{34} proposed to combine the UFNO with the multipleinput deep neural operator (MIONet)^{35}, and the authors termed their architecture Fourierenhanced multipleinput neural operator (FourierMIONet). The FourierMIONet is a more computationally and dataefficient alternative to the UFNO alone, albeit results in some loss of accuracy. The architecture utilizes a modified MIONet to process the inputs then passes the output to a UFNO, creating a hybrid architecture (Fig. 2). This is achieved by first, separating the time variable \(t\) from the other input variables and processing it through the trunk network. In doing so, 2D convolutions can be used in the UNet block instead of 3D convolutions in the original UFNO. In addition, one of the Fourier modes is now removed, which results in significant efficiency improvements, i.e., the number of trainable parameters dropped from 30+ million to 3+ million with a small loss of accuracy. Second, two branch networks were employed to process the two different kinds of inputs, field and scalar.
Compared to UFNO, the FourierMIONet reduces training time per epoch from 1535 s to 730 s on an NVIDIA GeForce RTX 3090 GPU. Additionally, it regains the capacity to generalize over unseen time steps, though these advantages come at a slight cost to accuracy. The overall training duration was reduced from approximately 42–43 h for the UFNO to a more manageable 16–17 h for the FourierMIONet on the same GPU. Yet we believe this improvement remains suboptimal.
In this article, we introduce a novel UNet Enhanced Deep Operator Network (UDeepONet) shown in Fig. 3. By leveraging the inherent modularity of the DeepONet, we incorporate a UNet within its branch structure. This new approach presents substantial improvements in computational efficiency compared to both UFNO and FourierMIONet. Furthermore, it retains the accuracy standards set by the UFNO, while also inheriting other merits of the FourierMIONet.
Methods
UDeepONet architecture
Inspired by the DeepONet, the ideas implemented in FourierMIONet, and UFNO architectures, we propose a novel architecture that is based on the DeepONet and the UNet; we call this architecture the UDeepONet shown in Fig. 3. Moreover, the implementation details of the UDeepONet are outlined in Table 1, and the implementation details of the UNet block are outlined in Table 2. We posit that using the Fourier layer and the MIONet is completely unnecessary, and computationally expensive. We build on the idea proposed for FourierMIONet and process the time variable \(t\) through the trunk of the DeepONet to leverage the computational benefits of 2D convolutions in lieu of 3D convolutions. In addition, we deploy multiple UNet blocks in the branch of the DeepONet to process both field and scalar inputs. This setup allows for multiple input operator learning without the MIONet architecture. However, the theoretical analysis is left for a future work.
In the UDeepONet, the trunk network \({t}_{k}\left(\mathcal{t}\right)\) is a simple feedforward neural network (FNN) with a nonlinear activation function \(\sigma \) and \(\mathcal{t}\) is the time variable. The branch of the UDeepONet consists of several UNet blocks connected in series:

1.
Lift the input observations \(\mathcal{v}\left(x\right)\) to a higher dimensional representation \({b}_{0}\left(x\right)\):
$$\begin{array}{c}{b}_{0}\left(x\right)=P\left(\mathcal{v}\left(x\right)\right)\in {\mathbb{R}}^{{d}_{\mathcal{z}}},\end{array}$$(2)where \(P\) is a linear layer, and it is a local transformation \(P:{\mathbb{R}}\to {\mathbb{R}}^{{d}_{p}}\); the result of this transformation \(({b}_{0})\) can be viewed as an image with \({d}_{p}\) channels. The purpose of this mapping is to increase the number of channels so that a bigger weight tensor can be constructed in the UNet blocks. In addition, it allows the UDeepONet to process multiple input operators simultaneously.

2.
Apply \(L\) UNets to \({b}_{0}\). The output of the \(l\) th UNet block is \({U}_{L}\) with \({d}_{\mathcal{z}}\) channels.
$$\begin{array}{c}{U}_{{l}_{j+1}}=\sigma \left({U}_{{l}_{j}}\right),\end{array}$$(3)where \(\sigma \) is a nonlinear activation function. In our experiments we did not observe significant performance changes with different types of activation functions; nonetheless, the choice of activation functions is generally problem specific.

3.
The operator \({\mathcal{G}}_{\theta }\) is defined as the inner product of the branch and trunk networks:
$$\begin{array}{c}{\mathcal{G}}_{\theta }\left(v\right)\left(t\right)={U}_{L}\left(v\right)\cdot {t}_{k}\left(\mathcal{t}\right).\end{array}$$(4) 
4.
The output of the product of the trunk and the branch \({\mathcal{G}}_{\theta }\left(v\right)\left(\mathcal{t}\right)\) is projected to the solution space via a linear transformation parametrized by a linear layer or shallow fullyconnected neural network layer \(G:{\mathbb{R}}^{{104\times 208\times 24\times d}_{z}}\to {\mathbb{R}}^{104\times 208\times 24\times 1}\):
$$\begin{array}{c}u\left(x,y,t\right)=G\left({\mathcal{G}}_{\theta }\left(v\right)\left(\mathcal{t}\right)\right).\end{array}$$(5)
A close examination of our novel UDeepONet reveals that not only does our architecture no longer use 3D convolutions in the UNet block, it also does not make use of the Fourier layer, its associated weight tensors, and FFT. This contributes to significant gains in computational efficiency compared to the UFNO, and FourierMIONet. The basic idea here is that the DeepONet on its own is an effective operator learning framework, and using a combined DeepONetFourier based architecture is redundant, as both are neural operator learning networks.
Training and hyperparameters
In the interest of consistency, we follow the approach presented in Wen et al.^{32} by constructing a mask to account for the varying thicknesses of the various realizations in the dataset. As such, the loss is only computed within that mask during training for each realization, while the cells laying outside each reservoir are padded with zeros. Hence, the \({l}_{2}\)loss is given by:
where \(y\) and \(dy/dr\) are the ground truth and the first derivative of the ground truth, respectively, \(\widehat{y}\) and \(d\widehat{y}/dr\) are the predicted output and the first derivative of the predicted output, respectively, and \(\beta =0.5\) is a hyperparameter. The first derivative term has been shown to greatly improve the prediction accuracy and leads to faster convergence^{32}.
We train two separate UDeepONet networks for saturation and pressure buildup. The trunk in both models consists of a feed forward neural network with 10 layers, while the branch consists of three UNets connected in series. Moreover, we use two different learning schedules for saturation and pressure buildup and the training is stopped once the loss stops decreasing. To allow for fair comparisons with the UFNO and FourierMIONet, in general we try to avoid changing hyperparameters where possible. Therefore, we maintain the same UNet structure, number of epochs, and batch size. It is worth noting that although the FNO architecture does not require spatial data as input, both UFNO and FourierMIONet use it as an additional input feature. This has been shown to improve accuracy, which is also consistent with other reports in the literature^{36}. In this work, we also use the spatial data as an additional input feature in the branch for consistency. Moreover, the initial learning rate for the gas saturation and pressure buildup models are \(0.0007\) and \(0.0006\), respectively. The learning rate follows a ‘staircase’ reduction schedule.
Performance evaluation
We benchmark the performance of the trained operator networks for gas saturation using the plume area coefficient of determination, \({R}_{plume}^{2}\), the plume mean absolute error (MPE), and the mean absolute error (MAE). The plume area is defined as nonzero values in either ground truth or prediction. This approach is used to only evaluate the gas saturation model’s accuracy because the gas saturation outside the CO_{2} plume is always zero. For the evaluation of the pressure buildup models, we use the field mean relative error, MRE, \({R}^{2}\) score, and the MAE. The field MRE for pressure buildup is defined as follows^{37}:
where \({n}_{t}=24\) is the total number of time steps, \({n}_{e}=500\) is the total number of test samples, \({n}_{b}=96*200=\text{19,200}\) is the number of grid blocks, \({{\widehat{p}}_{i,j}}^{t}\) and \({p}_{i,j}^{t}\) denote the pressure values provided by the model and the ground truth, respectively, for test sample \(i\), in grid block \(j\), at time step \(t\). The difference between the maximum gridblock pressure \({p}_{i,max}^{t}\) and the minimum gridblock pressure \({p}_{i, min}^{t}\), for sample \(i\) at time step \(t\), is used to normalize the pressure absolute error.
Results
Here, we benchmark our UDeepONet against UFNO and FourierMIONet under three main criteria: accuracy, training efficiency, and inference time. Note that the UFNO was already benchmarked against the vanilla FNO and convFNO, which uses a CNN in place of the UNet and was shown to outperform both^{32}. Moreover, we train the UDeepONet on a NVIDIA Tesla V100 GPU. We implement our UDeepONet using the opensource machine learning framework Pytorch^{38}. Details of the UDeepONet implementation are outlined in Table 1 in the “Methods” section. The main difference between the saturation and the pressure buildup architectures is in the number of channels in the UNet block. The code accompanying this manuscript is available upon request and the dataset^{32} is publicly available.
Dataset
The opensource dataset^{32}, which contains 5,500 realizations of CO_{2} geological storage in a radial and symmetric reservoir, was generated using the numerical reservoir simulator ECLIPSE (e300)^{39}. The purpose of these realizations is to track the movement of the CO_{2} plume and pressure buildup over time. The numerical model is made up of two hundred gradually coarsened grid cells in the radial direction with a 100,000 m radius (2D slice), and the simulation runs are reported in 24 successive time snapshots \(\left\{1\; day, 2 \;days, 4 \;days, \dots , 14.8 \;years, 21.1 \;years, 30 \;years\right\}\) using an adaptive implicit method. Supercritical CO_{2} is injected at a constant rate (constant per realization but varies from one realization to another) ranging from 0.2 to 2 Mt/year for a period of 30 years. The radius of the injection well is 0.1 m. The realizations account for a variable perforation thickness up to the entire thickness of the reservoir which can also vary from 12.5 to 200 m. The outer boundaries of the reservoir are closed (noflow boundaries). Additionally, the reservoir thickness (\(b\)) is randomly sampled between 12.5 and 200 m, with a vertical grid thickness of 2.08 m; this means that the number of vertical grid cells also varies between 6 and 96. If the number of vertical grids in a specific case is less than 96, the difference is accounted for via zero padding (mask). Consequently, all realizations are of consistent resolution (96, 200).
The dataset contains 5500 inputtooutput mappings for saturation and another 5500 inputtooutput mappings for pressure buildup; each dataset is split as 9:1:1, where 4500 realizations are used for training, 500 realizations for validation, and 500 realizations for testing. The outputs are the state variables: gas saturation (\({S}_{g}\)) and pressure buildup (\(dP\)); the inputs consist of 9 variables: four field variables (spacedependent inputs) and five scalar variables. In theory, the learned operator should be able to generalize over these nine inputs; in other words, given a new combination of these nine variables that do not exist in the training dataset the learned operator should accurately predict the state variables. The field variables include a horizontal permeability map (\({k}_{x}\)), a vertical permeability map (\({k}_{y}\)), a porosity map (\(\varnothing \)), and an injection perforation map (\(perf\)). Moreover, scalar variables include the initial reservoir pressure at the top of the reservoir (\({P}_{init}\)), injection rate (\(Q\)), reservoir temperature (\(T\)), capillary pressure scaling factor (\(\lambda \)), and irreducible water saturation (\({S}_{wi}\)). Table 3 summarizes the distributions and parameter ranges used to generate the nine variables. We direct the readers to the original paper^{32} for more details on the generation of the field maps and all other sampling techniques for the inputs. An example of an inputtooutput mapping is shown in Fig. 4.
Gas saturation
To train the UDeepONet, we use a batch size of 4 to remain consistent with UFNO^{32} and FourierMIONet^{34} performance results. However, we remark here that using a larger batch size is made possible with our UDeepONet due to its light memory footprint compared to the other two architectures, where the UDeepONet requires only 4.6 GiB compared to 15.9 and 12.8 GiB for the UFNO and FourierMIONet, respectively; see Table 4a for performance comparisons. This also gives our UDeepONet an edge over other models as it is materialistically cheaper to train and deploy since many commercial GPUs have less than 8 GiB of memory.
Moreover, the UDeepONet test set prediction error is lower than both the UFNO and the FourierMIONet with an average mean plume error (MPE) of only 1.58%. The MPE is a mean absolute error for the plume area only. The testing set results, which contains 500 realizations, represent the predictability of the model on truly unseen data. In Fig. 5, four testing examples for UDeepONet at two snapshots in time are shown. Using the mean absolute error (MAE) to benchmark, the UDeepONet is about twice as accurate as the FourierMIONet and about 20% more accurate than the UFNO. The UDeepONet also has a higher R^{2} score than the other two models.
The UDeepONet only requires about 108 s per epoch to train, i.e., it is about 18 times faster to train than the UFNO on the same GPU, and about 5 times faster than the FourierMIONet. We note here that the reported performance of the FourierMIONet was on a NVIDIA GeForce RTX 3090 GPU^{34} that has twice the CUDA cores. Not only is the UDeepONet faster in training time, but it is also faster in testing time. For any given testing case, the UDeepONet requires 0.016 s to predict the solution, while the UFNO and the FourierMIONet require 0.018 and 0.041 s, respectively. These results clearly show that the proposed UDeepONet is more accurate and significantly more efficient in training and testing than both the UFNO and FourierMIONet.
Our UDeepONet is similar to the FourierMIONet in terms of flexibility in selecting the batch size and the number of time snapshots in the trunk input. This means that the accuracy and the computational efficiency of the UDeepONet can potentially be further improved.
Pressure buildup
As mentioned earlier, a separate UDeepONet is trained on the pressure buildup data. The architecture of this model is outlined in Table 1 in the “Methods” section. The performance evaluation metrics for pressure buildup are reported in Table 4b. The improvements in accuracy for the UDeepONet over the UFNO are not as apparent as in the gas saturation model, with the UDeepONet having a higher R^{2} score of 0.994 compared to 0.992 for the UFNO and only 0.986 for FourierMIONet. The UDeepONet also performs better than the UFNO in terms of MAE (0.64 vs. 0.66), while the MAE of the FourierMIONet is not reported. In terms of the mean relative error (MRE), defined by Eq. (6), the UFNO performs slightly better than the UDeepONet (0.0068 vs. 0.0072). Later, we show that the MRE for the UDeepONet can be further dropped to 0.0069 if a batch size of 6 is used instead of 4.
It is not surprising that the UDeepONet improvements in accuracy for the pressure buildup model were not as renowned as in the gas saturation case. This is because the UFNO utilizes the Fourier transform which tends to perform well on smooth PDEs. This is particularly true in the case of the pressure equation which is diffusive by nature. Nonetheless, just as in the gas saturation model, the computational efficiency improvements are vast, with the UDeepONet being about 15 times faster to train than the UFNO and using almost a third of the GPU memory requirements. Furthermore, the UDeepONet is faster than the UFNO in inference. Figure 6 shows four examples of pressure buildup predictions. Our results for both saturation and pressure buildup models clearly indicate the advantages of the UDeepONet over other models.
Data efficiency
The UDeepONet demonstrates exceptional training efficiency and generalization capabilities, with remarkable improvements in data utilization efficiency compared to the UFNO. Data utilization efficiency is a crucial element as it reduces computational expenses and enhances applicability in datascarce scenarios. To assess generalizability, we created nine subsets from the main training dataset, varying from 500 to 4500 realizations, and evaluated the model performance using MPE for the saturation dataset and MRE for the pressure buildup dataset. Figure 7 presents these results, including the impact of batch sizes 4 and 6. We ensured robustness by repeating the training with different seeds to confirm consistency.
For the gas saturation dataset, even when trained with just 3000 realizations (twothirds of the full dataset) and a batch size of 6, the UDeepONet still surpasses the UFNO, achieving a MPE of 1.48%. This indicates that the UDeepONet requires significantly less data to achieve superior testing set performance. Furthermore, in scenarios with limited training data, the UDeepONet proves to be more reliable, maintaining a testing MPE of 3% when trained with only 500 realizations—about 10% of the full dataset. In contrast, the UFNO needs over double that amount of data to reach a similar level of performance. For the pressure buildup dataset, although the improvements in data utilization are less pronounced compared to the saturation case, the UDeepONet still outperforms the UFNO in small data regimes, ranging from 500 to 1500 realizations.
Inference at unseen time steps
One of the primary limitations of the UFNO architecture is its gridinvariance in time, stemming directly from the use of a UNet block to process all inputs, including the time variable. Unlike the UFNO, both the FourierMIONet and our UDeepONet circumvent this issue by processing the time variable through a separate network. In the UDeepONet, the time variable is passed to the trunk which allows it to make predictions at unseen time steps within the temporal training horizon. The advantage of this is twofold: it allows for further reduction in the size of the dataset required to train the neural operator, and it leads to a decrease in computational load when generating the dataset using a numerical solver. This reduction is achieved as fewer time steps are needed in numerical schemes, without compromising the accuracy of the machine learning model. Such feature is particularly valuable in applications where data collection is costly or challenging.
To demonstrate the UDeepONet’s capability to generalize to unseen time steps, we trained our models with only 13 time steps (~ 50% fewer time steps). Building upon the consistent performance of the UDeepONet with a reduced dataset, as previously shown, and to further demonstrate this capability, we augment the reduction in the time steps with a reduction in the overall size of the training dataset. Accordingly, we used 3500 realizations for the saturation model and 4000 for the pressure buildup model, while maintaining a batch size of 6 for both. To evaluate these models, we present the MPE in Fig. 8a and the MRE in Fig. 8b for the test datasets of the saturation and pressure buildup models, respectively, at each time step. Training the UDeepONet models with only ~ 50% of the time steps leads to a slight decrease in accuracy at the omitted time steps, as depicted by the empty squares in Fig. 8. On the other hand, the performance of the UDeepONet at time steps included in the training does not deteriorate (indicated by the solid squares in the figure). It is important to highlight that the first and lasttime steps should always be included in the training because datadriven neural operator learning methods cannot usually extrapolate in time.
From Fig. 8, it's evident that the UDeepONet's performance (gray curves with squares) using the hyperparameters discussed in the “Methods” section of the paper falls short at unseen time steps. To address this, we focused on the temporal interpolation task by refining the trunk network. The initial trunk network featured 10 layers and 64 neurons per layer for the saturation model and 96 neurons per layer for the pressure model. The updated design comprises 16 layers and 14 neurons per layer for the first 15 layers, while the last layer in each model maintained 64 and 96 neurons for the gas and pressure models, respectively, to enable the dot product with the branch.
Results of the updated design are shown in Fig. 8 (depicted by the orange curve with triangles). We can see from Fig. 8a and b that the UDeepONet regains its stellar performance at unseen time steps due to the finetuning of the trunk network.
Conclusions
This paper presents a novel UNet enhanced deep operator network, termed UDeepONet. In designing the UDeepONet, the best features from both the UFNO and the FourierMIONet were fused into an architecture that outperforms the other two architectures in training and testing performance for multiphase flow and transport problems in porous media. We evaluate the novel UDeepONet using the opensource CO_{2} sequestration dataset^{32} and compare performance to that of the UFNO and the FourierMIONet. Results show that the UDeepONet is advantageous in performance, predictive accuracy, training efficiency, and data utilization efficiency. Our UDeepONet is more than 18 times faster in training than the UFNO and more than 5 times faster than the FourierMIONet, while being more accurate than both models. We also show that the UDeepONet has a much smaller GPU memory footprint compared to the other operator learning algorithms and is faster in inference. The UDeepONet is data efficient and better at generalization as it can be trained with less data while maintaining accuracy. Moreover, we show that the UDeepONet performance at unseen time steps is robust without any changes to the architecture. Overall, we show that the UDeepONet is a better framework for neural operator learning compared to other stateoftheart frameworks, and that the UDeepONet is easier to work with due to the smaller number of hyperparameters. The UDeepONet architecture can be seen as an alternative to the multipleinput DeepONet (MIONet), as we have shown that it can learn multiple operators simultaneously. Finally, like most other neural operators, the UDeepONet does not perform well in temporal extrapolation tasks, especially for timesteps far from the temporal training horizon. Our work on forward modeling problems using neural operators is ongoing and is still very much an open question.
Data availability
The raw/processed data required to reproduce these findings is available in the below link, courtesy of reference^{32} in the manuscript. https://drive.google.com/drive/folders/1fZQfMn_vsjKUXAfRV0q_gswtl8JEkVGo.
References
Bachu, S. CO_{2} storage in geological media: Role, means, status and barriers to deployment. Prog. Energy Combust. Sci. 34, 254–273. https://doi.org/10.1016/j.pecs.2007.10.001 (2008).
Benson, S. M. & Cole, D. R. CO_{2} sequestration in deep sedimentary formations. Elements 4, 325–331 (2008).
Pruess, K. & García, J. Multiphase flow dynamics during CO_{2} disposal into saline aquifers. Environ. Geol. 42, 282–295 (2002).
Saadatpoor, E., Bryant, S. L. & Sepehrnoori, K. New trapping mechanism in carbon sequestration. Transp. Porous Media 82, 3–17 (2010).
Lengler, U., De Lucia, M. & Kühn, M. The impact of heterogeneity on the distribution of CO_{2}: Numerical simulation of CO_{2} storage at Ketzin. Int. J. Greenhouse Gas Control 4, 1016–1025 (2010).
Strandli, C. W., Mehnert, E. & Benson, S. M. CO_{2} plume tracking and history matching using multilevel pressure monitoring at the Illinois basinDecatur project. In Energy Procedia Vol. 63, 4473–4484 (Elsevier Ltd, 2014).
Yin, Z., Siahkoohi, A., Louboutin, M. & Herrmann, F. J. Learned coupled inversion for carbon sequestration monitoring and forecasting with Fourier neural operators. In SEG Technical Program Expanded Abstracts vols 2022August, 467–472 (Society of Exploration Geophysicists, 2022).
Fawad, M. & Mondol, N. H. Monitoring geological storage of CO_{2}: A new approach. Sci. Rep. https://doi.org/10.1038/s41598021853468 (2021).
Ajayi, T., Gomes, J. S. & Bera, A. A review of CO_{2} storage in geological formations emphasizing modeling, monitoring and capacity estimation approaches. Pet. Sci. 16, 1028–1063. https://doi.org/10.1007/s1218201903408 (2019).
Zhao, M., Wang, Y., Gerritsma, M. & Hajibeygi, H. Efficient simulation of CO_{2} migration dynamics in deep saline aquifers using a multitask deep learning technique with consistency. Adv. Water Resour. 178, 104494 (2023).
Flemisch, B. et al. The fluidflower validation benchmark study for the storage of CO_{2}. Transp. Porous Media https://doi.org/10.1007/s11242023019777 (2023).
Tariq, Z. et al. Datadriven machine learning modeling of mineral/CO_{2}/brine wettability prediction: implications for CO_{2} geostorage. In SPE Middle East Oil and Gas Show and Conference, MEOS, Proceedings (Society of Petroleum Engineers (SPE), 2023). https://doi.org/10.2118/213346MS.
Anyosa, S., Bunting, S., Eidsvik, J., Romdhane, A. & Bergmo, P. Assessing the value of seismic monitoring of CO_{2} storage using simulations and statistical analysis. Int. J. Greenhouse Gas Control 105, 103219 (2021).
Nordbotten, J. M. et al. Uncertainties in practical simulation of CO2 storage. Int. J. Greenhouse Gas Control 9, 234–242 (2012).
Jeong, H., Srinivasan, S. & Bryant, S. Uncertainty quantification of CO_{2} plume migration using static connectivity of geologic features. In Energy Procedia Vol. 37, 3771–3779 (Elsevier Ltd, 2013).
Gan, M. et al. Impact of reservoir parameters and wellbore permeability uncertainties on CO2 and brine leakage potential at the Shenhua CO_{2} Storage Site, China. Int. J. Greenhouse Gas Control 111, 103443 (2021).
Cao, C. et al. Parametric uncertainty analysis for CO_{2} sequestration based on distance correlation and support vector regression. J. Nat. Gas Sci. Eng. 77, 103237 (2020).
Xiao, C. et al. Deeplearninggeneralized dataspace inversion and uncertainty quantification framework for accelerating geological CO_{2} plume migration monitoring. Geoenergy Sci. Eng. 224, 211627 (2023).
Mahjour, S. K. & Faroughi, S. A. Selecting representative geological realizations to model subsurface CO2 storage under uncertainty. Int. J. Greenhouse Gas Control 127, 103920 (2023).
Zhang, K., Wu, Y.S. & Pruess, K. User’s Guide for TOUGH2MPA Massively Parallel Version of the TOUGH2 Code. (2008).
Lichtner, P. et al. PFLOTRAN User Manual: A Massively Parallel Reactive Flow and Transport Model for Describing Surface and Subsurface Processes.
Wen, G. et al. Realtime highresolution CO_{2} geological storage prediction using nested Fourier neural operators. Energy Environ. Sci. 16, 1732–1741 (2023).
Tariq, Z., Yan, B. & Sun, S. Predicting trapping indices in CO_{2} sequestration—A datadriven machine learning approach for coupled chemohydromechanical models in deep saline aquifers. In ARMA US Rock Mechanics/Geomechanics Symposium (2023). https://doi.org/10.56952/ARMA20230757.
Ju, X. et al. Learning CO_{2} plume migration in faulted reservoirs with Graph Neural Networks. arXiv preprint arXiv:2306.09648 (2023).
Yan, B., Chen, B., Robert Harp, D., Jia, W. & Pawar, R. J. A robust deep learning workflow to predict multiphase flow behavior during geological CO_{2} sequestration injection and PostInjection periods. J. Hydrol. 607, 127542 (2022).
Lyu, Y., Zhao, X., Gong, Z., Kang, X. & Yao, W. Multifidelity prediction of fluid flow based on transfer learning using Fourier neural operator. Phys. Fluids https://doi.org/10.1063/5.0155555 (2023).
Falola, Y., Misra, S. & Nunez, A. C. Rapid highfidelity forecasting for geological carbon storage using neural operator and transfer learning. In Abu Dhabi International Petroleum Exhibition and Conference (SPE, 2023). https://doi.org/10.2118/216135MS.
Stepien, M., Ferreira, C. A. S., Hosseinzadehsadati, S., Kadeethum, T. & Nick, H. M. Continuous conditional generative adversarial networks for datadriven modelling of geologic CO_{2} storage and plume evolution. Gas Sci. Eng. 115, 204982 (2023).
Tang, M., Ju, X. & Durlofsky, L. J. Deeplearningbased coupled flowgeomechanics surrogate model for CO_{2} sequestration. Int. J. Greenhouse Gas Control 118, 103692 (2022).
Cardoso, M. A., Durlofsky, L. J. & Sarma, P. Development and application of reducedorder modeling procedures for subsurface flow simulation. Int. J. Numer. Methods Eng. 77, 1322–1350 (2009).
Zhang, K. et al. Fourier neural operator for solving subsurface oil/water twophase flow partial differential equation. SPE J. 27, 1815–1830 (2022).
Wen, G., Li, Z., Azizzadenesheli, K., Anandkumar, A. & Benson, S. M. UFNO—An enhanced Fourier neural operatorbased deeplearning model for multiphase flow. Adv. Water Resour. 163, 104180 (2022).
Li, Z. et al. Fourier neural operator for parametric partial differential equations. (2021).
Jiang, Z. et al. FourierMIONet: FourierEnhanced MultipleInput Neural Operators for Multiphase Modeling of Geological Carbon Sequestration. arXiv:2303.04778v1 (2023).
Jin, P., Meng, S. & Lu, L. MIONet: Learning multipleinput operators via tensor product. SIAM J. Sci. Comput. 44, A3490–A3514. https://doi.org/10.1137/22M1477751 (2022).
Lu, L. et al. A comprehensive and fair comparison of two neural operators (with practical extensions) based on FAIR data. Comput. Methods Appl. Mech. Eng. 393, 1–35 (2022).
Tang, M., Liu, Y. & Durlofsky, L. J. A deeplearningbased surrogate model for data assimilation in dynamic subsurface flow problems. J. Comput. Phys. 413, 109456 (2020).
Paszke, A. et al. PyTorch: An imperative style, highperformance deep learning library. Adv. Neural Inf. Process Syst. 32, (2019).
Schlumberger. ECLIPSE Reservoir Simulation Software Reference Manual. (2014).
Acknowledgements
The authors wish to acknowledge Khalifa University's highperformance computing facilities for providing the computational resources.
Author information
Authors and Affiliations
Contributions
M.A.K. and W.D. conceptualized and constructed the neural operator network. W.D. wrote the software, generated the visualization, and conducted the interpretation; M.A.K. provided supervision. Both authors carried out the formal analysis and both have written and reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons AttributionNonCommercialNoDerivatives 4.0 International License, which permits any noncommercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/byncnd/4.0/.
About this article
Cite this article
Diab, W., Al Kobaisi, M. UDeepONet: UNet enhanced deep operator network for geologic carbon sequestration. Sci Rep 14, 21298 (2024). https://doi.org/10.1038/s41598024723930
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598024723930
Keywords
This article is cited by

Learning integral operators via neural integral equations
Nature Machine Intelligence (2024)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.