Abstract
Learning operators with deep neural networks is an emerging paradigm for scientific computing. Deep Operator Network (DeepONet) is a modular operator learning framework that allows for flexibility in choosing the kind of neural network to be used in the trunk and/or branch of the DeepONet. This is beneficial as it has been shown many times that different types of problems require different kinds of network architectures for effective learning. In this work, we design an efficient neural operator based on the DeepONet architecture. We introduce U-Net enhanced DeepONet (U-DeepONet) for learning the solution operator of highly complex CO2-water two-phase flow in heterogeneous porous media. The U-DeepONet is more accurate in predicting gas saturation and pressure buildup than the state-of-the-art U-Net based Fourier Neural Operator (U-FNO) and the Fourier-enhanced Multiple-Input Operator (Fourier-MIONet) trained on the same dataset. Moreover, our U-DeepONet is significantly more efficient in training times than both the U-FNO (more than 18 times faster) and the Fourier-MIONet (more than 5 times faster), while consuming less computational resources. We also show that the U-DeepONet is more data efficient and better at generalization than both the U-FNO and the Fourier-MIONet.
Similar content being viewed by others
Introduction
Geological CO2 storage (GCS) stands out as a promising solution to address the accumulation of anthropogenic carbon dioxide in our atmosphere1,2. This method involves the direct injection of CO2 into suitable deep underground geological formations. Prime reservoirs for storage include deep saline aquifers3, depleted oil and gas fields, and un-mineable coal seams, each selected based on their capacity, injectivity, and long-term retention attributes. As time progresses, the stored CO2 undergoes a sequence of trapping mechanisms, ranging from structural and residual trapping to solubility and eventual mineral trapping, thus enhancing the security of the storage4. To ensure the integrity and stability of these storage sites, rigorous monitoring and verification protocols are imperative.
Rigorous monitoring and verification protocols play an indispensable role in the efficacy and security of geological CO2 storage operations5,6. While direct observation methods, such as seismic surveys7 and wellbore monitoring8, provide valuable real-time data on CO2 plume dynamics and caprock integrity, simulation techniques offer a forward-looking approach to understanding subsurface behavior9. Advanced numerical simulations, grounded in reservoir engineering, enable the modeling of CO2 migration, dissolution, and mineralization over time10,11,12. These simulations, which integrate reservoir characteristics, injection rates, local geology, and many other variables aid in predicting potential leakage paths, pressure build-ups, and interactions between CO2 and formation water. By coupling real-world monitoring data with simulation results, stakeholders can achieve a more holistic understanding of the reservoir storage performance13. Such a synergistic approach not only ensures the long-term stability of stored CO2 but also helps in promptly addressing unforeseen challenges.
Numerical simulation of geological CO2 storage is a resource-intensive task, demanding both robust algorithms and powerful hardware capabilities. A primary challenge arises from geological uncertainty14,15,16,17,18. Given that the subsurface cannot be observed directly in detail, our understanding of it is based on sparse observations, which lead to significant uncertainties in reservoir properties like permeability, porosity, residual saturations, etc. To account for these uncertainties, multiple realizations of the geological model are generated, each representing a possible scenario of the subsurface19. However, every realization entails solving coupled partial differential equations that govern fluid flow and transport, and geochemical reactions in porous media. As the number of realizations increases to adequately sample the uncertainty space, the computational expense escalates, often exponentially19. High-resolution models, necessary for capturing fine-scale geological features and processes, further compound the computational demands. Yet, simulating the CO2 injection and migration process for each of these realizations is essential to assess the range of potential outcomes and risks.
Over the past five decades, numerical reservoir simulation has significantly advanced in sophistication and while these simulations are essential for evaluating flow patterns, their increased complexity demands substantial computational resources. High-performance computing can manage these demands20,21, but integrating new subsurface data into models requires continuous updates and recalibration, leading to extended simulation times. This iterative process can challenge timely decision-making, particularly when swift model updates are crucial for operational or safety adjustments.
In addressing these challenges, researchers are exploring various strategies, such as machine learning-based surrogates and reduced order models22,23,24,25,26,27,28,29,30,31 to achieve a balance between computational feasibility and simulation fidelity. Nonetheless, as the industry advances towards more extensive and deeper CO2 storage projects, ongoing efforts to optimize and innovate in the realm of computational methods remain paramount. Surrogate models have gained popularity as a way to reduce the computational burden associated with every realization. In essence, surrogate models need to be able to capture as much of the physics of the problem as possible, while maintaining high computational efficiency. The problem is that one is usually promoted at the expense of the other. Neural operator learning algorithms have the potential to solve this problem.
In contrast to traditional deep learning models like Convolutional Neural Networks (CNNs) and Graph Neural Networks (GNNs), which are primarily designed for mappings between finite dimensional vectors, neural operator learning models such as FNO and DeepONet focus on learning mappings between infinite function spaces. This allows operator learning models to efficiently handle complex, variable-dimensional problems, such as those arising in geologic carbon sequestration, where generalization to new and unseen input functions is crucial. In operator learning, we aim to learn (approximate) the nonlinear operator \({\mathcal{G}}_{\theta }: \mathcal{U}\to \mathcal{S}\) subject to some loss function, such that
where \(u\in \mathcal{U}\) denotes a function in the input functions space, and \(\mathcal{G}\left(u\right)\in \mathcal{S}\) denotes the corresponding unknown solution in the output space. Both \(\mathcal{S}\) and \(\mathcal{U}\) are separable Banach spaces of functions that take values in \({\mathbb{R}}^{{d}_{s}}\) and \({\mathbb{R}}^{{d}_{u}}\), respectively. For example, \(u\) would be a function, and \(s={\mathcal{G}}_{\theta }(u)\) is its derivative. In this paper, \(u\) represents the field and scalar variables, \(s\) represents the spatiotemporal gas saturation and pressure buildup solutions to complete the input–output pairs required to train the operator \({\mathcal{G}}_{\theta }\), \(u\) is evaluated at \(x\) which is a collection of fixed points. A key advantage of operator learning is that once a model is trained, it can generalize to new input functions. Thus, in inference, a trained operator is orders of magnitude faster than a numerical solver. Another key advantage of operator learning is that it can be trained using simulation data, experimental (real/noisy) data, or both. It can also be informed of the underlaying partial differential equation (PDE) if it is known, which helps it generalize even better.
The U-Net Fourier Neural Operator (U-FNO) was recently proposed as an algorithm for operator learning in modeling CO2 geological storage32. The U-FNO is an extension of the FNO33 with an additional U-Net inserted in all or some of the Fourier layers as shown in Fig. 1. The key feature of the FNO is that it formulates the operator learning task by parameterizing the integral kernel directly in Fourier space. This means that the parameters of the network are defined and learned in the Fourier space and the coefficients corresponding to the Fourier series representation of the output function are inferred from the dataset. The key component in the FNO architecture is the Fourier layer, which leverages the Fourier transform to capture global patterns by learning in the frequency domain, enabling efficient modeling of complex spatial- temporal relationships. Unfortunately, the Fourier layer is computationally expensive due to the large number of trainable parameters, as there are two weight matrices for every Fourier layer. In addition, the size of the weight tensor grows exponentially with the increase in the number of Fourier modes, i.e., the Fourier based neural operators suffer from the curse of dimensionality. These issues are only exacerbated with a U-Fourier layer, as the number of trainable weight matrices increases to three per layer. Moreover, the U-Net block in the U-Fourier layer could potentially be very expensive to compute if the input data is in 3D due to 3D convolutions. These sources of inefficiency contribute to high training times reported for the U-FNO architecture. The effectiveness of the U-Net stems from its ability to process data on structured grids through local convolution, and hence enriching the representation power of the architecture in higher frequencies leading to better accuracy compared to FNO and convolutional-FNO (conv-FNO)32. In our opinion, the proposed U-FNO has three main drawbacks: (1) An extremely large number of trainable parameters (more than 30 million in this case), (2) Difficulty in scaling to higher dimensions, and (3) Loss of the resolution invariance of FNO in time, i.e., it cannot make predictions at unseen time steps in the interpolation task.
In a recent work, Jiang et al.34 proposed to combine the U-FNO with the multiple-input deep neural operator (MIONet)35, and the authors termed their architecture Fourier-enhanced multiple-input neural operator (Fourier-MIONet). The Fourier-MIONet is a more computationally and data-efficient alternative to the U-FNO alone, albeit results in some loss of accuracy. The architecture utilizes a modified MIONet to process the inputs then passes the output to a U-FNO, creating a hybrid architecture (Fig. 2). This is achieved by first, separating the time variable \(t\) from the other input variables and processing it through the trunk network. In doing so, 2D convolutions can be used in the U-Net block instead of 3D convolutions in the original U-FNO. In addition, one of the Fourier modes is now removed, which results in significant efficiency improvements, i.e., the number of trainable parameters dropped from 30+ million to 3+ million with a small loss of accuracy. Second, two branch networks were employed to process the two different kinds of inputs, field and scalar.
Compared to U-FNO, the Fourier-MIONet reduces training time per epoch from 1535 s to 730 s on an NVIDIA GeForce RTX 3090 GPU. Additionally, it regains the capacity to generalize over unseen time steps, though these advantages come at a slight cost to accuracy. The overall training duration was reduced from approximately 42–43 h for the U-FNO to a more manageable 16–17 h for the Fourier-MIONet on the same GPU. Yet we believe this improvement remains suboptimal.
In this article, we introduce a novel U-Net Enhanced Deep Operator Network (U-DeepONet) shown in Fig. 3. By leveraging the inherent modularity of the DeepONet, we incorporate a U-Net within its branch structure. This new approach presents substantial improvements in computational efficiency compared to both U-FNO and Fourier-MIONet. Furthermore, it retains the accuracy standards set by the U-FNO, while also inheriting other merits of the Fourier-MIONet.
Methods
U-DeepONet architecture
Inspired by the DeepONet, the ideas implemented in Fourier-MIONet, and U-FNO architectures, we propose a novel architecture that is based on the DeepONet and the U-Net; we call this architecture the U-DeepONet shown in Fig. 3. Moreover, the implementation details of the U-DeepONet are outlined in Table 1, and the implementation details of the U-Net block are outlined in Table 2. We posit that using the Fourier layer and the MIONet is completely unnecessary, and computationally expensive. We build on the idea proposed for Fourier-MIONet and process the time variable \(t\) through the trunk of the DeepONet to leverage the computational benefits of 2D convolutions in lieu of 3D convolutions. In addition, we deploy multiple U-Net blocks in the branch of the DeepONet to process both field and scalar inputs. This setup allows for multiple input operator learning without the MIONet architecture. However, the theoretical analysis is left for a future work.
In the U-DeepONet, the trunk network \({t}_{k}\left(\mathcal{t}\right)\) is a simple feed-forward neural network (FNN) with a nonlinear activation function \(\sigma \) and \(\mathcal{t}\) is the time variable. The branch of the U-DeepONet consists of several U-Net blocks connected in series:
-
1.
Lift the input observations \(\mathcal{v}\left(x\right)\) to a higher dimensional representation \({b}_{0}\left(x\right)\):
$$\begin{array}{c}{b}_{0}\left(x\right)=P\left(\mathcal{v}\left(x\right)\right)\in {\mathbb{R}}^{{d}_{\mathcal{z}}},\end{array}$$(2)where \(P\) is a linear layer, and it is a local transformation \(P:{\mathbb{R}}\to {\mathbb{R}}^{{d}_{p}}\); the result of this transformation \(({b}_{0})\) can be viewed as an image with \({d}_{p}\) channels. The purpose of this mapping is to increase the number of channels so that a bigger weight tensor can be constructed in the U-Net blocks. In addition, it allows the U-DeepONet to process multiple input operators simultaneously.
-
2.
Apply \(L\) U-Nets to \({b}_{0}\). The output of the \(l\) th U-Net block is \({U}_{L}\) with \({d}_{\mathcal{z}}\) channels.
$$\begin{array}{c}{U}_{{l}_{j+1}}=\sigma \left({U}_{{l}_{j}}\right),\end{array}$$(3)where \(\sigma \) is a nonlinear activation function. In our experiments we did not observe significant performance changes with different types of activation functions; nonetheless, the choice of activation functions is generally problem specific.
-
3.
The operator \({\mathcal{G}}_{\theta }\) is defined as the inner product of the branch and trunk networks:
$$\begin{array}{c}{\mathcal{G}}_{\theta }\left(v\right)\left(t\right)={U}_{L}\left(v\right)\cdot {t}_{k}\left(\mathcal{t}\right).\end{array}$$(4) -
4.
The output of the product of the trunk and the branch \({\mathcal{G}}_{\theta }\left(v\right)\left(\mathcal{t}\right)\) is projected to the solution space via a linear transformation parametrized by a linear layer or shallow fully-connected neural network layer \(G:{\mathbb{R}}^{{104\times 208\times 24\times d}_{z}}\to {\mathbb{R}}^{104\times 208\times 24\times 1}\):
$$\begin{array}{c}u\left(x,y,t\right)=G\left({\mathcal{G}}_{\theta }\left(v\right)\left(\mathcal{t}\right)\right).\end{array}$$(5)
A close examination of our novel U-DeepONet reveals that not only does our architecture no longer use 3D convolutions in the U-Net block, it also does not make use of the Fourier layer, its associated weight tensors, and FFT. This contributes to significant gains in computational efficiency compared to the U-FNO, and Fourier-MIONet. The basic idea here is that the DeepONet on its own is an effective operator learning framework, and using a combined DeepONet-Fourier based architecture is redundant, as both are neural operator learning networks.
Training and hyperparameters
In the interest of consistency, we follow the approach presented in Wen et al.32 by constructing a mask to account for the varying thicknesses of the various realizations in the dataset. As such, the loss is only computed within that mask during training for each realization, while the cells laying outside each reservoir are padded with zeros. Hence, the \({l}_{2}\)-loss is given by:
where \(y\) and \(dy/dr\) are the ground truth and the first derivative of the ground truth, respectively, \(\widehat{y}\) and \(d\widehat{y}/dr\) are the predicted output and the first derivative of the predicted output, respectively, and \(\beta =0.5\) is a hyperparameter. The first derivative term has been shown to greatly improve the prediction accuracy and leads to faster convergence32.
We train two separate U-DeepONet networks for saturation and pressure buildup. The trunk in both models consists of a feed forward neural network with 10 layers, while the branch consists of three U-Nets connected in series. Moreover, we use two different learning schedules for saturation and pressure buildup and the training is stopped once the loss stops decreasing. To allow for fair comparisons with the U-FNO and Fourier-MIONet, in general we try to avoid changing hyperparameters where possible. Therefore, we maintain the same U-Net structure, number of epochs, and batch size. It is worth noting that although the FNO architecture does not require spatial data as input, both U-FNO and Fourier-MIONet use it as an additional input feature. This has been shown to improve accuracy, which is also consistent with other reports in the literature36. In this work, we also use the spatial data as an additional input feature in the branch for consistency. Moreover, the initial learning rate for the gas saturation and pressure buildup models are \(0.0007\) and \(0.0006\), respectively. The learning rate follows a ‘staircase’ reduction schedule.
Performance evaluation
We benchmark the performance of the trained operator networks for gas saturation using the plume area coefficient of determination, \({R}_{plume}^{2}\), the plume mean absolute error (MPE), and the mean absolute error (MAE). The plume area is defined as non-zero values in either ground truth or prediction. This approach is used to only evaluate the gas saturation model’s accuracy because the gas saturation outside the CO2 plume is always zero. For the evaluation of the pressure buildup models, we use the field mean relative error, MRE, \({R}^{2}\) score, and the MAE. The field MRE for pressure buildup is defined as follows37:
where \({n}_{t}=24\) is the total number of time steps, \({n}_{e}=500\) is the total number of test samples, \({n}_{b}=96*200=\text{19,200}\) is the number of grid blocks, \({{\widehat{p}}_{i,j}}^{t}\) and \({p}_{i,j}^{t}\) denote the pressure values provided by the model and the ground truth, respectively, for test sample \(i\), in grid block \(j\), at time step \(t\). The difference between the maximum grid-block pressure \({p}_{i,max}^{t}\) and the minimum grid-block pressure \({p}_{i, min}^{t}\), for sample \(i\) at time step \(t\), is used to normalize the pressure absolute error.
Results
Here, we benchmark our U-DeepONet against U-FNO and Fourier-MIONet under three main criteria: accuracy, training efficiency, and inference time. Note that the U-FNO was already benchmarked against the vanilla FNO and conv-FNO, which uses a CNN in place of the U-Net and was shown to outperform both32. Moreover, we train the U-DeepONet on a NVIDIA Tesla V100 GPU. We implement our U-DeepONet using the open-source machine learning framework Pytorch38. Details of the U-DeepONet implementation are outlined in Table 1 in the “Methods” section. The main difference between the saturation and the pressure buildup architectures is in the number of channels in the U-Net block. The code accompanying this manuscript is available upon request and the dataset32 is publicly available.
Dataset
The open-source dataset32, which contains 5,500 realizations of CO2 geological storage in a radial and symmetric reservoir, was generated using the numerical reservoir simulator ECLIPSE (e300)39. The purpose of these realizations is to track the movement of the CO2 plume and pressure buildup over time. The numerical model is made up of two hundred gradually coarsened grid cells in the radial direction with a 100,000 m radius (2D slice), and the simulation runs are reported in 24 successive time snapshots \(\left\{1\; day, 2 \;days, 4 \;days, \dots , 14.8 \;years, 21.1 \;years, 30 \;years\right\}\) using an adaptive implicit method. Supercritical CO2 is injected at a constant rate (constant per realization but varies from one realization to another) ranging from 0.2 to 2 Mt/year for a period of 30 years. The radius of the injection well is 0.1 m. The realizations account for a variable perforation thickness up to the entire thickness of the reservoir which can also vary from 12.5 to 200 m. The outer boundaries of the reservoir are closed (no-flow boundaries). Additionally, the reservoir thickness (\(b\)) is randomly sampled between 12.5 and 200 m, with a vertical grid thickness of 2.08 m; this means that the number of vertical grid cells also varies between 6 and 96. If the number of vertical grids in a specific case is less than 96, the difference is accounted for via zero padding (mask). Consequently, all realizations are of consistent resolution (96, 200).
The dataset contains 5500 input-to-output mappings for saturation and another 5500 input-to-output mappings for pressure buildup; each dataset is split as 9:1:1, where 4500 realizations are used for training, 500 realizations for validation, and 500 realizations for testing. The outputs are the state variables: gas saturation (\({S}_{g}\)) and pressure buildup (\(dP\)); the inputs consist of 9 variables: four field variables (space-dependent inputs) and five scalar variables. In theory, the learned operator should be able to generalize over these nine inputs; in other words, given a new combination of these nine variables that do not exist in the training dataset the learned operator should accurately predict the state variables. The field variables include a horizontal permeability map (\({k}_{x}\)), a vertical permeability map (\({k}_{y}\)), a porosity map (\(\varnothing \)), and an injection perforation map (\(perf\)). Moreover, scalar variables include the initial reservoir pressure at the top of the reservoir (\({P}_{init}\)), injection rate (\(Q\)), reservoir temperature (\(T\)), capillary pressure scaling factor (\(\lambda \)), and irreducible water saturation (\({S}_{wi}\)). Table 3 summarizes the distributions and parameter ranges used to generate the nine variables. We direct the readers to the original paper32 for more details on the generation of the field maps and all other sampling techniques for the inputs. An example of an input-to-output mapping is shown in Fig. 4.
Gas saturation
To train the U-DeepONet, we use a batch size of 4 to remain consistent with U-FNO32 and Fourier-MIONet34 performance results. However, we remark here that using a larger batch size is made possible with our U-DeepONet due to its light memory footprint compared to the other two architectures, where the U-DeepONet requires only 4.6 GiB compared to 15.9 and 12.8 GiB for the U-FNO and Fourier-MIONet, respectively; see Table 4a for performance comparisons. This also gives our U-DeepONet an edge over other models as it is materialistically cheaper to train and deploy since many commercial GPUs have less than 8 GiB of memory.
Moreover, the U-DeepONet test set prediction error is lower than both the U-FNO and the Fourier-MIONet with an average mean plume error (MPE) of only 1.58%. The MPE is a mean absolute error for the plume area only. The testing set results, which contains 500 realizations, represent the predictability of the model on truly unseen data. In Fig. 5, four testing examples for U-DeepONet at two snapshots in time are shown. Using the mean absolute error (MAE) to benchmark, the U-DeepONet is about twice as accurate as the Fourier-MIONet and about 20% more accurate than the U-FNO. The U-DeepONet also has a higher R2 score than the other two models.
The U-DeepONet only requires about 108 s per epoch to train, i.e., it is about 18 times faster to train than the U-FNO on the same GPU, and about 5 times faster than the Fourier-MIONet. We note here that the reported performance of the Fourier-MIONet was on a NVIDIA GeForce RTX 3090 GPU34 that has twice the CUDA cores. Not only is the U-DeepONet faster in training time, but it is also faster in testing time. For any given testing case, the U-DeepONet requires 0.016 s to predict the solution, while the U-FNO and the Fourier-MIONet require 0.018 and 0.041 s, respectively. These results clearly show that the proposed U-DeepONet is more accurate and significantly more efficient in training and testing than both the U-FNO and Fourier-MIONet.
Our U-DeepONet is similar to the Fourier-MIONet in terms of flexibility in selecting the batch size and the number of time snapshots in the trunk input. This means that the accuracy and the computational efficiency of the U-DeepONet can potentially be further improved.
Pressure buildup
As mentioned earlier, a separate U-DeepONet is trained on the pressure buildup data. The architecture of this model is outlined in Table 1 in the “Methods” section. The performance evaluation metrics for pressure buildup are reported in Table 4b. The improvements in accuracy for the U-DeepONet over the U-FNO are not as apparent as in the gas saturation model, with the U-DeepONet having a higher R2 score of 0.994 compared to 0.992 for the U-FNO and only 0.986 for Fourier-MIONet. The U-DeepONet also performs better than the U-FNO in terms of MAE (0.64 vs. 0.66), while the MAE of the Fourier-MIONet is not reported. In terms of the mean relative error (MRE), defined by Eq. (6), the U-FNO performs slightly better than the U-DeepONet (0.0068 vs. 0.0072). Later, we show that the MRE for the U-DeepONet can be further dropped to 0.0069 if a batch size of 6 is used instead of 4.
It is not surprising that the U-DeepONet improvements in accuracy for the pressure buildup model were not as renowned as in the gas saturation case. This is because the U-FNO utilizes the Fourier transform which tends to perform well on smooth PDEs. This is particularly true in the case of the pressure equation which is diffusive by nature. Nonetheless, just as in the gas saturation model, the computational efficiency improvements are vast, with the U-DeepONet being about 15 times faster to train than the U-FNO and using almost a third of the GPU memory requirements. Furthermore, the U-DeepONet is faster than the U-FNO in inference. Figure 6 shows four examples of pressure buildup predictions. Our results for both saturation and pressure buildup models clearly indicate the advantages of the U-DeepONet over other models.
Data efficiency
The U-DeepONet demonstrates exceptional training efficiency and generalization capabilities, with remarkable improvements in data utilization efficiency compared to the U-FNO. Data utilization efficiency is a crucial element as it reduces computational expenses and enhances applicability in data-scarce scenarios. To assess generalizability, we created nine subsets from the main training dataset, varying from 500 to 4500 realizations, and evaluated the model performance using MPE for the saturation dataset and MRE for the pressure buildup dataset. Figure 7 presents these results, including the impact of batch sizes 4 and 6. We ensured robustness by repeating the training with different seeds to confirm consistency.
For the gas saturation dataset, even when trained with just 3000 realizations (two-thirds of the full dataset) and a batch size of 6, the U-DeepONet still surpasses the U-FNO, achieving a MPE of 1.48%. This indicates that the U-DeepONet requires significantly less data to achieve superior testing set performance. Furthermore, in scenarios with limited training data, the U-DeepONet proves to be more reliable, maintaining a testing MPE of 3% when trained with only 500 realizations—about 10% of the full dataset. In contrast, the U-FNO needs over double that amount of data to reach a similar level of performance. For the pressure buildup dataset, although the improvements in data utilization are less pronounced compared to the saturation case, the U-DeepONet still outperforms the U-FNO in small data regimes, ranging from 500 to 1500 realizations.
Inference at unseen time steps
One of the primary limitations of the U-FNO architecture is its grid-invariance in time, stemming directly from the use of a U-Net block to process all inputs, including the time variable. Unlike the U-FNO, both the Fourier-MIONet and our U-DeepONet circumvent this issue by processing the time variable through a separate network. In the U-DeepONet, the time variable is passed to the trunk which allows it to make predictions at unseen time steps within the temporal training horizon. The advantage of this is twofold: it allows for further reduction in the size of the dataset required to train the neural operator, and it leads to a decrease in computational load when generating the dataset using a numerical solver. This reduction is achieved as fewer time steps are needed in numerical schemes, without compromising the accuracy of the machine learning model. Such feature is particularly valuable in applications where data collection is costly or challenging.
To demonstrate the U-DeepONet’s capability to generalize to unseen time steps, we trained our models with only 13 time steps (~ 50% fewer time steps). Building upon the consistent performance of the U-DeepONet with a reduced dataset, as previously shown, and to further demonstrate this capability, we augment the reduction in the time steps with a reduction in the overall size of the training dataset. Accordingly, we used 3500 realizations for the saturation model and 4000 for the pressure buildup model, while maintaining a batch size of 6 for both. To evaluate these models, we present the MPE in Fig. 8a and the MRE in Fig. 8b for the test datasets of the saturation and pressure buildup models, respectively, at each time step. Training the U-DeepONet models with only ~ 50% of the time steps leads to a slight decrease in accuracy at the omitted time steps, as depicted by the empty squares in Fig. 8. On the other hand, the performance of the U-DeepONet at time steps included in the training does not deteriorate (indicated by the solid squares in the figure). It is important to highlight that the first- and last-time steps should always be included in the training because data-driven neural operator learning methods cannot usually extrapolate in time.
From Fig. 8, it's evident that the U-DeepONet's performance (gray curves with squares) using the hyperparameters discussed in the “Methods” section of the paper falls short at unseen time steps. To address this, we focused on the temporal interpolation task by refining the trunk network. The initial trunk network featured 10 layers and 64 neurons per layer for the saturation model and 96 neurons per layer for the pressure model. The updated design comprises 16 layers and 14 neurons per layer for the first 15 layers, while the last layer in each model maintained 64 and 96 neurons for the gas and pressure models, respectively, to enable the dot product with the branch.
Results of the updated design are shown in Fig. 8 (depicted by the orange curve with triangles). We can see from Fig. 8a and b that the U-DeepONet regains its stellar performance at unseen time steps due to the fine-tuning of the trunk network.
Conclusions
This paper presents a novel U-Net enhanced deep operator network, termed U-DeepONet. In designing the U-DeepONet, the best features from both the U-FNO and the Fourier-MIONet were fused into an architecture that outperforms the other two architectures in training and testing performance for multiphase flow and transport problems in porous media. We evaluate the novel U-DeepONet using the open-source CO2 sequestration dataset32 and compare performance to that of the U-FNO and the Fourier-MIONet. Results show that the U-DeepONet is advantageous in performance, predictive accuracy, training efficiency, and data utilization efficiency. Our U-DeepONet is more than 18 times faster in training than the U-FNO and more than 5 times faster than the Fourier-MIONet, while being more accurate than both models. We also show that the U-DeepONet has a much smaller GPU memory footprint compared to the other operator learning algorithms and is faster in inference. The U-DeepONet is data efficient and better at generalization as it can be trained with less data while maintaining accuracy. Moreover, we show that the U-DeepONet performance at unseen time steps is robust without any changes to the architecture. Overall, we show that the U-DeepONet is a better framework for neural operator learning compared to other state-of-the-art frameworks, and that the U-DeepONet is easier to work with due to the smaller number of hyperparameters. The U-DeepONet architecture can be seen as an alternative to the multiple-input DeepONet (MIONet), as we have shown that it can learn multiple operators simultaneously. Finally, like most other neural operators, the U-DeepONet does not perform well in temporal extrapolation tasks, especially for timesteps far from the temporal training horizon. Our work on forward modeling problems using neural operators is ongoing and is still very much an open question.
Data availability
The raw/processed data required to reproduce these findings is available in the below link, courtesy of reference32 in the manuscript. https://drive.google.com/drive/folders/1fZQfMn_vsjKUXAfRV0q_gswtl8JEkVGo.
References
Bachu, S. CO2 storage in geological media: Role, means, status and barriers to deployment. Prog. Energy Combust. Sci. 34, 254–273. https://doi.org/10.1016/j.pecs.2007.10.001 (2008).
Benson, S. M. & Cole, D. R. CO2 sequestration in deep sedimentary formations. Elements 4, 325–331 (2008).
Pruess, K. & García, J. Multiphase flow dynamics during CO2 disposal into saline aquifers. Environ. Geol. 42, 282–295 (2002).
Saadatpoor, E., Bryant, S. L. & Sepehrnoori, K. New trapping mechanism in carbon sequestration. Transp. Porous Media 82, 3–17 (2010).
Lengler, U., De Lucia, M. & Kühn, M. The impact of heterogeneity on the distribution of CO2: Numerical simulation of CO2 storage at Ketzin. Int. J. Greenhouse Gas Control 4, 1016–1025 (2010).
Strandli, C. W., Mehnert, E. & Benson, S. M. CO2 plume tracking and history matching using multilevel pressure monitoring at the Illinois basin-Decatur project. In Energy Procedia Vol. 63, 4473–4484 (Elsevier Ltd, 2014).
Yin, Z., Siahkoohi, A., Louboutin, M. & Herrmann, F. J. Learned coupled inversion for carbon sequestration monitoring and forecasting with Fourier neural operators. In SEG Technical Program Expanded Abstracts vols 2022-August, 467–472 (Society of Exploration Geophysicists, 2022).
Fawad, M. & Mondol, N. H. Monitoring geological storage of CO2: A new approach. Sci. Rep. https://doi.org/10.1038/s41598-021-85346-8 (2021).
Ajayi, T., Gomes, J. S. & Bera, A. A review of CO2 storage in geological formations emphasizing modeling, monitoring and capacity estimation approaches. Pet. Sci. 16, 1028–1063. https://doi.org/10.1007/s12182-019-0340-8 (2019).
Zhao, M., Wang, Y., Gerritsma, M. & Hajibeygi, H. Efficient simulation of CO2 migration dynamics in deep saline aquifers using a multi-task deep learning technique with consistency. Adv. Water Resour. 178, 104494 (2023).
Flemisch, B. et al. The fluidflower validation benchmark study for the storage of CO2. Transp. Porous Media https://doi.org/10.1007/s11242-023-01977-7 (2023).
Tariq, Z. et al. Data-driven machine learning modeling of mineral/CO2/brine wettability prediction: implications for CO2 geo-storage. In SPE Middle East Oil and Gas Show and Conference, MEOS, Proceedings (Society of Petroleum Engineers (SPE), 2023). https://doi.org/10.2118/213346-MS.
Anyosa, S., Bunting, S., Eidsvik, J., Romdhane, A. & Bergmo, P. Assessing the value of seismic monitoring of CO2 storage using simulations and statistical analysis. Int. J. Greenhouse Gas Control 105, 103219 (2021).
Nordbotten, J. M. et al. Uncertainties in practical simulation of CO2 storage. Int. J. Greenhouse Gas Control 9, 234–242 (2012).
Jeong, H., Srinivasan, S. & Bryant, S. Uncertainty quantification of CO2 plume migration using static connectivity of geologic features. In Energy Procedia Vol. 37, 3771–3779 (Elsevier Ltd, 2013).
Gan, M. et al. Impact of reservoir parameters and wellbore permeability uncertainties on CO2 and brine leakage potential at the Shenhua CO2 Storage Site, China. Int. J. Greenhouse Gas Control 111, 103443 (2021).
Cao, C. et al. Parametric uncertainty analysis for CO2 sequestration based on distance correlation and support vector regression. J. Nat. Gas Sci. Eng. 77, 103237 (2020).
Xiao, C. et al. Deep-learning-generalized data-space inversion and uncertainty quantification framework for accelerating geological CO2 plume migration monitoring. Geoenergy Sci. Eng. 224, 211627 (2023).
Mahjour, S. K. & Faroughi, S. A. Selecting representative geological realizations to model subsurface CO2 storage under uncertainty. Int. J. Greenhouse Gas Control 127, 103920 (2023).
Zhang, K., Wu, Y.-S. & Pruess, K. User’s Guide for TOUGH2-MP-A Massively Parallel Version of the TOUGH2 Code. (2008).
Lichtner, P. et al. PFLOTRAN User Manual: A Massively Parallel Reactive Flow and Transport Model for Describing Surface and Subsurface Processes.
Wen, G. et al. Real-time high-resolution CO2 geological storage prediction using nested Fourier neural operators. Energy Environ. Sci. 16, 1732–1741 (2023).
Tariq, Z., Yan, B. & Sun, S. Predicting trapping indices in CO2 sequestration—A data-driven machine learning approach for coupled chemo-hydro-mechanical models in deep saline aquifers. In ARMA US Rock Mechanics/Geomechanics Symposium (2023). https://doi.org/10.56952/ARMA-2023-0757.
Ju, X. et al. Learning CO2 plume migration in faulted reservoirs with Graph Neural Networks. arXiv preprint arXiv:2306.09648 (2023).
Yan, B., Chen, B., Robert Harp, D., Jia, W. & Pawar, R. J. A robust deep learning workflow to predict multiphase flow behavior during geological CO2 sequestration injection and Post-Injection periods. J. Hydrol. 607, 127542 (2022).
Lyu, Y., Zhao, X., Gong, Z., Kang, X. & Yao, W. Multi-fidelity prediction of fluid flow based on transfer learning using Fourier neural operator. Phys. Fluids https://doi.org/10.1063/5.0155555 (2023).
Falola, Y., Misra, S. & Nunez, A. C. Rapid high-fidelity forecasting for geological carbon storage using neural operator and transfer learning. In Abu Dhabi International Petroleum Exhibition and Conference (SPE, 2023). https://doi.org/10.2118/216135-MS.
Stepien, M., Ferreira, C. A. S., Hosseinzadehsadati, S., Kadeethum, T. & Nick, H. M. Continuous conditional generative adversarial networks for data-driven modelling of geologic CO2 storage and plume evolution. Gas Sci. Eng. 115, 204982 (2023).
Tang, M., Ju, X. & Durlofsky, L. J. Deep-learning-based coupled flow-geomechanics surrogate model for CO2 sequestration. Int. J. Greenhouse Gas Control 118, 103692 (2022).
Cardoso, M. A., Durlofsky, L. J. & Sarma, P. Development and application of reduced-order modeling procedures for subsurface flow simulation. Int. J. Numer. Methods Eng. 77, 1322–1350 (2009).
Zhang, K. et al. Fourier neural operator for solving subsurface oil/water two-phase flow partial differential equation. SPE J. 27, 1815–1830 (2022).
Wen, G., Li, Z., Azizzadenesheli, K., Anandkumar, A. & Benson, S. M. U-FNO—An enhanced Fourier neural operator-based deep-learning model for multiphase flow. Adv. Water Resour. 163, 104180 (2022).
Li, Z. et al. Fourier neural operator for parametric partial differential equations. (2021).
Jiang, Z. et al. Fourier-MIONet: Fourier-Enhanced Multiple-Input Neural Operators for Multiphase Modeling of Geological Carbon Sequestration. arXiv:2303.04778v1 (2023).
Jin, P., Meng, S. & Lu, L. MIONet: Learning multiple-input operators via tensor product. SIAM J. Sci. Comput. 44, A3490–A3514. https://doi.org/10.1137/22M1477751 (2022).
Lu, L. et al. A comprehensive and fair comparison of two neural operators (with practical extensions) based on FAIR data. Comput. Methods Appl. Mech. Eng. 393, 1–35 (2022).
Tang, M., Liu, Y. & Durlofsky, L. J. A deep-learning-based surrogate model for data assimilation in dynamic subsurface flow problems. J. Comput. Phys. 413, 109456 (2020).
Paszke, A. et al. PyTorch: An imperative style, high-performance deep learning library. Adv. Neural Inf. Process Syst. 32, (2019).
Schlumberger. ECLIPSE Reservoir Simulation Software Reference Manual. (2014).
Acknowledgements
The authors wish to acknowledge Khalifa University's high-performance computing facilities for providing the computational resources.
Author information
Authors and Affiliations
Contributions
M.A.K. and W.D. conceptualized and constructed the neural operator network. W.D. wrote the software, generated the visualization, and conducted the interpretation; M.A.K. provided supervision. Both authors carried out the formal analysis and both have written and reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Diab, W., Al Kobaisi, M. U-DeepONet: U-Net enhanced deep operator network for geologic carbon sequestration. Sci Rep 14, 21298 (2024). https://doi.org/10.1038/s41598-024-72393-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-72393-0
Keywords
This article is cited by
-
Learning integral operators via neural integral equations
Nature Machine Intelligence (2024)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.