Introduction

As the second highest greenhouse gas (GHG) contributor to global radiative forcing, understanding the global budget of methane (CH4) is a top climate priority1,2. Methane is emitted from a variety of anthropogenic and natural emission sectors, including oil and gas operations, waste management, coal mining, agriculture, wetlands, and fires among others3. Article 14 of the Paris Agreement4 requires participating countries to report progress towards achieving their climate mitigation goals, or nationally determined contributions. Reporting progress, including any changes in the CH4 budget, necessitates inventorying all possible emission sources. CH4 emission inventories can be constructed from “bottom-up” or derived from “top-down” observations. Bottom-up accounting relies on a knowledge of activity data and emission factors for anthropogenic sectors and/or detailed processed-based models that predict CH4 emissions based on a set of environmental factors for natural emission sectors. By aggregating an ensemble of bottom-up inventories and process-models, Saunois et al.5 calculated a global methane budget for 2008–2017 and estimated total emissions of 594–880 TgCH4 a−1, with 113–154 TgCH4 a−1 from fossil fuels, 191–223 TgCH4 a−1 from agriculture and waste, 26–40 TgCH4 a−1 from biomass and biofuel burning, 102–182 TgCH4 a−1 from wetlands, and 143–306 TgCH4 a−1 from other natural sources. Uncertainty and bias in bottom-up CH4 emissions in some geographic regions may be caused by imprecise emission factors and activity data that are not readily available at necessary spatial and temporal scales, or by process-based models that perform poorly due to a host of environmental factors (e.g., wetland models rely on wetland inundation maps, biogeochemical process parameterizations, and knowledge of carbon availability6,7).

Atmospheric observations of CH4, combined with an atmospheric transport model and a regularizing statistical approach, provide a top-down constraint on the global CH4 budget. Typically referred to as “inversions” or top-down inventories, these methods estimate CH4 fluxes by assimilating tower, aircraft, or satellite-based CH4 measurements8,9,10,11,12,13,14. Generally, these top-down methods only estimate total fluxes (i.e., sum of all emission sector contributions) explicitly, and may rely on using prior ratios or relative weights (RWs) between source categories to partition fluxes to specific source sectors. Top-down inverse models may be driven by different regularization or prior conditions, complicating direct comparison between an ensemble of inventories. Therefore, these partitioning approaches are prone to error when comparing with bottom-up inventories if the prior distribution of emissions is biased, or if different sectors have different uncertainties. For example, Saunois et al. compared an ensemble of 22 top-down CH4 global inversions to an ensemble of bottom-up inventories, and found the bottom-up estimate to be 30% higher than the top-down ensemble mean. The study attributes much of this total discrepancy to large differences in non-wetland natural sources (e.g., lakes and rivers, oceans seeps, termites, geologic sources, wild animals, etc.). However, when integrating emissions over the whole globe, they find that other source categories from bottom-up and top-down approaches are consistent within their reported uncertainties (fossil fuel top-down: 81–131 Tg a−1, bottom-up: 113–154 Tg a−1; agriculture+waste top-down: 207–240 Tg a−1, bottom-up: 191–223 Tg a−1; and biomass+biofuel burning top-down: 22–36 Tg a−1, bottom-up: 26–40 Tg a−1). Reported uncertainties reflect the range of estimates among distinct bottom-up and top-down inventories across various emission sectors. Saunois et al. also relied on using RWs to partition top-down fluxes to individual bottom-up sectors. An explicit approach for comparison between top-down and bottom-up inventories, and between independent top-down based inventories, is needed to reduce this uncertainty in the global CH4 budget by sector and by region.

Bayesian estimation of emissions from fluxes

We propose a comprehensive Bayesian framework that derives top-down gridded CH4 emissions (\(\hat{{{{{{\bf{z}}}}}}}\)) and their error covariance (\(\hat{{{{{{\bf{Z}}}}}}}\)) from inverse fluxes (\(\hat{{{{{{\bf{x}}}}}}}\)) and their error covariance (\(\hat{{{{{{\bf{S}}}}}}}\)) without reliance on RWs from an inventory. This framework takes the following form:

$$\hat{{{{{{\bf{z}}}}}}}={{{{{{\bf{z}}}}}}}_{{{{{{\rm{A}}}}}}}+\hat{{{{{{\bf{Z}}}}}}}{{{{{{\bf{M}}}}}}}^{T}{\hat{{{{{{\bf{S}}}}}}}}^{-1}[({{{{{\bf{I}}}}}}-\hat{{{{{{\bf{S}}}}}}}{{{{{{\bf{S}}}}}}}_{{{{{{\rm{A}}}}}}}^{-1})({{{{{{\bf{x}}}}}}}_{{{{{{\rm{A}}}}}}}-{{{{{\bf{M}}}}}}{{{{{{\bf{z}}}}}}}_{{{{{{\rm{A}}}}}}})+(\hat{{{{{{\bf{x}}}}}}}-{{{{{{\bf{x}}}}}}}_{{{{{{\rm{A}}}}}}})]$$
(1)

Where the vector \(\hat{{{{{{\bf{x}}}}}}}\) represents inverse CH4 fluxes on a spatial grid with full error characterization given by the covariance matrix \(\hat{{{{{{\bf{S}}}}}}}\), xA is the vector of prior fluxes, I is the identity matrix, SA is the prior error covariance matrix, and M is an aggregation matrix that sums emissions to fluxes (Methods section: Eqs. 89). The posterior emission error covariance matrix \(\hat{{{{{{\bf{Z}}}}}}}\) is calculated explicitly given M, SA, \(\hat{{{{{{\bf{S}}}}}}}\), and prior emissions error covariance matrix ZA:

$$\hat{{{{{{\bf{Z}}}}}}}={({{{{{{\bf{M}}}}}}}^{T}({\hat{{{{{{\bf{S}}}}}}}}^{-1}-{{{{{{\bf{S}}}}}}}_{{{{{{\rm{A}}}}}}}^{-1}){{{{{\bf{M}}}}}}+{{{{{{\bf{Z}}}}}}}_{{{{{{\rm{A}}}}}}}^{-1})}^{-1}.$$
(2)

The full derivation is documented in the Methods section and Supporting Information (SI) and has been mechanically verified and tested using simulated emissions, concentrations, and fluxes. This approach has been previously described for atmospheric trace gas retrievals15 but is here modified and applied for flux comparisons.

Applying Eqs. 1 and 2 allows for the ability to “swap priors.” This means that no matter what prior was used in an initial flux inversion, we can swap it with a different prior emissions vector \({{{{{{\bf{z}}}}}}}_{{{{{{\rm{A}}}}}}}\), which can include sector-based information. This a critical component for comparison between two different top-down inventories, as it removes error that may arise from choice of prior. Another advantage of this Bayesian approach is that fluxes are partitioned according to not just the prior emission state zA, but also according prior uncertainties on those emissions (i.e., \({{{{{{\bf{Z}}}}}}}_{{{{{{\rm{A}}}}}}}\)). For example, if we have evidence that a particular emission sector is well-characterized in bottom-up inventories (i.e., a tight prior uncertainty), Eqs. 12 take this knowledge into account when optimizing emissions. In this sense, our approach is similar to updated methods that use prior ratios with prior error variance when computing RWs used for partitioning16. However, any RW-based approach still assumes that correlation exists between emission sectors at the grid-level, creating a relationship that may bias results depending on prior construction.

The main caveat of this Bayesian approach is that an explicit representation of \(\hat{{{{{{\bf{S}}}}}}}\) is needed. For analytical flux inversions, this matrix has a closed-form representation that is computed as part of the inversion. For adjoint-based inversions10,11, a closed-form representation of \(\hat{{{{{{\bf{S}}}}}}}\) is not directly computed. Though the error covariance can still be estimated17,18, this is computationally expensive and is generally not done. Analytical frameworks are best suited for direct comparison between inverse products due to explicit error characterization.

Results

In what follows, we show examples of this approach using a previously performed 2010–2015 GOSAT flux inversion of the Continental United States9 (CONUS). We also apply Eqs. 1 and 2 to a previously performed 2018–2019 TROPOMI flux inversion over the Permian Basin in western Texas, southern New Mexico19. We compare with the partitioned results from the 2010–2015 GOSAT inversion to show how projection to a common prior can be used to assess regional emission trends since uncertainties from different priors and different spatial resolutions of the flux inversions are removed.

Partitioning CH4 fluxes over CONUS

We apply the partitioning algorithm described by Eqs. 1 and 2 to a 2010–2015 0.50 × 0.625° resolution North American CH4 flux inversion6 performed using Greenhouse Gas Observing Satellite20 (GOSAT) dry air column mixing ratios of CH4. We partition emission to seven distinct sectors at 0.1 × 0.1° resolution: oil, gas, coal, livestock, waste management, wetlands, and other emission sources (soil, fire, etc.). For oil, gas, and coal prior emissions and uncertainties, we use a global inventory for 2016 based on national reports to the United Nations Framework Convention on Climate Change (Scarpelli et al.21). For wetland prior emissions and uncertainties, we take the ensemble average and standard deviation of wetland models that were found to be the highest performing when compared with a global CH4 flux inversion22. For all other emission sectors, we use the 2012 EPA gridded CH4 inventory as the prior estimate23, and assume a one standard deviation uncertainty equal to 50% of the mean value. For illustration and computational tractability, the prior error covariances are represented as diagonal matrices.

Figure 1a shows the inverse fluxes \(\hat{{{{{{\bf{x}}}}}}}\) over CONUS, optimized emissions \(\hat{{{{{{\bf{z}}}}}}}\) for the gas and livestock sectors when applying Eqs. 12, and the change in emissions compared to the emission prior \((\hat{{{{{{\bf{z}}}}}}}-{{{{{{\bf{z}}}}}}}_{{{{{{\rm{A}}}}}}})\). Other optimized emission sectors are shown in Fig. S1. Optimized oil and gas emissions show large changes from the prior at the basin scale in several major producing basins and shale plays: Permian (New Mexico/Texas; ΔCH4: gas/oil = 0.40/0.76 TgCH4 a−1), Eagle Ford (southern Texas; ΔCH4 = 0.14/0.11 TgCH4 a−1), Haynesville (Texas/Louisiana; ΔCH4 = 0.18/0.0 TgCH4 a−1), Barnett Shale (Texas; ΔCH4 = 0.21/0.0 TgCH4 a−1), Anadarko (Oklahoma; ΔCH4 = 0.52/0.05 TgCH4 a−1), and the Appalachia Basin (Ohio/Pennsylvania/West Virginia; ΔCH4 = 0.20/0.0 TgCH4 a−1). The Niobrara (Wyoming/Colorado; ΔCH4 = 0.05/0.0 TgCH4 a−1) region shows much less or no change from the prior. For the Bakken (Montana/North Dakota; ΔCH4 = 0.00/−0.06 TgCH4 a−1), the small CH4 flux enhancement observed in Fig. 1a is partitioned entirely to the oil sector, but produces emissions lower than the prior, which contrasts with the increasing production reported in the basin24. This emission reduction may be due to an overestimate in the prior inventory or a difficulty in GOSAT sampling over that region, factors which could be verified with additional study. Posterior livestock emissions show increases from the prior that are distributed across the central and eastern United States, and a 0.07 TgCH4 a−1 decrease in emissions over central California.

Fig. 1: Optimized total CH4 emissions per grid cell over CONUS using 2010–2015 GOSAT inverse fluxes6.
figure 1

Panel a shows CH4 fluxes at 0.5 × 0.625° resolution. Panels b and c show optimized emissions and the change from the prior for the gas sector, respectively at 0.1 × 0.1° resolution. Circled boxes in Panel b show rough outlines of major U.S. gas producing basins: (i) Permian, (ii) Anadarko, (iii) Haynesville, (iv) Eagle Ford, (v) Barnett Shale, (vi) Appalachia, (vii) Bakken, and (viii) Niobara. Panels d and e show optimized emissions and the change from the prior for the livestock sector, respectively.

A major advantage of using Eqs. 12 to estimate emissions from independent inverse fluxes is that any discrepancies in flux priors can be accounted for when partitioning to a common emission prior. We show this through a sensitivity study whose results are summarized in Fig. 2. Here, we use two inverse flux products: (1) the 2010–2015 GOSAT CONUS flux shown in Fig. 1a, and (2) inverse fluxes from 2010–2015 GOSAT recomputed by using the EDGAR v5.0 emission inventory for oil, gas, waste, and livestock sectors for 201525. Differences in these prior inventories are summarized in Table S1 and Figs. S2S3. Global inventories like EDGAR use consistent bottom-up aggregation methods across countries, which sometimes leads to discrepancies when compared with national inventories for particular sectors21. We employ two methods for optimizing/partitioning inverse fluxes: (1) using optimal estimation (OE) from this study’s Eqs. 12 to optimize the same emissions prior from Fig. 1, and (2) by computing RWs of each emission sector in the flux prior and partitioning the inverse fluxes to emissions using these weights. In Fig. 2 we show that by employing OE on each separate inverse flux, we get the exact same answer for optimized emissions. This is a result of the \(({{{{{\bf{I}}}}}}-\hat{{{{{{\bf{S}}}}}}}{{{{{{\bf{S}}}}}}}_{A}^{-1})\) term from Eq. 1. Sometimes called the averaging kernel matrix26, this term represents the spatial resolution of the flux estimate, or alternatively, the degree of smoothing of the flux prior to the estimate. Since the averaging kernel is applied to the prior swap term \(({{{{{{\bf{x}}}}}}}_{{{{{{\rm{A}}}}}}}-{{{{{\bf{M}}}}}}{{{{{{\bf{z}}}}}}}_{{{{{{\rm{A}}}}}}})\), we account for discrepancies between flux and emission priors as a condition of emission optimization, so the choice of flux prior is immaterial. However, Fig. 2 shows the result when a relative weighting scheme is used for partitioning. The livestock to gas ratio in EDGAR is higher than in the EPA and Scarpelli et al. inventories. The result is that RW-partitioned livestock emissions are higher and gas emissions are lower when using EDGAR RWs (13.2 TgCH4 a−1 and 5.8 TgCH4 a−1, respectively) than when using EPA-Scarpelli RWs (11.3 TgCH4 a−1 and 8.1 TgCH4 a−1, respectively). Similarly, wetland and waste emissions differ depending on which prior are used for RWs. Using prior ratios assumes total correlation between emission sectors in each grid cell as each sector’s RW depends on emissions from other sectors. Therefore, we see that different assumptions on the flux prior can complicate comparison even between inverse fluxes that use the same atmospheric observations and transport model.

Fig. 2: Emissions partitioned or optimized from 2010–2015 GOSAT flux inversions.
figure 2

Two different flux products are partitioned using two different partitioning approaches. Blue bars represent fluxes that were derived using the same flux prior as Fig. 1 (Table S1). Red bars represent fluxes that were derived using EDGAR v5.0 for oil, gas, livestock, and waste sectors. Solid bars represent the optimal emission (OE) approach from Eqs. 12. Hatched bars represent partitioning done by computing the relative weights (RW) of emission sectors that made up the flux prior.

Comparing inverse fluxes in the Permian Basin

As seen from Figs. 1 and S1, the Permian Basin shows large increases in CH4 emissions for both oil and gas sectors compared to the prior inventory. Production data from the Energy Information Administration24 (EIA) indicates that between 2010 and 2019, the Permian increased oil production by 190% and gas production by 150%. We expect that fugitive emissions from these sectors would increase proportionately to the production increases, but the actual posterior estimates do not differentiate between intentional (planned) and unintentional (unplanned) emissions. The sensitivity study in Fig. 2 shows that the application of Eqs. 12 can be used to compare distinct inverse fluxes when a common emissions prior is used. We compare the 2010–2015 GOSAT flux product with a Permian 0.25° × 0.325° flux product19 based on May 2018–March 2019 TROPOspheric Monitoring Instrument27 (TROPOMI). We partition fluxes to the same sectors as Fig. 1, but at a finer 0.1 × 0.1° grid resolution.

Figure 3a, b shows the 2010–2015 GOSAT flux inversion and the 2018–2019 TROPOMI flux inversion over the Permian Basin, respectively. Within the Permian domain (black line in Fig. 3), the GOSAT inverse flux estimate is 2.01 ± 0.01 TgCH4 a−1 and the TROPOMI inverse flux estimate is 2.68 ± 0.5 TgCH4 a−1, though these inversions were performed over different time periods and used different flux priors and prior error covariance matrices to constrain their solutions. The GOSAT and TROPOMI inverse products show distinct spatial patterns. The TROPOMI inversion shows two main regions of elevated CH4 flux, which correspond to the Delaware Basin on the western side of the Permian (west Texas, southeast New Mexico) and the Midland Basin on the eastern side of the Permian. The GOSAT inversion shows a more distributed region of flux enhancement across the Permian. The difference in spatial distribution of these CH4 flux maps could be due to the more limited GOSAT spatial observing coverage over the Permian and the coarser spatial resolution over which the 2010–2015 GOSAT flux inversion was performed. However, given that the two flux inversions are asynchronous and that oil and gas production increased dramatically between 2015 and 2018, spatial differences in CH4 fluxes could also be the result of changing infrastructure throughout the basin.

Fig. 3: Comparison of two satellite inverse flux whose emissions were optimized using the same prior in the Permian Basin (black outline).
figure 3

Panel a shows the 2010–2015 GOSAT inverse flux result over the Permian Basin. Panel b shows the 2018–2019 TROPOMI inverse flux13, with the Delaware and Midland basins circled in red. Panel c is the prior oil and gas inventory used for partitioning15. Panels d and e represent posterior oil and gas emissions, optimized from fluxes in panels a and b, respectively.

Figure 3d, e show the optimized emissions for the oil, gas, and livestock sectors in the Permian, respectively, and compared to the prior inventory (Fig. 3c). Optimized emissions are derived from GOSAT and TROPOMI fluxes (Fig. 3a, b) that were swapped with a consistent prior (Fig. 3c) using Eqs. 1 and 2. The partitioned GOSAT emissions continue to show more distributed oil and gas CH4 emissions across the basin when compared to the partitioned TROPOMI emissions, which are mostly concentrated to the Delaware and Midland Basins. Figure 4 shows the aggregated top-down oil and gas emissions in the Permian compared with reported trends from EIA production reports. The mean 2010–2015 oil and gas production in the Permian was 1.3 million bbl per day and 5.0 million Mcf per day, respectively. This increased to 3.8 million bbl per day and 12.7 million Mcf per day between May 2018 and March 2019 mean production, respectively24. Though not linear with increased production, comparing partitioned 2010–2015 GOSAT oil and gas emissions and 2018–2019 TROPOMI emissions, we see a 0.52 ± 0.29 TgCH4 a−1 change in CH4 emissions, representing a 29% increase. Optimized emissions summed across all sectors increased by 0.42 ± 0.33 TgCH4 a−1, with the increase driven mostly by gas, some contribution by oil, and offset by a small decrease in livestock emissions. Originally reported posterior flux estimates showed a 0.67 ± 0.5 TgCH4 a−1 difference between TROPOMI and GOSAT inversions, larger than the 0.42 ± 0.33 TgCH4 a−1 difference we quantify here after reprojection to a common prior. This difference discrepancy between top-down estimates corresponds explicitly to differences in flux priors used in the original inversions that is now accounted for with this approach. Therefore, we can quantify how much the choice of flux prior directly impacts any quantified changes between top-down inventories.

Fig. 4: Aggregated CH4 emissions and reported production from the Energy Information Administration (EIA) in the Permian Basin.
figure 4

Panel a shows top-down emissions optimized from the independent flux products from Fig. 3 aggregated to the basin. Error bars are an integration of prior and posterior error covariance matrices for the respective emission sectors. Panel b compares mean 2010–2015 and May 2018–March 2019 EIA production data21 for oil and gas sectors to the oil+gas emissions from panel a.

Though consistent with the increase in gas production, the quantified 0.42 TgCH4 a−1 increase in oil and gas emissions from 2010–2015 to 2018–2019 could also be due to observing constraints and/or biases in satellite retrievals. For example, Qu et al.28 performed global 2 × 2.5° CH4 inversions for 2019 using GOSAT and TROPOMI using the same prior and atmospheric chemistry-transport model, and derived Permian net fluxes of 2.36 TgCH4 a−1 and 2.43 TgCH4 a−1, respectively, showing consistency between signals observed by these independent remote sensing platforms for this region. Therefore, we conclude that the trends observed in Fig. 3 are likely due to changes in gas operations, and not bias in observing systems. However, in other global regions where the surface is less bright and homogeneous than the Permian, flux results derived from TROPOMI and GOSAT inversion may not agree28.

Discussion

Having robust intercomparison methods in place are needed for interpreting the ever increasing number of atmospheric observations of CH4, particularly with regard to the expected launches of several CH4-observing satellites in the 2020s29. As described in this study, inverse fluxes derived from these satellite observations will depend on the observations themselves, the chemical transport model, and the prior constraint. Having the ability to directly quantify how these terms influence emissions is needed for diagnosing uncertainty in the estimated methane budget. Ultimately, this information can be combined to provide better global understanding of methane emissions and finer and more policy-relevant spatial and temporal scales. Likewise, the Bayesian framework we describe in this study can be applied to other atmospheric gas fluxes like carbon dioxide (CO2). Partitioning between biospheric and anthropogenic CO2 emissions remains highly uncertain30, so incorporating this framework that directly optimizes emission sectors could be useful for reconciling the budget. This approach does require a move from traditional ensemble or adjoint-based inversions, created to reduce cost of this computationally expensive problem, to an analytic or optimal estimation inversion as an explicit representation of the posterior covariance is required and this covariance is not easily calculated from ensemble or 4D-var methods.

While our estimates account for the spatial resolution and error associated with the inversion of observations to fluxes, we do not explicitly account for error in model transport and chemistry. These errors could be important when comparing emissions between seasons or in regions where transport is poorly modeled such as in the tropics. For example, a study of global carbon monoxide emissions, a trace gas that (like CH4) is affected by reaction with the hydroxl radical (OH) and transport, found that convective mass flux in the tropics is likely responsible for errors in emissions31. While our approach can account for this error if a corresponding posterior covariance is provided, we emphasize that studies that both characterize and mitigate this part of the flux error budget are needed to better use satellite observations of methane and of other trace gases. Ultimately, improved emission and error characterization from top-down information will allow for better updates and comparisons with bottom-up inventories, which can guide progress towards CH4 mitigation.

Methods

In this section we derive a method to estimate CH4 emissions from atmospheric observations. The SI contains and alternate derivation (Section S1) and conceptual examples to further clarify the mechanics of prior-swapping. Emissions can be represented as a vector i.e., \(({{{{{\bf{z}}}}}}\in {{\mathbb{R}}}^{m})\) that contains both sectoral and spatial information. Atmospheric observations i.e., \(({{{{{\bf{y}}}}}}\in {{\mathbb{R}}}^{p})\) often represent concentrations of CH4 observed by surface, satellite, airborne, or some other observing system. Generally, atmospheric inversions do not directly optimize CH4 sectoral emissions from observations, and instead optimize CH4 fluxes i.e., \(({{{{{\bf{x}}}}}}\in {{\mathbb{R}}}^{n})\), which represent the summation of CH4 emissions within a grid cell. In the flux inversion setup, atmospheric CH4 observations y are used to estimate CH4 fluxes x. We estimate this optimal state by finding the mode of the posterior flux distribution p (x|y), or \(\hat{{{{{{\bf{x}}}}}}}\). A transformation or Jacobian matrix K can be derived from atmospheric transport simulations (e.g., GEOS-Chem), such that we can represent the relationship between fluxes and observations:

$${{{{{\bf{y}}}}}}={{{{{\bf{K}}}}}}{{{{{\bf{x}}}}}}+{{{{{\bf{n}}}}}}$$
(3)

Where n represents noise. We apply Bayes Theorem to estimate the posterior distribution p (x|y):

$$p({{{{{\bf{x}}}}}}|{{{{{\bf{y}}}}}})\propto p({{{{{\bf{y}}}}}}|{{{{{\bf{x}}}}}})p({{{{{\bf{x}}}}}})$$
(4)

Where p (y|x) is the maximum likelihood given by Eq. 3, and \(p({{{{{\bf{x}}}}}})\) is the prior distribution. If we assume that p (y|x) and p(x) are Gaussian distributions, and y and xA represent the modes of those respective distributions, then the mode of the posterior distribution \(\hat{{{{{{\bf{x}}}}}}}\) has a closed form solution, known as the Maximum A Posteriori (MAP; Rodgers, 2000) solution:

$$\hat{{{{{{\bf{x}}}}}}}={{{{{{\bf{x}}}}}}}_{{{{{{\rm{A}}}}}}}+\hat{{{{{{\bf{S}}}}}}}{{{{{{\bf{K}}}}}}}^{T}{{{{{{\bf{S}}}}}}}_{{{{{{\rm{y}}}}}}}^{-1}({{{{{\bf{y}}}}}}-{{{{{\bf{K}}}}}}{{{{{{\bf{x}}}}}}}_{{{{{{\rm{A}}}}}}})$$
(5)
$$\hat{{{{{{\bf{S}}}}}}}={({{{{{{\bf{K}}}}}}}^{T}{{{{{{\bf{S}}}}}}}_{{{{{{\rm{y}}}}}}}^{-1}{{{{{\bf{K}}}}}}+{{{{{{\bf{S}}}}}}}_{A}^{-1})}^{-1}$$
(6)

For policy-relevance and CH4 budget quantification, we really wish to optimize emissions using atmospheric observations, i.e., we want to compute the explicit posterior representation \(p({{{{{\bf{z}}}}}}\,|\,{{{{{\boldsymbol{y}}}}}})\) without re-simulation of an atmospheric transport model. The relationship between z and x is simple aggregation, and can represented by matrix M:

$${{{{{\bf{x}}}}}}={{{{{\bf{M}}}}}}{{{{{\bf{z}}}}}}.$$
(7)

If z and x share the same grid resolution (i.e., if \({{{{{\bf{z}}}}}}\in {{\mathbb{R}}}^{m}\) and \({{{{{\bf{x}}}}}}\in {{\mathbb{R}}}^{n}\) and s is the number of emission sectors, then m = ns), the matrix \({{{{{\bf{M}}}}}}\in {{\mathbb{R}}}^{n\times m}\) is represented with the following terms:

$${m}_{i,j}=\left\{\begin{array}{ll}1 & (i,j)\,pertain\,to\,same\,grid \\ 0 & otherwise\end{array}\right.$$
(8)

If z and x pertain to different grids, the relationship M is defined by the geographic area (Ω) and intersections (\(\cap\)) of grid cells:

$${m}_{i,j}=\frac{{\Omega }_{{x}_{i}\cap {z}_{j}}}{{\Omega }_{{z}_{j}}}.$$
(9)

Since M is simply a summation matrix, we assume there is no noise associated with its application. Using M, we can update Eqs. 5 and 6 to find the optimal emission state vector \(\hat{{{{{{\bf{z}}}}}}}\) and its posterior error covariance \(\hat{{{{{{\bf{Z}}}}}}}\):

$$\hat{{{{{{\bf{z}}}}}}}={{{{{{\bf{z}}}}}}}_{{{{{{\rm{A}}}}}}}+\hat{{{{{{\bf{Z}}}}}}}{({{{{{\bf{K}}}}}}{{{{{\bf{M}}}}}})}^{T}{{{{{{\bf{S}}}}}}}_{{{{{{\rm{y}}}}}}}^{-1}({{{{{\bf{y}}}}}}-({{{{{\bf{K}}}}}}{{{{{\bf{M}}}}}}){{{{{{\bf{z}}}}}}}_{{{{{{\rm{A}}}}}}})$$
(10)
$$\hat{{{{{{\bf{Z}}}}}}}={({({{{{{\bf{K}}}}}}{{{{{\bf{M}}}}}})}^{T}{{{{{{\bf{S}}}}}}}_{{{{{{\rm{y}}}}}}}^{-1}({{{{{\bf{K}}}}}}{{{{{\bf{M}}}}}})+{{{{{{\bf{Z}}}}}}}_{{{{{{\rm{A}}}}}}}^{-1})}^{-1}$$
(11)

Equations 10 and 11 provide an explicit closed form solution for \(\hat{{{{{{\bf{z}}}}}}}\), which is sufficient for emission optimization without reliance on RWs. Therefore, application of Eqs. 10 and 11 into existing inverse frameworks would provide posterior emission estimates constrained by atmospheric observations. However, these equations require the computation of the matrix (KM), which can be large in cases where many atmospheric observations exist. An exactly equivalent solution is possible with just the products of the flux inversion, specifically \(\hat{{{{{{\bf{S}}}}}}}\), \(\hat{{{{{{\bf{x}}}}}}}\), \({{{{{{\bf{S}}}}}}}_{{{{{{\rm{A}}}}}}}\), and \({{{{{{\bf{x}}}}}}}_{{{{{{\rm{A}}}}}}}\). We show one derivation below and provide an alternative derivation in the SI:

Equation 5 can be shown to have an equivalent form that is often used in atmospheric trace gas retrievals26:

$$\hat{{{{{{\bf{x}}}}}}}={{{{{{\bf{x}}}}}}}_{{{{{{\rm{A}}}}}}}+{{{{{\bf{A}}}}}}({{{{{\bf{x}}}}}}-{{{{{{\bf{x}}}}}}}_{{{{{{\rm{A}}}}}}})+{{{{{\bf{G}}}}}}{{{{{\bf{n}}}}}}$$
(12)

Where A is the averaging kernel matrix \({{{{{\bf{A}}}}}}=\frac{\partial \hat{{{{{{\bf{x}}}}}}}}{\partial {{{{{\bf{x}}}}}}}\):

$${{{{{\bf{A}}}}}}={{{{{\bf{I}}}}}}-\hat{{{{{{\bf{S}}}}}}}{{{{{{\bf{S}}}}}}}_{{{{{{\rm{A}}}}}}}^{-1}$$
(13)

And G is the Gain Matrix \({{{{{\bf{G}}}}}}=\frac{\partial \hat{{{{{{\bf{x}}}}}}}}{\partial {{{{{\bf{y}}}}}}}\):

$${{{{{\bf{G}}}}}}={{{{{{\bf{S}}}}}}}_{{{{{{\rm{A}}}}}}}{{{{{{\bf{K}}}}}}}^{T}{({{{{{\bf{K}}}}}}{{{{{{\bf{S}}}}}}}_{{{{{{\rm{A}}}}}}}{{{{{{\bf{K}}}}}}}^{T}+{{{{{{\bf{S}}}}}}}_{{{{{{\rm{y}}}}}}})}^{-1}$$
(14)

And x are the true atmospheric fluxes. Therefore, Eq. 12 shows that the optimal solution \(\hat{{{{{{\bf{x}}}}}}}\) is a combination of the truth, smoothed by some prior and includes noise. From Eq. 12, we can create a flux to posterior flux operation H, given that our relationship M is known:

$${{{{{\bf{H}}}}}}({{{{{\bf{x}}}}}})={{{{{{\bf{x}}}}}}}_{{{{{{\rm{A}}}}}}}+{{{{{\bf{A}}}}}}({{{{{\bf{x}}}}}}-{{{{{{\bf{x}}}}}}}_{{{{{{\rm{A}}}}}}})$$
(15)
$${{{{{\bf{H}}}}}}({{{{{\bf{M}}}}}}{{{{{\bf{z}}}}}})={{{{{{\bf{x}}}}}}}_{{{{{{\rm{A}}}}}}}+{{{{{\bf{A}}}}}}({{{{{\bf{M}}}}}}{{{{{\bf{z}}}}}}-{{{{{{\bf{x}}}}}}}_{{{{{{\rm{A}}}}}}})$$
(16)

The operation H allows for emissions to be smoothed with an averaging kernel, allowing for direct comparison with \(\hat{{{{{{\bf{x}}}}}}}\). With this operator relationship, we treat \(\hat{{{{{{\bf{x}}}}}}}\) as an observable. The error covariance \(\hat{{{{{{\bf{S}}}}}}}\) includes both smoothing (\({{{{{{\bf{S}}}}}}}_{{{{{{\rm{s}}}}}}}\)) and measurement error (Sm):

$$\hat{{{{{{\bf{S}}}}}}}={{{{{{\bf{S}}}}}}}_{{{{{{\rm{s}}}}}}}+{{{{{{\bf{S}}}}}}}_{{{{{{\rm{m}}}}}}}$$
(17)

For flux partitioning, we want to isolate the Sm error component \(\hat{{{{{{\bf{x}}}}}}}\), as the H operator already accounts for smoothing via the averaging kernel. The matrix Sm can be represented32 using G and Sy:

$${{{{{{\bf{S}}}}}}}_{{{{{{\rm{m}}}}}}}={{{{{\bf{G}}}}}}{{{{{{\bf{S}}}}}}}_{{{{{{\rm{y}}}}}}}{{{{{{\bf{G}}}}}}}^{T}.$$
(18)

While Ss has the following representation:

$${{{{{{\bf{S}}}}}}}_{{{{{{\rm{s}}}}}}}=({{{{{\bf{I}}}}}}-{{{{{\bf{A}}}}}}){{{{{{\bf{S}}}}}}}_{{{{{{\rm{A}}}}}}}{({{{{{\bf{I}}}}}}-{{{{{\bf{A}}}}}})}^{T}$$
(19)

We can combine Eqs. 17 and 19 to get an alternate form of Sm that does not require Sy and G explicitly:

$${{{{{{\bf{S}}}}}}}_{{{{{{\rm{m}}}}}}}=\hat{{{{{{\bf{S}}}}}}}-({{{{{\bf{I}}}}}}-{{{{{\bf{A}}}}}}){{{{{{\bf{S}}}}}}}_{{{{{{\rm{A}}}}}}}{({{{{{\bf{I}}}}}}-{{{{{\bf{A}}}}}})}^{T}$$
(20)

Using Sm for observational error covariance, we can apply the MAP solution to derive \(\hat{{{{{{\bf{z}}}}}}}\) and \(\hat{{{{{{\bf{Z}}}}}}}\):

$$\hat{{{{{{\bf{z}}}}}}}={{{{{{\bf{z}}}}}}}_{{{{{{\rm{A}}}}}}}+\hat{{{{{{\bf{Z}}}}}}}{{{{{{\bf{M}}}}}}}^{T}{{{{{{\bf{A}}}}}}}^{T}{{{{{{\bf{S}}}}}}}_{{{{{{\rm{m}}}}}}}^{-1}(\hat{{{{{{\bf{x}}}}}}}-{{{{{\bf{H}}}}}}({{{{{\bf{M}}}}}}{{{{{{\bf{z}}}}}}}_{{{{{{\rm{A}}}}}}}))$$
(21)
$$\hat{{{{{{\bf{z}}}}}}}={{{{{{\bf{z}}}}}}}_{{{{{{\rm{A}}}}}}}+\hat{{{{{{\bf{Z}}}}}}}{{{{{{\bf{M}}}}}}}^{T}{{{{{{\bf{A}}}}}}}^{T}{{{{{{\bf{S}}}}}}}_{{{{{{\rm{m}}}}}}}^{-1}(\hat{{{{{{\bf{x}}}}}}}-{{{{{{\bf{x}}}}}}}_{{{{{{\rm{A}}}}}}}-{{{{{\bf{A}}}}}}({{{{{\bf{M}}}}}}{{{{{{\bf{z}}}}}}}_{{{{{{\rm{A}}}}}}}-{{{{{{\bf{x}}}}}}}_{{{{{{\rm{A}}}}}}}))$$
(22)
$$\hat{{{{{{\bf{Z}}}}}}}={({{{{{{\bf{M}}}}}}}^{T}{{{{{{\bf{A}}}}}}}^{T}{{{{{{\bf{S}}}}}}}_{{{{{{\rm{m}}}}}}}^{-1}{{{{{\bf{A}}}}}}{{{{{\bf{M}}}}}}+{{{{{{\bf{Z}}}}}}}_{{{{{{\rm{A}}}}}}}^{-1})}^{-1}.$$
(23)

Now we have a direct solution for \(\hat{{{{{{\bf{z}}}}}}}\) and \(\hat{{{{{{\bf{Z}}}}}}}\) derived only from the products of a flux inversion (specifically, \({{{{{{\bf{x}}}}}}}_{{{{{{\rm{A}}}}}}},\,{{{{{{\bf{S}}}}}}}_{{{{{{\rm{A}}}}}}}\), \(\hat{{{{{{\bf{x}}}}}}}\), \(\hat{{{{{{\bf{S}}}}}}}\)), an emissions prior (\({{{{{{\bf{Z}}}}}}}_{{{{{{\rm{A}}}}}}}\)), and an aggregation matrix M. Equations 22 and 23 can be shown to be of the same form as Eqs. 1 and 2 by showing that \({{{{{{\bf{A}}}}}}}^{T}{{{{{{\bf{S}}}}}}}_{{{{{{\rm{m}}}}}}}^{-1}={\hat{{{{{{\bf{S}}}}}}}}^{-1}\). Decomposing \({{{{{{\bf{A}}}}}}}^{T}\) and recognizing that \(\hat{{{{{{\bf{S}}}}}}}\) and \({{{{{{\bf{S}}}}}}}_{{{{{{\rm{A}}}}}}}\) are symmetric matrices, we have

$${{{{{{\bf{A}}}}}}}^{T}={({{{{{\bf{I}}}}}}-\hat{{{{{{\bf{S}}}}}}}{{{{{{\bf{S}}}}}}}_{{{{{{\rm{A}}}}}}}^{-1})}^{T}={{{{{\bf{I}}}}}}-{({{{{{{\bf{S}}}}}}}_{{{{{{\rm{A}}}}}}}^{-1})}^{T}{\hat{{{{{{\bf{S}}}}}}}}^{T}={{{{{\bf{I}}}}}}-{{{{{{\bf{S}}}}}}}_{{{{{{\rm{A}}}}}}}^{-1}\hat{{{{{{\bf{S}}}}}}}.$$
(24)

We first expand Eq. 19 for an alternative expression of Sm:

$${{{{{{\bf{S}}}}}}}_{{{{{{\rm{m}}}}}}}=\hat{{{{{{\bf{S}}}}}}}-({{{{{\bf{I}}}}}}-{{{{{\bf{A}}}}}}){{{{{{\bf{S}}}}}}}_{{{{{{\rm{A}}}}}}}{({{{{{\bf{I}}}}}}-{{{{{\bf{A}}}}}})}^{T}$$
(25)
$$=\hat{{{{{{\bf{S}}}}}}}-({{{{{\bf{I}}}}}}-{{{{{\bf{I}}}}}}+\hat{{{{{{\bf{S}}}}}}}{{{{{{\bf{S}}}}}}}_{{{{{{\rm{A}}}}}}}^{-1}){{{{{{\bf{S}}}}}}}_{{{{{{\rm{A}}}}}}}{({{{{{\bf{I}}}}}}-{{{{{\bf{I}}}}}}+\hat{{{{{{\bf{S}}}}}}}{{{{{{\bf{S}}}}}}}_{{{{{{\rm{A}}}}}}}^{-1})}^{T}$$
(26)
$$=\hat{{{{{{\bf{S}}}}}}}{{{{{{\bf{A}}}}}}}^{T}.$$
(27)

Taking the inverse of Sm using the Eq. 27, we have:

$${{{{{{\bf{S}}}}}}}_{{{{{{\rm{m}}}}}}}^{-1}={(\hat{{{{{{\bf{S}}}}}}}{{{{{{\bf{A}}}}}}}^{T})}^{-1}$$
(28)
$${{{{{{\bf{S}}}}}}}_{{{{{{\rm{m}}}}}}}^{-1}={({{{{{{\bf{A}}}}}}}^{T})}^{-1}{\hat{{{{{{\bf{S}}}}}}}}^{-1}.$$
(29)

The algebra for 28 and 29 are only possible if \(({{{{{\bf{A}}}}}}^{T})^{-1}\) exists. For overdetermined systems (i.e., dimension of y » dimension of x), this is generally valid. Multiplying both sides of Eq. 29 by \({{{{{{\bf{A}}}}}}}^{T}\) finishes the proof:

$${{{{{{\bf{A}}}}}}}^{T}{{{{{{\bf{S}}}}}}}_{{{{{{\rm{m}}}}}}}^{-1}={\hat{{{{{{\bf{S}}}}}}}}^{-1}.$$
(30)

In a similar fashion, we can show that Eqs. 2 and 23 are equivalent if \({{{{{{\bf{A}}}}}}}^{T}{{{{{{\bf{S}}}}}}}_{{{{{{\rm{m}}}}}}}^{-1}{{{{{\bf{A}}}}}}={\hat{{{{{{\bf{S}}}}}}}}^{-1}-{{{{{{\bf{S}}}}}}}_{A}^{-1}\). To do this, we multiply Eq. 30 by A:

$${{{{{{\bf{A}}}}}}}^{T}{{{{{{\bf{S}}}}}}}_{{{{{{\rm{m}}}}}}}^{-1}{{{{{\bf{A}}}}}}={\hat{{{{{{\bf{S}}}}}}}}^{-1}{{{{{\bf{A}}}}}}$$
(31)
$$={\hat{{{{{{\bf{S}}}}}}}}^{-1}({{{{{\bf{I}}}}}}-\hat{{{{{{\bf{S}}}}}}}{{{{{{\bf{S}}}}}}}_{{{{{{\rm{A}}}}}}}^{-1})$$
(32)
$$={\hat{{{{{{\bf{S}}}}}}}}^{-1}-{{{{{{\bf{S}}}}}}}_{{{{{{\rm{A}}}}}}}^{-1}.$$
(33)