Introduction

Ethane (C2H6) is one of the most abundant volatile organic compounds (VOCs) present in the global atmosphere1,2,3. It is a precursor of tropospheric ozone, affects the transport and fate of reactive nitrogen through the formation of peroxyacetyl nitrate (PAN)4, and influences the atmosphere’s oxidizing capacity3,5,6,7. Ethane is also co-emitted with methane from fossil fuel sources and has become an important tracer for fingerprinting and quantifying anthropogenic methane emissions on regional8,9 and national10 scales. Attempts have been made to replicate this technique at the global11,12,13 scale, but success has proven difficult due in part to insufficient observational coverage and uncertainty in the methane:ethane emission ratios11,12,13.

The sources of ethane are primarily anthropogenic: recent estimates of fossil fuel-related ethane emissions range from 7-13 teragrams per year (Tg yr−1), with biofuel and biomass burning contributing an additional 2–8 Tg yr−13,14,15. Geologic emissions of ethane are uncertain, and not always included in budget assessments, though some studies estimate them at 2-6 Tg yr−11,3,16,17,18. Natural ethane sources, including oceanic and biogenic emissions, are believed to be globally negligible5,7,17,19. Atmospheric ethane is primarily destroyed by reaction with hydroxyl radicals (OH) on a global-mean timescale of ~2 months20, sufficiently long that it provides a valuable airmass and source tracer over long distances21,22,23.

Uncertainty in the fossil fuel ethane source is large, with top-down studies often indicating substantially higher emissions than inventories predict (e.g., a factor of 1.4 to 3 more3,14,24,25). Fossil ethane emission ratios relative to methane also vary temporally and spatially in a manner that is not well constrained10,14. These two issues challenge ethane’s use as a source tracer along with efforts to quantify its atmospheric impacts.

Satellite-based measurements of tropospheric ethane would offer a major advance by providing globally continuous and long-term observations to address the above challenges26,27,28. Ethane is a relatively weak absorber but features distinctive bands in the shortwave infrared (IR; ν7 band near 3000 reciprocal centimeters [cm−1])29 and thermal IR (ν9 band near 815 cm−1)30 that are amenable to remote sensing. The ν7 feature has been used for ground-based solar Fourier Transform Infrared (FTIR) measurements such as those produced by the Network for the Detection of Atmospheric Composition Change (NDACC)31,32,33. Satellite-based ethane measurements have so far been performed by solar occultation (using ν7)34 and by limb sounding (using ν9)35. However, these ethane retrievals are limited to the upper troposphere and stratosphere and are unable to capture spatial and temporal variability linked to surface emissions.

In this work, we present a retrieval of tropospheric ethane using the Cross-Track Infrared Sounder (CrIS; see Methods) onboard the Suomi-National Polar-orbiting Partnership (Suomi-NPP) satellite, an instrument well-suited for the task. CrIS has low noise in the longwave IR (LWIR) relative to similar sensors36, an afternoon overpass that maximizes sensitivity in the troposphere27, and high resolution in both space and time. Compared to observations at shorter wavelengths, the LWIR radiances detected by CrIS are much less impacted by atmospheric and surface scattering but have lower near-surface sensitivity due to atmospheric absorption. This issue is partly counteracted by the CrIS nadir view, which provides far more tropospheric sensitivity than the limb-based observations previously used for space-based ethane detection26,28. Below, we first demonstrate the spectral detection of ethane in the CrIS radiances and highlight the coherence of this signal with known fossil fuel and biomass burning sources. We next employ a neural network to derive ethane column concentrations from these spectral signals and evaluate the results against established ground-based observations. Finally, we combine this dataset with chemical transport model (CTM) simulations to investigate the largest persistent ethane enhancements visible in the global CrIS data record, which occur over the Permian basin in the southwestern United States (US).

Results

Spectral detection of ethane from space

We use the hyperspectral range index (HRI) to characterize the ethane spectral signal detected by CrIS. The HRI is a dimensionless measure of the spectral signature of a target atmospheric gas37,38 that is computed via28:

$${{{\rm{HRI}}}}=\frac{1}{N}\frac{{{{{\bf{K}}}}}^{{{{\bf{T}}}}}{{{{\rm{S}}}}}_{{{{\rm{y}}}}}^{-1}({{{\bf{y}}}}-\bar{{{{\bf{y}}}}})}{\sqrt{{{{{\bf{K}}}}}^{{{{\bf{T}}}}}{S}_{y}^{-1}{{{\boldsymbol{K}}}}}}.$$
(1)

where \({{{\rm{y}}}}\) is a measured spectrum from CrIS and \(\bar{{{{\bf{y}}}}}\) is the mean background spectrum calculated iteratively from background scenes in which ethane is undetectable39. The same background scenes are used to calculate the spectral covariance matrix \({{{{\rm{S}}}}}_{{{{\rm{y}}}}}\), which accounts for correlations between spectral channels due to factors other than ethane40. \({{{\bf{K}}}}\) is the spectral Jacobian reflecting the change in absorption per change in ethane39 (Supplementary Fig. 1). A normalization factor \(N\) is applied based on the HRI standard deviation over a region where no ethane enhancements are expected. HRI values computed in this way specifically quantify the above-background signal strength, and by definition have a mean of 0 and a standard deviation of 1 under background conditions. Higher values imply a strong signal from the target gas. Initially developed and applied to quantify ammonia and sulfur dioxide from the Infrared Atmospheric Sounding Interferometer (IASI)38,40, the HRI has subsequently been employed to measure a range of VOCs from IASI28 and to quantify isoprene from CrIS39. Here, we use an updated version of the Retrieval of Organics from CrIS Radiances39 (ROCRv2) to iteratively derive an ethane HRI between 800 and 850 cm−1, a range that encompasses the ν9 absorption feature (Supplementary Fig. 1). The subsequent conversion of these spectral signals to ethane column abundances is described below.

Figure 1 illustrates the spectral detection of ethane from CrIS using the HRI approach. Plotted are data from January 2nd, 2020, over the South Pacific downwind of Australian megafires that occurred during that period. Plumes from these fires provide a useful test-case scenario for ethane detection, given (1) strong expected ethane emissions from the fires, (2) lofting of those emissions high into the troposphere and lower stratosphere41, and (3) over-ocean transport, which minimizes any spectral influences from the surface (as ocean emissivity is less variable than over land). Three independent characterizations of the transported biomass burning plume are shown: (a) the CrIS carbon monoxide (CO) column abundance obtained via optimal estimation42; (b) the CrIS ethane HRI computed as above; and (c) the mean on-peak/off-peak brightness temperature difference at the wavenumbers with highest ethane absorption (Supplementary Fig. 1). For (c), we use CrIS spectral residuals processed with the MUlti-SpEctra, MUltiSpEcies, Multi-SEnsors (MUSES) algorithm from the TRopospheric Ozone and its Precursors from Earth System Sounding (TROPESS) project43,44. Several modifications were made to the MUSES/TROPESS processing strategy for this work, and these are described in Methods. Once the spectral residuals are derived, values at the eight largest ethane peaks in the ν9 band are then each background-corrected by subtracting the mean adjacent off-peak value as indicated in Supplementary Fig. 1 and Supplementary Table 1. These background-corrected ethane residuals are averaged to obtain the single spectral index shown in Fig. 1c. All quantities (a-c) are cloud-screened using the difference between the 900 cm−1 brightness temperature and surface skin temperature27.

Fig. 1: Spectral detection of ethane in radiances measured by the Cross-track Infrared Sounder (CrIS).
figure 1

Plotted are (a) CrIS carbon monoxide (CO) columns42, b ethane hyperspectral range indices (HRIs), and (c) ethane brightness temperature differences for a fire plume over the South Pacific on January 2, 2020. All quantities are normalized and screened for clouds using the 900 cm−1 brightness/surface-skin temperature difference27. The CrIS data shown is primarily from granule 13, with additional data from granules 12, 14, 29, 232, 233, and 234.

Figure 1 shows strong consistency between the above measurements, with all three mapping the same large fire plume stretching from (50° S, 180° E) to (30° S, 165° E). This coherence reflects the physical structure of the fire plume and the distribution of co-emitted absorbers within that plume (in this case, CO and ethane). The brightness-temperature difference ethane metric is noisier than the corresponding HRI, reflecting the sensitivity and noise advantages of the latter. While factors such as water vapor and ash can interfere with thermal IR retrievals in wildfire smoke45, the spatial consistency between the two entirely different indices shown in Fig. 1 demonstrates that both are capturing the underlying signal due to ethane rather than that of potential interferences.

Spatial patterns in the ethane signal

Along with fires, the strongest ethane HRI enhancements that emerge in the global CrIS record are associated with major fossil fuel production regions. This is illustrated in Fig. 2, which shows CrIS ethane HRI values over four oil and gas basins: the Permian in Texas and New Mexico, US46, the Hassi Messaoud oil field in Algeria47, the Burun oil field in Turkmenistan47, and the Ordos basin in northern China48. Of these four areas, we see the highest ethane HRI values over the Permian. In fact, this finding extends across the entire CrIS data record—the Permian basin nearly always exhibits the highest ethane HRI of any location on the globe. The Permian also shows a clear spatial correlation between fossil fuel production and ethane signal: Fig. 2b, f both show a clear two-lobed distribution corresponding to the Delaware (western) and Midland (eastern) component basins. Of these, the Delaware basin has both a higher mean ethane HRI and a higher total fossil fuel extraction rate, producing 54% and 61% of Permian oil and natural gas, respectively46,49. The corresponding methane emissions were estimated at 1.7 Tg CH4 yr−1 (Delaware) and 1.0 Tg CH4 yr−1 (Midland) in 2018–2019 using TROPOspheric Monitoring Instrument (TROPOMI) satellite-based observations46.

Fig. 2: Ethane hyperspectral range indices (HRIs) derived from CrIS radiances over fossil fuel production regions.
figure 2

Panel (a) shows the location and spatial extent of the other panels, which contain data for the Permian basin in Texas and New Mexico, US46 (b, f), the Hassi Messaoud oil field in Algeria47 (c, g), the Burun oil field in Turkmenistan47 (d, h), and the Ordos basin in northern China48 (e, i). Panels (be) show the 2013–2021 mean ethane HRI values as detected by CrIS over these areas. Panels (fi) show bottom-up fossil fuel data for the same locations based on oil production by well49 (f; 2018 data) or based on the estimated methane emissions from oil extraction and production50 (gi). Blue rectangles highlight high-HRI regions on each set of paired figures, with annual oil-related methane emissions from within those squares listed beneath50. Color scale maxima vary between regions.

The other basins highlighted in Fig. 2 show lower ethane HRI values, but nevertheless exhibit a tight spatial coherence between the ethane spectral signal and underlying fossil fuel extraction activities. We highlight this coherence in Fig. 2 with boxes delineating the HRI enhancements for each basin (panels b-e); these boxes are then replicated for reference over the corresponding emission/production maps (panels f-i). This coherence is particularly strong over the Hassi Messaoud oil field. Ethane HRI enhancements over the Burun oil field in Turkmenistan have a distribution that matches that of oil-extraction methane emissions, but with the highest ethane signals at the base of the Cheleken peninsula. The Ordos basin in China features the second-highest ethane HRI values across the regions in Fig. 2 but with much lower methane emissions (50 Gg CH4 yr−1) than those in Burun (690 Gg CH4 yr-1)50.

These observations represent an opportunity to characterize atmospheric ethane sources through time and space, and for understanding their connections to fossil fuel production and to the co-emission of methane. However, the HRI is a unitless measure that depends not only on ethane abundance but also on spectral and environmental factors as described previously39,45. We therefore employ a retrieval (described below) to convert the CrIS HRI values into ethane column amounts.

Retrieval of Organics from CrIS Radiances (ROCR) algorithm updates

We convert the satellite-derived HRI fields to ethane columns using an artificial neural network (ANN). The retrieval involves two significant updates to version 1 of the Retrieval of Organics from CrIS Radiances (ROCR) algorithm described by Wells et al.39. In ROCRv2, we use the Line-by-Line Radiative Transfer Model51,52 (LBLRTM) rather than the Earth Limb and Nadir Operational Retrieval53 (ELANOR) model to calculate spectral Jacobians and synthetic radiances. LBLRTM computes molecular optical depths line-by-line rather than via pre-computed lookup tables, thus increasing accuracy and species flexibility. The second update involves an explicit and customizable means of accounting for vertical retrieval sensitivity and is described later in this section.

Reactive trace gas fields (VOCs, ozone, ammonia) for ANN training use 2019 output from the GEOS-Chem CTM (Version 13.354; see Methods) for 12:00–15:00 local time (LT), reflecting the ~13:30 LT CrIS observations. The model is sampled between 60° S and 75° N every 16 days throughout the year (total: 23 days), with random positive scaling (σ = 1, μ = 1) applied to the VOC profiles to minimize any effects from correlations with atmospheric conditions39. Profiles for other spectrally relevant species (e.g., chlorofluorocarbons, carbon dioxide, methane, nitrous oxide, carbon tetrachloride) are based on a climatology from the Model for OZone and Related chemical Tracers (MOZART)55. Meteorological fields (temperature, water vapor) use the Modern-Era Retrospective analysis for Research and Applications (MERRA-2).

The above input data are used in LBLRTM to generate synthetic HRIs as would be detected by CrIS. For each model scene, we simulate top-of-atmosphere radiances with and without atmospheric ethane for three view angles selected randomly from each of three bins (0-16°, 17-32°, 33-48°). We then add CrIS-like random noise, with zero mean and a standard deviation derived from values reported in the CrIS L1B product. These CrIS noise values are (1) lowest in the middle of the LWIR band (700–1000 cm–1) and (2) lower at higher brightness temperatures36. We next calculate our set of training HRIs by combining the LBLRTM-derived synthetic radiances with background spectral covariances obtained from the CrIS measurements themselves, as detailed by Franco et al. for IASI28. Calculation of the synthetic HRIs against a background of zero ethane reduces sensitivity to model errors, but (as will be discussed) if the true ethane background is above the CrIS detection threshold it would then need to be added post-hoc to the ANN output prior to data analysis28. Through the above procedure we obtain a training HRI dataset for which the underlying ethane columns are known. The range of synthetic HRIs encompasses that of the CrIS HRIs, so that the ANN predictions described next are in-sample.

A feed-forward ANN is trained on this dataset to predict ethane column amounts from the corresponding HRI values. The employed ANN includes two hidden layers (with 20 and 10 nodes, respectively) and a single output node, and is trained on 10 random extractions of the input dataset. The ANN includes as predictors relevant ancillary variables that affect the spectroscopy. For ethane, these include water vapor (column amount), surface skin temperature, atmospheric temperatures (surface air temperature, ~1 kilometer, ~4 km, ~10 km), surface pressure, and view angle. As with ROCRv139, latitude and longitude are not included as predictors. As an improvement over ROCRv139 we introduce here an additional predictor to account for the sensitivity of thermal IR satellite measurements to the vertical profile of the absorber: P90, the atmospheric pressure below which 90% of the ethane column resides. For training purposes these vertical profiles are obtained from the randomized GEOS-Chem simulations referenced above; inclusion of P90 in the ANN then allows the vertical sensitivity of the measurement to be explicitly represented via a single variable. As will be outlined below, it also allows the retrieved ethane column to be adjusted a posteriori to any assumed vertical profile shape—e.g., for an internally consistent comparison with model fields, or where an observational constraint on the profile shape is available.

Training performance and uncertainty

Figure 3 summarizes the ANN training performance. The network mean can reproduce 88% of the ethane column variance in the training set with a root-mean-square error of 4.1 × 1015 molecules cm–2. Figure 3b and Supplementary Fig. 2 compare the relative impacts of the input variables on the column predictions, demonstrating that the HRI is most important—particularly when ethane is enhanced. Next in importance is P90, and its influence is likewise most pronounced for higher ethane columns. The leading importance of the HRI and P90 variables show that the retrieval is predominantly determined by the ethane column amount and its vertical distribution, as expected for a thermal IR measurement. The five temperature variables individually have only modest impacts on the predictions (Supplementary Fig. 2) but are collectively important given that thermal IR retrieval sensitivity depends on the surface-atmosphere thermal contrast.

Fig. 3: Artificial Neural Net (ANN) training set and predictor importance.
figure 3

Panel (a) displays predicted vs true ethane columns for the full training set, shown as the mean (red dots) and standard deviation (blue error bars) across the 10 individual ANNs. Panel (b) displays the relative importance of selected individual predictors via their impact on the overall 10-ANN root-mean-square error (RMSE). ‘All Predictors’ shows the RMSE when including all predictors; other results show the RMSE when the designated predictor is omitted. A version of this figure including all predictors is included in the supplement (Supplementary Fig. 2).

The relative error in the predicted value increases as column values decrease, from ~10% at a column abundance of 5 × 1016 molecules cm-2 to ~30% at 1 × 1016 molecules cm-2. Supplementary Fig. 3 plots the prediction error (the absolute value of the difference from the true value) and bias (the simple difference from the true value) as a function of both thermal contrast and P90. The mean error exceeds 100% for ethane columns below 4 × 1015 molecules cm-2, defining an approximate detection threshold for the CrIS observations (this threshold does not vary significantly with thermal contrast or P90). The lack of a strong relationship between error and thermal contrast seen in Supplementary Fig. 3 is notably different from the isoprene case39. We attribute this difference to the longer atmospheric lifetime of ethane: isoprene is highly concentrated in the lowermost atmosphere and thus more sensitive to near-surface thermal contrast. Ethane is more dispersed through the vertical column and this dependence is thus distributed amongst the suite of atmospheric temperature inputs. The bias plots in Supplementary Fig. 3 further reveals a modest tendency for the retrieval to overpredict the training values when ethane column amounts are low and to underpredict them at high column amounts and when ethane is lofted vertically (by a mean of 12% for columns >2 × 1016 molecules cm–2 and P90 < 300 hectopascals [hPa]). This moderate underestimate at high column amounts, revealed in Supplementary Fig. 3, also manifests in the predicted vs. true regression slope (0.94; Fig. 3).

Ethane column derivation and comparison with observations

We use the 10-ANN ensemble described in the prior section to derive ethane column densities from the CrIS-measured HRI values. Meteorological inputs for the retrieval use MERRA-2 reanalysis. We perform the ANN retrieval on a 0.5° × 0.625° horizontal grid and for a range of P90 values (150, 250, 350, 450, 550, and 650 hPa; chosen based on global P90 statistics from a GEOS-Chem simulation). The CrIS retrieval output for a given location can hence be adapted to any assumed vertical profile by interpolating between the ethane columns derived for the two proximal P90 values.

We evaluate the resulting CrIS ethane measurements against surface-based solar FTIR observations collected by the Network for the Detection of Atmospheric Composition Change (NDACC; see Methods). A map of NDACC stations used in this analysis is shown in Supplementary Fig. 4. NDACC ethane retrievals are well-characterized, with <6% systematic error and <3% random error33, exhibit good agreement with in-situ observations56, and have a long history of use3,21,32,33,57,58,59,60,61,62,63. In our baseline comparison approach we use the NDACC a posteriori profiles to specify the ethane P90 value used in the CrIS retrieval for each time and location. This provides a CrIS-independent constraint and ensures internally consistent profile assumptions between the two datasets.

Figure 4 displays the CrIS-NDACC comparison results obtained in this way. The two datasets are significantly correlated (R2 = 0.66), showing that CrIS can capture variability in atmospheric ethane across the NDACC locations and sampling times. The CrIS data are lower than the surface column observations, with a mean CrIS/NDACC column ratio of 0.65 and a major axis slope of 0.60 (bootstrapped 95% confidence interval: 0.59–0.61). The intercept of (–1.1 [–1.2 to –0.9]) × 1015 molecules cm−2 is near-zero and below the CrIS limit of detection, obviating any background addition that would otherwise be required from the HRI formulation. Restricting the analysis to the single CrIS grid cell containing the NDACC station rather than the 3 × 3 matrix mean increases the slope to 0.75 (0.72–0.77), increases the scatter (R2 = 0.48), and shifts the intercept slightly (–3.2 [–3.4 to –2.9] × 1015). Despite the greater noise, this latter approach likely provides a more robust comparison near source regions (e.g., for the St. Petersburg, Bremen, Xianghe, and Toronto sites). Relaxing the ±2 h temporal overlap window to ±8 hours, or excluding the altitude-dependent scaling for high-elevation sites (see Methods), does not appreciably change these results. Averaging the results in Fig. 4 by station (see Supplementary Fig. 5) shows that the CrIS:NDACC comparisons adhere to a single statistical relationship (R2 = 0.92; CrIS/NDACC column ratio = 0.65; slope = 0.45) across the global distribution of observing stations—arguing against any large regional differences in retrieval performance.

Fig. 4: Evaluation of Suomi-NPP Cross-track Infrared Sounder (CrIS) ethane retrievals against ground-based observations from the Network for the Detection of Atmospheric Composition Change (NDACC).
figure 4

NDACC observations are averaged over ±2 h of the local satellite overpass time. CrIS observations reflect a 3 × 3 pixel mean (0.5° × 0.625° resolution) centered on the NDACC site. The black line shows a standard major axis fit, with parameters given inset (intercept units: 1016 molecules cm−2). Vertical scaling corrections (Methods) have been applied to the CrIS columns for the six high-altitude sites (Altzomoni, Maido, Mauna Loa, Izaña, Rikubetsu, and Zugspitze). The 1:1 line is also shown (dashed gray line).

The NDACC degrees of freedom for signal (averaging ~1.5) indicate that their a posteriori profile shapes and associated P90 values are largely determined by the NDACC prior rather than by the observations. For that reason, we performed an additional CrIS-NDACC comparison instead using P90 values predicted by GEOS-Chem for each time and location. Results (slope = 0.66; R2 = 0.55) are consistent with the baseline comparison, supporting the robustness of the conclusions above.

Possible reasons for the non-unity CrIS:NDACC slope shown in Fig. 4 could include inconsistency between the employed thermal and shortwave infrared cross-sections, a misspecification of the CrIS noise in the vicinity of the ν9 ethane feature (as used for deriving the synthetic HRIs), or radiative transfer model uncertainties (different forward models are used for the two datasets). Future refinements of the CrIS ethane product presented here will explore such factors for potential improvement.

Ethane in the Permian Basin

The ~400 km2 Permian basin in New Mexico and west Texas consistently exhibits the largest ethane enhancements over the 2013–2021 CrIS record (Fig. 2). Over the last decade, with advances in drilling technology this region has become the most productive oil-producing basin in the US and one of the largest in the world46. As a result, the Permian has been a focus of methane-related research46,64,65, with a recent top-down analysis concluding that leakage rates 60% higher than the national average led to a methane flux of 2.7 ± 0.5 Tg yr−1 in 2018–2019—the largest reported amount from any US oil/gas production region46.

The accompanying ethane emissions are less well known. Bottom-up inventories used by GEOS-Chem6 suggest an ethane flux of 0.074 Tg yr−1 from the Permian: however, this leads to a dramatic underestimate of the resulting column amounts compared to those seen by CrIS, indicating that the true flux is much higher. Barkley et al.66 obtained a regional ethane flux estimate of 0.28 Tg yr−1 based on airborne sampling downwind of the Permian that mainly took place in 2017, though the authors report low confidence in this finding due to limited regional sampling coverage. A recent bottom-up estimate for 2015 from NOAA’s Fuel-based Oil and Gas (FOG) inventory67 suggests considerably higher Permian ethane emissions for 2015 (0.56 Tg) than those inferred by Barkley et al.66 for 2017 (0.28 Tg) or predicted by GEOS-Chem.

We performed a series of sensitivity simulations with the GEOS-Chem CTM (configured as described in Methods) to determine the Permian ethane flux required to match the CrIS observations. Simulations were first performed for 2019 over a nested 0.5° × 0.625° domain containing the Permian basin (29°–34° N; 100°–106° W). In addition to the reference run with standard model emissions, simulations were performed with ethane emissions from the subregion encompassing the Delaware and Midland basins (31°-34° N, 101°–105° W) scaled by factors of 2, 5, 6, 7, 8, 10 and 11. The model was sampled from 12:00 to 15:00 LT daily to match the CrIS overpass, with the simulated profile shapes then used to determine the grid-cell and time-specific P90 values for the corresponding CrIS retrievals.

The top panel of Fig. 5 plots the mean CrIS/GEOS-Chem column difference over the full Permian domain as a function of the emission scale factor employed in the model. The intercept of 7.4 (95% confidence interval: 7.3–7.6) reflects the factor by which the bottom-up Permian emissions need to be scaled to be consistent with the space-borne constraints. The above confidence interval was generated by bootstrapping the statistical fit shown in Fig. 5 and thus does not account for any systematic satellite or model errors. The result implies an ethane flux of 0.53 Tg in 2019, with the Permian basin alone then responsible for 4–7% of the total estimated fossil-fuel ethane emissions worldwide3,14.

Fig. 5: Derivation of 2019 mean ethane columns over the Permian basin.
figure 5

Panel (a) displays the difference in the 2019-mean ethane columns over the Permian basin between CrIS and GEOS-Chem as a function of the regional emissions multiplier used in the model. Error bars reflect 95% confidence intervals for each mean. The blue line and shaded region show a linear least-squares fit and associated 95% confidence interval, with the intercept indicating a true Permian emission scale factor of 7.4 ± 0.1 (vertical dashed line). The bottom panel shows 2019 mean ethane columns over the Permian basin as (b) simulated by the GEOS-Chem base-case (GC a priori), (c) observed from space by CrIS, and (d) simulated by the updated GEOS-Chem run based on a Permian emission scale factor of 7.4 (GC a posteriori). Vertical profile shape information (P90 values, defined as the pressure level above 90% of the ethane column) for the CrIS retrievals are obtained from the corresponding GC a posteriori output. The gray square indicates the subdomain over which the emission scale factors were applied in the model. Total annual ethane emissions from the full Permian domain are indicated in white on panels (b, d).

Figure 5 also spatially compares the CrIS ethane columns over the Permian with those derived by the base-case GEOS-Chem simulation and by an optimized simulation with regional emissions scaled by the derived factor of 7.4. The updated simulation captures the mean ethane column magnitude observed by CrIS over the Permian but not the underlying spatial features, because the bottom-up inventory employed in the model (which is scaled uniformly over the indicated subdomain) inaccurately maps the distribution of emissions across the Delaware and Midland basins. Advancing ethane emission inventories, and improving on the Permian flux estimate presented here, will thus require improved and up-to-date spatial mapping of the underlying activities (e.g., as provided via GFEI and FOG50,67).

We extended the above inversion technique through the 2014–2019 period to characterize the associated ethane emission trend as seen by CrIS. Figure 6 shows the results and compares them to US Energy Information Agency records of regional oil production46,47,66,67,68 (natural gas follows a similar trend) and to other bottom-up and top-down constraints. The Permian fossil fuel production rates rise strongly through this period, with slightly lower rates of increase in 2016-2017. The CrIS-derived ethane emissions likewise show an overall increase, with a 2016 dip followed by a through the remainder of the period. The CrIS ethane emission trend arises entirely from changes in the measured ethane columns, as the prior fluxes in GEOS-Chem are static over this time3.

Fig. 6: Trends in oil production and emissions over the Permian Basin.
figure 6

Annual mean ethane emission estimates are plotted in gold on the left axis, while annual oil production rates for the same region68 are plotted in purple on the right axis. The Barkley et al.66 flux estimate was derived from airborne ethane measurements; the Francoeur et al.67 estimate is a bottom-up estimate of ethane emissions; the GEOS-Chem ethane emissions are from Tzompa-Sosa et al.3; and the Varon et al.65 and Zhang et al.46 values were obtained by applying a basin-average ethane:methane emission ratio of 18%3,66,69 to their satellite-derived methane flux estimates. The Cross-track Infrared Sounder (CrIS) ethane emissions in this plot were derived using the same methodology as shown in Fig. 5.

Figure 6 also shows that the CrIS-derived Permian emissions agree match well with the Barkley et al.66 estimate for 2017 (0.31 Tg vs 0.28 Tg, respectively) but are ~2 × lower than the FOG bottom-up estimate for 2015, and are also lower than those implied by applying a regional 17–18% mol/mol ethane:methane emission ratio (32-34% on a mass basis)3,66,69 to recent satellite-inferred methane fluxes46,65. Appling a factor of two correction to the CrIS values, as would be implied by the NDACC comparison in Fig. 4, yields 2014-2019 emissions that more closely match FOG and fall between the two methane-based estimates—while degrading agreement with Barkley et al.66.

Discussion

We present a space-based retrieval of tropospheric ethane based on thermal infrared radiance observations from the CrIS satellite sensor onboard Suomi-NPP. Spectral detection of ethane is first demonstrated using plume observations downwind of the January 2020 Australian wildfires. Other ethane enhancements seen from CrIS correspond with known fossil fuel extraction activities, and we show examples from four major production regions: the Permian basin (in Texas and New Mexico); the Hassi Messaoud oil field (in Algeria); the Burun oil field (in Turkmenistan); and the Ordos basin (in Northern China). The CrIS ethane HRIs are sensitive enough to reveal sub-basin variability that reflects the distribution of oil/gas extraction and processing facilities.

We then employ an artificial neural network to convert the CrIS ethane enhancements to atmospheric ethane column amounts, while accounting for the ancillary variables that affect the spectroscopy. Our ethane retrieval can reproduce 88% of the variance in the training set with low bias and an RMSE of 4.1 × 1015 molecules cm–2, while comparison with ground-based column observations from the NDACC network reveals a significant correlation (R2 = 0.48–0.66) and a mean CrIS/NDACC ratio of 0.61–0.65. We focus subsequent analysis on what the CrIS HRI observations reveal to be the most pronounced ethane hotspot on the globe: the Permian basin in Western Texas and Southern New Mexico. By combining our ROCRv2 ethane retrieval with GEOS-Chem using a mass-balance inversion technique, we obtain ethane emission estimates that increase with Permian fossil fuel production rates. Our analysis suggests that the Permian alone represents at least 4–7% of the total global fossil fuel-related ethane source; considering the CrIS:NDACC comparison results, this estimate may be conservative3,14. The analysis in this paper represents a first step in using the CrIS observations to better understand global ethane emissions from fossil and non-fossil sources.

This Suomi-NPP CrIS dataset spans 12 years (2012–2023) of daily global ethane column observations, opening opportunities for understanding the underlying emission magnitudes, variability, and trends around the world. Suomi-NPP CrIS ceased LWIR operation in August 2023, but successor CrIS instruments are currently onboard the Joint Polar Satellite System JPSS-1/NOAA-20 and JPSS-2/NOAA-21 satellites and slated for launch onboard JPSS-3 and JPSS-4—offering data continuity well into the 2030 s. This long-term record is a key community resource for advancing understanding of ozone, reactive nitrogen budgets, and fossil fuel emissions worldwide.

Methods

Instrumentation

CrIS is a Fourier transform spectrometer onboard the Suomi-NPP (launched 2011), JPSS-1/NOAA-20 (2017), and JPSS-2/NOAA-21 (2022) satellites, which are all in sun-synchronous low-Earth orbits with ~01:30/13:30 equator crossing times. Additional CrIS instruments are planned for the JPSS-3 and -4 missions. CrIS measures in three spectral bands covering the LWIR (650–1095 cm−1), midwave IR (1210–1750 cm–1), and shortwave IR (2155–2550 cm–1). We employ here daytime-overpass LWIR data from Suomi-NPP, which is available from 02/2012 to 08/2023 at 0.625 cm–1 spectral resolution. The CrIS field of regard consists of a 3 × 3 pixel array each with 14-km nadir footprint diameter; the 2200 km cross-track scan width provides twice-daily global coverage. Single-pixel results from CrIS are averaged here to the MERRA-2 grid70 prior to analysis using a drop-in-the-box method with no temporal or spatial co-adding.

CrIS spectral residual processing

We use CrIS spectral residuals processed with the MUlti-SpEctra, MUltiSpEcies, Multi-SEnsors (MUSES) algorithm from the TRopospheric Ozone and its Precursors from Earth System Sounding (TROPESS) project. The MUSES algorithm leverages over 20 years of heritage from the Aura TES optimal estimation algorithm. The strategy for the residuals differs from the standard TROPESS CrIS strategy as the latter aims to estimate the best values for all relevant absorbers, whereas our goal is to remove non-target absorber signals. Following the standard initial guess refinement for clouds and surface temperature71, the residual strategy used here therefore diverges from the standard TROPESS processing. A joint retrieval for H2O, CO2, O3, NH3, peroxyacetyl nitrate (PAN), HDO, N2O, CH4, cloud properties, surface temperature, and atmospheric temperature is performed. The radiative transfer also accounts for fixed climatological values for SO2, HNO3, OCS, HCN, SF6, HCOOH, C2H4, CH3OH, CFC-11, CFC-12, and CCl4. All spectroscopic parameters come from the AER line file (aer_v_3.8.1: https://github.com/AER-RC/AER_Line_File). These parameters were either obtained from the HITRAN database30,72,73 (this is the case for ethane) or were custom-derived at AER (for certain spectral ranges affected by H2O, CO2, O3 and O2). The windows span 650 to 1750 cm−1 with gaps imposed where radiances do not fit well (e.g. due to spectroscopy). Following the joint step, cloud properties, surface properties, and atmospheric temperature are fixed, and PAN is retrieved between 780 and 790 cm–1 (with a gap between 783.125 and 786.250 cm–1 to window around water interference). Next, ozone and ammonia are updated with windows between 923.125 and 1317.5 cm–1. Finally, the spectral residual is calculated from 650 to 1095 cm–1 and from 1215 to 1750 cm–1, accounting for H2O, HDO, CO2, O3, N2O, CH4, SO2, NH3, HNO3, OCS, HCN, SF6, HCOOH, C2H4, CH3OH, PAN, CFC-11, CFC-12, and CCl4 in the first window and H2O, HDO, CO2, O3, N2O, CH4, SO2, NH3, HNO3, HCN, HCOOH, C2H4, CH3OH, and PAN in the second.

Modeling the Permian Basin with GEOS-Chem

We use version 14.1 Classic (https://doi.org/10.5281/zenodo.7600579) of the GEOS-Chem CTM for our simulations of the Permian basin. Simulations are for 2019 and employ MERRA-2 meteorology, Community Emissions Data System global emissions74, and US ethane emissions from the Tzompa-Sosa et al. inventory3, which is scaled from the 2011 US National Emissions Inventory75 based on observational comparisons. Model runs apply ethane emission scale factors (1, 2, 5, 6, 7, 8, 10, 11) over the Delaware and Midland basins (31°–34° N, 101°–105° W). A 6-month spinup is used to initialize year-long, scenario-specific global runs at 2 × 2.5° resolution that then provide boundary conditions for matching nested simulations over the Permian (0.5° × 0.625°; 29°–34° N, 100°–106° W). Model columns are sampled daily from 12:00 to 15:00 LT to match the CrIS overpass. A scenario-specific P90 value for each column amount is then calculated from the model output and used for the CrIS-model comparisons.

Surface-based ethane column observations

We use 2013–2021 data from 13 sites (12 NDACC stations plus one candidate NDACC station at Xianghe, China) between 50°S - 60°N (Supplementary Fig. 4) and restrict the comparison to observations within ±2 h of the local satellite overpass time. Observation pairs with skin temperatures below 270 K are omitted to avoid retrieval artifacts associated with snow or ice cover. We compare the NDACC columns to (1) the mean CrIS value for a 3 × 3 matrix of 0.5° × 0.625° grid cells centered on the site in question (to reduce noise), and (2) to the CrIS value for the individual underlying grid cell (to minimize any signal suppression for near-source stations). Six of the sites used here (Altzomoni, Maido, Mauna Loa, Rikubetsu, Izaña, and Zugspitze) are located at altitudes considerably higher than the average altitude over the corresponding CrIS pixels. We therefore apply a scaling correction to the CrIS values for these sites to account for this altitude difference based on the shape of the modeled GEOS-Chem profile for each date and location.

Supplementary Fig. 6 plots the mean column averaging kernels for the employed NDACC datasets, showing sensitivity through the tropospheric column that increases with altitude. Also plotted in Supplementary Fig. 6 is the HRI/column ratio for the corresponding CrIS observations plotted as a function of the P90 value employed in the retrieval, reflecting the measurement sensitivity to the vertical location of ethane in the atmosphere. Simulations from GEOS-Chem suggest that the range of ambient ethane profiles encountered at these sites correspond to atmospheric P90 values of ~200-400 hPa (shaded regions in Supplementary Fig. 6B). Supplementary Fig. 7 shows the number of NDACC observations by site and year used in Fig. 4 and in Supplementary Figs. 5 and 6.