Background & Summary

Fossil fuel use, cement production and land-use change have perturbed the natural carbon cycle and increased the concentration of carbon dioxide (CO2) in the Earth’s atmosphere by almost 50% since 1750, from 277 ppm in 1750 to 407 ppm in 20181,2,3. Routine assessment of the global carbon cycle is required to monitor the ongoing increases in atmospheric CO2 concentrations, evaluate the causes and drivers of this trend, and quantify the impact of policies that aim to stabilise and reverse it3,4,5. The global carbon budget (GCB) was evaluated on multi-year time scales by each of the foregoing Intergovernmental Panel on Climate Change (IPCC) assessment reports6,7,8,9,10, while the Global Carbon Project (GCP) has published an annual assessment of the GCB on an annual time-scale for over a decade3,11,12,13.

The GCP disaggregates the annual GCB into six components: atmospheric growth (GATM); CO2 emissions due to fossil fuel combustion, non-combustion uses of fossil fuels, and cement production (EFF); CO2 emissions due to land-use change (ELUC); uptake of CO2 by the global ocean (SOCEAN); uptake of CO2 by the terrestrial biosphere (SLAND), the two later fluxes from ocean and land carbon models, respectively; and a budget imbalance term (BIM). GATM is the most precisely constrained term of the budget (1σ of 4%)3, while EFF, ELUC, SOCEAN and SLAND rely on analysis of national emissions reports14,15,16, satellite observations17,18, and process-based models3,19,20 and are more uncertain. If EFF, ELUC, SOCEAN and SLAND were perfectly constrained then their sum would be equal to the measured change in the atmospheric stock of CO2 (GATM). However, the independent analysis of GATM, EFF, ELUC, SOCEAN and SLAND using different methodologies results in an unconstrained budget, and a small budget imbalance term (BIM) is required to close the budget. The global carbon budget is thus closed as follows (SOCEAN and SLAND hold negative values)3:

$${G}_{ATM}={E}_{FF}+{E}_{LUC}+{S}_{OCEAN}+{S}_{LAND}+{B}_{IM}$$
(1)

Inversion models use an integrated approach to simultaneously quantify all fluxes of the global carbon budget and they are, by design, constrained by observations of atmospheric CO2 mole fraction, or satellite derived products of column CO2. Inversion models prescribe the fossil carbon emissions (EFF) because the current density of the surface network and the sampling of the atmosphere by satellites is too sparse to quantify this flux separately, and then estimate the total land flux (FLAND = SLAND + ELUC) and the ocean sink (SOCEAN) using a modelling framework that minimises data–model mismatch across all fluxes according to a cost function (see examples in refs. 21,22,23,24,25,26,27,28,29,30,31,32 and studies cited therein). By synchronously quantifying EFF, FLAND and SOCEAN, inversion models avoid budget imbalance and hence the global carbon budget equation is closed without a BIM term as follows:

$${G}_{ATM}={E}_{FF}^{inv}+{F}_{LAND}^{inv}+{S}_{OCEAN}^{inv}$$
(2)

Inversion models require prior constraints on the regional distribution of the CO2 fluxes that they seek to disaggregate. Here we describe our development of the Global Carbon Budget Gridded Fossil Emissions Dataset (GCP-GridFED; version 2019.1), a new gridded 0.1° × 0.1° global dataset of monthly CO2 emissions resulting from fossil fuel oxidation and the calcination of limestone during cement production. The gridded nation- and source- specific emissions in GCP-GridFED are consistent with the nation- and source- specific emissions inventories compiled for the GCP’s 2019 GCB assessment3,33 and for version 2019.1 cover the period 1959-2018. The GCP-GridFED will be updated each year for use by inversion models contributing to the annual updates of the GCB, thus aligning the prior constraints on top-down estimates of fossil CO2 emissions with the bottom-up estimates used by the GCP.

Gridded estimates of uncertainty in CO2 emissions are provided as an additional layer of GCP-GridFED and are based on the relative uncertainties (1σ) in fossil CO2 presented in the uncertainty assessment of the GCB3 and the relative uncertainties amongst emission sectors34. Uncertainties associated with the spatial disaggregation of national emissions are not included (see ‘CO2 Emissions Uncertainty’). Our approach to uncertainty quantification is broadly representative of the sectoral contributions to total emissions in each grid cell, which changes throughout the time series, and of differences in uncertainty across national emission reports. Inversion models may utilise these uncertainty grids but with the freedom to build more complex covariance structures to suit their requirements.

The global cycles of carbon and oxygen are coupled through their dual involvement in carboxylation reactions (photosynthesis), which consume CO2 and emit O2, and oxidation reactions (respiration and combustion), which consume O2 and emit CO2 (refs. 35,36). In addition to CO2 alone, some inversion models are able to constrain surface fluxes of O2 or atmospheric potential oxygen (APO ≈ O2 + 1.1CO2)37,38,39,40. Such models can utilise dual atmospheric measurements of CO2 and O2 and dual priors for CO2 and O2 surface fluxes and synchronously minimise data–model mismatch with respect to CO2 and O2. Alternatively, O2 fluxes can be constrained independently using atmospheric O2 observations and O2 surface flux priors. GCP-GridFED includes dual estimates of atmospheric O2 uptake due to the oxidation of fossil fuels, with the aim of supporting the inverse modelling of O2 or APO and with the view that the data can be used in multi-decadal analyses of the global oxygen budget. Our O2 uptake estimates are based on the oxidative ratios (OR; uptake of O2/emission of CO2)36 applied to the CO2 emission estimates for coal, oil, natural gas oxidation36.

Methods

Overview

GCP-GridFED was produced by scaling monthly gridded emissions for the year 2010, from the Emissions Database for Global Atmospheric Research (EDGAR; version 4.3.2)41, to the national annual emissions estimates compiled as part of the 2019 global carbon budget (GCB-NAE) for the years 1959–2018 (ref. 3). EDGAR data for the year 2010 is used because monthly gridded data was only available for this year at the time of product development (new data for 2015 was published recently and will be adopted in future versions of GCP-GridFED)42. We describe the key features of the EDGAR and GCB-NAE datasets below (see ‘Input Datasets’).

GCB-NAE and EDGAR provide information regarding the global emission of CO2 through the combustion of fossil fuels, industrial processes and cement production, and some other minor sources (e.g. consumption of lubricants and paraffin waxes, solvent use, agricultural liming); nonetheless, their merits differ. GCB-NAE provides a consistent long-term dataset of annual national CO2 emissions (1750–2018), however this dataset is not spatially-explicit below the country level and does not include sub-annual variability in CO2 emissions. EDGAR provides estimates at high spatial resolution for specific fuels and sectors with a representation of the monthly distribution of emissions. However the EDGARv4.3.2 estimates are only available for 1970–2010 and a constant monthly distribution, matching the year 2010, is used throughout the time series41. Our approach merged these two complementary datasets to create a long-term (1959–2018) and gridded (0.1° × 0.1°) dataset of global monthly CO2 emissions. The start year of 1959 aligns with the period of direct atmospheric measurements of CO2 concentration43. Our approach is to scale EDGAR’s 2010 monthly gridded CO2 emissions to match the annual gridded CO2 emissions from GCB-NAE on a nation- and fuel- specific basis (see ‘emissions scaling protocol’, Fig. 1; Table 1).

Fig. 1
figure 1

A conceptual depiction of the emissions scaling protocol used to produce GCP-GridFED as described in section 2.2. Descriptions of the input datasets are provided in section 2.1. This figure does not depict the procedure used to calculate emissions uncertainty or O2 combustion; however, the figure indicates the stages at which these additional outputs are produced within the CO2 emissions scaling protocol (marked as ‘*’ and ‘#’, respectively). Uncertainties in CO2 concentration are calculated for the EDGAR dataset in the year 2010 and scaled using the same factors as the central estimates for CO2 emission. O2 combustion estimates are calculated using oxidative ratios applied to the CO2 emissions from fuel combustion, prior to the final step of the protocol. Full details of the procedure used to produce gridded CO2 uncertainties and O2 combustion estimates can be found in sections 2.3 and 2.4, respectively.

Table 1 The relation of GCP-GridFED source classes to EDGAR activity sectors.

GCP-GridFED includes additional data layers that are beneficial to inversion models. Gridded uncertainty in CO2 emissions from each nation and emissions sector is also propagated to our nation-, year- and fuel- specific emissions estimates (Table 2). Gridded estimates of the uptake of O2 related to oil, coal and natural gas use are also made using the literature-based oxidative ratios presented in the CO2 release and Oxygen uptake from Fossil Fuel Emission Estimate (COFFEE) dataset36.

Table 2 Calculation of uncertainties for each sector in GCP-GridFEDv2019.1.

We provide Figs. 212, the descriptive details in Tables 3 and 4, the summary statistics in Tables 57 and Online-Only Table 1 to outline the key features of GCP-GridFEDv2019.1 and assist with its technical validation.

Fig. 2
figure 2

Time series of (left column) annual CO2 emissions (Gt CO2 year−1) and (right column) monthly fossil CO2 emissions (Mt CO2 day−1) as estimated by GCP-GridFED. Uncertainties in CO2 emissions are treated as 5% for Annex I nations and global, following the GCB uncertainty assessment3. (Top row) Total global emissions are disaggregated to (other rows) the top 4 emission regions. Crosses mark input data directly from GCB-NAE3,33.

Fig. 3
figure 3

Time series of annual CO2 emissions as estimated by GCP-GridFEDv2019.1 for the period 1959–2018 (Gt CO2 year−1). Uncertainties in CO2 emissions are treated as 5% for Annex I nations and global, following the GCB uncertainty assessment3. (Top row, Left column) Total global emissions are disaggregated to the (other columns) top 4 emission regions and (other rows) source classes. Crosses mark input data directly from GCB-NAE3,33.

Fig. 4
figure 4

Time series of monthly CO2 emissions as estimated by GCP-GridFEDv2019.1. for the period 1990–2018 (Mt CO2 day−1). Uncertainties in CO2 emissions are treated as 5% for Annex I nations and global, following the GCB uncertainty assessment3. (Top row, Left column) Total global emissions are disaggregated to the (other columns) top 4 emission regions and (other rows) source classes.

Fig. 5
figure 5

Monthly emissions anomaly relative to the annual mean daily emissions rate from GCP-GridFEDv2019.1. Each grey line represents a year of data in the period 1959–2018. The red line marks the mean value across the time series. (Top row, left column) Global seasonality is disaggregated to (other columns) the top 4 emission regions and (other rows) source classes. No seasonality is present in India, based on the EDGARv4.3.2 gridded input data for the year 2010.

Fig. 6
figure 6

Time series of (left column) annual O2 uptake (Gt O2 year−1) and (right column) monthly O2 uptake (Mt O2 day−1) through fossil fuel oxidation as estimated by GCP-GridFED. (Top row) Total global uptake is disaggregated to (other rows) the top 4 emission regions. The plotted uptake uncertainties are based on CO2 emissions uncertainties of 5% for Annex I nations and global3, and they do not include uncertainties in oxidative ratios.

Fig. 7
figure 7

Time series of annual O2 uptake as estimated by GCP-GridFEDv2019.1 for the period 1959–2018 (Gt O2 year−1). The plotted uptake uncertainties are based on CO2 emissions uncertainties of 5% for Annex I nations and global3, and they do not include uncertainties in oxidative ratios. (Top row, Left column) Total global uptake is disaggregated to the (other columns) top 4 emission regions and (other rows) source classes.

Fig. 8
figure 8

Time series of monthly O2 uptake as estimated by GCP-GridFEDv2019.1 for the period 1990–2018 (Mt O2 day−1). (Top row, Left column) Total global uptake is disaggregated to the (other columns) top 4 emission regions and (other rows) source classes. The plotted uptake uncertainties are based on CO2 emissions uncertainties of 5% for Annex I nations and global3, and they do not include uncertainties in oxidative ratios.

Fig. 9
figure 9

Spatial distribution of CO2 emissions as estimated by GCP-GridFED (kg CO2 year−1). Gridded (0.1° × 0.1°) estimates of total fossil CO2 emissions are shown for four years of the analytical period.

Fig. 10
figure 10

(Left panels) Spatial distribution of O2 uptake through fossil fuel use as estimated by GCP-GridFEDv2019.1 (kg O2 year−1). Gridded (0.1° × 0.1°) estimates of total O2 uptake are shown for four years of the time series (1959–2018). (Right panels) Spatially-explicit oxidative ratios (OR; kg O2 kg−1 CO2) for total emissions activities as estimated by GridFEDv2019.1.

Fig. 11
figure 11

Gridded (0.1° × 0.1°) estimates of relative uncertainty in total CO2 emissions for four years of the GCP-GridFEDv2019.1 time series. Uncertainty in total emissions is aggregated from the sector-level estimates (see Table 2). The uncertainty estimates account for uncertainty across national emission reports and spatial differences in the sectoral breakdown to total emissions in each grid cell, which changes throughout the time series, however they exclude uncertainties associated with the spatial or temporal (monthly) disaggregation of national emissions (see ‘CO2 emissions uncertainty’). Aggregation of uncertainties to a coarser resolution should account for the non-independence of gridded emissions uncertainties.

Fig. 12
figure 12

Monthly emissions anomaly relative to the annual mean daily emissions rate from GCP-GridFEDv2019.1. Each grey line represents a year of data in the period 1959–2018. The red line marks the marks 2010 and is compared with values obtained directly from the EDGARv4.3.2 grids for the year 2010. (Top row, left column) Global seasonality is disaggregated to (other columns) source classes (other rows) regions.

Table 3 Table of groups, variables, dimensions and units of the output files with naming convention GCP_Global_{YYYY}.nc. Numbers show the length of each dimensions for each variable.
Table 4 Table of groups, variables, dimensions and units of the output file GCP_Global_Annual.nc. Numbers show the length of each dimensions for each variable.
Table 5 Regional summary statistics relating to total annual CO2 emissions from GridFEDv2019.1.
Table 6 Regional summary statistics relating to seasonality of total CO2 emissions from GridFEDv2019.1.
Table 7 Regional summary statistics relating to annual CO2 emissions from each source, from GridFEDv2019.1.

Input datasets

National annual emissions from the global carbon budget 2019 (GCB-NAE)

The GCB estimates national annual emissions of CO2 due to coal, oil and natural gas combustion, the oxidative use of these fuels in non-combustive industrial processes, and the production of cement clinker3,14,15,16,17,44. National CO2 emissions are preferentially taken from the country submissions to the United Nations Framework Convention on Climate Change (UNFCCC) for 42 “Annex I” countries over the period 1990–201844. These countries were members of the Organisation for Economic Co-operation and Development (OECD) in 1992, plus 16 non-OECD European countries and Russia, and contributed ~60% of total global emissions in 1990. Emissions in other countries and in Annex I countries prior to 1990 derive from the Carbon Dioxide Information Analysis Center (CDIAC)15 and are rooted in energy statistics published by the United Nations (UN)16,45. For recent years not covered by either the UNFCCC or CDIAC datasets, the national emissions are predicted using national or regional energy growth rates from the annual BP Statistical Review of World Energy14. National cement emissions are based on national inventories of cement production and ratios of clinker production from officially reported clinker production data and emission factors, IPCC default emission factors, industry-reported clinker production, and survey-based clinker ratios16.

Gridded monthly emissions from EDGAR

The Emissions Database for Global Atmospheric Research (EDGAR) version 4.3.241 is a dataset of global emissions of gases and particulates, including CO2, based on available national statistics, default emission factors and methods recommended by IPCC46,47. EDGAR uses a bottom-up approach that calculates gridded (0.1° × 0.1°) monthly CO2 emissions for activity sectors based on: statistics that track national levels of each activity; proxy data representing the spatial and temporal distribution of each activity; the mix of technologies used to perform each activity; the fuel mix used by each technology, and; emissions factors for the technology and fuel combinations, which are also corrected for the emission control technologies in place. A detailed description of EDGAR’s gridding procedure is available elsewhere (refs. 41,48) however we summarise below the key features of its design:

  • 28 EDGAR activity sectors are based on the 48 sectors defined by IPCC guidelines46,47.

  • Activity in each sector is tracked from 1970–2015 using statistics that represent demand and supply of goods and energy, including: fuel-specific energy balances, fuel production, commodity production and cement clinker production and agriculture-related activities.

  • Emission factors are taken from the guidelines issued by the IPCC46,47 and are assigned to each country in the following order of preference: national, regional, country group (Annex I/non-Annex I).

  • National emissions of CO2 from each sector are distributed across months using sector-specific or, preferentially, technology-specific monthly shares.

  • Emissions are distributed in space using spatial proxy data (that vary stepwise over time 1990–2010), such as population density, point source locations and transport routes.

Some of the uncertainties associated with using proxy data to disaggregate emission in time and space are considered in later sections (see ‘CO2 emissions uncertainty’).

EDGAR sectors included in GCP-GridFED

Of the 28 EDGAR sectors, the 18 relating to fossil fuel combustion, non-combustion use of fossil fuel and cement production were used in GCP-GridFED. These 18 sectors were selected to correspond as closely as possible with the activities included in the GCB-NAE emission estimates. The 18 activity sectors incorporated from EDGAR into GCP-GridFED are shown in Table 1.

Where possible, emissions from each EDGAR sector were further separated into specific fuels using fuel-specific data from an intermediate processing step of the EDGAR gridding protocol41. Where this was not possible, it was necessary to make the assumptions that follow about the fuels that contribute to emissions in each sector. These assumptions are based on the sector descriptions provided in the IPCC guidelines46,47 and the major contributing activities and fuel dependencies in each sector. Specifically, we assume that:

  • All chemical process emissions relate to the non-combustion use of natural gas.

  • All emissions from the non-energy use of fuels sector relate to non-combustion use of oil. This sector chiefly comprises the use of waxes and lubricants.

  • All emissions from the solvents and product use sector relate to non-combustion use of oil. This sector chiefly comprises solvents in paint, degreasing and dry cleaning, chemical products and other product use.

  • All emissions from the production of steel, iron and non-ferrous metals relate to the oxidation of coal and production of cokes.

  • All emissions from fossil fuel fires relate to underground coal fires. This sector also includes oil flaring emissions in Kuwait, however fossil fuel fire emissions were found to be negligible in Kuwait.

  • All emissions from off-road, rail and pipeline transport relate to the combustion of oil.

  • All emissions from the production of non-metallic minerals relate to cement clinker production.

National CO2 emissions data were extracted from the EDGAR datasets for the purpose of national annual emissions scaling. National masks were based on the ‘countries 2016’ dataset of the Geographic Information System of the European Commission (EU-GISCO)49.

The appropriate positioning of power plants is key to distributing total emissions accurately because the power sector accounts for ~45% of global emissions50. Changes in the available datasets of power plant geolocations are common, and hence we note the importance of recording which datasets are used in each release of gridded emissions products. GridFEDv2019.1 adopts point source geolocations from EDGAR v4.3.2, which are scaled as described below (see ‘GCP-GridFED Protocol’). The EDGAR protocol for geolocating power plant emissions is summarised as follows, with full documentation provided by Janssens-Maenhaut et al.41. The location, fuel type and seasonality of power plant emissions derives from the CARMAv3.0 dataset51. The 2010 gridded emissions dataset used here as the scaling basis includes over 60,000 plants mapped globally in CARMAv3.0 in the year 2007. Standard QA/QC screening was applied to the CARMAv3.0 dataset, including gap-filling of missing (0, 0) plant coordinates, correcting inverted (lon, lat) coordinates and adding some additional points for Russia. National power sector emissions for each fuel type are distributed across plants in proportion to their reported capacities. For larger countries (e.g. USA) with a non-uniform distribution of coal power plants, the fuel-specific distribution of emissions is considered a significant improvement over foregoing approaches. Emissions from each power plant reflect the fuel mix of the plant and the respective carbon intensity of emissions from that fuel mix. However, details of the technologies used by each plant, including carbon capture and storage, are not available. Alternative mappings of point sources can be based on night light detections by satellite52 or population data53 but these are least aligned with EDGAR’s ‘bottom up’ approach41.

Heating and cooling degree day (HCDD) Correction

The monthly distribution (seasonality) of global CO2 emissions is principally determined by seasonality of climate in the Northern Hemisphere, and thus a peak in emissions occurs in the boreal winter months and a trough occurs in the boreal summer months. Although this seasonality is predictable, inter-annual variability in weather influences the distribution of emission across the months. Because the monthly emissions distribution in the EDGAR dataset is derived only from 2010 data, we applied a correction to the EDGAR data to account for the impacts of inter-annual variability on emissions. Specifically, we used a heating and cooling degree day (HCDD) correction to implement inter-annual variability in the monthly distribution of CO2 emissions from selected EDGAR sectors (power industry, 1A1a; buildings, 1A4; manufacturing, 1A2; and road transport, 1A3b; see Table 1). The HCDD correction approach was implemented as follows.

First, monthly (m) HCDDs were calculated based on gridded (0.5° × 0.5°) daily mean temperature (T) data for the years 1959–2018 from the Climatic Research Unit time-series version 4.03 (CRU-TSv4.03)54 and following Spinoni et al. (refs. 55,56). For each 0.5° × 0.5° cell (i_r, j_r) of CRU-TSv4.03, HCDD was calculated as the absolute difference between the daily mean temperature of each month and an upper temperature threshold of 22 °C or a lower temperature threshold of 15.5 °C degrees55,56, multiplied by the number of days in the month (d).

$$HCD{D}_{m,i\_r,j\_r}=\left\{\begin{array}{c}\left(15.5-{\bar{T}}_{m,i\_r,j\_r}\right)\cdot d,{\bar{T}}_{m,i\_r,j\_r} < 15.5\\ \left.{\bar{T}}_{m,i\_r,j\_r}-22.0\right)\cdot d,{\bar{T}}_{m,i\_r,j\_r} > 22.0\end{array}\right.$$
(3)

Second, the monthly fraction of annual HCDDs (HCDDfrac) was calculated for each month in the year 2010 as follows.

$$HCDDfra{c}_{m,i\_r,j\_r}=\frac{HCD{D}_{m,i\_r,j\_r}}{\sum HCD{D}_{i\_r,j\_r}}$$
(4)

Third, the monthly fraction of annual emissions (Efrac) for each of the relevant sectors (s; 1A1a, 1A4, 1A2, 1A3b) was calculated on a reduced-resolution grid (0.5° × 0.5°; i_r, j_r) for each month as follows.

$$Efra{c}_{s,m,i\_r,j\_r}=\frac{EDGA{R}_{s,m,i\_r,j\_r}}{\sum EDGA{R}_{s,i\_r,j\_r}}$$
(5)

Fourth, a simple linear regression equation of the form below was fitted between monthly HCDDfrac and Efrac in the year 2010.

$$Efra{c}_{s,m,i\_r,j\_r}={a}_{s,i\_r,j\_r}+({b}_{s,i\_r,j\_r}\cdot HCDDfra{c}_{m,i\_r,j\_r})+erro{r}_{s,i\_r,j\_r}$$
(6)

The HCDD correction was implemented in each year of the time series by predicting Efrac based on HCDDfrac. The correction was applied only to cells where the R2 value of the linear regression equation exceeded 0.66 (where 66% of variation in Efrac was explained by variation in HCDDfrac). The following conditional approach was applied in all years (1959–2018).

$$EDGA{R}_{s,m,i,j}=\left\{\begin{array}{c}{a}_{s,i,j}+({b}_{s,i,j}\cdot HCDDfra{c}_{m,i,j}),{R}_{s,i\_r,j\_r}^{2}\ge 0.66\\ EDGA{R}_{s,m,i,j},else\end{array}\right.$$
(7)

Here HCDDfrac, a and b were re-gridded by repeating each grid cell in the i_r, j_r dimensions to provide output at the resolution of the EDGAR grid (0.1° × 0.1°; i, j).

GCP-GridFED Protocol

CO2 Emissions

GCP-GridFED was generated using the six-step emissions scaling protocol set out below and applied sequentially for each year in the period 1959–2018 (see Fig. 1):

1.     Group emissions from EDGAR sectors by source class. The global gridded (i, j) monthly (m) CO2 emissions were summed across the EDGAR activity sectors (s) in each source class used in this study (S; see Table 1). The monthly distribution of annual emissions was adjusted in advance using Eq. 7.

$$EDGA{R}_{S,m,i,j}=\sum EDGA{R}_{s,m,i,j}$$
(8)

2.     Extract gridded emissions data from EDGAR for each country. A subset of gridded monthly CO2 emissions from each GCP-GridFED source class (see Table 1) was extracted for each country (c) using country masks (True/False) from the EU-GISCO dataset49. No subset was extracted for the bunker fuels source class; the entire grid layer was scaled globally. Hence, all grid cells were included in an ‘international’ mask (True in all cells) and treated thereafter in the same way as each country.

$$EDGA{R}_{S,m,i\left[c\right],j[c]}=EDGA{R}_{S,m,i\left[True\right],j[True]}$$
(9)

3.      Sum EDGAR emissions for each GCP-GridFED source class. Monthly CO2 emissions were summed both across the months of the year and across the grid extracted for each nation, for each GCP-GridFED source class. The resulting annual emission sub-totals were stored in a tabular format matching the structure of the GCB-NAE data.

$$EDGA{R}_{S,c}=\sum \sum {EDGAR}_{S,m,i\left[c\right],j[c]}$$
(10)

4.     Calculate scaling factors based on comparison of EDGAR and GCB-NAE emissions estimates. For each country c and for each source class S, the scaling factor (α) required to convert the annual CO2 emissions from EDGAR (step 3) to the annual CO2 emissions estimate from GCB-NAE was derived as follows.

$$GC{B}_{S,c}={\alpha }_{S,c}\cdot EDGA{R}_{S,c}$$
(11)

5.     Apply annual scaling factors to monthly emission grids. The scaling factors for each nation and GCP-GridFED source class were applied to the national monthly CO2 emissions grids generated in step 2. The same scaling factor was used for all months. For the bunker fuels source class, the scaling factor was applied to the equivalent global data.

$$GridFE{D}_{S,m,i[c],j[c]}={\alpha }_{S,c}\cdot EDGA{R}_{S,m,i[c],j[c]}$$
(12)

6.     Collate national data to a global output. Scaled monthly CO2 emissions grids from all nations were merged into a single grid for each GCP-GridFED source class.

We do not attempt to adjust the EDGARv4.3.2 grids (year 2010) for a range of historical changes to the spatial distribution of emissions, for instance due to the expansion of road networks or flight routes, the commissioning/decommissioning of facilities or large-scale population migration. The resolution of these issues will be prioritised in future developments to the GCP-GridFED protocol. We note that developments introduced in EDGARv5.0 (ref. 42) include refined spatial proxy records and national temporal profiles covering the period 1970–2012, which will support further developments to the GCP-GridFED protocol. Dedicated datasets of fuel-specific monthly CO2 emissions are also emerging for some countries, including India57 and the USA58, and could be used preferentially in the GCP-GridFED protocol. Additional sources such as the diffusive coal mine oxidation CO2, as derived for the dataset CHE-EDGARv4.3.2_FT201541,59,60 will also be considered.

We do not consider emissions of non-CO2 carbon emissions that later influence atmospheric CO2 (in particular, CO and CH4). Here all fossil carbon is assumed to be emitted as fossil CO2, whereas a fraction is in reality emitted as CO and later represents a diffuse fossil CO2 source after oxidation to CO2 (~1.8 Pg CO2-equivalent year−1)61. In GridFEDv2019.1, the diffuse nature of this CO2 source is not considered, and the source is instead placed at the surface at the time and location of oxidation. Meanwhile, fossil CH4 fugitive emissions represent an additional diffuse source of CO2 emissions (0.4 Pg CO2-equivalent year−1) that is not considered here62. These diffuse CO2 sources will also be considered in future developments to the GCP-GridFED protocol.

CO2 emissions uncertainty

We provide gridded uncertainties to complement all gridded layers of the GCP-GridFED dataset, however we note here the incomplete nature of our uncertainty assessment. The gridded uncertainties are based on the total fossil CO2 emissions uncertainty assessment from the GCB3, combined with variation in relative uncertainties across emission sectors from the recent TNO assessment (Table 2)34 or uncertainties in national total CO2 emissions, we adopt the values presented in the uncertainty assessment of the GCB; 5% for the 42 Annex I countries that report annually to the UNFCCC44 and 10% for other countries3,63 (1σ). Annex I countries are assigned lower uncertainty because for these countries more detailed energy and activity statistics are available, and they are periodically reviewed externally3. We used data presented in the TNO uncertainty assessment to evaluate the ratio of the uncertainties for each sector (U_TNOs) to the uncertainty in total emissions (U_TNOTot). We then scaled the ratios to the uncertainties in total emissions that are adopted for Annex I and other countries from GCB-NAE.

$${U}_{GridFEDs}=\left\{\frac{{U}_{TNOs}}{{U}_{TNOTot}}\times 5{\rm{ \% }},\,UNFCCC\,Annex\,I\,\frac{U\_TN{O}_{s}}{U\_TN{O}_{Tot}}\times 10{\rm{ \% }},\,other\right.$$
(13)

Table 2 shows sectoral TNO uncertainty estimates and the resulting uncertainties adopted in GCP-GridFED for Annex I and other countries, for each sector.

The gridded uncertainty estimates presented here do not include the uncertainties associated with the spatial or temporal (monthly) disaggregation of national emissions, nor do we present a formal assessment of those disaggregation uncertainties. We note that spatially-averaged uncertainties resulting from the spatial disaggregation of national emissions estimates to grid cells are on the order of 20–75% (1σ) at spatial resolutions of 1 km to 1° (refs. 53,64,65,66,67). Spatial disaggregation uncertainties occur due to incomplete proxy data coverage (e.g. unmapped or mislocated point sources), poorly constrained nonlinearities (e.g. differences in the emissions intensity between equally dense rural and urban populations), shortcomings in continuous proxy values (e.g. poorly constrained population density) or inappropriate spatial representativeness (e.g. the spatial representativeness of roadmaps for traffic volume). By construction, these uncertainties are larger for years distant from our reference year 2010 and at monthly resolution. Dedicated analyses of regional emissions at high temporal resolution are yielding new data with which to quantify temporal disaggregation uncertainties57,68 and to assess the robustness of the temporal profiles employed here and elsewhere42.

A full quantitative assessment of these issues, to support the development of comprehensive grid-level uncertainties associated with GCP-GridFED, will be the subject of future work. Overall, our approach to uncertainty quantification is broadly representative of the sectoral contributions to total emissions in each grid cell, which changes throughout the time series. Inversion models may utilise these uncertainty grids but with the freedom to build more complex covariance structures to suit their requirements.

O2 Uptake

The relationship between CO2 and O2 fluxes during oxidation reactions can be expressed as an oxidative ratio (OR = flux of O2 from the atmosphere/flux of CO2 to the atmosphere, unitless)36,69. The OR differs detectably between specific fossil fuel sources, holding a value of −1.17 for coal, −1.44 for oil, and −1.95 for natural gas36,69. Uncertainties in OR are thought to be on the order of 2–3%, however variations within fuel classes, such as different grades of coal, have not been studied extensively (ref. 35). Cement clinker production involves a calcination reaction rather than an oxidation reaction, and thus no exchange of oxygen occurs (OR = 0).

GCP-GridFED calculates gridded estimates of the uptake of O2 during fossil fuel oxidation by applying OR values to the CO2 emissions estimates for each source.

$$GridFED\_O{2}_{S,i,j}=GridFE{D}_{S,i,j}\cdot O{R}_{S}$$
(14)

We treat relative uncertainty in O2 emissions as equal to the relative uncertainty in CO2 emissions (U_GridFED).

Data Records

All GCP-GridFEDv2019.1 output grids can be accessed via the Zenodo data repository70 (https://doi.org/10.5281/zenodo.3958283).

The data records include 60 files in Network Common Data Form (NetCDF) format with the naming convention GCP_Global_{YYYY}.nc, where YYYY is the year represented by the contents. Each NetCDF file includes 3 dimensions: time (month of the year expressed as days since the first day of YYYY, n = 12); latitude (Degrees North of the equator [cell centres], n = 1800); longitude (Degrees East of the Prime Meridian [cell centres], n = 3600). Each NetCDF file includes three groups representing CO2 emissions, CO2 emissions uncertainty, and O2 uptake (CO2, CO2_uncertainty and O2, respectively). Each group contains five variables representing emissions from each source class (COAL, OIL, GAS, CEMENT, BUNKER) with the units shown in Table 3. Each file contains 1,088,640,000 unique data points. All 60 NetCDF files are contained within a.zip archive named “GCP-GridFEDv2019.1_monthly.zip”.

The data records also include 1 file in NetCDF format, “GCP_Global_Annual.nc”. The NetCDF file includes 3 dimensions: time (year expressed as days since 1959–01–01, n = 60); latitude (Degrees North of the equator [cell centres], n = 1800); longitude (Degrees East of the Prime Meridian [cell centres], n = 3600). Each NetCDF file includes 4 groups representing CO2 emissions, CO2 emissions uncertainty, and O2 uptake, and O2 uptake uncertainty (CO2, CO2_uncertainty, O2, and O2_uncertainty respectively). Each group contains 5 variables representing emissions from each source class (COAL, OIL, GAS, CEMENT, BUNKER) with the units shown in Table 4. The file contains 6,998,400,000 unique data points. The NetCDF file is contained within a.zip archive named “GCP-GridFEDv2019.1_annual.zip”.

“GCP-GridFEDv2019.1_monthly.zip” and “GCP-GridFEDv2019.1_annual.zip” can be found within a parent.zip file name “GCP-GridFEDv2019.1.zip”. All grids are bottom-left arranged with coordinates referenced to the prime meridian and the equator.

Technical Validation

We provide Figs. 212, the summary statistics in Tables 47 and Online-Only Table 1 to outline the key features of GCP-GridFEDv2019.1 and assist with its technical validation.

GCP-GridFED is designed to distribute national annual emissions from GCB-NAE over a spatio-temporal grid based on EDGARv4.3.2. We validated the outputs from GCP-GridFED by comparing the global annual emissions from the output grids (the sum of emissions across the global grid) with the input data supplied to the gridding protocol from GCB-NAE. Throughout the time series of emissions and across all source classes, the global annual emissions totals from GCP-GridFED were always within 0.0077% of the GCB-NAE input data throughout the annual time series (Figs. 2 and 3). The discrepancies were caused by unscalable (zero or NoData) values in sectors of the EDGAR dataset at the national level in 13 countries (EDGAR data summed within the national masks as per Eq. 10). These 13 countries make a small contribution to total global emissions (0.047% in 2018). Online-Only Table 1 provides national-level comparisons of the emissions estimates from GCP-GridFED and GCB-NAE. For the 13 countries where maximum absolute discrepancies exceeded 1% of GCB-NAE emissions, we provide a brief description of the cause of the discrepancy. The GCP-GridFED outputs are robust to within 0.0001% of GCB-NAE values in 195 countries, plus bunker fuels, comprising 99.9% of global emissions in 2018. Hence, we conclude that the GCP-GridFEDv2019.1 is consistent with GCB-NAE emissions estimates for the years 1959–2018.

We also observed a close match between the seasonality seen in the year 2010 in the GCP-GridFED dataset and that seen in the same year of the EDGAR input data, both at the global scale and in large austral and boreal extratropical nations (Fig. 12). This coherence indicates that the seasonality seen in the EDGAR dataset was preserved by the GCP-GridFED protocol. Inter-annual variability in the monthly distribution of emissions can be seen most prominently in the EU27 + UK. Note that EDGARv4.3.2 does not feature monthly variability in emissions for tropical countries41, and so GCP-GridFED also shows no seasonality in these countries.

Usage Notes

The data is intended for use as a prior in inversion model studies, which may wish to incorporate individual priors for each source class or to use total gridded emissions. The data records contain a layer for each source class. Global total emissions can be calculated as the sum of emissions across the 5 source classes. National total emissions estimates should be calculated as the sum of coal, oil, gas and cement emissions (bunker fuel emissions should not be included in national emissions totals)71.

GCP-GridFED will be updated annually and made available for the inversion model runs conducted annually as part of the GCP assessment of the GCB. An updated version of GCP-GridFED (GCP-GridFEDv2020.1) was already made available upon request to support the inversion model runs of the GCP’s 2020 GCB assessment72,73 and is now publicly available74. GCP-GridFEDv2020.1 is based on the emissions estimates from a preliminary release of GCB-NAE covering the years 1959–2019 with input data available to June 2020. Further updates will be issued as the GCB-NAE data is updated74.

When using GCP-GridFED as a prior in inversion models operating at a coarser resolution, aggregation to the required resolution should account for the non-independence of gridded emissions uncertainties. See ‘CO2 Emissions Uncertainty’ for further information regarding our treatment of spatial and temporal aggregation in GCP-GridFED.