Gridded fossil CO2 emissions and related O2 combustion consistent with national inventories 1959–2018

Quantification of CO2 fluxes at the Earth’s surface is required to evaluate the causes and drivers of observed increases in atmospheric CO2 concentrations. Atmospheric inversion models disaggregate observed variations in atmospheric CO2 concentration to variability in CO2 emissions and sinks. They require prior constraints fossil CO2 emissions. Here we describe GCP-GridFED (version 2019.1), a gridded fossil emissions dataset that is consistent with the national CO2 emissions reported by the Global Carbon Project (GCP). GCP-GridFEDv2019.1 provides monthly fossil CO2 emissions estimates for the period 1959–2018 at a spatial resolution of 0.1°. Estimates are provided separately for oil, coal and natural gas, for mixed international bunker fuels, and for the calcination of limestone during cement production. GCP-GridFED also includes gridded estimates of O2 uptake based on oxidative ratios for oil, coal and natural gas. It will be updated annually and made available for atmospheric inversions contributing to GCP global carbon budget assessments, thus aligning the prior constraints on top-down fossil CO2 emissions with the bottom-up estimates compiled by the GCP. Measurement(s) carbon dioxide emission • oxygen combustion Technology Type(s) digital curation • computational modeling technique Factor Type(s) annual and monthly fossil carbon dioxide emissions estimates • annual and monthly oxygen combustion estimates Sample Characteristic - Environment climate system Sample Characteristic - Location global Measurement(s) carbon dioxide emission • oxygen combustion Technology Type(s) digital curation • computational modeling technique Factor Type(s) annual and monthly fossil carbon dioxide emissions estimates • annual and monthly oxygen combustion estimates Sample Characteristic - Environment climate system Sample Characteristic - Location global Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.13333643


ATM F F L UC OCEAN L AND I M
Inversion models use an integrated approach to simultaneously quantify all fluxes of the global carbon budget and they are, by design, constrained by observations of atmospheric CO 2 mole fraction, or satellite derived products of column CO 2 . Inversion models prescribe the fossil carbon emissions (E FF ) because the current density of the surface network and the sampling of the atmosphere by satellites is too sparse to quantify this flux separately, and then estimate the total land flux (F LAND = S LAND + E LUC ) and the ocean sink (S OCEAN ) using a modelling framework that minimises data-model mismatch across all fluxes according to a cost function (see examples in refs. [21][22][23][24][25][26][27][28][29][30][31][32] and studies cited therein). By synchronously quantifying E FF , F LAND and S OCEAN , inversion models avoid budget imbalance and hence the global carbon budget equation is closed without a B IM term as follows: Inversion models require prior constraints on the regional distribution of the CO 2 fluxes that they seek to disaggregate. Here we describe our development of the Global Carbon Budget Gridded Fossil Emissions Dataset (GCP-GridFED; version 2019.1), a new gridded 0.1° × 0.1° global dataset of monthly CO 2 emissions resulting from fossil fuel oxidation and the calcination of limestone during cement production. The gridded nation-and source-specific emissions in GCP-GridFED are consistent with the nation-and source-specific emissions inventories compiled for the GCP's 2019 GCB assessment 3,33 and for version 2019.1 cover the period 1959-2018. The GCP-GridFED will be updated each year for use by inversion models contributing to the annual updates of the GCB, thus aligning the prior constraints on top-down estimates of fossil CO 2 emissions with the bottom-up estimates used by the GCP. Gridded estimates of uncertainty in CO 2 emissions are provided as an additional layer of GCP-GridFED and are based on the relative uncertainties (1σ) in fossil CO 2 presented in the uncertainty assessment of the GCB 3 and the relative uncertainties amongst emission sectors 34 . Uncertainties associated with the spatial disaggregation of national emissions are not included (see 'CO 2 Emissions Uncertainty'). Our approach to uncertainty quantification is broadly representative of the sectoral contributions to total emissions in each grid cell, which changes throughout the time series, and of differences in uncertainty across national emission reports. Inversion models may utilise these uncertainty grids but with the freedom to build more complex covariance structures to suit their requirements.
The global cycles of carbon and oxygen are coupled through their dual involvement in carboxylation reactions (photosynthesis), which consume CO 2 and emit O 2 , and oxidation reactions (respiration and combustion), which consume O 2 and emit CO 2 (refs. 35,36 ). In addition to CO 2 alone, some inversion models are able to constrain surface fluxes of O 2 or atmospheric potential oxygen (APO ≈ O 2 + 1.1CO 2 ) [37][38][39][40] . Such models can utilise dual atmospheric measurements of CO 2 and O 2 and dual priors for CO 2 and O 2 surface fluxes and synchronously minimise data-model mismatch with respect to CO 2 and O 2 . Alternatively, O 2 fluxes can be constrained independently using atmospheric O 2 observations and O 2 surface flux priors. GCP-GridFED includes dual estimates of atmospheric O 2 uptake due to the oxidation of fossil fuels, with the aim of supporting the inverse modelling of O 2 or APO and with the view that the data can be used in multi-decadal analyses of the global oxygen budget. Our O 2 uptake estimates are based on the oxidative ratios (OR; uptake of O 2 /emission of CO 2 ) 36 applied to the CO 2 emission estimates for coal, oil, natural gas oxidation 36 .

Methods
Overview. GCP-GridFED was produced by scaling monthly gridded emissions for the year 2010, from the Emissions Database for Global Atmospheric Research (EDGAR; version 4.3.2) 41 , to the national annual emissions estimates compiled as part of the 2019 global carbon budget (GCB-NAE) for the years 1959-2018 (ref. 3 ). EDGAR data for the year 2010 is used because monthly gridded data was only available for this year at the time of product development (new data for 2015 was published recently and will be adopted in future versions of GCP-GridFED) 42 . We describe the key features of the EDGAR and GCB-NAE datasets below (see 'Input Datasets').
GCB-NAE and EDGAR provide information regarding the global emission of CO 2 through the combustion of fossil fuels, industrial processes and cement production, and some other minor sources (e.g. consumption of lubricants and paraffin waxes, solvent use, agricultural liming); nonetheless, their merits differ. GCB-NAE provides a consistent long-term dataset of annual national CO 2 emissions (1750-2018), however this dataset is not spatially-explicit below the country level and does not include sub-annual variability in CO 2 emissions. EDGAR provides estimates at high spatial resolution for specific fuels and sectors with a representation of the monthly distribution of emissions. However the EDGARv4.3.2 estimates are only available for 1970-2010 and a constant monthly distribution, matching the year 2010, is used throughout the time series 41 . Our approach merged these two complementary datasets to create a long-term (1959-2018) and gridded (0.1° × 0.1°) dataset of global monthly CO 2 emissions. The start year of 1959 aligns with the period of direct atmospheric measurements of CO 2 concentration 43 . Our approach is to scale EDGAR's 2010 monthly gridded CO 2 emissions to match the annual gridded CO 2 emissions from GCB-NAE on a nation-and fuel-specific basis (see 'emissions scaling protocol' , Fig. 1; Table 1).
GCP-GridFED includes additional data layers that are beneficial to inversion models. Gridded uncertainty in CO 2 emissions from each nation and emissions sector is also propagated to our nation-, year-and fuel-specific emissions estimates (Table 2). Gridded estimates of the uptake of O 2 related to oil, coal and natural gas use are also made using the literature-based oxidative ratios presented in the CO 2 Tables 3 and 4, the summary statistics in Tables 5-7 and  Online-Only Table 1 to outline the key features of GCP-GridFEDv2019.1 and assist with its technical validation. Input datasets. National annual emissions from the global carbon budget 2019 (GCB-NAE). The GCB estimates national annual emissions of CO 2 due to coal, oil and natural gas combustion, the oxidative use of these fuels in non-combustive industrial processes, and the production of cement clinker 3,[14][15][16][17]44 . National CO 2 emissions are preferentially taken from the country submissions to the United Nations Framework Convention on Climate Change (UNFCCC) for 42 "Annex I" countries over the period 1990-2018 44 . These countries were members of the Organisation for Economic Co-operation and Development (OECD) in 1992, plus 16 non-OECD European countries and Russia, and contributed ~60% of total global emissions in 1990. Emissions in other countries and in Annex I countries prior to 1990 derive from the Carbon Dioxide Information Analysis Center (CDIAC) 15 and are rooted in energy statistics published by the United Nations (UN) 16,45 . For recent years not covered by either the UNFCCC or CDIAC datasets, the national emissions are predicted using national or regional energy growth rates from the annual BP Statistical Review of World Energy 14 . National cement emissions are based on national inventories of cement production and ratios of clinker production from officially reported clinker production data and emission factors, IPCC default emission factors, industry-reported clinker production, and survey-based clinker ratios 16 .
Gridded monthly emissions from EDGAR. The Emissions Database for Global Atmospheric Research (EDGAR) version 4.3.2 41 is a dataset of global emissions of gases and particulates, including CO 2 , based on available national statistics, default emission factors and methods recommended by IPCC 46,47 . EDGAR uses a bottom-up approach that calculates gridded (0.1° × 0.1°) monthly CO 2 emissions for activity sectors based on: statistics that track national levels of each activity; proxy data representing the spatial and temporal distribution of each activity; the mix of technologies used to perform each activity; the fuel mix used by each technology, and; emissions factors for the technology and fuel combinations, which are also corrected for the emission control technologies in place. www.nature.com/scientificdata www.nature.com/scientificdata/ A detailed description of EDGAR's gridding procedure is available elsewhere (refs. 41,48 ) however we summarise below the key features of its design: • 28 EDGAR activity sectors are based on the 48 sectors defined by IPCC guidelines 46,47 .
• Activity in each sector is tracked from 1970-2015 using statistics that represent demand and supply of goods and energy, including: fuel-specific energy balances, fuel production, commodity production and cement clinker production and agriculture-related activities. • Emission factors are taken from the guidelines issued by the IPCC 46,47 and are assigned to each country in the following order of preference: national, regional, country group (Annex I/non-Annex I). • National emissions of CO 2 from each sector are distributed across months using sector-specific or, preferentially, technology-specific monthly shares. • Emissions are distributed in space using spatial proxy data (that vary stepwise over time 1990-2010), such as population density, point source locations and transport routes.
Some of the uncertainties associated with using proxy data to disaggregate emission in time and space are considered in later sections (see 'CO 2 emissions uncertainty').
EDGar sectors included in GCP-GridFED. Of the 28 EDGAR sectors, the 18 relating to fossil fuel combustion, non-combustion use of fossil fuel and cement production were used in GCP-GridFED. These 18 sectors were selected to correspond as closely as possible with the activities included in the GCB-NAE emission estimates. The 18 activity sectors incorporated from EDGAR into GCP-GridFED are shown in Table 1.
Where possible, emissions from each EDGAR sector were further separated into specific fuels using fuel-specific data from an intermediate processing step of the EDGAR gridding protocol 41 . Where this was not possible, it was necessary to make the assumptions that follow about the fuels that contribute to emissions in each   www.nature.com/scientificdata www.nature.com/scientificdata/ sector. These assumptions are based on the sector descriptions provided in the IPCC guidelines 46,47 and the major contributing activities and fuel dependencies in each sector. Specifically, we assume that: • All chemical process emissions relate to the non-combustion use of natural gas. • All emissions from the non-energy use of fuels sector relate to non-combustion use of oil. This sector chiefly comprises the use of waxes and lubricants. • All emissions from the solvents and product use sector relate to non-combustion use of oil. This sector chiefly comprises solvents in paint, degreasing and dry cleaning, chemical products and other product use. • All emissions from the production of steel, iron and non-ferrous metals relate to the oxidation of coal and production of cokes. • All emissions from fossil fuel fires relate to underground coal fires. This sector also includes oil flaring emissions in Kuwait, however fossil fuel fire emissions were found to be negligible in Kuwait. • All emissions from off-road, rail and pipeline transport relate to the combustion of oil.
• All emissions from the production of non-metallic minerals relate to cement clinker production.
National CO 2 emissions data were extracted from the EDGAR datasets for the purpose of national annual emissions scaling. National masks were based on the 'countries 2016' dataset of the Geographic Information System of the European Commission (EU-GISCO) 49 .
The appropriate positioning of power plants is key to distributing total emissions accurately because the power sector accounts for ~45% of global emissions 50 . Changes in the available datasets of power plant geolocations are common, and hence we note the importance of recording which datasets are used in each release of gridded  Table 2. Calculation of uncertainties for each sector in GCP-GridFEDv2019.1. Calculations are based on (i) the ratio of the uncertainty in emissions from each sector (U_TNO s ) to the uncertainty in total emissions (U_TNO Tot ) and (ii) the relative uncertainty in emissions for Annex I countries (5%) and other countries (10%) from GCB-NAE 3 . (a) GCP-GridFED source class codes are adopted from EDGAR. See Table 1  www.nature.com/scientificdata www.nature.com/scientificdata/ emissions products. GridFEDv2019.1 adopts point source geolocations from EDGAR v4.3.2, which are scaled as described below (see 'GCP-GridFED Protocol'). The EDGAR protocol for geolocating power plant emissions is summarised as follows, with full documentation provided by Janssens-Maenhaut et al. 41  www.nature.com/scientificdata www.nature.com/scientificdata/ 0) plant coordinates, correcting inverted (lon, lat) coordinates and adding some additional points for Russia. National power sector emissions for each fuel type are distributed across plants in proportion to their reported capacities. For larger countries (e.g. USA) with a non-uniform distribution of coal power plants, the fuel-specific distribution of emissions is considered a significant improvement over foregoing approaches. Emissions from each power plant reflect the fuel mix of the plant and the respective carbon intensity of emissions from that fuel mix. However, details of the technologies used by each plant, including carbon capture and storage, are not available. Alternative mappings of point sources can be based on night light detections by satellite 52 or population data 53 but these are least aligned with EDGAR's 'bottom up' approach 41 .
Heating and cooling degree day (HCDD) Correction. The monthly distribution (seasonality) of global CO 2 emissions is principally determined by seasonality of climate in the Northern Hemisphere, and thus a peak in emissions occurs in the boreal winter months and a trough occurs in the boreal summer months. Although this seasonality is predictable, inter-annual variability in weather influences the distribution of emission across the months. Because the monthly emissions distribution in the EDGAR dataset is derived only from 2010 data, we applied a correction to the EDGAR data to account for the impacts of inter-annual variability on emissions. Specifically, we used a heating and cooling degree day (HCDD) correction to implement inter-annual variability www.nature.com/scientificdata www.nature.com/scientificdata/ in the monthly distribution of CO 2 emissions from selected EDGAR sectors (power industry, 1A1a; buildings, 1A4; manufacturing, 1A2; and road transport, 1A3b; see Table 1). The HCDD correction approach was implemented as follows.
Here HCDDfrac, a and b were re-gridded by repeating each grid cell in the i_r, j_r dimensions to provide output at the resolution of the EDGAR grid (0.1° × 0.1°; i, j). GCP-GridFED Protocol. CO 2 Emissions. GCP-GridFED was generated using the six-step emissions scaling protocol set out below and applied sequentially for each year in the period 1959-2018 (see Fig. 1): 1. Group emissions from EDGAR sectors by source class. The global gridded (i, j) monthly (m) CO 2 emissions were summed across the EDGAR activity sectors (s) in each source class used in this study (S; see Table 1). The monthly distribution of annual emissions was adjusted in advance using Eq. 7.
S m i j s m i j , , , , , , 2. Extract gridded emissions data from EDGAR for each country. A subset of gridded monthly CO 2 emissions from each GCP-GridFED source class (see Table 1) was extracted for each country (c) using country masks (True/False) from the EU-GISCO dataset 49 . No subset was extracted for the bunker fuels source  www.nature.com/scientificdata www.nature.com/scientificdata/ class; the entire grid layer was scaled globally. Hence, all grid cells were included in an 'international' mask (True in all cells) and treated thereafter in the same way as each country.

Calculate scaling factors based on comparison of EDGAR and GCB-NAE emissions estimates.
For each country c and for each source class S, the scaling factor (α) required to convert the annual CO 2 emissions from EDGAR (step 3) to the annual CO 2 emissions estimate from GCB-NAE was derived as follows.
S c S c S c , , , 5. Apply annual scaling factors to monthly emission grids. The scaling factors for each nation and GCP-GridFED source class were applied to the national monthly CO 2 emissions grids generated in step 2.  www.nature.com/scientificdata www.nature.com/scientificdata/ The same scaling factor was used for all months. For the bunker fuels source class, the scaling factor was applied to the equivalent global data.
. Collate national data to a global output. Scaled monthly CO 2 emissions grids from all nations were merged into a single grid for each GCP-GridFED source class.
We do not attempt to adjust the EDGARv4.3.2 grids (year 2010) for a range of historical changes to the spatial distribution of emissions, for instance due to the expansion of road networks or flight routes, the commissioning/decommissioning of facilities or large-scale population migration. The resolution of these issues will be prioritised in future developments to the GCP-GridFED protocol. We note that developments introduced in EDGARv5.0 (ref. 42 ) include refined spatial proxy records and national temporal profiles covering the period 1970-2012, which will support further developments to the GCP-GridFED protocol. Dedicated datasets of fuel-specific monthly CO 2 emissions are also emerging for some countries, including India 57 and the USA 58 , and could be used preferentially in the GCP-GridFED protocol. Additional sources such as the diffusive coal mine oxidation CO 2 , as derived for the dataset CHE-EDGARv4.3.2_FT2015 41,59,60 will also be considered.
We do not consider emissions of non-CO 2 carbon emissions that later influence atmospheric CO 2 (in particular, CO and CH 4 ). Here all fossil carbon is assumed to be emitted as fossil CO 2 , whereas a fraction is in reality emitted as CO and later represents a diffuse fossil CO2 source after oxidation to CO 2 (~1.8 Pg CO 2 -equivalent year −1 ) 61 . In GridFEDv2019.1, the diffuse nature of this CO 2 source is not considered, and the source is instead placed at the surface at the time and location of oxidation. Meanwhile, fossil CH 4 fugitive emissions represent an additional diffuse source of CO 2 emissions (0.4 Pg CO 2 -equivalent year −1 ) that is not considered here 62 . These diffuse CO 2 sources will also be considered in future developments to the GCP-GridFED protocol. CO 2 emissions uncertainty. We provide gridded uncertainties to complement all gridded layers of the GCP-GridFED dataset, however we note here the incomplete nature of our uncertainty assessment. The gridded uncertainties are based on the total fossil CO 2 emissions uncertainty assessment from the GCB 3 , combined with variation in relative uncertainties across emission sectors from the recent TNO assessment (Table 2) 34 or uncertainties in national total CO 2 emissions, we adopt the values presented in the uncertainty assessment of the GCB; 5% for the 42 Annex I countries that report annually to the UNFCCC 44 and 10% for other countries 3,63 (1σ). Annex I countries are assigned lower uncertainty because for these countries more detailed energy and activity statistics are available, and they are periodically reviewed externally 3 . We used data presented in the TNO uncertainty assessment to evaluate the ratio of the uncertainties for each sector (U_TNO s ) to the uncertainty in total emissions (U_TNO Tot ). We then scaled the ratios to the uncertainties in total emissions that are adopted for Annex I and other countries from GCB-NAE.  www.nature.com/scientificdata www.nature.com/scientificdata/ Table 2 shows sectoral TNO uncertainty estimates and the resulting uncertainties adopted in GCP-GridFED for Annex I and other countries, for each sector.
The gridded uncertainty estimates presented here do not include the uncertainties associated with the spatial or temporal (monthly) disaggregation of national emissions, nor do we present a formal assessment of those  Table 7. Regional summary statistics relating to annual CO 2 emissions from each source, from GridFEDv2019.1.  www.nature.com/scientificdata www.nature.com/scientificdata/ disaggregation uncertainties. We note that spatially-averaged uncertainties resulting from the spatial disaggregation of national emissions estimates to grid cells are on the order of 20-75% (1σ) at spatial resolutions of 1 km to 1° (refs. 53,64-67 ). Spatial disaggregation uncertainties occur due to incomplete proxy data coverage (e.g. unmapped or mislocated point sources), poorly constrained nonlinearities (e.g. differences in the emissions intensity between equally dense rural and urban populations), shortcomings in continuous proxy values (e.g. poorly constrained population density) or inappropriate spatial representativeness (e.g. the spatial representativeness of www.nature.com/scientificdata www.nature.com/scientificdata/ roadmaps for traffic volume). By construction, these uncertainties are larger for years distant from our reference year 2010 and at monthly resolution. Dedicated analyses of regional emissions at high temporal resolution are yielding new data with which to quantify temporal disaggregation uncertainties 57,68 and to assess the robustness of the temporal profiles employed here and elsewhere 42 .
A full quantitative assessment of these issues, to support the development of comprehensive grid-level uncertainties associated with GCP-GridFED, will be the subject of future work. Overall, our approach to uncertainty quantification is broadly representative of the sectoral contributions to total emissions in each grid cell, which changes throughout the time series. Inversion models may utilise these uncertainty grids but with the freedom to build more complex covariance structures to suit their requirements.
O 2 Uptake. The relationship between CO 2 and O 2 fluxes during oxidation reactions can be expressed as an oxidative ratio (OR = flux of O 2 from the atmosphere/flux of CO 2 to the atmosphere, unitless) 36,69 . The OR differs detectably between specific fossil fuel sources, holding a value of −1.17 for coal, −1.44 for oil, and −1.95 for natural gas 36,69 . Uncertainties in OR are thought to be on the order of 2-3%, however variations within fuel classes, such as different grades of coal, have not been studied extensively (ref. 35 ). Cement clinker production involves a calcination reaction rather than an oxidation reaction, and thus no exchange of oxygen occurs (OR = 0). www.nature.com/scientificdata www.nature.com/scientificdata/ GCP-GridFED calculates gridded estimates of the uptake of O 2 during fossil fuel oxidation by applying OR values to the CO 2 emissions estimates for each source.

Global
We treat relative uncertainty in O 2 emissions as equal to the relative uncertainty in CO 2 emissions (U_GridFED).

Data Records
All GCP-GridFEDv2019.1 output grids can be accessed via the Zenodo data repository 70  www.nature.com/scientificdata www.nature.com/scientificdata/ emissions from each source class (COAL, OIL, GAS, CEMENT, BUNKER) with the units shown in Table 3. Each file contains 1,088,640,000 unique data points. All 60 NetCDF files are contained within a.zip archive named "GCP-GridFEDv2019.1_monthly.zip".
The data records also include 1 file in NetCDF format, "GCP_Global_Annual.nc". The NetCDF file includes 3 dimensions: time (year expressed as days since 1959-01-01, n = 60); latitude (Degrees North of the equator [cell centres], n = 1800); longitude (Degrees East of the Prime Meridian [cell centres], n = 3600). Each NetCDF file includes 4 groups representing CO 2 emissions, CO 2 emissions uncertainty, and O 2 uptake, and O 2 uptake uncertainty (CO 2 , CO 2 _uncertainty, O 2 , and O 2 _uncertainty respectively). Each group contains 5 variables representing emissions from each source class (COAL, OIL, GAS, CEMENT, BUNKER) with the units shown in Table 4. www.nature.com/scientificdata www.nature.com/scientificdata/ The file contains 6,998,400,000 unique data points. The NetCDF file is contained within a.zip archive named "GCP-GridFEDv2019.1_annual.zip".
"GCP-GridFEDv2019.1_monthly.zip" and "GCP-GridFEDv2019.1_annual.zip" can be found within a parent. zip file name "GCP-GridFEDv2019.1.zip". All grids are bottom-left arranged with coordinates referenced to the prime meridian and the equator.

technical Validation
We provide Figs. 2-12, the summary statistics in Tables 4-7 and Online-Only Table 1 to outline the key features of GCP-GridFEDv2019.1 and assist with its technical validation.
GCP-GridFED is designed to distribute national annual emissions from GCB-NAE over a spatio-temporal grid based on EDGARv4.3.2. We validated the outputs from GCP-GridFED by comparing the global annual emissions from the output grids (the sum of emissions across the global grid) with the input data supplied to the gridding protocol from GCB-NAE. Throughout the time series of emissions and across all source classes, the global annual emissions totals from GCP-GridFED were always within 0.0077% of the GCB-NAE input data throughout the annual time series (Figs. 2 and 3). The discrepancies were caused by unscalable (zero or NoData) values in sectors of the EDGAR dataset at the national level in 13 countries (EDGAR data summed within the national masks as per Eq. 10). These 13 countries make a small contribution to total global emissions (0.047% in 2018). Online-Only Table 1 provides national-level comparisons of the emissions estimates from GCP-GridFED and GCB-NAE. For the 13 countries where maximum absolute discrepancies exceeded 1% of GCB-NAE emissions, we provide a brief description of the cause of the discrepancy. The GCP-GridFED outputs are robust to within 0.0001% of GCB-NAE values in 195 countries, plus bunker fuels, comprising 99.9% of global emissions in 2018. Hence, we conclude that the GCP-GridFEDv2019.1 is consistent with GCB-NAE emissions estimates for the years 1959-2018.
We also observed a close match between the seasonality seen in the year 2010 in the GCP-GridFED dataset and that seen in the same year of the EDGAR input data, both at the global scale and in large austral and boreal extratropical nations (Fig. 12). This coherence indicates that the seasonality seen in the EDGAR dataset was preserved by the GCP-GridFED protocol. Inter-annual variability in the monthly distribution of emissions can be seen most prominently in the EU27 + UK. Note that EDGARv4.3.2 does not feature monthly variability in emissions for tropical countries 41 , and so GCP-GridFED also shows no seasonality in these countries. www.nature.com/scientificdata www.nature.com/scientificdata/

Usage Notes
The data is intended for use as a prior in inversion model studies, which may wish to incorporate individual priors for each source class or to use total gridded emissions. The data records contain a layer for each source class. Global total emissions can be calculated as the sum of emissions across the 5 source classes. National total Fig. 11 Gridded (0.1° × 0.1°) estimates of relative uncertainty in total CO 2 emissions for four years of the GCP-GridFEDv2019.1 time series. Uncertainty in total emissions is aggregated from the sector-level estimates (see Table 2). The uncertainty estimates account for uncertainty across national emission reports and spatial differences in the sectoral breakdown to total emissions in each grid cell, which changes throughout the time series, however they exclude uncertainties associated with the spatial or temporal (monthly) disaggregation of national emissions (see 'CO 2 emissions uncertainty'). Aggregation of uncertainties to a coarser resolution should account for the non-independence of gridded emissions uncertainties. www.nature.com/scientificdata www.nature.com/scientificdata/ emissions estimates should be calculated as the sum of coal, oil, gas and cement emissions (bunker fuel emissions should not be included in national emissions totals) 71 .
GCP-GridFED will be updated annually and made available for the inversion model runs conducted annually as part of the GCP assessment of the GCB. An updated version of GCP-GridFED (GCP-GridFEDv2020.1) was already made available upon request to support the inversion model runs of the GCP's 2020 GCB assessment 72,73 and is now publicly available 74 . GCP-GridFEDv2020.1 is based on the emissions estimates from a preliminary www.nature.com/scientificdata www.nature.com/scientificdata/ release of GCB-NAE covering the years 1959-2019 with input data available to June 2020. Further updates will be issued as the GCB-NAE data is updated 74 .
When using GCP-GridFED as a prior in inversion models operating at a coarser resolution, aggregation to the required resolution should account for the non-independence of gridded emissions uncertainties. See 'CO 2 Emissions Uncertainty' for further information regarding our treatment of spatial and temporal aggregation in GCP-GridFED.

Code availability
The code used to perform all steps described here and shown in Fig. 1 can be accessed via the Zenodo dataset repository entry for GCP-GridFEDv2020.1 (https://doi.org/10.5281/zenodo.4277267) 74 . GCP-GridFEDv2020.1 uses the same code and methodology as GCP-GridFEDv2019.1 but includes updated estimates of national annual emissions through to 2019 from the GCP, as discussed in the Usage Notes and also detailed at ref. 74 .