Carbon Monitor, a near-real-time daily dataset of global CO2 emission from fossil fuel and cement production

We constructed a near-real-time daily CO2 emission dataset, the Carbon Monitor, to monitor the variations in CO2 emissions from fossil fuel combustion and cement production since January 1, 2019, at the national level, with near-global coverage on a daily basis and the potential to be frequently updated. Daily CO2 emissions are estimated from a diverse range of activity data, including the hourly to daily electrical power generation data of 31 countries, monthly production data and production indices of industry processes of 62 countries/regions, and daily mobility data and mobility indices for the ground transportation of 416 cities worldwide. Individual flight location data and monthly data were utilized for aviation and maritime transportation sector estimates. In addition, monthly fuel consumption data corrected for the daily air temperature of 206 countries were used to estimate the emissions from commercial and residential buildings. This Carbon Monitor dataset manifests the dynamic nature of CO2 emissions through daily, weekly and seasonal variations as influenced by workdays and holidays, as well as by the unfolding impacts of the COVID-19 pandemic. The Carbon Monitor near-real-time CO2 emission dataset shows a 8.8% decline in CO2 emissions globally from January 1st to June 30th in 2020 when compared with the same period in 2019 and detects a regrowth of CO2 emissions by late April, which is mainly attributed to the recovery of economic activities in China and a partial easing of lockdowns in other countries. This daily updated CO2 emission dataset could offer a range of opportunities for related scientific research and policy making. Measurement(s) carbon dioxide emission Technology Type(s) computational modeling technique Factor Type(s) geographic location • sector • temporal interval Sample Characteristic - Environment climate system Sample Characteristic - Location global Measurement(s) carbon dioxide emission Technology Type(s) computational modeling technique Factor Type(s) geographic location • sector • temporal interval Sample Characteristic - Environment climate system Sample Characteristic - Location global Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.12994058

www.nature.com/scientificdata www.nature.com/scientificdata/ to June 30 th in 2020 when compared with the same period in 2019 (Fig. 2), and detects a regrowth of CO 2 emissions by late April, which is mainly attributed to the recovery of economic activities in China and partial easing of lockdowns in other countries (for a more in-depth analysis of this topic, please see our recent related paper 28 ).

Methods
annual total and sectoral emissions per country in the baseline year 2019. According to the IPCC guidelines for emissions reporting 4 , the CO 2 emissions Emis should be calculated by multiplying the activity data AD by the corresponding emissions factors EF where i, j, k are indices for regions, sectors and fuel types, respectively. EF can be further separated into the net heating values v for each fuel type (the energy obtained per unit of fuel), the carbon content c per energy output (t C/TJ) and the oxidization rate o (the fraction (in %) of fuel oxidized during combustion): Due to the lag of more than two years in the publishing governmental energy statistics, we started from the most recent CO 2 emissions estimates up to 2018 from current CO 2 databases 1, [9][10][11] . For 2019, we completed this information by obtaining annual total emissions based on data in the literature and disaggregated the annual total  www.nature.com/scientificdata www.nature.com/scientificdata/ into daily emissions (see below). For 2020, we estimated daily CO 2 emissions by using daily changes in activity data in 2020 compared to 2019. The CO 2 emissions and sectoral structure in 2018 for countries and regions were extracted from EDGAR V4.3.2 1 and V5.0 29 for each country, and national emissions were scaled to 2019 based on our own estimate (for China) and data from the Global Carbon Budget 2019 24 (for other countries) ( Table 1): ,2018 For China, we first calculated CO 2 emissions in 2018 based on the energy consumption by fuel type and for cement production in 2018 from the China Energy Statistical Yearbook 30 and the National Bureau Statistics 31 , following Eq. 1. We projected the energy consumption in 2019 from the annual growth rates of coal, oil and gas reported by the Statistical Communiqué 31 and applied China-specific emission factors 17 to obtain the annual growth rate of emissions in 2019. We projected China's CO 2 emissions based on our previous studies on the country-specific emission factor data in China 17 and past trends of China's CO 2 emissions 17,32 . Our projection (2.6%) is similar to the GCP revised projection 33 (2.0%), and the slight difference falls into the uncertainty range of the both estimations. For the US and EU27&UK, we used updated emissions growth rates in 2019 reported by CarbonBrief 33 . For countries with no estimates for emissions growth rates in 2019, such as Russia, Japan and Brazil, we assumed that their growth rate was 0.5% based on the emission growth rate of the rest of the world 24 . As the result, our daily average CO 2 estimates in 2019 (98.2 Mt CO 2 per day) are lower than the daily average CO 2 estimates in 2019 from GCP (around 100 Mt CO 2 per day). The discrepancies mainly came from the difference between EDGAR and GCP databases, showing a 2.1% global difference between these two databases in 2018.
In this study, the EDGAR detailed sectors were aggregated into several larger sectors (s): power sector, industrial sector, transport sector (ground transport, aviation and shipping), and residential sector, and the percentage estimates for the various sectors are derived from the EDGAR database. This is consistent with the new activity data we used below to compute daily variations. We used the sectoral distribution in 2018 from EDGAR to infer the sectoral emissions in 2019 for each country/region (Eq. 4), assuming that the sectoral distribution remained unchanged in these two years. Data acquisition and processing of carbon monitor daily CO 2 emissions. According to IPCC Guidelines 4 , the CO 2 emissions for each sector can be calculated by multiplying sectoral activity data by their corresponding emission factors following Eq. 5: The emissions were calculated following this equation separately for the power sector, the industrial sector, the transport sector, and the residential sector.
Power sector. The CO 2 emissions from the power sector can be calculated by adapting Eq. 5 with sector-specific activity data (i.e., electricity production in Russia and thermal electricity production in other countries) and the corresponding emission factors (Eq. 6): Normally, the emission factors change slightly over time but can be assumed to remain constant over the two-year period considered in this study compared to the large changes in activity data. We present uncertainties due to the changes of fuel mix in thermal production (see Technical Validation). Thus, we assumed that emission factors remained unchanged in 2019 and 2020 and calculated the daily emissions as follows: daily y early daily yearly For China, we estimate the daily thermal production AD daily from the daily disaggregation of monthly thermal emissions AD monthly by using daily coal consumption by six power companies in China C daily as follow: The data sources of daily activity data in the power sector are described in Table 2. The countries/regions listed in Table 2 account for more than 74% of the total CO 2 emissions in the power sector. For emissions from other countries (ROW), which are not listed in Table 2, we estimated the power sector emission changes in 2020 based on periods of the national lockdown. For daily emission changes for the ROW in 2019, we first assumed a linear relationship between daily global emissions and daily total emissions for the countries listed in Table 2. Then, we classified each country according to whether it adopted lockdown measures based on official reports. Based on daily emissions data for the power sector of the countries listed in Table 2, we calculated the average percent change α of power sector emissions across those countries during their lockdown periods, and used it to estimate the emission reduction of each country in the rest of the world during their specific national lockdowns (Eq. 9, where c denotes a country in ROW and d denotes day of 2020), and aggregated them into daily emissions for each ROW country. Industrial sector: Industrial and cement production. While daily production data are not directly available for industrial and cement production, the monthly CO 2 emissions from the industrial and cement production sectors can be calculated by using monthly statistics of industrial production and daily data of electricity generation to disaggregate the monthly CO 2 emissions into daily values. This calculation assumes a linear relationship between daily electricity generation for industry and daily industry production data to compute daily industry production. The emissions from industrial production during fossil fuel combustion were calculated by multiplying the activity data (i.e., fossil fuel consumption data in the industrial sector) by the corresponding emissions factors by the type of fuel. Due to limited data availability, we assumed a linear relationship between daily industrial production and industrial fossil fuel use, and the emission factors remained unchanged. Therefore, the monthly emissions in 2019 for a country/region can be calculated by the following equation: The emissions from cement production during the chemical process of calcination of calcite were also calculated with Eq. (10).
Specifically, for China, the emissions from the industrial sector were further divided into those for the steel, cement, chemical, and other industries (indicated by index i): where P is the industrial production for different industrial sectors (in China) or the total industrial production index (in other countries), as listed in Table 3. In China's case, the January and February estimates were combined, as individual monthly data were not reported by the sources listed in Table 3 for these two months. The monthly industrial emissions were disaggregated to daily emissions using daily electricity data, as explained above.  www.nature.com/scientificdata www.nature.com/scientificdata/ Lacking the latest Industrial Production Index for June 2020 for the EU27 & UK and India, we adopted monthly growth rates of industrial output from Trading Economics (https://tradingeconomics.com) based on preliminary survey data. CO 2 emissions from countries listed in Table 3 accounts for more than 71% of the global industrial emissions. For other countries not listed in Table 3, we used the same method described for the power sector to calculate the daily industry emissions from the ROW.
To allocate monthly emissions into daily emissions, we assume the linear relationship between daily industry activity and daily electricity production, and use the weight of daily electricity production to monthly electricity production: daily m onthly daily m onthly transport sector. Ground transportation. We collected hourly congestion level data from the TomTom website (https://www.tomtom.com/en_gb/traffic-index/). The congestion level (hereafter called X) represents the extra time spent on a trip, as a percentage, compared to under uncongested conditions. TomTom congestion level data were obtained for 416 cities across 57 countries (Only-online Table 1) at a temporal resolution of one hour. Of note, a zero-congestion level means that the traffic is fluid or 'normal' but does not mean there are no vehicles and zero emissions. It is thus important to identify the lower threshold of emissions when the congestion level is zero. To do so, we compared the time series of daily mean TomTom congestion level X with the daily mean car flux (in vehicles per day) from publicly available real-time Q data from an average of 60 roads in the megacity area of Paris. The daily mean car counts were reported by the city's service (https://opendata.paris.fr/pages/home/). We used a sigmoid function to fit the relationship between X and Q: where a, b, c and d are the regression parameters (Table 4). We verified that the empirical fit from Eq. 14 can reproduce the observed large drop in Q due to the lockdown in Paris and the recovery afterwards. We assume that relative changes in daily emissions were proportional to the relative change in the function Q(X) from Eq. 14.
Then, we applied the function Q(X) established for Paris to other cities included in the TomTom dataset, assuming that the relative magnitude of car counts (and thus emissions) follows a similar relationship with TomTom. The emission changes were first calculated for individual cities and then weighted by city emissions for aggregation to national changes. For a specific country i with n cities reported by TomTom, the national daily vehicle flux for day j was given by: , The TomTom GPS products include devices for car (https://www.tomtom.com/en_us/drive/car/), motorcycle (https://www.tomtom.com/en_us/drive/motorcycle/) and large vehicles (https://www.tomtom.com/en_us/drive/ truck/). Although we did not find more information about the details how the congestion index is calculated, we believe that the calculation of congestion level includes the data from private and commercial cars, light and heavy vehicles. In this study, we did not compute the emissions separately for light and heavy vehicles separately, because 1) the EDGAR emission product for "road transport" (sector 1A3b), which we used as a reference emission product, did not separate these two different types of vehicles; and 2) the TomTom congestion level is not reported for light and heavy vehicles separately. So we implicitly assume that they similarly scale with the TomTom congestion level.
For countries not included in the TomTom dataset, we assumed that the emissions changes follow the mean changes of other countries. For example, Cyprus, as an EU member country, had no city reported in the TomTom dataset, so its relative emissions change was assumed to follow the same pattern for total emissions from other EU countries included in the TomTom dataset (which covers 98% of total EU emissions). Similarly, the relative changes in emissions for countries in the ROW but not reported by TomTom were assumed to follow the same pattern as the total emissions from all TomTom reported countries (which cover 85% of global total emissions).
Aviation. CO 2 emissions from commercial aviation are usually reconstructed from bottom-up emission inventories based on knowledge of the parameters of individual flights 34,35 . We also calculated the CO 2 emissions from commercial aviation following this approach. Individual commercial flights are tracked by Flightradar24 (https:// www.flightradar24.com) based on ADS-B signals emitted by aircraft and received by their network of ADS-B www.nature.com/scientificdata www.nature.com/scientificdata/ receptors. As we do not yet have the capability to convert the FlightRadar24 database into CO 2 emissions on a flight-by-flight basis, we compute CO 2 emissions by assuming a constant EF aviation (CO 2 emission factor per km flown) across the whole fleet of aircraft (regional, narrowbody passenger, widebody passenger and freight operations). This assumption is reasonable if the flight mix between these categories has not changed significantly between 2019 and 2020.The International Council on Clean Transportation (ICCT) published that CO 2 emissions from commercial freight and passenger aviation resulted in 918 Mt CO 2 in 2018 36 based on the OAG flight database and emission factors from the PIANO database. IATA estimated a 3.4% increase between 2018 and 2019 in terms of available seat kilometers 37 . In the absence of further information, we consider this increase to be representative of freight aviation as well and use a slightly smaller growth rate of 3% for CO 2 emissions between 2018 and 2019 to account for a small increase in fuel efficiency. The kilometers flown are computed assuming great circle distance between the take-off, cruising, descent and landing points for each flight and are cumulated over all flights. The FlightRadar24 database has incomplete data for some flights and may completely miss a small fraction of actual flights, so we scale the ICCT estimate of CO 2 emissions (inflated by 3% for 2019) with the total estimated number of kilometers flown for 2019 (67.91 million km) and apply this scaling factor to 2020 data. Again, this assumes that the fraction of missed flights is the same in 2019 and 2020. As the departure and landing airports are known for each flight, we can classify the km flown (and hence the CO 2 emissions) per country and for each country between domestic or international traffic. The daily CO 2 emissions were computed as the product of distance flown by a CO 2 emission factor per km flown, according to:  www.nature.com/scientificdata www.nature.com/scientificdata/ the whole year. Given this, we estimated the shipping emissions for the first half year of 2019 using R month equal to 181/365.
We assumed that the change in shipping emissions was linearly related to the change in ship traffic volume. The change in international shipping emissions for the first half year of 2020 was calculated according to the following equation: 2020 ,2019 where c index represents the ratio of the change in shipping emissions, estimated to the end of April as −25% compared to the same period last year according to the news report 40 . residential sector: residential and commercial buildings. Fuel consumption daily data from this sector are not available. Several studies 41,42 showed that the main source of daily and monthly variability in this sector is climate, namely, heating emissions increase when temperature falls below a threshold that depends on the region, building type and people's habits. We calculated emissions by assuming annual totals unchanged from 2019 and using daily climate information in three steps: 1) estimate population-weighted heating degree days for each country and for each day based on the ERA5 43 reanalysis of 2-meter air temperature, 2) split residential emissions into two parts: cooking emissions and heating emissions based on the EDGAR database 29 and using the EDGAR estimates of 2018 residential emissions as the baseline. Emissions from cooking were assumed to remain independent of temperature, and those from heating were assumed to be a function of the heating demand. The main assumption in this approach is that residential emissions did not change based on factors other than heating degree day variations in 2020, although people's time at home dramatically increased during the lockdown period. To test the validity of this assumption, we compiled natural gas daily consumption data by residential and commercial buildings for France (https://www.smart.grtgaz.com/fr/consommation) (unfortunately, such data could not be collected in many countries) during 2019 and 2020 (Fig. 3). Natural gas consumption in kWh per day was transformed to CO 2 emissions using an emission factor of 10.55 kWh per m 3 and a molar volume of 22.4 10 −3 m 3 per mole.
First, we verified that the temporal variation in those 'true' residential CO 2 emissions was similar to that given by Eqs. 20-22. Second, after fitting a piecewise model to those natural gas residential emission data using ERA5 air temperature data, we removed the effect of temperature to obtain emissions corrected for temperature effects. Even if the lockdown was very strict in France, we found no significant emissions anomaly, meaning that although nearly the entire population was confined at home, it did not increase or decrease emissions. This complementary analysis tentatively suggests that residential emissions can be well approximated in other countries by Eqs. 20-22 based only on temperature during the lockdown period.  www.nature.com/scientificdata www.nature.com/scientificdata/

Data Records
Currently, there are 36,177 data records provided in this dataset, which can be downloaded at our website (https:// carbonmonitor.org) and Figshare 45 : • A total of 270 records are daily mean CO 2 emissions (from fossil fuel combustion and cement production processes) 1751-2020. technical Validation Uncertainty estimates. We calculate daily emissions directly from daily activity data in most sectors, thus the daily uncertainties are explained by the specific uncertainty of daily sectoral activity data itself and uncertainties in the (empirical) models used to convert activity to emissions. Here we should distinguish between error and uncertainty. Errors being defined the difference to the truth cannot be estimated because we do not know the true values. Uncertainty could be estimated if we had different activity datasets, by looking at the spread of these different datasets, but this is also not possible as most of our activity data are unique. The main issue is that we do not know if the uncertainty of activity data would be a systematic uncertainty (e.g. all days are biased low using our activity dataset compared to another activity dataset) or a random error (in which case, there could be positive bias one day compensated by negative bias another day and the uncertainty would be small e.g. on monthly values derived from daily values). We do acknowledge that we cannot estimate daily uncertainties. In the future, this could be done by trying to collect different activity data. This uncertainty analysis was also presented in our related paper recently published at Nature Communications 28 . Thus, we followed the 2006 IPCC Guidelines for National Greenhouse Gas Inventories to conduct an uncertainty analysis of the data. First, the uncertainties were calculated for each sector (See Table 6 for uncertainty ranges of each sector): • Power sector: the uncertainty is mainly from inter-annual variability of coal emission factors and changes in mix of generation fuel in thermal production. The uncertainty of power emission from fossil fuel is within (±14%) with the consideration of both inter-annual variability of fossil fuel based on the UN statistics and the variability of the mix of generation fuel (the ratio of electricity produced by coal to thermal production). • Industrial sector: The uncertainty of CO 2 from industry and cement production comes from monthly production data. CO 2 from industry and cement production in China accounts for more than 60% of world total industrial CO 2 , and the uncertainty of emissions in China is 20%. Uncertainty from monthly statistics was derived from 10,000 Monte Carlo simulations to estimate a 68% confidence interval (1 sigma) for China. We calculated the 68% prediction interval of the linear regression models between emissions estimated from monthly statistics and official emissions obtained from annual statistics at the end of each year to deduce the one-sigma uncertainty involved when using monthly data to represent the change for the whole year. The squared correlation coefficients are within the range of 0.88 (e.g., coal production) and 0.98 (e.g., energy import and export data), which indicates that only using the monthly data can explain 88% to 98% of the whole year's variation 32 ; the remaining variation is not covered but reflects the uncertainty caused by the frequent revisions of China's statistical data after they are first published. • Ground Transportation: The emissions from the ground transportation sector are estimated by assuming that the relative magnitude in car counts (and thus emissions) follow a similar relationship with TomTom congestion index in Paris. This model calibrated in Paris was cross checked to match traffic fluxes of two other cities. Future work will focus on additional cross-validation and calibration with more traffic data from other cities. • Aviation: The uncertainty in the aviation sector comes from the difference in daily emission data estimated based on the two methods. We calculate the average difference between the daily emission results estimated based on the flight route distance and the number of flights and then divide the average difference by the average daily emissions estimated by the two methods to obtain the uncertainty in CO 2 from the aviation sector. • Shipping: We used the uncertainty analysis from IMO as our uncertainty estimate for shipping emissions.
According to the Third IMO Greenhouse Gas study 2014 38 , the uncertainty in shipping emissions was 13% based on bottom-up estimates.

Shipping Emissions Sources
Global shipping emissions (2007-2012) IMO 38 Global shipping emissions (2013-2015) ICCT 39 International shipping emissions (2016-2018) EDGAR v5.0 13 www.nature.com/scientificdata www.nature.com/scientificdata/ • Residential: The 2-sigma uncertainty in daily emissions is estimated as 40%, which is calculated based on a comparison with daily residential emissions derived from real fuel consumption in several European countries, including France, Great Britain, Italy, Belgium, and Spain.
The uncertainty in the emission projection for 2019 is estimated as 2.2% by combining the reported uncertainty of the projected growth rates and the EDGAR estimates in 2018.
Then, we combine all the uncertainties by following the error propagation equation from the IPCC. Equation 23 is used to derive the uncertainty of the sum and could be used to combine the uncertainties of all sectors: total s s s where U s and µ s are the percentage and quantity (daily mean emissions) of the uncertainty of sector s, respectively. Equation 24 is used to derive the uncertainty of the multiplication, which in turn is used to combine the uncertainties of all sectors and of the projected emissions in 2019: overall i 2

Code availability
The generated datasets are available from https://doi.org/10.6084/m9.figshare.12685937.v4 and https://github. com/zhudeng94/dailyCO2. Codes for industrial emission calculation and summary table generation are available on the GitHub repository presented as worksheets. Also the raw data of power generation in U.S., EU27 & UK, India, Russia, Japan and Brazil are available on the GitHub repository. Other raw data and codes for emission estimation in other sectors are only available upon reasonable requests.

Fig. 3
Residential and commercial building daily natural gas consumption (linearly related to CO 2 emissions from this sector) in France for the last 5 years. Temperature effects have been removed from emissions using a linear piecewise model fitted to daily data. When the effect of variable winter temperature was removed, no significant change is seen in 2020 during the very strict lockdown period except for a small dip by end of March.