China CO2 emission accounts 2016–2017

Despite China’s emissions having plateaued in 2013, it is still the world’s leading energy consumer and CO2 emitter, accounting for approximately 30% of global emissions. Detailed CO2 emission inventories by energy and sector have great significance to China’s carbon policies as well as to achieving global climate change mitigation targets. This study constructs the most up-to-date CO2 emission inventories for China and its 30 provinces, as well as their energy inventories for the years 2016 and 2017. The newly compiled inventories provide key updates and supplements to our previous emission dataset for 1997–2015. Emissions are calculated based on IPCC (Intergovernmental Panel on Climate Change) administrative territorial scope that covers all anthropogenic emissions generated within an administrative boundary due to energy consumption (i.e. energy-related emissions from 17 fossil fuel types) and industrial production (i.e. process-related emissions from cement production). The inventories are constructed for 47 economic sectors consistent with the national economic accounting system. The data can be used as inputs to climate and integrated assessment models and for analysis of emission patterns of China and its regions.

www.nature.com/scientificdata www.nature.com/scientificdata/ to incomparable results and discrepancies frequently exceeding 20% ref. 17 . Second, these global datasets only provide estimates for China's overall emissions or at most for a few sectors and fuels. They do not provide detailed emission inventories by sectors and fuels for subnational administrative units in China. Third, these datasets do not provide the underlying raw data, making the emission non-transparent and unverifiable. As a result, scholars did lots of repetitive work on emission accounting when analysing China's emission patterns [18][19][20] .
Aiming at the above research gap, this study follows a uniform accounting framework to construct the emission inventories of China and its 30 provinces, as well as their energy inventories for the years 2016 and 2017. The inventories are internally consistent and comparable with each other and are compiled based upon the same accounting scope (IPCC administrative territorial scope; energy-plus process-related emissions), methods (sectoral approach and reference approach), data source (official statistical data), and format (17 fossil fuels and 47 economic sectors).
The study provides the most up-to-date emission and energy accounts of China and its 30 provinces. It is a key update of our previous emission dataset 1997-2015 ref. 17 , as well as an important supplement to official emission estimates. We also publish the activity data, emission factors, and calculation code with the inventories, in order to ensure our data is transparent and verifiable. All data have been uploaded to our open-access dataset: China Emission Accounts and Datasets www.ceads.net for free download.
Methods accounting scope. Three scopes are widely used in emission accounts 21 . Scope 1 emissions, also called territorial emissions, refer to emissions 'taking place within national (including administered) territories and offshore areas over which the country has jurisdiction (pageoverview.5)' 8 . In other words, scope 1 accounts for all CO 2 emissions generated within a country/region boundary, such as energy consumption, production of goods and services and household consumption, as well as emissions from agriculture, forestry, and waste 22,23 . Scope 2 accounts for indirect emissions that relate to electricity/heat consumed within the boundary of a country or region but are produced outside of its boundary. Scope 3 emissions include all other indirect emissions associated with the production of final consumption of a country/region.
Compared with the other two emission scopes, scope 1 emissions describe the physical CO 2 emissions emitted within a country/region's boundary and can be used by local governments as a benchmark to design emission reduction policies for their jurisdiction. Therefore, this study follows the IPCC's administrative territorial scope to account for CO 2 emissions for China and its 30 provinces. accounting method. We include CO 2 emissions from both fossil fuel combustion (i.e. energy-related emissions) and cement production (process-related emissions) in the emission accounts. Energy-related CO 2 emissions are converted from the carbon content in fossil fuels, such as raw coal and gasoline during combustion. We use mass balances to calculate emissions according to the IPCC guidelines 9 , as shown in Eq. 1.
In the equation, CE i refers to CO 2 emissions from fossil fuel i. While China's energy statistical system has 26 types of fossil fuels, we merge them into 17 types due to the small consumption amount and similar quality of certain fuels, shown in Table 1. For example, Naphtha, Lubricants, Paraffin, White spirit, Bitumen asphalt and Petroleum coke are all products of petroleum refineries and account for only 2.8% of total final energy consumption in 2017. AD i is the "activity data" used for emission estimation. In the case of energy-related emission accounting, AD i refers to the combustion volume of fossil fuel i. NCV i represents the "net caloric value", which is the heat value per physical unit from the combustion of fossil fuel i. CC i is the "carbon content" of fuel i, which quantifies carbon emissions per net caloric value produced. O refers to "oxygenation efficiency", which represents the oxidation ratio during fossil fuel combustion. The emission factors are collected from our previous study 17,24 (shown in Table 1), which is estimated based on a wide investigation of 4,243 coal mine samples.
Please note that this study only accounts for direct emissions from the consumption of 17 fossil fuels. We do not consider emissions related to electricity/heat consumption to avoid double accounting, as those electricity/ heat-related emissions have already been calculated from the production side and allocated to the respective power plants.
Process-related emissions refer to CO 2 emitted during chemical reactions of industrial production and the CO 2 emissions are converted from industrial raw materials, rather than fossil fuels. For example, in the production of cement, the calcium carbonate (CaCO 3 ) is calcined to get calcium oxide (CaO). Process-related emissions are converted from the carbon content in CaCO 3 , while the emissions from fuel combustion are accounted for in energy-related emissions. In this study, we only include cement production, which accounts for over 70% of China's process-related emissions 15,25 . According to the IPCC guidelines 9 , process-related emissions can be calculated as follows: CE t refers to process-related emissions from cement production. The activity data, AD t , refers to cement production in the estimation of process-related emissions. EF t refers to the emission factor, which is 0.2906 tonne of CO 2 per tonne of cement produced.
Emissions by sectoral approach and sectoral emission inventories. Energy-related emissions can be calculated by two approaches for a country/region. One is based on sectoral energy consumption data (known as sectoral emissions), the other is calculated based on energy production and supply data, referred to as reference emissions 9 .
www.nature.com/scientificdata www.nature.com/scientificdata/ Sectoral energy consumption accounts both energy used by final consumers such as agriculture, mining, industry, services and households, and energy used as inputs in energy transformation sectors, in order to produce secondary energy. Most of final energy consumption is from combustion of fossil fuels (excluding a relatively small proportion that is used as raw materials in industrial processes, so called non-energy use), while some of the energy use for energy transformation is not. For example, raw coal is consumed during coal washing in order to obtain cleaned coal, while cleaned coal is transformed to coke during the coking process. There is almost no CO 2 emitted during such transformations, because the carbon content in raw coal/cleaned coal is transferred to cleaned coal/coke, respectively. Therefore, we exclude such non-combustion inputs in energy transformation when calculating the emissions. Only the fuels combusted in power plants for electricity or heat generation to fuel these processes are included. In this way, in the calculation of sectoral energy-related emissions, we include final energy consumption, energy used for electricity and heat generation, and exclude non-energy use and energy loss.
The current Chinese energy statistical system distinguishes final energy consumption in 47 sectors, which are consistent with the Chinese National Economic System, shown in Table 2 ref. 26 . We follow this sector definition to construct our emission inventories, which makes it easy to link the inventories to national socioeconomic data. The emissions induced for electricity and heat generation are allocated to the sector "Production and Supply of Electric Power, Steam and Hot Water" and process-related emissions from cement production are allocated to the "Non-metal mineral products" sector as an additional column.
In this way, we construct the sectoral emission inventories of China and its 30 provinces by 47 economic sectors (rows), 17 fossil fuels (columns), and the cement production process (the 18 th column).

Emissions by reference approach and reference emission inventories.
According to the IPCC guidance, the reference method "is a top-down approach, using a country's energy supply data to calculate the emissions of CO 2 from combustion of primary fossil fuels. The reference approach is a straightforward method that can be applied on the basis of relatively easily available energy supply statistics (Volume 2, Chapter 6, Page 5)" 9 . The reference emissions are an important supplement to sectoral emissions and can be used for verification. The reference emissions can be estimated with reference energy consumption ( − AD ref i ), which is shown in Eq. 3. We exclude energy loss and non-energy use in the calculation of reference energy consumptions and emissions due to the same reason as in the sectoral approach.  www.nature.com/scientificdata www.nature.com/scientificdata/

AD Indigeous production imports exports Moving in from other provinces
Sending out to other provinces stock changes Non energy use Loss We only consider three primary fossil fuels -raw coal, crude oil, and natural gas -when calculating reference emissions and energy consumption. The basic assumption is that all secondary fuels are transformed from primary fuels, with their carbon content coming from the primary fuels. If we consider the country/region as a black box, the reference consumption of three primary fuels contains the overall carbon content consumed within the black box, no matter how they are transformed or circulated among energy types or sectors.

No. (j)
Economic sectors  www.nature.com/scientificdata www.nature.com/scientificdata/ The reference emission inventories are constructed for the three primary energy types (raw coal, crude oil, and natural gas) and for six sub-items at the country level (indigenous production, import, export, stock change, loss, and non-energy use), eight sub-items at the provincial level (plus moving in from and out to other provinces). activity data. Energy activity data are collected from China and its provinces' Energy Balance Tables, which   are published in the China Energy Statistical Yearbook 2017 and 2018. The Energy Balance Table presents comprehensive energy flows and utilization including production, transformation, final consumption, loss and others.
Final energy consumption in the Energy Balance Table includes only eight sectors: "Farming, Forestry, Animal Husbandry, Fishery and Water Conservancy", "Industry", "Construction", "Transportation, Storage, Post and Telecommunication Services", "Wholesale, Retail Trade and Catering Services", "Other Service Sectors", "Urban Resident Energy Usage", and "Rural Resident Energy Usage". We then expand the "Industry" sector to 40 sub-sectors ( ∈ j [2,41] in Table 2) according to the "Industry Sectoral Energy Consumption Table (

Data records
The dataset "China CO 2

technical Validation
Uncertainties. The uncertainty of emission accounts may come from various sources. According to the IPCC 9 , activity data, emission factors, lack of completeness, lack of data, and measurement errors may lead to different levels of uncertainties. However, due to technical issues, some of the uncertainties are not quantifiable. For example, the measurement error "is random or systematic, resulting from errors in measuring, recording and transmitting information; inexact values of constants and other parameters obtained from external sources (Volume 1, Chapter 3, Page 11)" 9 , and such uncertainty exists in every step when developing emission accounts. In this study, we discuss the uncertainties from activity data and emission factors, which are the major quantifiable www.nature.com/scientificdata www.nature.com/scientificdata/ uncertainty sources in emission accounts. The uncertainty analysis acknowledges the limitations and potential inaccuracy of our emission inventories and provides a more accurate illustration of our emission estimates.
Activity data is one of the most important sources of uncertainty of emissions. Due to the dual system and poor quality of the energy statistics 28 , China's energy consumption data may have a Coefficient of Variation (CV, the standard deviation divided by the mean) ranging from 5% to 30% ref. 29 depending on sectors: e.g., for electricity generation the CV is 5% ref. 30,31 ; industry and construction 10% ref. 9,32 ; transportation 16% ref. 33 ; residential 20% ref. 9 ; and primary industry 30% ref. 34 . Despite China having modified its national energy consumption data three times since 2000, there is still a 5% difference between the national and aggregated provincial energy data 35 .
The uncertainties in China's emission factors have been widely discussed in the research community [36][37][38][39] . Despite default emission factors of China being published based on IPCC guidelines, the government and some research groups published their own factors, such as: National Bureau of Statistics (NBS), NDRC, Initial National Communication on Climate Change (NC1994), Second National Communication on Climate Change (NC2005), Multi-resolution emission inventory for China (MEIC), UN-China, and UN-average. Our previous comparisons of eight different factor sources found that the coefficient of variation of fuels' emission factors ranged from 1.1% (crude oil) to 33.4% (other gas). Liu et al. 's 24 emission factors used in this study is relatively low compared to the other sources, but higher than MEIC and NC1994 values. The coefficients of variation for different fossil fuels are presented in Table 1.
We applied Monte Carlo simulation, which is a technique recommended by the IPCC 9 to propagate the uncertainties from activity data and emission factors and calculate the integrated uncertainty of the entire emission inventory. The technique first assumes distributions (probability density function) for both activity data and emission factors. In this study, we assume that both variables follow a normal distribution 24 and their standard deviations are discussed above. Then, the technique generates a mass of random samples (100,000 times in this study) of the two variables, meaning that 100,000 independent estimations of emission can be calculated. The 97.5% uncertainty range is calculated as the 97.5% confidence intervals of the 100,000 estimations. Compared with energy-related emissions, process-related emissions have relatively lower uncertainties due to less parameters and simpler calculation methods 24 . We only quantify the uncertainties from energy-related emissions in this study.
The results show that the uncertainty range for energy-related emissions falls within (−15.1%, 30.8%) in 2016 and (−15.0%, 30.3%) in 2017 at a 97.5% confidential level. The 97.5% confidential interval of our estimates are shown in Fig. 2 (the grey area). We simulate the uncertainty range of a variable by keeping the others constant. Comparisons with existing emission datasets. In order to verify our emission accounts, we compared our estimates with other emission datasets, as shown in Fig. 2. While we found that our estimates produce relatively lower emissions, our estimates are very close to China's official emissions, with gaps ranging between −5.81% to 1.98%. Our national sectoral emissions were 5.81%, 3.49%, 2.79%, and 2.    www.nature.com/scientificdata www.nature.com/scientificdata/ As these existing emission datasets only provide the total emissions of China, we cannot make a further comparison of the estimates at the provincial or sector level. That is to say that our estimates provide the most up to date and comprehensive emission inventories of China and its provinces, and is an important supplement to the existing emission estimates as well as the official emission inventories.
Limitations and future work. Our datasets have several limitations, but we will work on these limitations in the future to improve the accuracy of China's emission accounts.
(1) We only include the emissions from fossil fuel combustion and cement production. There are many other components in scope 1 direct emissions, such as emissions from waste treatment and landfills and other industrial processes. Despite these components only accounting for a small proportion of overall emissions (less than 4% in 2012 ref. 15 ), missing them makes our inventories incomplete. In the future, we will expand our accounting scope to achieve more comprehensive emission inventories for China and its regions. (2) We adopt the national average emission factors to calculate each province's fossil fuel-related emissions.
The provinces may have heterogeneity in energy quality and utilization efficiency, and using the national average factors for all provinces may lead to additional uncertainties at the provincial level. Similar uncertainties also exist in the process-related emission accounts. The emission factor for cement production is estimated based on the national average clinker-cement ratio and clinker emission factors 24 . Due to the variance in production technologies in different regions, the clinker emission factors and clinker-cement ratios would be different as well 25 . Our future work will investigate regional specific emission factors for China to achieve more accurate emission data for China's provinces. (3) Due to data accessibility, nine provinces do not have sub-sectoral energy consumption of their industry.
The current version uses historical data, which is collected from the national economic census of 2008 ref. 40 . We thus assume that the economic structure of these provinces remained the same over the past 10 years. In the future, we will investigate these provinces to obtain more updated sectoral energy data for them.

Code availability
The MATLAB Code used to generate the emission inventories with energy inventories is published below for transparency and verifiability. We take Anhui 2017 as an example.