Data Descriptor: An emissions-socioeconomic inventory of Chinese cities

As the centre of human activity and being under the threat of climate change, cities are considered to be major components in the implementation of climate change mitigation and CO 2 emission reduction strategies. Inventories of cities ’ emissions serve as the foundation for the analysis of emissions characteristics and policymaking. China is the world ’ s top energy consumer and CO 2 emitter, and it is facing great potential harm from climate change. Consequently, China is taking increasing responsibility in the ﬁ ght against global climate change. Many energy/emissions control policies have been implemented in China, most of which are designed at the national level. However, cities are at different stages of industrialization and have distinct development pathways; they need speci ﬁ c control policies designed based on their current emissions characteristics. This study is the ﬁ rst to construct emissions inventories for 182 Chinese cities. The inventories are constructed using 17 fossil fuels and 47 socioeconomic sectors. These city-level emissions inventories have a scope and format consistent with China ’ s national/provincial inventories. Some socioeconomic data of the cities, such as GDP, population, industrial structures, are included in the datasets as well. The dataset provides transparent, accurate, complete, comparable, and veri ﬁ able data support for further city-level emissions studies and low-carbon/sustainable development policy design. The dataset also offers insights for other countries by providing an emissions accounting method with limited data.

As the centre of human activity and being under the threat of climate change, cities are considered to be major components in the implementation of climate change mitigation and CO 2 emission reduction strategies. Inventories of cities' emissions serve as the foundation for the analysis of emissions characteristics and policymaking. China is the world's top energy consumer and CO 2 emitter, and it is facing great potential harm from climate change. Consequently, China is taking increasing responsibility in the fight against global climate change. Many energy/emissions control policies have been implemented in China, most of which are designed at the national level. However, cities are at different stages of industrialization and have distinct development pathways; they need specific control policies designed based on their current emissions characteristics. This study is the first to construct emissions inventories for 182 Chinese cities. The inventories are constructed using 17 fossil fuels and 47 socioeconomic sectors. These city-level emissions inventories have a scope and format consistent with China's national/provincial inventories. Some socioeconomic data of the cities, such as GDP, population, industrial structures, are included in the datasets as well. The dataset provides transparent, accurate, complete, comparable, and verifiable data support for further city-level emissions studies and low-carbon/sustainable development policy design. The dataset also offers insights for other countries by providing an emissions accounting method with limited data. The term 'city' here refers to administrative prefecture-level city rather than to a built-up city. Accordingly, the CO 2 emissions calculated in this dataset are Intergovernmental Panel on Climate Change (IPCC) administrative territorial CO 2 emissions, referring to emissions "taking place within national (including administered) territories and offshore areas over which the country has jurisdiction (page overview.5)" 37 . We exclude the emissions induced by international aviation and shipping 38 . Unlike production-or consumption-based emissions 17 , the administrative territorial scope quantifies the direct emissions induced by human activities within a regional boundary. That is, territorial emissions provide the data baseline for emission-related studies and regional carbon control.
The emission inventories include two components: CO 2 emitted from fossil fuel combustion (energyrelated emissions) and CO 2 emitted from industrial production (process-related emissions). Processrelated emissions refers to CO 2 emitted from industrial raw materials during chemical reactions, such as CO 2 escaping during calcium carbonate (CaCO 3 ) calcination in cement production.
The cities' emissions inventories are uniform with China's national and provincial emission inventories in scope, format, and data sources 39 , making them comparable.

Emissions calculation and inventory construction
The energy-related emissions are calculated based on 17 fuels (shown in Table 1) and 47 socioeconomic sectors (shown in Table 2). The 17 types of fossil fuels are selected based on China's official energy statistical system 36 . There are 29 energy types used in the system: 26 are fossil fuels, one is electricity, one is heat, and one is other energy. As our study only accounts for the direct emissions from fossil fuel burning within one city boundary (the IPCC administrative territorial scope), the inventories exclude the indirect emissions induced by electricity and heat use. The CO 2 emissions related to electricity and heat generation, therefore, are calculated based on fuel inputs and allocated to the power plants. We also assume that there is no, or little, CO 2 emitted from other energy uses. Some of the fossil fuels share similar carbon content and have very low consumption volumes; we merge them in the emission accounts 39 . The 47 socioeconomic sectors are set according to the System of National Accounts 40 .
Energy-related CO 2 emissions are calculated based on the mass balance theory; 41 see Equation 1 .
where CE ij represents the CO 2 emissions induced by the combustion of fuel i in sector j, AD ij (activity data) represents fossil fuel combustion by fuel and sector. The emission factor (ton CO 2 /ton) is composed of a specific heat value factor-NCV i (J/ton) multiplied by the carbon content per unit heat value-CC i (ton  Table 3 (available online only)). The seven sets of emission factors are collected from IPCC, NBS, NDRC, NC1994, NC2005, MEIC, UN-China, and UN-average. Generally, coal-related fuels have a larger range than oil-and gas-related fuels. Liu, et al. 44 's re-evaluated emission factors have already been widely used by many studies and institutions to calculate China's emission inventory, including China's third official emission inventory 2012 45 . Thus, this study uses the above-mentioned updated emission factors. Table 1 gives the net caloric value (NCV i ) and carbon content (CC i ). Table 4 (available online only) shows the sector-specific oxygenation efficiency (O ij ), which considers sector discrepancies in technical level 39 .
The process-related CO 2 emissions (CE t ) are calculated in Equation 2 41 . We include seven industrial processes, including cement production (for approximately 70% of the total process-related emissions in China 45,46 ), lime production (the 2 nd largest emissions source 47 ), ammonia production, soda ash production, ferrochromium production, silicon metal production, and unclassified ferro-production. The process-related emissions are allocated to the corresponding sectors in the emission inventory. Cement and lime-related emissions are allocated to the sector "Non-metal Mineral Products"; ammonia and soda ash-related emissions are allocated to the sector "Raw Chemical Materials and Chemical Products"; Ferrochromium, silicon metal, and unclassified ferro-related emissions are allocated to the sector "Smelting and Pressing of Ferrous Metals".
AD t and EF t in the equation refer to industrial production (activity data) and emission factors, respectively. The emission factors of industrial processes are collected from IPCC 41 and NDRC 42 , as shown in Table 5. The cities' CO 2 emissions matrices (namely, inventories) are created as 19 columns and 48 rows. Seventeen fossil fuel-related emissions, process-related emissions and total emissions are represented by 19 columns, while 47 rows correspond to the 47 socioeconomic sectors. Each element of the matrices is identified as the CO 2 emissions from fossil fuel combustion/industrial production in the corresponding sector. An inventory of Beijing is given in Table 6 (available online only) as an example.
These methods on emission inventory construction are expanded version of descriptions in our related work 39 . MATLAB R2014a is used to construct the cities' emission inventories. We provided the MATLAB code in the Supplementary Information. We also provided the activity data of the cities for additional data transparency and verifiability (see "China city-level Energy inventory, 2010", Data Citation 1). Researchers will be able to use the MATLAB code and energy inventories to recalculate the emission inventories for the cities or replicate to other cities.

Activity data collection
Fossil fuel combustion, i.e., the activity data for energy-related emission accounts, includes two parts: the energy inputs for electricity/heat generation and the total final consumption. Other inputs for energy transformation, such as coal cleaning or petroleum refineries, transfer the carbon element from one fuel to another. These processes emit little CO 2 . Following our previous emissions inventories constructed for China and its provinces 39 , fossil fuel combustion can be collected from a region's energy balance table (EBT) and final energy consumption can be captured by the industrial sector (Energy ij ). The EBT provides each fossil fuel's transformation and final consumption in farming, industry, construction, three service sectors, and households (rural and urban). As the entire industry sector consists of 40 sub-sectors, Energy ij presents the sectoral consumption of fossil fuel for the industry sector.
Generally, the EBT and Energy ij can be found in a city's statistical yearbook. However, due to the poor data quality of city-level statistics, not all cities' yearbooks publish the EBT or Energy ij . We developed a series of methods in our previous study to estimate missing data 48 :  1. EBT: Very few cities have EBT in their statistical yearbooks. We scale down the corresponding provincial EBT to obtain the city table. We use each sector's GDP to estimate farming, construction, and three service sectors, assuming that the city has the same farming/construction/service energy intensity as its province. We also use the urban/rural population to estimate the urban/rural household energy estimation on the premise that the city has the same per capita residential energy consumption as its province. The GDP and population data are collected from statistical yearbooks for the cities and their corresponding provinces. 2. Energy ij : Some cities only provide Energy ij from enterprises of above-designated-size (ADS). ADS enterprises are defined as enterprises with prime operating revenue over 20 or 5 million yuan for different cities. ADS enterprises account for 50 to 90% (roughly) of one city's total industrial output. We use the ADS industrial output ratio (calculated as the whole-industry output divided by the ADS enterprises' output) to scale up ADS Energy ij and obtain sectoral fossil fuel consumption at the wholeindustry scale.
As for cement production, the cities' statistical yearbooks provide total cement production or production from ADS enterprises. We then scaled up the ADS cement production by the ADS industrial output ratio to obtain the total cement production.
The raw activity data are collected through a "crowd-sourcing" working mode implemented in the Applied Energy Summer School 2017 and 2018. Over 100 students joined the summer school and participated in data collection. The summer school will be held annually in the future, and more researchers will contribute to and update city-level data collection. These methods on city-level data estimation and collection are expanded version of descriptions in our related work 48 .

Socioeconomic indexes
This study collects several socioeconomic indexes for the 182 cities from the "China City Statistical Yearbook" 49 , including: 1. population, in 10 thousand; 2. employed population, in 10 thousand; 3. employed population in sectors (primary industry; mining; manufacturing, electric power, gas and water production and supply; construction; transport, storage and post; information transmission, computer services and software industry; wholesale and retail trade; hotel and catering services; financial intermediation; real estate; leasing and business services; scientific research, technical services and geological exploration; water, environmental and public facilities management; resident services and other services; education; health, social security and social welfare; culture, sports and entertainment; public administration and social organization), in 10 thousand; 4. area, in square kilometres; 5. built up area, in square kilometres; 6. gross domestic product (GDP), in 10 thousand yuan; 7. primary industry, secondary industry, and tertiary industry's share in GDP, in %; 8. industrial output, in 10 thousand yuan.
The socioeconomic indexes (as shown in Table 7 (available online only) and "China city-level socioeconomic inventory, 2010", Data Citation 1) can be used to explore the drivers and characteristics of cities' emissions.

Data Records
A total of 365 data records (emissions-socioeconomic inventories) are contained in the datasets. Of these,

No.
Industry process Emission factors Allocation sectors   The cities' CO 2 emissions inventories are constructed at an IPCC territorial administrative scope, including both energy-related emissions (from fossil fuel combustion) and process-related emissions (from cement production). The socioeconomic inventory presents GDP, population, employed population (with structure), GDP (with structure), and area of the 182 cities.

Technical Validation
Uncertainties CO 2 emissions inventories gather the contributions of economic activity to total CO 2 emissions for a given time period and area. Inventories are critical to many environmental decision-making processes and scientific goals. Policymaking and scientific research require reliable inventories to ensure the effectiveness of the policy process. In both types of applications, it is important to understand the uncertainty in emissions inventories. Additionally, uncertainty analysis can improve the accuracy of emissions accounts. Regarding the city-level CO 2 emissions inventories in this article, the literature shows that uncertainty regarding the process-related emissions in cement production is low. The inventories' uncertainty mainly depends on energy-related emissions part 44,50 . The contributing sources of uncertainty for energy-related emissions accounting are associated with emission factors, activity data and other estimation parameters (Volume 1, Chapter 3, Page 6)" 41 . The uncertainty induced by emissions factors and energy activity data are both quantified for the cities' emission inventories.
Uncertainties in activity data and emission factors. China's energy data are of relatively poor quality compared with those of developed countries, especially city-level data. The literature also shows that the uncertainties range widely from sector to sector. The coefficient of variation (CV; the standard deviation divided by the mean) is used to quantify the uncertainty. According to a field survey led by previous studies, the fossil fuel consumed in China's power generation sector has the lowest CV (5%) 51,52 , compared with primary industry (30%) 53 , other manufacturing sectors (10%), construction (10%) 41,54 , transportation sector (16%) 55 , and residential energy use (20%) 41 . The sources of uncertainties could lie in the opaqueness in China's statistical systems, especially on the "statistical approach on data collection, reporting and validation (Page 673)" 56 and the dependence of China's statistics departments on other government departments. Such uncertainties result in a large gap between China's national fossil fuel consumption data and the aggregated provincial data. To cover the gap, China has adjusted its energy data three times since 2004, resulting in a gap between the latest national fossil fuel consumption data and provincial aggregated data of 5% 57 . The gap between city-level aggregated energy consumption and the national overall data could be even larger.
Previous studies have debated China's emission factors [58][59][60][61] . The range of emission factors across different sources is as high as 40%. This study collects emission factors from Liu, et al. 44 , which measured them based on a broad investigation of China's fuel quality. Based on the statistical analysis of surveyed fuel quality, the CVs of coal-, oil-, and gas-related fuels are estimated as 3, 1, and 2%, respectively.
Monte Carlo simulations. Monte Carlo methods are used to simulate the uncertainties resulting from both fossil fuel combustion and emissions factors to estimate the overall uncertainty of the emissions 41 . Monte Carlo simulations select random values for the emission factor and activity data (fossil fuel consumption) from within their individual normal probability (density) functions and calculate the corresponding emission values (chapter 6 IPCC 41 ). To perform Monte Carlo simulations, we first set up probability density functions for each input variable (emission factor and activity data). Both variables are assumed to follow a normal distribution 44 . Then, we randomly sample both the activity data and the emission factors 20,000 times and obtain 20,000 CO 2 emission estimations. The uncertainties are obtained at a 97.5% confidence level and are calculated as the 97.5% confidence intervals of the estimates. This article finds that the average uncertainties in the cities' total CO 2 emissions range from −;3.65 to 3.67% at a 97.5% confidence level (±47.5% confidence interval around the estimate). Hegang in Heilongjiang has the highest uncertainties in emissions of (−5.83, 5.86%), while Huizhou in Guangxi has the lowest value of (−0.91, 0.91%).

Limitations and future work
The cities' emission inventories have some limitations that could lead to more uncertainty. Although these uncertainties may not be large enough to quantify, they are an indispensable component of the emission inventories' uncertainties. First, this study only takes the energy-related and process-related emissions from seven industrial production processes into account in the emission accounts, and emissions emitted by other sources is missing, such as "agriculture", "land-use change and forestry", "waste", and other industrial processes. Thus, the analysis incomplete. In the future, we will expand the emission scope to achieve more complete inventories for cities. Second, the cities' emission factors for fossil fuels and industrial processes are substituted by national average emission factors during the process of accounting for cities' CO 2 emissions, resulting in inaccuracy. We hope that specific city-level emissions factors could be updated in the future to increase the accuracy of our results. If not, in our future research, we could employ provincial emission factors to obtain a more accurate emission inventory for the provinces. Third, due to the poor data quality for the cities, the EBTs of most cities are a downscaled version of the provincial table, assuming that the cities have the same sectoral energy intensity and per capita residential energy consumption with their provinces. Such assumptions bring additional uncertainties to cities' emission inventories. In the future, a consistent time-series emission inventory dataset for Chinese cities will be completed. We will integrate the bottom-up estimations (calculated based on survey data from enterprises) 14 and satellite observations to achieve more emission accounts for these cities. More specifically, the high-resolution bottom-up emissions and satellite images can confirm some of the cities' emission sources (i.e. some super-emitting points). The night-light data will also be used to verify our top-down emissions inventories 16,62 .