CO2 emission accounts of Russia’s constituent entities 2005–2019

Constituent entities which make up Russia have wide-ranging powers and are considered as important policymakers and implementers of climate change mitigation. Formulation of CO2 emission inventories for Russia’s constituent entities is the priority step in achieving emission reduction. Russia is the world’s largest exporter of oil and gas combined and the fourth biggest CO2 emitter, so it’s efforts in mitigating CO2 emissions are globally significant in curbing climate change. However, the existing emission inventories only present national CO2 emissions; the subnational emission details are missing. In addition, the emission factors are not country-specific and energy activity data by fossil energy types and sectors are not sufficiently detailed. In this study, the CO2 emission inventories of Russia and its 82 constituent entities from 2005 to 2019 are constructed. The emission inventories include energy-related emissions with 89 socio-economic sectors and 17 energy types and process-related emissions. The uniformly formatted emission inventories can be a reference for in-depth analysis of emission characteristics and emission-related studies of Russia.

As for the emission factors used to calculate CO 2 emissions, Russia's emission accounting is generally based on the default emission factors recommended by the Intergovernmental Panel on Climate Change (IPCC) 13 , which are not country-specific and not representative enough. Also, CO 2 emissions by fossil energy types and sectors are not sufficiently detailed. Some of them only provide Russia's total emissions, or at best for some key sectors and fossil fuel types. For example, BP only provides the total amount of CO 2 emissions of Russia and the IEA provides emissions only from four energy types (coal, oil, natural gas and other) and nine sectors 4,5 .
Considering the large emission data gap at subnational level and sketchy national data, our dataset includes the CO 2 emission inventories of 82 constituent entities and Russia between 2005 and 2019. The emission database is constructed according to detailed socioeconomic sectors and energy types in a uniform format, which presents emissions from 17 energy types and 89 socio-economic sectors. Also, the emission construction method of the 82 constituent entities is consistent with the method of national estimation, which enables multi-scale emission studies and increases comparability. The emission inventories will be updated and published yearly. Our emission inventory is constructed based on country-specific emission factors provided by the Ministry of Natural Resources and Environment (MNRE) of Russia 14 . These emission datasets can provide robust data support for follow-up studies of Russia's emission-related issues and formulation of decarbonization strategies. The emissions dataset can be accessed freely from the China Emission Accounts and Datasets (CEADs, www.ceads.net).

Methods
In general, CO 2 emissions accounting includes three scopes 15 . Scope 1 indicates direct CO 2 emissions generated within a territory, which is also known as territorial-based emissions. Scope 1 accounts for all CO 2 emissions produced within a region boundary, such as emissions from local energy production enterprises 16,17 . Scope 2 indicates indirect CO 2 emissions embodied in electricity, steam and heat imported from another territory 15,18 . Scope 3 indicates indirect CO 2 emission embodied in products and services which are imported from another territory 15,18 . The compilation of CO 2 emissions inventory was constructed according to the IPCC administrative territorial-based accounting scope, that is Scope 1 13 . The impact of international aviation and shipping is not included in our estimation 19 . CO 2 emission inventories consist of two components, as shown in Fig. 1: energyand process-related (cement) CO 2 emissions 20-22 . The energy-related emissions suggest the CO 2 emissions generated when burning the fossil fuel [23][24][25] . Process-related emissions indicate CO 2 emissions produced during the chemical reactions of the industrial process, with the CO 2 emissions converted from industrial raw materials, Are the data of cement producƟon available?

Yes No
Energy-related emissions from fuel combusƟon Process-related emissions from cement producƟon + Fig. 1 The framework of CO 2 emissions inventory construction.
In Eq. (1), i and j indicate energy types and socio-economic sectors, respectively. CE ij indicates CO 2 emissions from fossil fuel i combusted in sector j. NCV i is net caloric value, indicating the heat produced per physical unit of fossil fuel during the combustion process. CC i means carbon content per calorie of fossil fuel. O ij indicates carbon oxidation ratio, which is the percentage of carbon converted to CO 2 emissions in fossil fuel. AD ij indicates activity data. As for energy-related emission accounting, AD ij refers to the amount of fossil fuel used for combustion.
Most of the studies and international institutes adopted the default emission factors provide by the IPCC. This study adopts the emission factors from the MNRE of Russia 14 . Compared with emission factors from the IPCC, country-specific emission factors measured by the MNRE are more representative of the fossil fuel situation in Russia. For example, the MNRE released the emission factors of 29 types of coal based on their mining areas, as shown in Table 1. Because of Russia's large territory, the quality of coal differs significantly among regions, such as Kuznetskiy basin, Donetskiy basin and Kansk-Achinskiy basin, and their emission factors range from 0.73 tonne CO 2 /tonne to 2.72 tonne CO 2 /tonne (shown in Table 1). However, the default value of coal issued by the IPCC is around 2.61 tonne CO 2 /tonne. The differences between emission factors provided by the IPCC and the MNRE of Russia are illustrated in Table 1. Among all the fossil fuels, the emission factor of blast furnace gas shows the largest gap evaluated by the MNRE (3.28 tonne CO 2 /tonne) and the IPCC (0.76 tonne CO 2 /tonne). We also compared the level of CO 2 emissions evaluated based on MNRE, IPCC and two other sources and explained the difference in section 4.2.
The study collects the energy activity data from the Unified Interdepartmental Statistical Information System of Russia (UISIS) 32 . UISIS is the state integrated statistical resource and the largest provider of statistical data in Russia at national and subnational levels. The raw energy data are sourced from the 4-TER form (information on the use of fuel and energy sources) filled out by legal entities of energy consumers and suppliers in Russia (except small enterprises). The completed form is then submitted to the Federal State Statistics Service (Rosstat) of the territorial body where the separate subdivision is located or where the legal entity is located if it does not have a separate subdivision. If a legal entity does not carry out the activities in its location, the form should be submitted at the place where the activities are carried out. Energy activity data includes the energy used for combustion in the  2 Other solid fuel includes industrial waste, residential waste, and other types of natural fuel (e.g., straw, brushwood, and waste from logging and woodworking). Thus, its emission factor is based on the average emission factor of its components.
www.nature.com/scientificdata www.nature.com/scientificdata/ final consumption and the energy used for process and transformation (e.g., electricity and heat generation) within the nation/constituent entity boundaries. Emissions generated from imported electricity and heat are not included in this study since we focus on emissions produced within the nation/constituent entity boundary (Scope 1). The Energy activity data provided by UISIS includes total energy consumption, energy used for feedstocks, and energy used for non-fuel needs. A relatively small proportion of energy used for feedstocks and non-fuel needs has been excluded in the calculation of energy-related emissions. Examples about the energy used for feedstocks can be the production of chemical, petrochemical or other non-fuel products. As for the non-fuel needs, they can be the chemical reagents for drilling oil wells, gas injection to maintain reservoir pressure, lubricant, and insulating material. Based on the categorization method of the UISIS 32 , there are 45 types of fossil fuels, which include 29 different types of coal based on their mining areas, as shown in Table 1. In the emission inventory, we merged the CO 2 emissions from these 29 types of coal into CO 2 emissions from one energy type, that is coal, due to their similar energy quality and for better demonstration. In other words, this study shows CO 2 emissions from 17 energy types. Since the unit of fuelwood released by UISIS is in cubic meters 32 , the emission factor of fuelwood provided by the MNRE cannot be directly used to measure CO 2 emissions. Therefore, we first converted the unit of fuelwood to tonnes by using the density unit provided by the Self-regulatory Organization (SRO) of Russia 33 , at 0.6 tonne/m 3 .
The sector's classification is according to the document of the Russian Classification of Economic Activities code ОK 029-2014 (OKVED 2 NACE Rev. 2) provided by the Federal Agency for Technical Regulation and Metrology 34 . This is a hierarchical classification method which includes four levels, that is: sections (an alphabetical code), divisions (two-digit numerical code), groups (three-digit numerical code) and classes (four-digit numerical code), as shown in Online-only Table 1. To save space, we do not always show the lower hierarchical levels since not all the sectors generate CO 2 emissions. In other words, all sections are contained in the emission inventory, while the division, group, and class levels will be included only when this sector generates CO 2 emissions. Since this study accounts for CO 2 emissions produced within a region boundary, we excluded a section which does not consume energy activity data within the boundary, that is 'section U: activities of extraterritorial organization and bodies' .
There are some subsectors, for which UISIS does not provide energy activity data, and this leads to a gap between the main sectors and the summation of its lower level sectors. Considering that this gap does not belong to a specific subsector, we allocated this gap to a newly constructed sector, which is the combination of several subsectors. For example, energy activity data is only available in 'Q section: Human health and social work activities' and'No. 86: health service activities' , while the data of 'No. 87: Residential care activities' and'No. 88: social work activities without accommodation' are not available (Section Q= No. 86+No. 87+No. 88) (shown in Online-only Table 1). Therefore, there will be an emission gap between Q section and No.86 sector, so we combined No.87 and No.88 into one sector, named as 'social service activities' and the CO 2 emissions gap is then allocated to this newly constructed sector (shown in Online-only Table 1). In general, there are 11 newly constructed sectors: 'Crop production, hunting and related services' , 'Raising of other animals' , 'Transmission, distribution and trade of electricity' , 'Gas distribution and trade' , 'Transmission, distribution and trade of steam and hot water; Maintenance of thermal network and boiler room' , 'Construction of other civil projects' , 'Demolition and site preparation' , 'Other construction works' , 'Non-specialized wholesale trade' , 'Wholesale trade of other specialized products' and 'Social service activities' (shown in Online-only Table 1). Based on the above processes, there are 89 sectors contained in the construction of CO 2 emissions in this study after excluding the double counting sectors, as shown in Online-only Table 1. For completeness, apart from these 89 sectors, we also demonstrate the CO 2 emissions of sectors of higher-level classification in the emission inventory. There may still be a small gap between aggregated emissions of subsectors and emissions of their main sector due to measurement errors. To eliminate this gap, we further allocated the small gap to subsectors based on their share of CO 2 emissions. process-related (cement) CO 2 emissions. The process-related CO 2 emissions are calculated in Eq. (2).
In Eq. (2), EF and AD mean and emission factor for cement production released by the MNRE 14 and activity data (cement production level), respectively. Based on the availability of production data, we adopted two approaches to collect the amount of cement production (AD cement ) of 82 constituent entities, that is direct activity data (AD Cement-d ) and indirect activity data (AD Cement-ind ). AD cement-d is collected from 82 constituent entities' yearbooks, however, only five constituent entities released their cement production data, which are Sverdlovsk Region, Chelyabinsk Region, Bryansk Region, Karachayevo-Chircassian Republic, and Krasnodar Territory. For the other constituent entities, the activity data is obtained indirectly (AD Cement-ind .) by multiplying the production capacity data (PC) by utilization rate (UR) of each cement plant. As shown in Fig. 1, we use the point source database of the Russian cement plants from RuCEM 35 , which includes the production capacity of all cement plants in Russia. And then, according to the constituent entities where each cement plant is located, we collected the (UR) of production capacity in these constituent entities, which are available from yearbooks. Therefore, the cement production data of these constituent entities can be obtained by multiplying PC of the cement plant located in each constituent entity by UR in the corresponding year. The CO 2 emissions from cement production belong to 'Manufacture of other non-metallic mineral products' sector, as shown in Online-only Table 1.
Since 2020, yearbooks have not been published officially, only Russia's national cement production data can be collected in 2019 from CMPRO 36 . We estimate the CO 2 emissions of the constituent entities in 2019 by downscaling from the national level. The downscale factor is based on the share of the CO 2 emissions from the cement production of constituent entities in Russia in 2018. We will update the process-related emissions of 82 constituent entities in 2019 once the related data are available.
www.nature.com/scientificdata www.nature.com/scientificdata/  Fig. 2). The 120 rows include 89 sectors and the remaining 31 higher level sectors (shown in Fig. 2). For example, 'Section Q: Health and social service activities' is a higher level sector, which includes two subsectors ('health service activities' and 'social service activities') and we show the data of both the main sector and its subsectors (shown in Fig. 2). Each element of the matrices indicates the CO 2 emissions from the combustion of a certain energy type in the corresponding sector (shown in Fig. 2). The units of energy-related emissions and process-related emissions provided are million tonnes. As shown in Fig. 3, the stacked area chart represents CO 2 emissions from 17 fossil fuels combustion and cement production. The chart shows that Russia's CO 2 emissions increased in fluctuations from 2005 to 2019, and reached 1549.52 million tonnes in 2019 (shown in Fig. 3). Natural gas is the primary source of CO 2 emissions from 2005-2019, accounting for about 37.11% of the total (shown in Fig. 3). The proportion of CO 2 emitted from coal combustion is gradually decreasing, from 22.66% in 2005 to 15.57% in 2019, while the share of CO 2 emissions produced by the combustion of petroleum products has increased from 17.45% in 2005 to 21.12% in 2019 (shown in Fig. 3). After 2014 the proportion of CO 2 emissions from petroleum product combustion exceeds that of coal as the second source of CO 2 emissions (shown in Fig. 3). Overall, Russia's energy structure is relatively stable from 2005 to 2019 (shown in Fig. 3). Figure 4 presents the CO 2 emissions of 82 constituent entities by sectors in 2019. The 89 sectors are categorized into 16 main sectors for better demonstration and the categorization details can refer to Online-only Table 1. There was vast regional heterogeneity in CO 2 emissions among the 82 constituents. From Fig. 4, we find that the Tyumen region is the top emitter among the 82 constituent entities in 2019, contributing around 137.41 million tonnes of CO 2 emissions. This is mainly because the Tyumen region accounts for more than half of Russia's production of oil, natural and associated gas 41 . The Chelyabinsk region is the second largest emitter in 2019, generating about 119.96 million tonnes of CO 2 emissions, primarily because the Chelyabinsk region is one of the oldest mining bases with abundant mineral resources (shown in Fig. 4). Moscow city, the capital of Russia, also produced a relatively large amount of CO 2 emissions, at around 79.05 million tonnes in 2019 (shown in Fig. 4).

Data Records
The dynamic changes of CO 2 emissions of 82 constituent entities from 2005 to 2019 can be found in Online-only Table 3 Table 3). The Tyumen region saw the maximum rise in emissions, increasing by 24.26 million tonnes, followed by the Lipetsk region (19.53 million tonnes) and the St. Peterburg city (19.02 million tonnes), the Leningrad_region (13.64 million tonnes), and the Moscow city (13.30 million tonnes) (shown in Online-only Table 3). In contrast, the Sverdlovsk region, the Krasnoyarsk territory, and the Moscow region witnessed the most significant decrease during the study period, dropping by 15.50 million tonnes, 13.61 million tonnes, and 11.43 million tonnes, respectively (shown in Online-only Table 3). For the average growth rate, the CO 2 emissions of Chukotka autonomous witnessed the fastest decrease between 2005 and 2019, at 8.01% annually (shown in Fig. 5).

technical Validation
Comparisons with existing emission datasets. Emission inventories are indispensable in making many environmental decisions and setting scientific mitigation targets. Policy design and emission-related studies require reliable and accurate emission inventories. Since our estimate is based on the 4-TER form covering only large and medium companies, it is important to understand the robustness and accuracy of our emission inventories. Figure 6 shows the comparisons of energy-related CO 2 emissions of our estimate with the emissions (2021) 8  www.nature.com/scientificdata www.nature.com/scientificdata/ estimated based on the reference approach and five international institutions (EDGAR, IEA, BP, EIA, and CDIAC). Our study is estimated using the sectoral approach, while the reference approach can also be used to calculate the energy-related emissions 24 . The sectoral emissions are calculated from the energy consumption side, while the reference emissions are evaluated based on production side using the energy balance tables (energy consumption = production + importexportinternational shipping and aviationnon-energy use, reductants, and feedstocks ± stock change) 24 . Theoretically, the energy data from consumption side and production side should be equal. However, there can be some differences due to many reasons, such as different scopes of statistics and statistical errors. The reference approach is considered to be more accurate for two reasons 42 . First, the reference approach is evaluated according to the fuel production and trade statistics, which are more reliable. Second, the reference approach can avoid accounting errors during the energy processing and conversion process. Therefore, we further compare our estimates with the emission inventories using the reference approach, which are derived from Russia's national inventory reports (shown in Fig. 6). Results show that the difference between our estimates and the reference approach is relatively small over the study period, at 2.24% on average. This verifies that although our estimate does not cover the small companies, the potential underestimation issue is not significant.
Some differences can also be found when comparing with the emissions presented by five international institutions. The time-series trend of our estimate is consistent overall with other international institutions. For example, there was a sudden decrease in CO 2 emissions in 2009, and then a rebound can be seen after that (shown in  Table 2). www.nature.com/scientificdata www.nature.com/scientificdata/ Fig. 6). It can be interpreted by the negative impact caused by the 2008 financial crisis. Our estimate is closer to BP and IEA (shown in Fig. 6). Compared with BP, our estimate shows gaps ranging between 0.82% and 4.01% (shown in Fig. 6). Compared with the IEA, our estimate shows differences ranging between 0.48% and 7.01% (shown in Fig. 6). Since existing emission inventories of Russia do not provide detailed emissions by energy types and socio-economic sectors, a further comparison of the emissions by energy types and by socio-economic sectors cannot be made. In other words, our emission dataset provides the most up to date and comprehensive emission inventories of Russia and its 82 constituent entities and is an important supplement and improvement to the current emission inventories.

Comparisons with different emission factors.
We first compare the national CO 2 emissions (shown in Fig. 7, National data, MNRE_EF) with the aggregation of the 82 constituent entities (shown in Fig. 7, Aggregate data, MNRE_EF). It can be seen that the gap between these two emissions is relatively small, ranging between -1.18 million tonnes and 36.47 million tonnes, representing 0.00% and 0.02% of national CO 2 emissions. This small gap can be regarded as mutual verification of the quality of energy activity data of both Russia and its constituent entities, which shows the robustness of our estimate. As mentioned above in the Method section, we adopt the country-specific emission factors from the MNRE of Russia 14 . However, the estimation of emission factors provided by different institutions varies, which may lead to different results.
To quantitatively characterize the range of emission factors, this study summarized the emission factors from four sources: MNRE, IPCC, Energy auditor self-regulatory organization (SRO) and United Nations-Russia, as can be found in Online-only Table 4. It shows that among the emission factors of most fossil fuels from four sources, the IPCC has the highest value regarding diesel, artifical coke gas, combustible natural gas, associated petroleum gas, fuelwood, and coal. In terms of the components of emission factors, the net calorific value (NCV) of many www.nature.com/scientificdata www.nature.com/scientificdata/ emission factors from the IPCC, are higher than the other three sources, especially coal, while the oxygenation efficiency and carbon content are relatively similar. Specifically, the NCV of coal released by the IPCC is 8.83, 8.15 and 10.58 higher than the MNRE, UN-Russia and SRO respectively and the CO 2 emissions of coal combustion calculated using the IPCC emission factor are 105.83 million tonnes (43.88%), 132.04 million tonnes (61.42%), 100.32 million tonnes (40.67%) higher than the MNRE, the SRO and UN-Russia, respectively (shown in Fig. 7). Additionally, the main types of coal consumed in the Russian Federation come from the Kuznetskiy Basin, the Kansk-Achonskiy Basin and East Siberia 32 , and the NCV of coals in these three places are lower than the IPCC released (shown in Online-only Table 4). For example, the NCV of Kansk-Achonskiy coal is only 15.10 TJ/thousand tonnes, half of that by the IPCC, so the emission factor of coal released by the IPCC is not representative enough. Although, the emission factors of many fossil fuels from the IPCC have the largest value, the CO 2 emissions calculated adopted the emission factors of the IPCC, which is lower than the UN-Russia and the MNRE, and higher than the SRO (shown in Fig. 7). This is mainly because the NCV of blast furnace gas is only a quarter of the other two sources and the CO 2 emissions of blast furnace gas combustion (IPCC) have the lowest value, only accounting for about 23.33% of that from the MNRE and UN-Russia (shown in Fig. 7).
Limitations and future work. There are several limitations of our emission dataset. First, the activity data used to calculate energy-related emissions cover only large and medium companies. The missing data makes our inventories incomplete. In the future, we will explore the data for all companies to construct more comprehensive emission inventories of Russia and its constituent entities. Second, the process-related emissions only consider the emission generated from the cement production process. In the future, other process-related emissions will be included, such as iron and steel production, glass production and ammonia production, which can further improve the accuracy of the datasets. Third, due to data unavailability, CO 2 emissions from 2005 to 2016 only show the emissions by energy types with emissions by sectors missing. In future work, we will further explore the sectoral energy data during this period or downscale to the sectoral level based on economic and demographic indicators.

Usage Notes
This emission dataset can facilitate the academic studies on Russia's emission patterns and mitigation strategies. The detailed emission inventories can be used to analyse CO 2 emissions by sectors and energy types, such as the driving factors of CO 2 emissions, emission reduction potential, emission efficiency, shadow price of CO 2 emissions, emission reduction cost, and emission prediction. Apart from the energy-related emission analysis, process-related emissions can be used to investigate the emission characteristics and reduction strategies of cement industry.
These emission inventories are a long time-series dataset and cover both Russia and its 82 constituent entities, which can be used to study the emission characteristics over time and space. Therefore, emission-related study at the global, national, and subnational levels can be carried out and some comparisons can be made to gain more insights.

Code availability
The Python Code used to draw Fig. 3 and Fig. 4 is published at Supplementary File 1 to show how the data can be loaded and visualized.