Monthly direct and indirect greenhouse gases emissions from household consumption in the major Japanese cities

Urban household consumption contributes substantially to global greenhouse gases (GHGs) emissions. Urban household emissions encompass both direct and indirect emissions, with the former associated with the direct use of fossil fuels and the latter with the emissions embodied in the consumed goods and services. However, there is a lack of consistent and comprehensive datasets outlining in great detail emissions from urban household consumption. To bridge this data gap, we construct an emission inventory of urban household emissions for 52 major cities in Japan that covers around 500 emission categories. The dataset spans from January 2011 to December 2015 and contains 12,384 data records for direct emissions and 1,543,128 records for indirect emissions. Direct emission intensity is provided in g-CO2/JPY to facilitate both future studies of household emission in Japan, as well as act as a reference for the development of detailed household emission inventories in other countries.

www.nature.com/scientificdata www.nature.com/scientificdata/ Direct emissions. The fossil fuels associated with direct household emissions are gasoline, kerosene, liquefied petroleum gas (LPG), and city gas. As outlined in more detail below we convert the monthly household expenditures for these fuels elicited from the FIES, to g-CO 2 e. In summary, for each fuel, the household expenditures extracted from the FIES are first converted into corresponding mass or volume using retail fuel prices (see below). Fuel volume or mass is then converted into g-CO 2 e by multiplying with respective emission coefficients (see below).
Weekly retail prices for gasoline and kerosene come from a weekly survey conducted by the Ministry of Economy, Trade and Industry of Japan on retail prices at filling stations 47 . Monthly prices for gasoline and kerosene are obtained by averaging the prices of the weeks within each month. For kerosene, we adopt prices for on-site purchase, as the 2006 survey of kerosene and LPG consumption 48 indicates that on-site purchases are be the principal means through which households purchase kerosene in Japan. For LPG, prices come from the monthly survey of retail prices conducted by the Oil Information Center, at the Institute of Energy Economics, Japan 49 . The LPG prices for the cities across the 47 prefectures are recorded as the overall prices of the corresponding geographical regions. As LPG retail prices are recorded in a stepwise manner for volumes (at 5 m 3 , 10 m 3 , 20 m 3 and 50 m 3 ), prefectural unit prices of LPG are set as the per unit price at one of the four recorded volumes just greater than the average volume purchased per month. Due to the lack of actual consumption data to distinguish the stepwise price window, we define this price window spatially using the most recent consumption data on LPG consumption in each of the 47 prefectures 50 . For instance, the average monthly consumption of LPG is 12.6 m 3 per household in Chiba Prefecture. Therefore, the per cubic meter price of LPG (JPY/m 3 ) purchased at 20 m 3 is set to correspond to the per unit price for Chiba, as 20 m 3 is the volume gradient that is  indirect emissions. The estimation of the indirect emissions embodied in the goods and services consumed by the household sector requires a cross-mapping and matching of the emission categories (and their intensities) from the 3EID dataset with the consumption categories of the FIES dataset.
The calculation of the indirect carbon emission intensity (E i ) in the 3EID model is as follows: where D is the direct emission matrix, I is the unit matrix, = A mn x x mn m represents the output of industry m required to produce one unit of output from industry n, and M is a diagonal matrix representing the import portion of the direct requirement coefficient. Due to its structure, the 3EID considers only domestic production (see "Limitations and Acknowledgements"). Further details of the input-output table and applications can be found elsewhere 10,37,[55][56][57][58][59][60] .
As the classification of industries in the 3EID database differs from the classification of consumption elements in the FIES expenditure data, we matched the data following the general approach outlined elsewhere 33  Even though the base data for the goods and services consumed by households that fall under indirect emissions are generated for each month between January 2011 and December 2015 under the FIES data (see "Dataset scope"), the indirect emission intensities relevant for each of these indirect emission categories after cross-mapping (see above) are generated only for the 2011 and 2015 input-output tables. This is because the 3EID databases that are used for the emission intensities are released every five years, and are thus only available for the years 2011 and 2015. To estimate the indirect emission intensities for each study items for the years 2012, 2013, and 2014 we use linear interpolation. Therefore, the values are obtained via an interpolation method as follows: It should be noted that electricity is treated as an indirect emission, as there is no direct emission upon its consumption (in contrast to the fossil fuels discussed in the previous section). Although the different cities contained in this dataset are supported from different electricity companies, we employ the national standard electricity intensity generated through the input-output tables.
It is worth noting that there is not always a perfect matching between (a) FIES and 3EID categories, and (b) within 2011 and 2015 3EID categories. These items are matched based on similarities in their properties. For (a) some 3EID categories such as waste management are not distinct household components in FIES. For this reason, they have been linked to relevant services items such as municipal services. However, some of the FIES miscellaneous expenses that are not expected to have indirect emissions such as allowances, grants for religious services, and donations have been omitted. For (b), some examples include categories in the 2011 3EID such as "small dried sardines", "sewing machines" and "cloth tailoring" that do not have a perfect match in the 2015 3EID, and have been thus matched in similar categories such as "other salted food", "consumer electrical equipment" and "other personal services" in 2015 3EID, respectively. This logic is also used when interpolating between years.

Limitations and acknowledgements. First, the FIES that is the base of the dataset used in this Data
Descriptor does not cover single-person households. Single-person households are very prevalent in Japan 42 , and have very distinct consumption patterns, which often lead to higher per capita emissions in Japan 61 . This www.nature.com/scientificdata www.nature.com/scientificdata/ means that there is possibly an underestimation of the findings outlined in the "Technical Validation" section. Considering the comparatively high prevalence of single-person households in Japan 62 , there should be some caution or acknowledgement when using this dataset. It should be noted that similar to the FIES, the National Survey of Family Income and Expenditure (NSFIE) 63 records the monthly consumption expenditures per household. Although the NSFIE covers a larger household sample than FIES (including single-person households), it does not record the consumption expenditures continuously. In particular the expenditures are based on a survey covering September to November for two-or more-person households, and one in November for single-person households. Furthermore, it is not conducted every year but every five years. Although this dataset has been utilized for household emissions at the city 64,65 , the fact remains that it has a lower periodicity and is not available publicly.
Second, the 3EID is an emission inventory generated through the Japanese single-regional input-output (SRIO) table. This means that the emission intensities used in this study reflect only domestic goods and services. By applying domestic emission intensities for imported goods, inserts uncertainty to the dataset, which is to a large degree unavoidable considering the lack of options to create a fine-grained dataset that also includes international emission intensities (see below). In particular, we select the 3EID, rather than a multi-regional input-output (MRIO) because of its higher sectoral resolution. In more detail, the 3EID has a much higher sectoral resolution (390 sectors), which brings it closer to the structure of the FIES that contains 500 consumption categories. This is a much more extensive coverage compared with other MRIOs such as WIOD (56 sectors) and EXIOBASE (200 sectors). This inability to consider properly emission intensities for imported goods and services might underestimate the actual GHG emissions for some consumption categories, as imported goods tend to have longer value chains, and thus higher GHG emissions when compared to similar domestic goods 24 . Regarding the regional heterogeneity, using a domestic subnational MRIO 66 would be more preferable than 3EID cognizant of domestic regional heterogeneity 65

Data records
The dataset contains monthly direct and indirect GHG emissions for 51-52 Japanese cities from 2011 to 2015. The direct emissions are recorded as Natural Gas, Gasoline, LPG, and Kerosene. The emissions are expressed in "per capita" terms. Overall, there is a total of 1,555,512 items, which include 1,543,128 items for indirect emissions and 12,384 items for direct emissions. Table 1 offers a summary of the data items for each study year.
The entire dataset is made public in Figshare, and is named "Monthly direct and indirect greenhouse gases emissions from household consumption in Japanese cities" 67 . It consists of 17 excel files (Table 2), which are explained in more detail below. For each study year, the data is included in two excel files, one for direct emissions (labeled as: "direct_20XX depending on the year) and one for indirect emissions (labeled as: "indi-rect_20XX depending on the year). Thus, the dataset spans a total of 10 excel files (Files 1-10) (Unit: g-CO 2 ) The files for direct emissions, apart from the emission dataset itself, also contain the unit prices for direct energy consumption for each city for each month, provided by a separate file named Direct Emission Intensity.xlsx (Files 11-14). Here, direct emission intensity is provided in g-CO 2 /JPY, to facilitate future calculations on direct emission calculation at the city-scale (Files 11-14). Next, for reference (Files 15-16) we provide two excel files highlighting the cross-mapping of FIES and 3EID for the year 2015 (named Mapping.xlsx) and sector details (named FIES_items_Eng_2011-15.xlsx). Last, the household size information (File 17), includes total household size, children under 18 years old, aging above 65 years old, are given by a separate file named Household size information.xlsx. Figure 2 shows the monthly average emissions of the cities included in this dataset for the period between 2011 and 2015. The results show that indirect emissions per capita are much higher than direct emissions per capita, accounting for 81.2% of total emissions. Naha and Sapporo are the lowest and highest per capita emitting cities, respectively. Lower per capita emitting cities are generally located in western Japan and Kyushu (e.g. Osaka, Fukuoka, Kumamoto, Nagasaki, Miyazaki and Naha). On the contrary, the higher per capita emitting cities are located in northeastern Japan (e.g. Sapporo and Akita). The cities in northeastern Japan have consistently higher direct and indirect emissions per capita compared to other cities in the country. www.nature.com/scientificdata www.nature.com/scientificdata/ Figure 3 shows the monthly emission variation, averaged across cities, between 2011 and 2015. In particular, Fig. 3(a), shows the total emission averaged for all cities across years, and suggests higher total emissions for colder months such as December, January, and February. This trend is visible for both direct and indirect emissions, Fig. 3(b,c). However, two emission peaks are visible in March and August, which are mostly generated by indirect emissions. Figure 4 provides a simple break-down of indirect emissions averaged across 512 consumption elements for the year 2015. Approximately, 50.2% of total indirect emissions is due to electricity and other utilities, followed by food (19.8%). When disaggregating the food-related indirect emissions, emission from eating accounts for 16% of the total food-related emission, followed by processed food (15%), meat (14%), and cereals (13%) (Fig. 4).

Indirect emission items
Although there is no directly comparable dataset for validation purposes, we see a good correspondence of the direct emissions estimated through our dataset, with the relevant constituents of the Greenhouse Gas Inventory Office of Japan (GIO), at the National Institute for Environmental Studies (NIES) (Fig. 5). In Fig. 5 the pink shadow area shows the maximum and minimum value of our dataset (i.e. highest-and lowest-emitting cities), which falls within the average national estimates of GIO for direct emissions.   www.nature.com/scientificdata www.nature.com/scientificdata/ To note, for this validation we only use the direct emissions component of our dataset, since the indirect emissions are not available in any other currently available official statistics or other studies for validation purposes. However, we have to note that the indirect emission intensities in 3EID vary little over short timescales, such as the ones in this Data Descriptor (period between 2011-2015). For example, the emission intensities of rice in 2011 and 2015 are 8.1t-CO 2 e/M-JPY and 7.9 t-CO 2 e/M-JPY, respectively. Therefore, the variance is only 2.65% in a five-year period. Furthermore, the emission intensities for some durable goods changes even less, as production processes do not change dramatically. Thus we expect that the interpolation will not introduce major

Usage Notes
As cities have emerged as major actors in climate mitigation efforts in the past decades, there have been multiple initiatives to both estimate and quantify the contribution of cities to national and/or global emissions (see Background and Summary), as well as to develop city-level climate change mitigation strategies. For example, the Intergovernmental Panel on Climate Change (IPCC) 5 th Assessment Report pointed that progressive   www.nature.com/scientificdata www.nature.com/scientificdata/ cities across the world have demonstrated significant political leadership by initiating meaningful strategies and actions to tackle climate change 69 . In this context the Urban Carbon Footprint (UCF) has been recognized as one of the more useful methodological options to inform decision-makers about environmental sustainability, both within and beyond city limits 70 .
That said, and mindful of its limitations (see Methods), this dataset can provide a very useful resource to urban researchers interested in analyzing different aspects of UCFs in a temporal and spatially differentiated manner. The data structure (i.e. direct and indirect emissions) and the method used to develop the emissions factors in this Data Descriptor have been discussed (and to some degree used) in previous UCF studies 33,68 . In this sense the method is rather universal in its approach.
However, what sets this Data Descriptor apart is the quality and comprehensiveness of the underlying data, both in relation to urban consumption (FIES data), as well as the emission intensity factors (3EID data). Thus, this dataset can allow the exploration of consumption patterns in a very disaggregated manner (>500 consumption items) and over different periods of time (i.e. monthly, annually). Due to data limitations at the urban scale, few studies have managed to calculate city-level household emission inventories in such a comprehensive manner. As outlined below, this dataset can appeal to researchers globally, as well as practitioners and policy-makers in the covered cities, and Japan more broadly.
In terms of research, some possible applications could be to identify, among others, (a) differences in the UCFs of cities 24,71 , (b) differences in consumption structure 21 , (c) differences in drivers of emissions by month, year, or city 33,72 , or (d) differences in emissions by income 26,73 , education 74,75 , or age 11,31,76 . Beyond city-level patterns, the dataset can be used to understand broader phenomena related to the environmental impacts of urban consumption. For example, the indirect emission component could be connected with other datasets focusing on specific demand to understand better emerging topics in urban studies such as urban tele-connections 77,78 , the transboundary environmental impacts of cities 79 , or inequalities in emissions 26,80 . Ideally this dataset can become an input to ongoing and future global reports on urban carbon mitigation, such as the reports prepared by the IPCC's Working Group III.
In terms of policy and practice, the granular and location-specific data for various constituents of consumption can be used to identify potential priority areas for GHG emission reduction and facilitate better-informed and evidence-based mitigation actions by policy-makers in the covered cities. For example in some cities in northern Japan such as Sapporo and Aomori, kerosene consumption accounts for a large percentage of the direct GHG emissions. This suggests that the electrification of heating could be an important GHG mitigation measure 33,68 . By providing the emission inventories for each month, it is possible to facilitate the understanding of seasonal emission patterns, providing an even finer print of household emissions, and the development of decarbonization measures through behavioral change. Furthermore, this dataset can be used to explore differentiated emission profiles across households with different characteristics such as income, age or education. By identifying better the different emission profiles of such groups it can help city governments create more nuanced and targeted measures to affect consumption and emission behavior across different types of households or investigate decarbonization scenarios in a more nuanced way. The above could inform the generation of good practices for how to use such high-resolution data to track household carbon footprints according to daily consumption behaviors, which can possibly be applied in other urban contexts around the world.

Code availability
No code was used in the generation of the data.