China material stocks and flows account for 1978–2018

As the world’s top material consumer, China has created intense pressure on national or global demand for natural resources. Building an accurate material stocks and flows account of China is a prerequisite for promoting sustainable resource management. However, there is no annually, officially published material stocks and flows data in China. Existing material stocks and flows estimates conducted by scholars exhibit great discrepancies. In this study, we create the Provincial Material Stocks and Flows Database (PMSFD) for China and its 31 provinces. This dataset describes 13 materials’ stocks, demand, and scrap supply in five end-use sectors in each province during 1978–2018. PMSFD is the first version of material stocks and flows inventories in China, and its uniform estimation structure and formatted inventories offer a comprehensive foundation for future accumulation, modification, and enhancement. PMSFD contributes insight into the material metabolism, which is an important database for sustainable development as well as circular economy policy-making in China. This dataset will be updated annually. Measurement(s) Material stocks • Material flows Technology Type(s) digital curation • Mathematical Model Sample Characteristic - Location China Measurement(s) Material stocks • Material flows Technology Type(s) digital curation • Mathematical Model Sample Characteristic - Location China Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.16601249

www.nature.com/scientificdata www.nature.com/scientificdata/ exponentially in China: steel stocks increased 2.5-fold in a decade 23 ; aluminium stocks rose 100-fold during 1950-2009 24 ; copper stocks increased 50-fold during 1952-2015 25,26 ; cement stocks increased by 40 times during 1990-2010 21 . China's accelerating materialization process creates serious environmental problems 27,28 , which makes understanding the material metabolism crucial for implementing effective resource management policies and providing valuable insights into promoting circular economy 9 . However, the material stocks and flows account in China and each province have not been well reported and published. Although previous studies have estimated China's material stocks 20,21,[29][30][31] , there is considerable disagreement on the stocks level between these studies. For example, Krausmann et al. 4 estimated that the material stocks in China amounted to 136 t/cap in 2010 based upon the top-down approach, while Han and Xiang 19 estimated the result to be only 33 t/cap in 2008 based upon the bottom-up approach. The difference may be explained by the inconsistent system boundaries, identified end-use sectors, data sources, and estimation approaches, which makes it difficult to clarify the pattern of material use and identify the mechanism of material metabolism.
Considering the great discrepancies of material stocks account and the lack of provincial material stocks and flows account in China, this paper creates the first version of the Provincial Material Stocks and Flow Database (PMSFD), which contains 13 materials' stocks and flows in five end-use sectors in 31 provinces of mainland China. The spatial scale of PMSFD ranges from province to nation, and the timescale ranges from 1978 to 2018. There are 200,000 + data records stored in the PMSFD, and we also provide the material intensity used in the estimation for transparency and verifiability. By integrating data records into a unified format, PMSFD has taken a step towards overcoming the limited accessibility due to incomplete data availability.
The comprehensive and consistent data records make the PMSTD an important database for sustainability analyses and assessments. For example, PMSTD may be used to facilitate retrospective analyses and prospective forecasts of stocks and flows to assist in identifying and achieving sustainable development. We will advance this goal through this initial version of PMSTD, and update the PMSTD annually. The rest of this paper introduces the estimation approach used to create the PMSFD, the file format used to record the PMSFD, and the properties of the data in the PMSFD.

Methods
System boundary. A PMSF (Provincial Material Stocks and Flows) model, which combines a bottom-up, stocks-drive-flows, and mass-balanced dynamic model, is constructed for material stocks and flows estimates at the provincial level. This combined model can evaluate material inputs, stocks accumulation, and end-of-life outflows. Considering the availability of intensity data and the wide application in society, 13 types of materials, including steel (Fe), aluminium (Al), copper (Cu), rubber, plastic, glass, lime, asphalt, sand, gravel, brick, cement, and wood, are considered in this study. Drawing on a comprehensive provincial product database, annual provincial material uses and flows are estimated. The time horizon is from 1978 to 2018 and all computations are performed in time-discrete steps of one year. The spatial scope covers 31 provinces in mainland China. The workflow is shown in Fig. 1 and is described in detail in the following sections.
Material stocks. The bottom-up accounting method, which starts from counting every piece of material-containing products, then investigates the material intensity for each product, and finally adds up material contained in all product categories, could complement the top-down method by revealing greater details of technologies and identifying geographical locations of material stocks 23 . Since the physical data of product stocks are becoming available on provincial scale, the bottom-up approach is appropriately adopted to estimate the material stocks (S t ( )). A list of 103 commodities, which are divided into five end-use sectors (including buildings, infrastructure, transportation facilities, machinery, and domestic appliances), is identified and revised based on our earlier www.nature.com/scientificdata www.nature.com/scientificdata/ study 19 (Online-only Table 1). Unlike our previous research, we have primarily adjusted the commodities identified in the end-use sectors of buildings and infrastructure. For example, we have added different types of highways, power stations, and cable users in the infrastructure, while removing different structures of residential buildings. The material stocks (S C ) in each end-use sector (C ) at the time t are then calculated as the sum of material stocks of related products (Eq. 1).
is the amount of product i in active use in the end-use sector C at time t and I t ( ) i is the material intensity of product i.
The number of product N i quantified by per-household, absolute, or per-capita (Online-only Table 1) is extracted from various official statistical reports, yearbooks, and socio-economic databases (including https:// data.stats.gov.cn and http://www.data.ac.cn) for each province during 1979-2019 (see ref. 19 for detail). Owing to limited data availability for the individual years, linear interpolation has been used to estimate the missing data between consecutively recorded values 32 .
The material intensity is sourced from various references and estimated based on expert judgements 10,20,23,24,30,33 , which are also provided in our datasets.
The total material stocks S t ( ) are calculated as the sum over the five end-use sectors at the time t (Eq. 2).
C C Residential and non-residential buildings (including buildings of public and industry) are considered when calculating the material stocks in buildings. Unlike residential buildings, the floor space of non-residential buildings is not officially reported in each province. Instead, the floor space of public ( f p ) and industrial buildings ( f i ) in towns and townships in each province are recorded by China Urban-Rural Construction Statistical Yearbook for years (http://www.mohurd.gov.cn/xytj/tjzljsxytjgb/jstjnj/index.html) 34 . Hence, we estimate the material stocks in non-residential buildings by assuming that the ratio between non-residential and residential buildings in towns (or townships) applies to the urban (or rural) in each province (Eqs. 3-4).
where R t ( ) is the proportion of floor space of non-residential buildings to residential buildings in urban (or rural) at time t, f t In addition, due to limited official reporting data for (large, medium, small, and micro) passenger cars and (heavy, medium, light, and micro) trucks in each province before 2001, we use the proportion of the different sizes of passenger cars and trucks in 2002 to extrapolate the numbers during 1978-2001. Meanwhile, it is difficult to identify the amount of machines used in diverse industries. We evaluate the metal stocks in industrial machinery by supposing that there is a directly proportional relationship between power consumption and the amount of industrial machines, which is recommended by Zhang et al. 29 and Liu et al. 32 .
Material flows. The material outflow at the time t is defined as the scrap generated from stocks from the end-use sector C of material m, and material inflow at the time t is defined as inputs to stocks of material m. The dynamic stocks-drive-flows model is applied to estimate the material inflows and outflows during 1978-2018 on a sector scale. The annual outflows are determined from stocks using the lifetime model, and the annual inflows are determined from mass balance with outflows and stocks change (Eq. 5): The lifetime distribution expresses the probability of each end-use sector to reach the end-of-life at the time t.
According to previous studies 15, 31 , we assume a normally distributed lifetime λ σ ′ t t L ( , , , ) ( ′ t = 1949) with end-use sectors dependent mean L and standard deviation σ (Eq. 6), which determines the outflow F out from stocks and inflows F in (Eqs. [6][7][8]: www.nature.com/scientificdata www.nature.com/scientificdata/ Since material quality requirements are different between buildings, cars, machines, laptops, and other products, the lifetime can vary greatly. However, it is hard to get a specific lifetime of each product in different regions, we assume that the lifetime of products in the same end-use sector keeps unchanged. The mean lifetime and standard deviation for each end-use sector are given in Table 1. According to Eqs. (6)(7)(8), when the net stocks ( − − S S t t ( ) ( 1) ) less than zero, the F t ( ) in will be negative. Hence, to elaborate the negative data of inflow, we will artificially set F t ( ) in into zero. The F t ( ) out will be equal to the net stocks. When estimating the inflows and outflows from the year 1978, the stocks and the resulting inflows and outflows before 1978 must be taken into account. These approximations of initial stocks and flows are necessary to make the estimation for the period of 1978-2018 more accurate 4 . Hence, a spin-up period has been implemented, and the length of this spin-up period begins in 1949 when the People's Republic of China was founded. Due to limited official reporting data that existed in each province before 1978, we use the power function revealed by previous studies 35 to extrapolate the material stocks during 1949-1977 (Eq. 9).
where S t ( ) p 0 is the material stocks at the time t 0 in province p (during 1949-1977), K p1 and K p2 are coefficients of the fitting model in each province.

Data records
The PMSFD datasets, which contain five core tables, and each table is provided as xlsx files, are freely available through Figshare 36 Table S1 of Supplementary Information). Average per-capita stocks amount to 130.5 t in 2018, and the stocks of gravel are highest (47.2 t/cap) (Fig. 2b,c) (Detailed results are available in Table S2 of Supplementary Information). The composition of material stocks has changed over the past 40 years. The proportion of sand and brick is decreasing, while the proportion of cement and steel is increasing (Fig. 2d,e). Provincially, a wide disparity exists and five groups can be clarified based on their stocks. A high level of stocks is found in Guangdong, Jiangsu, Shandong, Henan, Zhejiang, and Sichuan provinces (Detailed results are available in Table S3   www.nature.com/scientificdata www.nature.com/scientificdata/ scrap in buildings will reach 1.1 Gt/year in 2018, accounting for ~95% of the national total scrap (Fig. 5) (Detailed results in provinces and end-use sectors are available in Table S7 of Supplementary Information).
Comparison with existing estimates. Our results, estimated with the bottom-up accounting method, can complement the previous estimates based upon the top-down accounting method and contribute important knowledge to sustainable management of bulk materials at an unprecedented level of accuracy and resolution. We compare our stocks estimate with previous case studies on the national and provincial scales (Table 3). We find that our results are close to that of previous studies. Specifically, our results are approximately 30% lower than the top-down estimations 4,21,37 . The difference can be largely explained by the fact that the bottom-up accounting method cannot capture all the products in use in society. Data limitations, especially at the provincial level, prevent us from collecting data for commercial buildings, high-speed trains, water and environmental infrastructures, machinery possessed by small businesses, and appliances for commercial use, which may incorporate large amounts of materials. In addition, our results are generally higher than previous bottom-up accounting estimations. The amount of identified products in different end-use sectors is the main reason for the differences. For example, previous studies only identified less than 50 products in buildings and infrastructure in their studies 20,32 , while we considered 103 products in five end-use sectors in evaluation.
From the aspect of the format, the existing stocks estimated only present the total material stocks of the whole country, or stocks of an individual material (e.g., steel, cement, or copper). Our datasets provide the products-based stocks and flows of five end-use sectors and 13 material stocks to give detailed demonstrations of China's material use and disposal of the statue as well as its 31 provinces. Our datasets can be a more detailed supplement to the existing stocks and flows estimates.
Limitations and future work. Unavoidably, the datasets bear uncertainties from data selection, handling, and operation.
• The first uncertainty is the material intensity. For each product, we only estimate the average material intensity for a special period. However, the material intensity appears to vary by a wide range in most product categories 32 . The great variations of material intensity contained in technological products of different sizes, weights, and functions should be considered in future work.  Table 2. Sectoral level of material stocks inventory in Beijing in 2018 (in million tonnes). www.nature.com/scientificdata www.nature.com/scientificdata/ • The second uncertainty is the lifetime of products. Given that lifetime manifests itself in lifestyle, its impact is significant in determining the material flows 15,38 . Due to the lack of validated estimation on lifetime changes, especially at the provincial level, our study did not consider the lifetime changes. Future work can focus on regional differences in lifetime of products to reduce uncertainties in estimation.
• The third uncertainty is the assumptions when we estimate the material stocks in non-residential buildings and industrial machinery. Although these assumptions are made based on underlying reliable official data and academic research, there is a necessity to increase the results accuracy by cross-checking or enriching the input datasets in our future work.
Our future work will also focus on continuously updating the PMSFD by collecting newly published input data (the number of 103 products in five end-use sectors) based on the constructed data structures and PMSF model. Meanwhile, based on the existing input data datasets, we will estimate the stocks and flows for more materials (e.g., strategic and minor metals) to expand the PMSFD in the future to provide robust support for circular economy and sustainable development.

Usage Notes
Since the dataset format is clear and easy to be understood, the continuous 41-year material stocks and flows records can be used to tracking and analyzing the metabolism of different materials on the provincial and national scale. The dataset can be used to assess the material efficiency/productivity and criticality, and evaluate the environmental impact in conjunction with information of production and recycling and life cycle analyses model.  Table S3 of the Supplementary Information for provincial material stocks in five end-use sectors. The five lower line charts show the total material stocks of provinces in I to V groups. See Table S4 of the Supplementary Information for provincial material stocks in the past 40 years.
www.nature.com/scientificdata www.nature.com/scientificdata/  www.nature.com/scientificdata www.nature.com/scientificdata/ The data files are documented as xlsx files, which can be readily read and processed by many software, such as Matlab, R, and Python. The 13 materials' stocks can be distinguished into four types based on their properties, including biomass (wood), fossil materials (plastics and asphalt), metals (steel, aluminium, and copper), and non-metallic minerals (gravel, bricks, sand, cement, lime, glass, and rubber). The stocks of biomass, fossil materials, and non-metallic minerals concentrate on buildings, highways, passenger cars, and trucks. Due to the data dispersion, the ARIMA (Autoregressive Integrated Moving Average) methodology is recommended to be used when analyzing the dynamic evolution of material inflows and outflows. Furthermore, the dataset users can evaluate the stocks of other materials or the stocks of the same material in more accounting end-use sectors according to the calculation models of this study.

Code availability
The original input data, the amount of products quantified by per-household, per-capita, and absolute in each province, is stored as xlsx files and shared in Figshare (Input data.xlsx) 36 . Data processing is performed using MATLAB software (MatlabR2019), and the codes for creating provincial and national stocks and flows datasets are also stored in Figshare (code_calculation.m) 36 . We share these datasets and scripts for data transparency and computational reproducibility, and to assist users for further exploration and development.  Table 3. Comparisons of stocks estimation between our results with previous estimates. a Materials stocks in buildings without considering glass, rubber, and cement. b Sand stocks in residential buildings and infrastructure. c Material stocks in residential buildings and infrastructure. d Stocks of steel, copper, and aluminium in urban area of Beijing.