Abstract
Global production fragmentation generates indirect socioeconomic and environmental impacts throughout its expanded supply chains. The multiregional inputoutput model (MRIO) is a tool commonly used to trace the supply chain and understand spillover effects across regions, but often cannot be applied due to data unavailability, especially at the subnational level. Here, we present MRIO tables for 2012, 2015, and 2017 for 31 provinces of mainland China in 42 economic sectors. We employ hybrid methods to construct the MRIO tables according to the available data for each year. The dataset is the consistent China MRIO table collection to reveal the evolution of regional supply chains in China’s recent economic transition. The dataset illustrates the consistent evolution of China’s regional supply chain and its economic structure before the 2018 USSino trade war. The dataset can be further applied as a benchmark in a wide range of indepth studies of production and consumption structures across industries and regions.
Measurement(s)  multiregional inputoutput 
Technology Type(s)  partial survey 
Factor Type(s)  province • year 
Sample Characteristic  Location  China 
Machineaccessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.15362565
Similar content being viewed by others
Background & Summary
Following the 2008 financial recession, China entered an economic transition period (“New Normal”), during which growth patterns shifted from the old investmentdriven pattern into a new growth paradigm characterised by “high quality but lower growth”. The paradigm aimed to lift China’s position of global value chains by prioritising the development of high valueadded manufacturing and services^{1,2,3,4,5}. The changes that came about in the economic transition were huge, not only in the national economic structure but more in the interprovincial supply chains. Given China’s vast territory and the variations in socioeconomic development, tracing the evolution of crossprovince supply chains is of great importance for regional development policymaking as well as for understanding the environmental spillover effects along the supply chains^{6,7}. From a global perspective, China’s economic transition from 2012–2017 is historically significant in that it occurred between the postfinancial crisis and the 2018 USSino trade war. The period provides a benchmark for further studies about how the USSino trade war reshaped China’s economic structure and regional supply chains.
The multiregion inputoutput model (MRIO) is the dominant model used to quantify spillover effects through supply chains and regional heterogeneity^{8,9,10,11}. Many scholars have constructed MRIO tables in the past decade, especially at the international level. The most commonly used MRIO tables in the international scientific community are GTAP^{12,13}, WIOD^{8}, EORA^{14}, OECDICIO^{15} and EXIOBASE^{16}. In China, several provincial MRIO tables are already available (Table 1). Li et al. in DRC (Development Research Centre) constructed the consistent provincial MRIO tables every five years from 1997 to 2012^{17,18}; Zhang (2012) of SIC (State Information Centre) built MRIO tables for eight regions based on the entropy model for 1997, 2002 and 2007^{19}; Liu of IGSNRR (Institute of Geographic Sciences and Natural Resources ResearchChina Academy of Sciences) constructed provincial MRIO tables for 2007, 2010, and 2012 adjusted by geographic indicators^{20,21,22,23}; Mi of CEADs (China Carbon Emissions Databases) group adopt the same method to construct an MRIO table for 30 provinces with 30 sectors^{24}; Zhang and colleagues of RCFEDS (Research Centre on Fictitious Economy and Data Science) compiled an MRIO table for 30 provinces with the highest sector resolution of 60 sectors, but only for 2002^{25}. Despite these collective efforts, all datasets are compiled using different methods, assumptions, resolution or coverage of sectors, regions and time coverage, resulting in a lack of coherent and consistent data and making it difficult to crossreference^{26}. For example, IGSNRR adjusts the gravity model using parameters such as spatial weights and competitive coefficients. The spatial weights are to take into account the effects of trade between neighbouring places on focal trade, while competitive coefficients are to reduce the trade of the same product between two places; DRC adjusts the trade data of the original SRIO table by using Chinese customs datasets. Such inconsistencies mean that it is very difficult to make timeseries or trend analyses. Most importantly, all available datasets are constructed at most to 2012 and so cannot trace the evolution of interregional supply chains during the economic transition. In the post2012 period, to our knowledge, only our previous work which reveals the carbon flows in 2015 has illustrated the interregional supply chains for 2015^{4}, while the MRIO table for 2017 has yet to be constructed. The lack of an updated dataset that reflects the evolutionary supply chain deeply undermines our understanding of the heterogeneous effects of China’s economic transition at the regional level. It is worth noting that the tables compiled by this study are just to bridge the gap in terms of the lack of recent MRIO tables, and our study also contributes to understanding China’s regional heterogeneity alone with other institutes.
To bridge the data gap, we present a collection of provincial MRIO tables for 2012, 2015, and 2017 that cover China’s economic transition period constructed using a consistent approach. We compiled MRIO tables for the three years and include 42 sectors of all 31 provinces of mainland China, excluding Hongkong, Macao and Taiwan, by the maximum entropy model. The MRIO table construction was based on official data, including provincial single region IO tables (SRIO), economic data from provincial statistics yearbooks and China’s customs database. The unified formats of the MRIO tables are compatible with China’s national SRIO table, with identical sector classification and five categories in the final demands: rural household consumption, urban household consumption, government consumption, capital formation, and inventory changes. This dataset can be widely used in both economic analyses to identify the driving forces of regional growth in China’s economic transition and environmental impacts focusing on the interregional spillover effects along the supply chains.
Methods
China’s provincial MRIO tables for 2012, 2015, and 2017 were compiled using a partial survey approach, which combines the official survey data and modelled outcomes^{6,27}. The partial survey approach allows MRIO table construction to be regarded as linking provincial SRIO tables with the trade matrix for each sector. Provincial SRIO tables are often available from surveyed data, while the trade matrix for sectors are unavailable and rely on modelling. However, there are two challenges to be faced before compilation. On the one hand, due to the high costs for the ad hoc survey for inputoutput table construction, China’s official provincial SRIO tables are only released every five years, with the year ending with 2 or 7 (e.g. 2012 and 2017 in this case). The SRIO table for years ending with 5 (e.g. 2015) is not compulsory. Hence, SRIO tables of 2012 and 2017 were available for all provinces whereas SRIO tables for 2015 were not. Thus, before SRIO tables could be linked to the trade matrix, provincial SRIO tables for 2015 had to be built. In addition, provincial SRIO tables released by the National Statistics Bureau cannot be directly used due to the inconsistent trade flows between provinces, which is the case for 2012 and 2017. For a given product, the total domestic exports should be equal to the total domestic imports in an economy but officially released SRIO tables often fail to meet this condition. In this study, a crossentropy model is thus employed to address these problems. The model follows the minimal crossentropy principle (or KullbackLeibler divergence) to minimise the entropy distance between the target and prior distribution^{28,29}. The outcome of the crossentropy model ensures maximum similarity between the target and the known distribution.
Figure 1 illustrates the 5 steps involved in constructing provincial MRIO tables: (1) Estimation of domestic demand and supply; (2) Disaggregating demand and supply; (3) Adjustment of the provincial SRIO table; (4). Estimation of the interregional trade matrix; (5) Linking adjusted provincial SRIO tables with the trade matrix. Table 2 lists the raw data required in the MRIO table construction. Due to differences in data availability, we introduce two cases according to data treatment processes. Case 1 is based on comprehensive provincial SRIO tables for all 31 provinces, such as 2012 and 2017, while Case 2 is based on incomplete provincial SRIO tables for all provinces, namely, there are no tables for 2015. In short, all of China’s available provincial SRIO tables can be accessed on the websites of 31 provincial statistics bureaus, including all 31 SRIO tables for 2012 and 2017, and a few SRIO tables for 2015. In compiling the model, output and valueadded data by sectors can be derived from provincial SRIO tables (in 2012 or 2017 case) or provincial statistical yearbooks (in 2015 case), but provincial statistical yearbooks might not provide the output for tertiary sectors and valueadded data for industrial sectors. In this case, we can estimate the missing data based on the assumption of the same share structure of valueadded and output. For example, we can estimate valueadded data for industrial sectors by multiplying the distribution of their output with the total provincial valueadded for industrial sectors. To be consistent with the national SRIO table, aggregated provincial output and valueadded by sector for all 31 provinces are scaled by the national value from the national SRIO table. In short, output for tertiary sectors is not available in the yearbooks, but valueadded for tertiary sectors is. So, we use the structure of valueadded for all provinces to disaggregate the national output of the tertiary sectors (derived from national IOT). Similarly, valueadded for industrial sectors is not available in the yearbooks, but the output is. Similarly, we use the output structure for all provinces to disaggregate the national valueadded by industrial sectors (derived from national IOT).
Provincial trade flows (domestic imports and domestic exports) are derived from the China customs database for 2015 or from the official provincial SRIO tables for 2012 and 2017. To estimate the trade matrix, the observed transport data and electricity transmission data were also obtained from national railway statistics and China’s electricity yearbook respectively.
Estimation of domestic demand and supply
The compilation starts with the estimation of supplies and demands (Fig. 2). From the supply perspective, the supply from the given province can be defined by destination and further divided into selfsupply, supply to other provinces and supply to other countries (or export). We can estimate domestic supply \({s}_{r}^{i}\) (sector i in province r supplied to all provinces including itself) by using its total output (\({x}_{r}^{i}\)) minus exports (\(e{x}_{r}^{i}\)), as shown in Eq. 1:
Where \({x}_{r}^{i}\) refers to the output of commodity i in province r; \(e{x}_{r}^{i}\) refers to the export of commodity i in province r. \({s}_{r}^{i}\) represents the domestic supply of commodity i for province r. The demand of a specific province can be defined by the source and further divided into selfdemand, demand from other provinces, and demand from other countries (or import). Similarly, we can estimate the domestic demand \({d}_{r}^{i}\) within a province by using its total demand minus imports, where the total demand is the function of intermediate demand (\({z}_{r}^{i}\)) and final demand (\({f}_{r}^{i}\)). In case 1, the total demand and imports are available from provincial SRIO tables, and thus shown as Eq. 2. In case 2, the total demands are not available due to the lack of provincial SRIO tables for 2015. We estimated the total demand based on the assumptions: 1. identical technical coefficients between 2012 and 2015 (using the 2012 SRIO table as proxy) to estimate intermediate demand (\({z}_{r}^{i\ast }\)); 2. The identical proportion of intermediate demands in total demands between 2012 and 2015 (\({p}_{r}^{i\ast }\)). Where provincial SRIO tables were unavailable for 2015, we used the SRIO tables instead of the 2012 proxy. For Gansu, Anhui, Guangdong, Hunan and Chongqing, we used their provincial SRIO table for 2015. See below for case 1 Eq. (2) and case 2 Eq. (3):
If case 1:
If case 2:
Where the domestic demand \({d}_{r}^{i}\) refers to sector i needed in province r; \({a}_{201{2}_{r}^{i}}\) is the technical coefficients for 2012 for sector i needed in province r; \({x}_{201{5}_{r}^{i}}\) is the output of sector i in province r. \({z}_{201{2}_{r}^{i}}\) and \(t{d}_{201{2}_{r}^{i}}\) are the intermediate demand and total demand for sector i of province r for 2012. \({z}_{r}^{i\ast }\) and \(t{d}_{r}^{i\ast }\) are preliminary intermediate demand and total demand, where the star * indicates the preliminary. \(n{d}_{201{5}^{i}}\) is the national demand for sector i in 2015; \(i{m}_{r}^{i}\) is the import for sector i of province r. It is worth noting that the technical coefficients for 2012 were used because the sector classification in the 2012 tables is the same as used in the 2015 tables, while a different classification is used in the 2017 tables (discussed in Table S1). However, choosing different technical coefficients can generate different estimated total demands which lead to different MRIO tables. Therefore, more investigation is needed to address how total demand for each province can be estimated.
Disaggregating demand and supply
Once domestic supply and demand are established through the above step, we disaggregate the domestic supply and demands by the crossentropy model (CE), shown in Fig. 3. The crossentropy model (CE), as mentioned above, is used to obtain the distribution which is closest to the prior information as well as taking into account the given constraints. For a given product or sector, several numeric equations reflect the supplydemand balance, which are constraints in the CE model: (1) the selfsupply should be equal to the selfdemand for the same provinces; (2) the row sum of Sd and SO should be identical with the domestic supply S. Correspondingly, the row sum of DD and DO should conform with the domestic demand D; (3) the column sum of SO for all provinces should be equal to the column sum of DO for all provinces, as all products giving out are equal to all products received within a certain boundary. Mathematically, this can be shown as:
Subject to:
(the distribution of all supply is equal to 1)
\(\sum _{i}\sum _{r}({{\rm{p}}}_{ir}^{{\rm{DD}}}+{{\rm{p}}}_{ir}^{{\rm{DO}}})=1;\)
(the distribution of all demand is equal to 1)
(the column sum of SO is equal to the sum of DO)
(the row sum of domestic supply is equal to the domestic supply by province)
(the column sum of domestic demand is equal to the domestic demand by province)
Where \({p}_{ir}\) is the distribution of supply and demand for sector i in province r; \({q}_{ir}\) is the prior distribution of supply and demand for sector i in province r. \({s}_{i}\) and \({d}_{i}\) are aggregated domestic supply and demand for sector i. \({s}_{ir}\) and \({d}_{ir}\) indicates the domestic supply and demand for sector i in province r.
Adjusting provincial single regional inputoutput table
The above steps readjust the domestic supply and demand to make sure that total domestic exports \(\left({\sum }_{r}\,s{o}_{r}\right)\) are equal to domestic imports \(\left({\sum }_{r}\,s{d}_{r}\right)\) for any product. Thus, we updated the intermediate demand (Z) and final demand (F) from previous provincial SRIO tables, calibrated with adjusted domestic export and import. We employed the generalised RAS (GRAS) model, which is a variant that allows for nonpositive elements in the iterative matrix balancing^{30}. For a given SRIO table for province r, two conditions need to be met in terms of the SRIO table balancing. By row, the row sum of intermediate and final demand should be equal to total output minus net export. By column, the column sum of intermediate demand should be equal to total output minus valueadded. Mathematically:
Subject to:
(column constraint)
(row constraint)
Where \({q}_{r}^{ij}\) is the prior distribution containing the matrix of intermediate demand \({z}_{r}^{ij}\) and final demand \({f}_{r}^{i}\), which can be directly derived from the provincial SRIO table or proxy if the SRIO table is not available, as in 2015. We assume the identical technical coefficients between 2012 and 2015 and then multiply the 2015 input to get a preliminary intermediate demand. For the final demand, we first calculate the aggregated final demand by GDP minus NE, and then multiply the final demand distribution of 2012 as the proxy estimate. \(n{e}_{r}^{i}\) represents the net export of product i for province r, which is equal to foreign export + domestic exportforeign importdomestic import by product. Foreign export and import are intermediately available from the provincial SRIO tables (for 2012 and 2017) or customs dataset (for 2015). For 2012 and 2017, we used the trade data directly from provincial SRIOTs, while the customs dataset is to estimate provincial export and import by sectors for 2015, as there are no provincial SRIO tables. \({p}_{r}^{ij}\) represents the unknown distribution dividing known prior distribution, which is the result of the GRAS; e is the Natural logarithm. \({X}_{r}^{j}\) represents the total input of product j for province r, while \({X}_{r}^{i}\) represents the total output of product i for province r.
Intraregional matrix estimate
Equation 5 yields the updated provincial SRIO tables incompatibility with adjusted trade (e.g. domestic export and import) and national accounts (e.g. VA). However, the intermediate and final demands in the updated provincial SRIO table are mixed, with both the domestic and imported demands, categorised as the competitive type, while the MRIO table requires a separated matrix for domestic and imported goods. Thus, we convert the competitive table into a noncompetitive table by assuming the proportion of imports in the intermediate and final demands in the SRIO table are identical^{24,31}. Specifically, we introduce an indicator regional purchase coefficient (RPC) to measure the proportion of total demands supplied locally. The domestic intermediate and final demands thus can be derived by multiplying the RPC with intermediate and final demands in the updated SRIO table. Mathematically:
Similarly, we can apply the import purchase coefficient (IPC), analogous to the purchase coefficient, to derive the demands supplied from other provinces. Mathematically:
Where \({x}_{r}^{i}\) refers to the output of product i in province r; \(e{x}_{r}^{i}\) indicates the foreign export of product i in province r; \(s{o}_{r}^{i}\) refers to product i supply from province r to other provinces; \(i{m}_{r}^{i}\) represents product i imported from other countries to province r; \(d{o}_{r}^{i}\) represents product i required in province r. \({z}_{rr}^{ij}\) and \({f}_{rr}^{i}\) are intermediate and final demand matrix for domestic products i for province r. These matrixes are the diagonal matrix in China’s provincial MRIO table for domestic intermediate and final demands respectively. \(z{n}_{r}^{ij}\) and \(f{n}_{r}^{i}\) are the intermediate and final demand matrices for the imported products i to province r.
Interregional trade matrix estimate
To obtain the trade matrix, we apply the gravity model with the observable trade data between provinces, which improves the accuracy and reliability of the interregional estimates^{27,32,33}. The gravity model has been widely adopted in previous Chinese MRIO table building^{17,21}. It is worth noting that the standard gravity model requires trade sample data to estimate the parameters. When the sample data are unavailable, the doubly constrained gravity model can be chosen as a reliable alternative^{34}. The doubly constrained gravity model has also been used by IMPLAN to build a subnational trade matrix for the US^{35,36}. The model assumes that the trade between two regions is the function of supply and demand and the impedance in costs. Therefore, the standard gravity model is as follows:
Where \({t}_{rs}^{i}\) is the trade flow for commodity i between province r and province s; \({e}_{{\rm{ro}}}^{i}\) and \({m}_{{\rm{os}}}^{i}\) are the supply (or domestic export) from province r and the demand (or domestic import) of province s, respectively. d_{rs} is the distance between two provinces, which is the proxy for transportation costs. β_{1} and β_{2} represent the weights of the original and destination province. γ refers to the friction parameter. With sample trade data, the unknown coefficients for each sector (β_{1}, β_{2}, γ) can be estimated using regression. In this case, we use the railway’s interregional commodity from National Railway Statistical Data as sample data for the shippable commodity. We use the sample data as the trade flow (\({t}_{rs}^{i}\)) to estimate the unknown coefficients (β_{1}, β_{2}, γ) . With sample trade data, the unknown coefficients for each sector (β_{1}, β_{2}) can be estimated using regression. We use the sample data as the trade flow (\({t}_{rs}^{i}\)) to estimate unknown coefficients (β_{1}, β_{2}, γ) . As we have trade data for 11 commodities from the railway statistics, some sectors in the gravity model have to share the same coefficients. As we have trade data for 11 commodities from railway statistics, some sectors in the gravity model have to share the same coefficients (See Table S2). We show the mapping relationship in the appendix. For nonshippable commodities (e.g. service and construction), we do not set transport costs, and simply assume that they are evenly distributed based on supply and demand, as data are unavailable. For electricity transmissions, we obtained an interregional electricity transmission matrix from China Electricity Power Yearbook as electricity sample data^{37}. With estimated coefficients, we can derive the initial trade matrix directly by Eq. 12. But the initial trade matrix is not in line with the constraints of row and column which are domestic export and import from the updated provincial SRIO table. We then apply the RAS model to balance the trade matrix to make it consistent with the provincial SRIO table.
Based on the balanced trade matrix, we calculate the proportion of total domestically imported products supplied from each province, defined as purchase proportion (RP), shown as:
Where \(r{p}_{rs}^{i}\) represents the ratio of domestic imports from province r to province s for product i; \({t}_{{\rm{jr}}}^{{\rm{i}}}\) refers to the trade from province j to province r for sector i. Therefore, the nondiagonal matrix in the MRIO table can be presented as:
So far, the diagonal matrix generates intermediate demand (\({z}_{rr}^{ij}\)) and final demand (\({f}_{rr}^{i}\)) and the nondiagonal matrix generates intermediate demand (\({z}_{rs}^{ij}\)) and final demand (\({f}_{rs}^{i}\)), which make up the provincial MRIO table.
Data Records
Provincial MRIO tables illustrate the regional economic structure and interregional supply chains for 31 provinces with 42 sectors and cover China’s economic transition period for 2012, 2015 and 2017. The layout follows the standard MRIO table (Fig. 4). For each year, the MRIO table contains an intermediate matrix (1302*1302) for the 42 sectors in 31 provinces. The final demand of each province consists of 5 categories, including rural household consumption, urban household consumption, government consumption, fixed capital formation, and changes in inventories. The final demand matrix contains 1302*155 vectors for each year. In addition, foreign export contains 1302*1 vectors measuring the export for all 42 sectors in 31 provinces, while import contains 1*1302 vectors indicating the imports from other countries used by all 42 sectors in 31 provinces. Valueadded includes compensation of employees, net taxes on production, depreciation of fixed capital and operating surplus, with 4*1302 vectors representing four categories of valueadded for 31 provinces and 42 sectors. Notably, the national sector classification changed slightly in 2017 due to changes in the national sectoral classification. Table S1 compares sector classification for 2012, 2015 and 2017. “Other manufacturing” and “Comprehensive use of waste resources” in the 2012 & 2015 classifications are combined as “Other manufacturing and waste resources” in the 2017 classification (highlighted in bold). “Scientific research and polytechnic services” in the 2012 & 2015 classifications are separated into “scientific research” and “polytechnic services” in the 2017 classification (highlighted in bold). The MRIO tables can be freely downloaded from the China Emission Accounts and Datasets (CEADs, www.ceads.net) and Zenodo^{38}. The MRIO table constructed in the paper is only at the provincelevel, and it can be nested into global MRIO tables for the global scale analysis (technical details seen Jiang et al.^{39}).
Technical Validation
Compare with other MRIO datasets for the 2012 table
Because the most recent provincial MRIO datasets available are for 2012, we compared our MRIO table (MRIOCEADs) for 2012 with the other two most adopted MRIO tables: the provincial MRIO table compiled by DRC (MRIODRC), and the provincial MRIO table compiled by CAS(MRIOCAS). Our previously constructed table (see Table 1) is not included in this comparison, due to this table comprising only 30 sectors for 30 provinces. The format of MRIODRC and MRIOCAS (42 sectors for 31 provinces) is compatible with our MRIO table (MRIOCEADs).
Following previous work in MRIO table comparison^{40}, three indicators are employed in the comparison. Specifically, we calculate the mean absolute deviation (MAD), the IsardRomanoff similarity index (DSIM) and the absolute entropy distance (AED). These indicators measure the similarity between matrixes. MAD measures the absolute distance between each element in the two matrices; DSIM uses the relative distance instead of the absolute distance in MAD; AED is based on information theory and refers to the entropy loss between two matrices. It calculates the absolute entropy differences between two matrices. More similar to two matrices, AED is closer to zero. Here, we compare the intermediate demand matrix, representing how the sector’s production requires the other sector’s production. Mathematically:
where
In the above, \({z}_{ij}^{CEADs}\) refers to the intermediate demand for sector j from sector i of MRIOCEADs; \({z}_{ij}^{Counterpart}\) denotes the intermediate demand for sector j from sector i of the counterpart MRIO tables: MRIOCAS or MRIO DRC.
Table 3 shows the results for three indicators in the comparison between the MRIOCEADs and MRIODRC, and the MRIOCEADs and MRIOCAS. In terms of MAD, MRIOCEADs is more similar to the MRIO CAS, with slightly less absolute distance on average (0.8 versus 0.9). Our MRIO table shows that 17 provinces are similar to them in MRIOCAS, while 14 provinces are similar to them in MRIODRC. Given that MAD gives more weight to the large number, it might be more sensitive for the rich regions which have larger transactions. Our results accordingly show that indicators for rich regions such as Beijing, Shanghai, Jiangsu, Zhejiang and Guangdong are more similar to CAS. However, DSIM measuring the relative distance indicates higher similarity with MRIODRC where 22 provinces are closer to MRIODRC, despite the small gap between the two counterparts on average (27.4 with CAS versus 25.4 with DRC). AED compares the matrix in terms of information (or entropy) loss. It indicates a slight similarity with MRIODRC, with an average entropy loss of 13% versus 15% with CAS. Overall, three indicators might explain why our MRIO table is in the middle between two counterparts MRIO tables, while all three indicators might be similar for some provinces. For example, Chongqing in MRIOCEADs is similar to MRIOCAS, while Hubei in MRIOCEADs is similar to MRIODRC.
We then compare the provincewise proportion of domestic intermediate input to total input, the proportion of the domestic final demand to total output, and the valueadded embodied in the final demand (Fig. 5). The results show that MRIOCEADs is generally similar to the other two matrices although for some provinces differences may be more significant. In the proportion of domestic intermediate demands to total input, the biggest gap is found in Shanghai, where demand to total input is 6% less compared to the MRIODRC but 4% higher compared to the MRIOCAS. The comparison in standard deviation (SD) between our MRIO table and the other two tables shows that our MRIO table is more similar to MRIODRC in the intermediate demand, with a tiny margin. But for domestic final demands in the total output, MRIOCEADs is more similar to the MRIODRC as a general trend. The biggest gap is found in Tibet where the figure is 13% higher than in the MRIOCAS, but only 4% higher than in the MRIODRC. In terms of SD, our MRIO s deviates more than in the MRIOCAS. As for the valueadded embodied in the final demands, all MRIO tables produce similar outcomes which might indicate that the main deviation occurs in the sectors with smaller valueadded in the final demands, such as agriculture and mining.
Comparison with the national SRIO table
Given that there are no MRIO tables for 2015 and 2017 compiled by other institutes, we justify our MRIO table for 2015 and 2017 by comparing it with their official national SRIO tables (Table 4). The MRIO table and SRIO table cannot be compared directly, as the national SRIO table is compiled in a competitiveimport style, where imports are included in the intermediate and final demands. In contrast, the MRIO table is noncompetitive style, in that intermediate and final demands only show domestically supplied demand. In the MRIO table compilation, sectoral output, valueadded, export, and import in the national IO table are used in the calibration with the provincial raw data. Therefore, these national accounting data from the MRIO table are identical with the national SRIO table, and thus, the total intermediate input (inputvalueadded) is identical between the MRIO and SRIO tables. We then transform the competitiveimport national SRIO table into a noncompetitive type, assuming an identical imported ratio in both intermediate and final demands (Eq. 14). After removing the imported parts, we compare the domestic intermediate input of the national SRIO table with the input of the MRIO table. The results show that most sectors are ±5% in all three years, except for a few sectors. In 2012, Processing of petroleum, coking, processing of nuclear fuel (S11), Comprehensive use of waste resources(S23), and production and distribution of gas (S26) are outliers, being 30% higher than in the SRIO table, the most significant deviation is found in production and distribution of gas (S26 in 2015 and S25 in 2017) in 2015 and 2017. It is worth noting that these sectors in China are highly related to imports. The reason behind the uncertainty is the assumption that the import ratio is identical when transforming the competitive table into the noncompetitive table. The accuracy of the ratio is, therefore, more sensitive to the sectors with higher imports. The ratio can be adjusted if more data are available to improve the model. In 2012, other datasets can also be compared with domestic input from SRIO, but the deviation is far higher. For example, Mining and washing of coal (S2) shows a deviation of 57% for MRIODRC and 62% for MRIOCAS higher than in the SRIO table. The main reason for the deviation is that MRIODRC and MRIOCAS are compiled based on the provincial SRIO table without being calibrated to the national one. The aggregation of provincial data are not entirely equal to the national one, as provincial data are compiled by provincial statistics agencies while national data are compiled by the national statistics bureau^{41}.
Code availability
The programs used in the data generation is based on MATLAB and GAMS. The associated code can be found in Zenodo repository^{38}.
References
Mi, Z. et al. Chinese CO_{2} emission flows have reversed since the global financial crisis. Nat. Commun. 8, 1712 (2017).
Meng, J. et al. The role of intermediate trade in the change of carbon flows within China. Energy Econ. 76, 303–312 (2018).
Hilton, I. & Kerr, O. The Paris Agreement: China’s ‘New Normal’ role in international climate negotiations. Clim. Policy 17, 48–58 (2017).
Zheng, H. et al. Regional determinants of China’s consumptionbased emissions in the economic transition. Environ. Res. Lett. 15, 074001 (2020).
Zhang, Z. et al. Production Globalization Makes China’s Exports Cleaner. One Earth 2, 468–478 (2020).
Zheng, H. et al. Linking citylevel input–output table to urban energy footprint: Construction framework and application. J. Ind. Ecol. 23, 781–795 (2019).
Zheng, H. et al. Entropybased Chinese Citylevel MRIO table Framework. Econ. Syst. Res. https://doi.org/10.1080/09535314.2021.1932764 (2021).
Dietzenbacher, E., Los, B., Stehrer, R., Timmer, M. & de Vries, G. The Construction of World Input–Output Tables in the WIOD Project. Econ. Syst. Res. 25, 71–98 (2013).
Miller, R. E. & Blair, P. D. Input–Output Analysis Foundations and Extensions 2nd edn. (Cambridge University Press, 2009).
Wang, Y. An industrial ecology virtual framework for policy making in China. Econ. Syst. Res. 29, 252–274 (2017).
Huang, Q. et al. Heterogeneity of consumptionbased carbon emissions and driving forces in Indian states. Adv. Appl. Energy 100039 (2021).
Aguiar, A., Chepeliev, M., Corong, E. L., McDougall, R. & van der Mensbrugghe, D. The GTAP Data Base: Version 10. J. Glob. Econ. Anal. 4 (2019).
Andrew, R. M. & Peters, G. P. A Multiregion Inputoutput Table Based on the Global Trade Analysis Project Database (GTAPMRIO). Econ. Syst. Res. 25, 99–121 (2013).
Lenzen, M., Moran, D., Kanemoto, K. & Geschke, A. Building Eora: A Multiregion Input–Output Database at High Country and Sector Resolution. Econ. Syst. Res. 25, 20–49 (2013).
Yamano, N. & Webb, C. Future Development of the InterCountry InputOutput (ICIO) Database for Global Value Chain (GVC) and Environmental Analyses. J. Ind. Ecol. 22, 487–488 (2018).
Stadler, K. et al. EXIOBASE 3: Developing a Time Series of Detailed Environmentally Extended MultiRegional InputOutput Tables. J. Ind. Ecol. 22, 502–515 (2018).
Pan, C. et al. Structural Changes in Provincial Emission Transfers within China. Environ. Sci. Technol. 52, 12958–12967 (2018).
Li, S., He, J. & Pan, C. Extended Chinese Regional InputOutput Table for 2012: Construction and Application. (Economic Science Press, 2018).
Zhang, Y. The Methodology and Compilation of China Multiregional Inputoutput Model. Stat. Res. 29, 3–9 (2012).
Liu, W., Tang, Z. & Han, M. The 2012 China MultiRegional InputOutput Table of 31 Provincial Units. (China Statistical Press, 2018).
Liu, W., Li, X., Liu, H., Tang, Z. & Guan, D. Estimating interregional trade flows in China: A sectorspecific statistical model. J. Geogr. Sci. 25, 1247–1263 (2015).
Liu, W., Tang, Z., Chen, J. & Yang, B. China’s interregional inputoutput tables between 30 provinces in 2010. (China Statistics Press, 2014).
Liu, W. et al. Theories and practice of constructing China’s interregional input output tables between 30 provinces in 2007. (China Statistics Press, 2012).
Mi, Z. et al. Data descriptor: A multiregional inputoutput table mapping China’s economic outputs and interdependencies in 2012. Sci. Data 5, 180155 (2018).
Zhang, Z., Shi, M. & Zhao, Z. The Compilation Of China’S Interregional InputOutput Model 2002. Econ. Syst. Res. 27, 238–256 (2015).
Qu, S., Yang, Y., Wang, Z., Zou, J.P. & Xu, M. Great Divergence Exists in Chinese Provincial TradeRelated CO_{2} Emission Accounts. Environ. Sci. Technol. 54, 8527–8538 (2020).
Jahn, M. Extending the FLQ formula: a location quotientbased interregional input–output framework. Reg. Stud. 51, 1518–1529 (2017).
Többen, J. On the simultaneous estimation of physical and monetary commodity flows. Econ. Syst. Res. 29, 1–24 (2017).
Batty, M. Reilly’s Challenge: New Laws of Retail Gravitation Which Define Systems of Central Places. Environ. Plan. A Econ. Sp. 10, 185–219 (1978).
Junius, T. & Oosterhaven, J. The solution of updating or regionalizing a matrix with both positive and negative entries. Econ. Syst. Res. 15, 87–96 (2003).
Zheng, H. et al. Mapping Carbon and Water Networks in the North China Urban Agglomeration. One Earth 1, 126–137 (2019).
Yamada, M. Construction of a multiregional inputoutput table for Nagoya metropolitan area, Japan. J. Econ. Struct. 4, 11 (2015).
Nakano, S. & Nishimura, K. A nonsurvey multiregional inputoutput estimation allowing crosshauling: Partitioning two regions into three or more parts. Ann. Reg. Sci. 50, 935–951 (2013).
Cai, M. Doubly constrained gravity models for interregional trade estimation. Pap. Reg. Sci. 100, 455–474 (2021).
Lindall, S., Olson, D. & Alward, G. Deriving MultiRegional Models Using the IMPLAN National Trade Flows Model. J. Reg. Anal. Policy 36, 76–83 (2005).
IMPLAN. IMPLAN’s Gravity Model and Trade Flow RPCs. https://blog.implan.com/estimatingregionspecificforeigntraderates (2018).
Wei, W. et al. A 2015 inventory of embodied carbon emissions for Chinese power transmission infrastructure projects. Sci. Data 7, 318 (2020).
Zheng, H. Chinese Provincial MultiRegional InputOutput database for 2012, 2015, and 2017. Zenodo https://doi.org/10.5281/zenodo.5079345 (2021).
Jiang, M. et al. Improving Subnational Input–Output Analyses Using Regional Trade Data: A CaseStudy and Comparison. Environ. Sci. Technol. 54, 12732–12741 (2020).
SteenOlsen, K. et al. Accounting for value added embodied in trade and consumption: an intercomparison of global multiregional input–output databases. Econ. Syst. Res. 28 (2016).
Zheng, H. et al. How modifications of China’s energy data affect carbon mitigation targets. Energy Policy 116, 337–343 (2018).
Acknowledgements
We acknowledge supports from National Natural Science Foundation of China (41921005, 71904125, 72173133), Ministry of Science and Technology International Cooperation Project (2020ICR103), the UK Natural Environment Research Council (NE/P019900/1, NE/V002414/1), and the Norwegian Research Council: 287690/F20.
Author information
Authors and Affiliations
Contributions
H.Z. and D.G. designed the study. H.Z. led the project and conducted modelling. H.Z., Y.B. and J.M. wrote and revised the manuscript. H.Z., W.W., Z.Z. and M.S. collected raw data.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files associated with this article.
About this article
Cite this article
Zheng, H., Bai, Y., Wei, W. et al. Chinese provincial multiregional inputoutput database for 2012, 2015, and 2017. Sci Data 8, 244 (2021). https://doi.org/10.1038/s41597021010235
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597021010235
This article is cited by

Health burden from food systems is highly unequal across income groups
Nature Food (2024)

European multi regional input output data for 2008–2018
Scientific Data (2023)

An interprovincial input–output database distinguishing firm ownership in China from 1997 to 2017
Scientific Data (2023)

Carbon emission fluctuations of Chinese interregional interaction: a network multihub diffusion perspective
Environmental Science and Pollution Research (2023)

Assessing the Regional Economic Ripple Effect of Flood Disasters Based on a Spatial Computable General Equilibrium Model Considering Traffic Disruptions
International Journal of Disaster Risk Science (2023)