Introduction

Although the PM2.5 in the atmosphere of China has declined steadily and significantly in recent years, but the average atmospheric concentration of PM2.5 in China’s cities (30 μg/m3, in 2021) is still five times higher than the latest World Health Organization's standard (5 μg/m3), and the days with PM2.5 as the primary pollutant accounts for the highest proportion of the total days with air quality lower than standard (39.7%, see https://www.mee.gov.cn/hjzl/sthjzk/zghjzkgb). In addition, the Global Air Quality Report 2021 released by IQAir also shows that the 15 cities with the highest atmospheric PM2.5 concentrations in East Asia are all located in China (https://www.iqair.com/us/blog/press-releases/WAQR_2021_PR). The above data all suggest that PM2.5 remains one of the biggest threats to urban air quality in China.

In order to effectively alleviate the air pollution issues represented by PM2.5, China has continuously strengthened the tax system on the emission of PM2.5 precursors such as SO2, NOx,1 etc. In mainland China before 2018, the pollutants discharge fee system effectively played the role of an environmental protection tax system2,3 and formally transformed into an environmental protection tax system after 20184.

The tax on SO2 emissions has always been one of the main components of environmental protection taxes (discharge fees). Since 2007, China's environmental protection tax system (in this paper, both the discharge fee and environmental protection tax paid for SO2 emissions are collectively referred to as SO2 emission tax to avoid confusion) has undergone several rounds of reform, and the type of environmental protection tax that has experienced the earliest, most frequent, and largest reform is just the SO2 emission tax5, of which primary reform orientation is to promote the tax rate. However, an issue still worth exploring and testing is whether a relatively higher SO2 emission tax rate can simultaneously reduce PM2.5 pollution in both local areas and adjuncts. This is also the core issue focused on by this paper.

In academia, some studies have tested the pollution reduction effect of the environmental protection tax system from different perspectives. Blackman found that the sewage discharge fees in Colombia effectively reduced pollution6, and Wang et al. find that the reform from sewage charge to environmental protection tax effectively reduced pollutant emissions7. Guo et al.5 use the quasi-natural experimental method to test and confirm that China's pilot sewage charge reform in 2007 significantly reduced the emission intensity of SO2.

Many previous studies regard PM2.5 as the key policy objective or policy evaluation scale of environmental protection tax (sewage charge), and pay attention to the relationship between PM2.5 and policy strength. For example, Ye and An find that the carbon tax significantly improves the air quality in countries with high concentrations of PM2.5 in their quasi-natural experiment8; Xu et al.9 and Hu et al.10 both find through mathematical model analysis that higher tax rates can simultaneously reduce greenhouse gas and PM2.5 emissions; Rith et al.11 find that imposing higher tax rates on gasoline and vehicles would help reduce pollutant emissions and public health risks in Manila. Han and Li12 find that the environmental protection tax rate and tax revenue for air pollutants significantly affected the concentration of PM2.5 in China; Chien et al.13 find that environmental taxes in the United States have played an important role in reducing haze pollution such as PM2.5.

As an improvement of the above studies, this paper establishes a quasi-natural experimental framework based on the spatial Difference-in-Differences (Spatial-DID) model and tests the direct effect (local effect) and indirect effect (spatial spillover effect) of SO2 emission tax rate reform on PM2.5. Compared with previous studies, the main marginal contributions of this paper are:

First, although many previous studies have regarded PM2.5 as the goal and evaluation criterion of the environmental protection tax (discharge fee) policy7,8,13, these studies usually pay less attention to the spatial spillover effect of PM2.5. Using panel data from 285 China's cities (2004–2019), this paper can conduct a more detailed test of the spatial spillover effect of SO2 emission tax policy. The deficiencies found in the above tests can also provide essential inspiration for future improvement of relevant policies and systems.

Second, the design of Difference-in-Differences (DID) in this paper is to regard the adoption of SO2 emission tax rates that are higher than the legal minimum standard (also higher than the other cities which always follow the newest legal minimum tax rate, see Fig. 1) in part of cities as a policy reform worthy of focusing on. Compared with most DID studies in the same field4,5,8,14, the above design can help accurately identify the historical changes and regional differences in the tax rate.

Figure 1
figure 1

Adjustment of SO2 emission tax rate in China. Note: The basic map resources come from Institute of Geographic Sciences and Natural Resources Research (IGSNRR).

Next, the structure of the rest of this paper is as follows: section “Mechanisms and hypotheses” analyzes the theoretical mechanism and puts forward the hypothesis, section “Methodology and data” introduces the methodology and data, section “Results and discussions” reports and discusses the results, and section “Conclusions” is the conclusions.

Mechanisms and hypotheses

According to the pollution heaven hypothesis, a higher SO2 emission tax rate is also possible to force some highly polluting production activities to relocate (through the relocation of production departments, outsourcing, etc.) and lead to increased pollution in surrounding areas15,16. Of course, the above spatial spillover effect and its mechanism are still worth testing, so the main hypotheses to be tested in this paper are:

H1: Enabling SO2 emission tax rates higher than the national legal minimum standard (so it is also higher than other regions that always follow the minimum standard) has a promotion effect on the atmospheric PM2.5 concentration in the surrounding cities (i.e. spatial spillover effect is positive).

Due to the cross-regional migration of high pollution industrial firms16 caused by SO2 emission tax rates, this paper will also test two sub-hypotheses to explain the mechanism of the spatial spillover effect (if hypothesis H1 can be confirmed):

H1-a: Enabling SO2 emission tax rates higher than the national legal minimum standard can increase the atmospheric PM2.5 concentration of the surrounding cities by increasing their industrial SO2 emission intensity.

H1-b: Enabling SO2 emission tax rates higher than the national legal minimum standard can increase the atmospheric PM2.5 concentration of the surrounding cities by promoting their accumulation of industrial production factors.

Methodology and data

Empirical research design

Spatial-DID model

This paper establishes a Spatial-DID model based on the spatial Durbin model (SDM), and then judges whether it should degenerate into a spatial lag model (SLM) or a spatial error model (SEM)17,18 according to the test methods provided by Elhorst19. The expression of the base-line Spatial-DID model in this paper is:

$$\begin{aligned} PM_{i,t} & = \rho_{0} \sum\limits_{j = 1}^{n} {W_{ij} PM_{j,t} } + \alpha_{1} POST_{i,t} \times FSDIOX_{i,t} + \rho_{1} \sum\limits_{j = 1}^{n} {W_{ij} (POST_{j,t} \times FSDIOX_{j,t} )} \\ & \;\;\quad + X_{i,t}^{{}} \beta + \sum\limits_{j = 1}^{n} {W_{ij} X_{j,t} } \theta + f_{i} + \mu_{t} + \varepsilon_{i,t} \\ \end{aligned}$$
(1)

In Eq. (1), the interpreted variables PM i,t are defined as the atmospheric PM2.5 concentration of the sample city. \(\sum\limits_{j = 1}^{n} {W_{ij} PM_{j,t} }\) is the spatial lagged term of interpreted variable. The letter t in the subscript represents the period (year), while the letters i and j are the sequence numbers used to distinguish different cities. The letter n represents the total number of sample cities, and Wij is the element in i-th row and j-th column of the row standardized spatial weight matrix W, and W before row standardize is defined as:

$$\begin{aligned} W_{ij}^{{}} & = \left\{ \begin{gathered} {1 \mathord{\left/ {\vphantom {1 {d_{ij}^{{}} ,i \ne j}}} \right. \kern-0pt} {d_{ij}^{{}} ,i \ne j}} \hfill \\ 0,i = j \hfill \\ \end{gathered} \right. \\ i & = 1,2 \ldots ,n \\ j & = 1,2 \ldots ,n \\ \end{aligned}$$
(2)

FSDIOXi,t in Eq. (1) is defined as the SO2 emission tax rate, and POSTi,t is a dummy variable for the adjustment of SO2 emission tax rate (0 before adjustment and 1 after adjustment, which is specifically defined in section “Data for main explanatory variable (DID variable)” below). The product of the above two (FSDIOXi,t × POSTi,t) is the DID term20 of Spatial-DID model. \(\sum\limits_{j = 1}^{n} {W_{ij} (POST_{j,t} \times FSDIOX_{j,t} )}\) is the spatial lagged term of DID variable. The terms fi and μt stands for individual fixed effect and time fixed effect respectively. εi,t is an independent identically distributed random disturbance term. The k-dimensional row vector x contains all control variables. \(\sum\limits_{j = 1}^{n} {W_{ij} X_{j,t} }\) contains the spatial lagged terms of all control variables.

In this paper, the scripts provided by Elhorst19 are used to estimate Eq. (1). In order to overcome endogenous and bias, the bias corrected quasi maximum likelihood tester (BC-QMLE) provided by Lee and Yu21 is selected as the estimation method of Eq. (1).

Based on the estimated parameters, this paper calculates the impact of DID variable on local PM2.5 concentrations (direct effect c) and the spatial spillover effect on PM2.5 concentrations in surrounding areas (indirect effect cw). In the following, this paper will fully judge the policy effects base on the above two types of effects19,22.

Data and variable selection

Data for interpreted variable (PM)

The interpreted variable of the spatial-DID model is the annual average PM2.5 concentration (PM), its data source is the raster data of global PM2.5 concentration estimated by ACAG (Atmospheric Composition Analysis Group) using satellite observation data (see Fig. A.1 in the appendices23).

Data for main explanatory variable (DID variable)

The main explanatory variable of the Spatial-DID model in this paper is the DID term POST × FSDIOX in Eq. (1), in which the policy meaning of dummy variable POST is a major innovation of this paper. Since 2007, according to the order of the State Council of China, some pilot cities began to gradually adjust the tax rate on SO2 emissions from 0.63 Yuan/Kg since 2004 to higher levels (ultimately no less than 1.26 Yuan/Kg). In September 2014, China further extended the legal standard of SO2 emission tax rate of no less than 1.26 Yuan/Kg to all regions. Then, on January 1st, 2018, China began to change the pollutants discharge fee system into environmental protection tax system obeying the principle of "tax shifting"4. After 2018, at least 19 provinces in China Mainland directly determined their environmental protection tax rates according to their previous discharge fee rates (see Fig. 1).

It is noteworthy that after several reforms, only part of the cities have adopted the tax rate higher than the legal minimum standard, and the rest of the cities just follow the legal minimum tax rate standard at each period (see Fig. 1). Therefore, if the first tax rate adjustment in each city is simply regarded as a quasi-natural experiment4,5,8,14, it may cause the treatment group and the control group to be indistinguishable, and most historical changes and regional heterogeneity of SO2 emission tax rate will be ignored.

Therefore, in this paper, POST = 1 means the tax rates of SO2 emissions in the sample cities (206 cities in total) have been adjusted to be higher than the legal minimum standard (POST = 0 when this reform has not been launched). Other cities that always follow the statutory minimum tax rate are included in the control group (POST ≡0).

Moreover, this paper further combines the continuous value of the actual tax rate to construct a DID variable POST × FSDIOX20,24, in which FSDIOX is defined as the real tax rate on SO2 emissions (Yuan/Kg, which is deflated by using the price index of the province where the city is located).

Of course, considering the lagging nature of the policy effects, the above-mentioned DID variable has been adjusted to its 1-period lagged term.

Data for control variables

The first category of control variables is the population density (POPUD), economic development (real GDP per capita, GDPPC), and technological progress (R&D personnel as a percentage of total urban employment, RDEMP) selected with reference to the STIRPAT model (Stochastic impacts by Regression on Population, Affluence, and Technology)25. Since economic development may have nonlinear effects26, the squared term of the variable GDPPC is also included in Eq. (1).

The second category is the factors related to infrastructure and energy. The specific indicators used in this paper are per capita urban road area (ROAD, to reflect the improvement of urban infrastructure) and gas popularization rate (GASR, to simultaneously reflect the popularity of clean energy and the perfection of energy infrastructure).

The third category is the natural environment and climate factors. Specifically: The Normalized Difference Vegetation Index (NDVI), which is indicated by the data from the Institute of Geographic Sciences and Natural Resources Research (IGSNRR, http://www.resdc.cn/), Chinese Academy of Sciences (CAS). Temperature (TEMP) is indicated by the data from the MERRA-2 raster dataset of the Global Modeling and Assimilation Office (GMAO)27. Besides, the data on atmospheric pressure (PRSD), wind speed (WIN), and air humidity (RHU) are extracted from the Dataset of Daily Climate Data from Chinese Surface Stations, provided by the National Meteorological Information Center of China (http://data.cma.cn).

The data of the variables GDPPC, POPUD, RDEMP, ROAD, and GASR comes from The Yearbook of China's Cities or The Yearbook of China's Urban Construction. These yearbooks directly provide the above data at the prefecture-level city level, of which the data of 285 prefecture or higher-level cities are extracted into the panel data of this paper (the total number of the prefecture or higher-level cities in China is 337, of which 52 cities are not included in the sample cities of this paper since their series data missing problems).

The original data of variables PM, NDVI, and TEMP are geographic rasters. The original data of PRSD, WIN, and RHU are the data of monitoring stations, and this paper uses the Kriging interpolation method to transform them into raster form. Subsequently, this paper calculates the annual mean of rasters within the boundary of 285 sample cities to convert all above the geographic rasters into panel data.

Finally, this paper unifies the data of the above variables into the panel data of 285 sample cities, and the data period is 2004–2019. The descriptive statistics of all the above data are reported in Table 1.

Table 1 Descriptive statistics.

Results and discussions

Empirical results

Parallel trend test

Referring to Beck31, the parallel trend test model of Spatial-DID model is defined as:

$$\begin{aligned} PM_{i,t} & = \rho_{0} \sum\limits_{j = 1}^{N} {W_{ij} PM_{j,t} } + \sum\limits_{{k = 1 - T_{b} }}^{5} {\alpha_{k} T_{i,t}^{ - k} } + \sum\limits_{{s = 1 + T_{b} }}^{10} {\alpha_{s} T_{i,t}^{ + s} } + \sum\limits_{{k = 1 - T_{b} }}^{5} {\rho_{k} \sum\limits_{j = 1}^{N} {W_{ij} T_{j,t}^{ - k} } } + \sum\limits_{{s = 1 + T_{b} }}^{l} {\rho_{s} \sum\limits_{j = 1}^{10} {W_{ij} T_{j,t}^{ + s} } } \\ & \;\;\; + X_{i,t}^{{}} \beta + \sum\limits_{j = 1}^{N} {W_{ij} X_{j,t} } \theta + f_{i} + \mu_{t} + \varepsilon_{i,t} \\ \end{aligned}$$
(3)

In Eq. (3), if the launch time of policy reform in the city i is 2009, and there is t = 2009-k (kZ+ and k ≤ p) or t = 2009 + s (sZ+ and s ≤ l), then \(T_{i,t}^{ - k}\) (or \(T_{i,t}^{ + s}\)) equals 1, otherwise \(T_{i,t}^{ - k}\) (or \(T_{i,t}^{ + s}\)) equals 0. According to the time span of panel data, this paper sets the maximum values of k and s as 5 and 10 respectively, and selects the first period (− 1 period) before the policy launch as the base period (Tb) of the parallel trend test.

Referring to Jia et al.20, the test results of the parallel trend hypothesis are indicated by both the direct and indirect effects (95% confidence intervals were also reported, see Fig. 2). It can be seen that before the base period (the period before the policy reform31), the direct and indirect effects of \(T_{i,t}^{ - k}\) never significantly deviate from 0, indicating that the policy has not taken effect; however, after the base period, the two types of effects of \(T_{i,t}^{ + s}\) begin to deviate significantly 0, in which the direct effect is a mainly negative effect, and the indirect effect is a mainly positive effect. The above results prove that the parallel trend hypothesis is confirmed.

Figure 2
figure 2

Parallel trend test results.

Selective test of spatial econometric models

The implementation of spatial econometric analysis must be based on the existence of spatial autocorrelation of the explained variables. The results of the spatial autocorrelation tests based on global Moran's I test and Moran Scatter (reported in Appendix A2) show that PM2.5 concentration has significant positive spatial autocorrelation (p-value < 0.05). Therefore, to control the spatial spillover effect of PM2.5 (cross-boundary transmission of pollution), it is necessary to select the spatial econometric model as the empirical test tool.

The empirical results of this base-line model (Model A) are reported in Table 2. The spatial lag coefficient of the interpreted variable (PM2.5 concentration) is significantly positive (0.722, p-value < 0.01), which also confirms the significant positive spatial autocorrelation of PM2.5 concentration, meaning that Model A has identified and extracted the cross-boundary impact of surrounding areas on local PM2.5 pollution.

Table 2 Estimation results.

The statistics of the Wald test and LR test about space error and space lag in Table 2 show the selective test results of the spatial econometric model in this paper. It can be seen that the significance level of all Wald test and LR test statistics reaches 1%, indicating that the null hypothesis that the SDM should generate into the spatial lag model (SLM) or spatial error model (SEM) is all rejected. Therefore, SDM should be selected as the modeling basis of the Spatial-DID model in this paper.

Effects of main explanatory variable

From Table 2, it can be seen that the SO2 emission tax policy reform can produce a significant negative direct effect on the PM2.5 concentration in the local atmosphere (p-value < 0.01), and a significantly positive indirect (spatial spillover) effect (p-value < 0.01) on the atmospheric PM2.5 concentration of surrounding areas, which are consistent with the parallel trend test results (Fig. 2). Therefore, hypothesis H1 is confirmed.

Placebo test

The placebo test (to test whether the empirical results are affected by unobserved factors) is to randomly assign false treatment groups and time points of policy reform in a false DID variable of the Spatial-DID model, and then process the estimation and test the effects by using a Spatial-DID model with this false DID variable. This paper repeated the above test 1000 times and obtained the distribution of direct and indirect effects (see Fig. 3). It can be seen from Fig. 3 that, significantly different from the effect of the real DID variable (grey vertical line), the direct and indirect effects of the 1000 placebo tests are concentrated around the 0 value, and the effect is the weakest near the 0 value. This shows that the empirical results of this paper are not caused by unobserved factors20, but truly identify and measure the spatial spillover effects brought about by policy reform.

Figure 3
figure 3

Placebo test results.

Robustness tests

In addition to the base-line model (Model A), this paper also established a total of 10 auxiliary models, of which Model B—Model F are robustness test models and Model G—Model K are heterogeneity analysis models. The estimation results and discussion of the above models are described in sections “Robustness tests” and “Heterogneity analysis”.

The purpose of Model B—Model F in this subsection is to judge whether the empirical results of Model A are robust and credible. If the models can still draw empirical conclusions that are generally consistent with Model A after some partial modifications, it indicates that the empirical results of model A are robust. In addition, in a series of robustness tests, the use of propensity score matching (Model B) and instrumental variable (Model C) can also be used to determine whether the empirical conclusions are affected by selective bias and endogenous respectively, thus further proving the effectiveness of Model A.

Propensity score matching

In order to avoid selection bias caused by the fact that the treatment group samples may not satisfy the assumption of random selection20,32, this paper performs propensity score matching (PSM) on the panel data year by year based on all control variables (reported in Table A.4 of the appendices). The Spatial-DID model using matched data is called the PSM-Spatial-DID model (Model B), and the number of sample cities drops from 285 to 246 in Model B. From Table 2, it can be seen that the direction (sign) of the marginal effect (P-values < 0.01) of the DID variable in Model B is still consistent with the base-line model (Model A), indicating that the empirical results in this paper are not significantly affected by the problem of selective bias.

Instrumental variable approach

Referring to Shehata33 and Vega and Elhorst34, this paper treats the spatial lagged interpreted variable (W × PM), DID variable (POST × FSDIOX), and its spatial lagged term as endogenous variables, using instrumental variables (IV) and two-stage Least Squares (2SLS) to overcome endogeneity, thus extending the base-line Spatial-DID model to Spatial-DID-2SLS model (Model C). The IV of the DID variable is the sewage treatment capacity of sample cities (WDCAPA). The reason for choosing this IV is that cities with larger sewage treatment capacity usually tend to make up for the huge investment in pollution control by increasing the tax (fee) rate on SO2 emission. In addition, refer to Shehata33, the 2–4th order spatial lagged terms of the DID variable are selected as IVs of spatial lagged interpreted variable; refer to Vega and Elhorst34 and Shao et al.35, choose the 1–3th order spatial lag terms of the main IV (WDCAPA) are used as IVs of the spatial lagged DID variable.

It can be seen from Table 2 and Table A.5 (in the appendices) that the direct and indirect effects of Model C are not substantially different from the base-line model (Model A), indicating that endogeneity cannot significantly affect the effectiveness of the Spatial-DID model.

Replace the spatial weight matrix

An alternative spatial weight matrix is We = W × E, where matrix E is defined as:

$$\left\{ \begin{gathered} E_{ij}^{{}} { = }\frac{1}{{GDPPC_{i} - GDPPC_{j} }},_{{}} i \ne j \hfill \\ E_{ij}^{{}} { = }0,_{{}} i = j \hfill \\ \end{gathered} \right.$$
(4)

In the above formula, GDPPC is the average real GDP per capita of sample cities from 2004 to 2019. The estimated results of the Spatial-DID model using We as spatial weight matrix (Model D) are reported in Table 2. It can be seen that there are only limited differences between the parameter estimates and effects of Model A and Model D, which reflects the robustness of the empirical results.

Replace the interpreted variable

With the comparison between the models after replacing the interpreted variable and the base-line model, this paper further tests the robustness and credibility of the empirical conclusions. There are two alternative interpreted variables used in this paper. The first is PM2.5 atmospheric concentration (PMCH) from the China High Air Pollutants (CHAP) dataset36, measured in μg/m3, its original data is from the simulations based on China's ground monitoring data and satellite remote sensing images using artificial intelligence methods. The second alternative interpreted variable is the PM2.5 emission flux (PME), which can directly reflect the density of PM2.5 (Kg/m2) generated and discharged into the atmosphere by each sample city. Its data source is the dataset of Hemispheric Transport of Air Pollution (HTAP). The models using the above two interpreted variables are Model E and Model F. From Table 2, it can be seen that the empirical results of Model E and Model F are not fundamentally different from Model A, which shows that neither changing the source of PM2.5 atmospheric concentration data (changing to the data source from China) nor only considering the local emission of PM2.5 can essentially change the empirical results of this paper.

Heterogneity analysis

Referring to Yu and Zhang (2021), this paper further adds the interaction term of city heterogeneity variable and DID term to Eq. (1) to separate and identify the impact of regional characteristics or the interference of other policies. The following heterogeneity variables are selected in this paper:

  1. (a)

    Eastern cities (EAST), a 0–1 dummy variable which distinguishes eastern China from other regions.

  2. (b)

    City level (CLEVEL), which represents the cities’ administrative levels by the values between 1 and 3.

  3. (c)

    Pollutant emission rights trading pilot cities (PET), which is a 0–1 dummy variable representing the launch of trading.

  4. (d)

    Key regions dummy variable (PCAP) of the Air Pollution Prevention and Control Action Plan (APPCAP), represents the launch of the APPCAP in the key regions. Here, the key regions refer to the three major areas of the Beijing-Tianjin-Tangshan region, the Yangtze River Delta, and the Pearl River Delta which are the focus of attention in the APPCAP (http://www.gov.cn/zhengce/content/2013-09/13/content_4561.htm).

  5. (e)

    NOx emission tax rate reform (REFNF); NOx is also the main precursor of PM2.5; during 2004–2019, the NOx emission tax rates are also adjusted in many regions of China. Similar to the DID term in Eq. (1) (POST × FSDIOX), the definition of REFNF is similar to the DID term of SO2 emission tax rate reform in Eq. (1) (POST × FSDIOX), which equals to variable POSTN (a 0–1 dummy variable representing the adoption of NOx emission tax rates higher than the national statutory minimum standard) multiplied by the real NOx emission tax rate FNOX.

The results of the heterogeneity analysis are reported in Table 2. It can be seen that the direct effect of the interaction term POST × FSDIOX × EAST is not significant, but its indirect effect is significantly negative (P-values < 0.05), indicating that the policy reform can have a relatively weak PM2.5 pollution aggravation effect on the surrounding areas of each eastern city, which is similar to the conclusion of Jia et al.20. The interaction terms POST × FSDIOX × CLEVEL can simultaneously produce negative direct effects and indirect effects (P-values ≤ 0.05), indicating that the policy reform in higher administrative level cities can produce a larger local PM2.5 pollution suppression effect and a weaker surrounding PM2.5 pollution aggravation effect, this may be due to the fact that China's higher administrative level cities are usually subject to stricter environmental supervision1.

At the same time, the heterogeneity analysis also shows the synergy between different environmental regulation policies. it can be seen from Table2 that the Interactive terms POST × FSDIOX × PET (in Model I) and POST × FSDIOX × REFNF (in Model K) both can produce negative indirect effects (P-values ≤ 0.1), which shows that the pollutants emission rights trading and the NOx emission tax rate reform can bring beneficial spatial spillover effects when cooperating with the reform of SO2 emission tax rates. The interactive term POST × FSDIOX × PCAP (in Model J) also produces a negative but not significant indirect effect, which may be due to the late start (after 2013) and gradual implementation of APPCAP (see http://www.gov.cn/zhengce/content/2013-09/13/content_4561.htm).

Moreover, it can be seen from Table 2 that the interaction with above policies will not fundamentally change the spatial spillover effect of the SO2 emission tax rate reform (the indirect effects of POST × FSDIOX in Table 2).

Mechanism analysis (mediation effects)

In this paper, the following mediation effect analysis method is used to test the hypotheses (H1-a and H1-b) involving the generation mechanism of the spatial spillover effects:

First, use the following model to estimate the impact of the DID variable on the mediator Mit:

$$\begin{aligned} M_{i,t}^{{}} & = \rho_{0} \sum\limits_{j = 1}^{n} {W_{ij} M_{j,t} } + \alpha_{1} POST_{j,t} \times FSDIOX_{j,t} + \rho_{1} \sum\limits_{j = 1}^{n} {W_{ij} POST_{j,t} \times FSDIOX_{j,t} } \\ & \;\; + X_{i,t}^{{}} \beta + \sum\limits_{j = 1}^{N} {W_{ij} X_{j,t} } \theta + f_{i} + \mu_{t} + \varepsilon_{i,t} \\ \end{aligned}$$
(5)

According to the parameter estimates of Eq. (5), the direct effect (effect a) and indirect effect (i.e. spatial spillover effect aw) of the DID variable on the mediator Mit can be calculated. Then, the mediator Mit is added to the following equation:

$$\begin{aligned} PM_{i,t} & = \rho_{0} \sum\limits_{j = 1}^{n} {W_{ij} PM_{j,t} } + \alpha_{0} M_{i,t} + \alpha_{1} POST_{i,t} \times FSDIOX_{i,t} + \rho_{2} \sum\limits_{j = 1}^{N} {W_{ij} M_{j,t} } \\ & \;\;\; \quad+ \rho_{1} \sum\limits_{j = 1}^{n} {W_{ij} POST_{j,t} \times FSDIOX_{j,t} } + X_{i,t}^{{}} \beta + \sum\limits_{j = 1}^{N} {W_{ij} X_{j,t} } \theta + f_{i} + \mu_{t} + \varepsilon_{i,t} \\ \end{aligned}$$
(6)

According to the parameters of Eq. (6), the direct effect (effect b) of the mediator Mit on PM2.5 can be calculated19. After identifying and stripping the mediation mechanism contained in direct effect (consisting of effects a and b) and the mediation mechanism contained in indirect effect (consisting of effects aw and b), the actual direct effect (effect c') and actual spatial spillover effect (effect cw') can be measured. Based on the above direct effects and indirect effects (rather than parameters20), the logic and criterion refer to Baron and Kenny37, MacKinnon et al.38, etc. are used to analysis the mediation effects.

According to the hypothesis in section “Mechanisms and hypotheses”, this paper uses two categories of mediators: the first is the SO2 emission intensity (SDIOXINT), which is indicated by the proportion of SO2 emissions to industrial outputs, and its time span is 2004–2017. The second category contains two types of industrial production factors: (a) INDEMP, which means the proportion of industrial labor in the city to total employment; (b) INDFA, which means the real industrial fixed assets per employee.

It can be seen from the small graph on the right side of Fig. 4 that the policy reform can greatly increase the industrial SO2 emission intensity of surrounding cities (effect aw, P-value ≤ 0.01), which can further increase the PM2.5 pollution in surrounding cities with the help of effect b. After separating the above mediation effect, the actual indirect effect cw′ (4.303, P-value ≤ 0.05) of the policy reform on the local PM2.5 concentration is smaller than the original effect cw (4.505) (in Model A), indicating that the policy reform can result in more serious surrounding PM2.5 pollution by increasing the intensity of industrial SO2 emissions in surrounding cities. Based on the above results, hypothesis H1-a is confirmed.

Figure 4
figure 4

The mediation effects of industrial SO2 emission intensity. Notes: (a) The t-statistics are marked in parentheses. *, **, and *** denote significance at 10%, 5%, and 1% levels, respectively. (b) It should be noted that some mediation effect studies39 also refer to effect c'′ as a direct effect and refer to effect a × b as an indirect effect, so readers should pay attention to the differences between the definitions in the above studies and those in this paper.

It can also be seen from Fig. 5 that the SO2 emission tax policy reform has an improvement effect (effect aw) on the variables INDEMP and INDFA in the surrounding cities (P-values ≤ 0.05), and both of them can further increase the concentration of PM2.5 in surrounding cities (effect b). Further, comparing Fig. 5a–c, it can be seen that the actual indirect effects cw' (P-values ≤ 0.05) after separating the above mediation effects are both far smaller than the original effect cw (6.385) (in Model A), indicating that the policy reform can result in more serious surrounding PM2.5 pollution by increasing the concentration of industrial production factors in surrounding cities. Based on the above results, hypothesis H1-b is also confirmed. The above results together provide strong evidence for the existence of the pollution heaven effect in the SO2 emission tax policy reform15,16.

Figure 5
figure 5

The mediation effects of industrial production factors. Note: The t-statistics are marked in parentheses. *, **, and *** denote significance at 10%, 5%, and 1% levels, respectively.

Conclusions

After considering the primary results, we conclude the following:

  1. 1.

    The policy reform of SO2 emission tax (enabling SO2 emission tax rates higher than the legal minimum standard at that time) can significantly reduce the concentration of PM2.5 locally, but significantly increase PM2.5 concentrations in surrounding areas.

  2. 2.

    The policy reform of SO2 emission tax can lead to a relatively weaker effect on the aggravation of PM2.5 pollution in the surrounding areas of eastern cities, while the policy reform in high-level cities can lead to weaker surrounding PM2.5 pollution aggravating effect. The pollutants emission rights trading and the reform of NOx emission tax rates can produce beneficial spatial spillover effects when cooperating with the reform of SO2 emission tax rates.

  3. 3.

    Reform of the SO2 emission tax policy can cause the surrounding PM2.5 pollution to aggravate by promoting the accumulation of surrounding industrial production factors and the increase in SO2 emission intensity.

The policy suggestions according to the above issues are mainly:

  1. 1.

    The most direct countermeasure is to expand the geographical scope of the SO2 emission tax rates higher than the legal minimum standard so that more regions can adopt the discharge tax rates higher than the legal minimum standards that more China’s cities can share the beneficial impact of the discharge tax policy. Otherwise, China’s cities should improve the ecological compensation mechanism for enterprises with damaged interests after the burden of taxes is increased, and provide sufficient subsidies or incentives for their green transformation, and the relocation costs of highly polluting enterprises should also be increased through inter-city cooperation.

  2. 2.

    Promote resource flow and sharing more flexibly, such as adopting a more flexible employment mechanism for green technology R&D personnel and emissions trading practitioners, establishing a closer green technology sharing and collaboration relationship, etc. Then, the spillover of resources required for the environmental governance in terms of R&D, operation, and management can offset the negative spatial spillover effect that may be caused by the taxes on pollutant emissions.

A probable inadequacy of this paper is that due to the lack of data availability of prefecture-level cities, this paper cannot use the actual collection strength index such as the total amount of discharge tax collection. The above problem also provides a possible future improvement direction for our future research.