arising from: R. F. Savaris et al.; Scientific Reports https://doi.org/10.1038/s41598021840921 (2021).
Savaris et al.^{1} aim at “verifying if staying at home had an impact on mortality rates.” This short note shows that the methodology they have applied in their paper does not allow them to do so. An estimated coefficient \(\beta \approx 0\) does not imply that there is no association between the variables in either country. Rather, their pairwise difference regressions are computing coefficients that are weightedaverages of regionspecific time series regressions, such that it is possible that the association is significant in both regions but their weightedaverage is close to zero. Therefore, the results do not back up the conclusions of the paper.
Consider two regions: A and B. Suppose that the true relationships between the change in deaths per million \((\Delta Y_t^i)\) and the change in an index of staying at home \((\Delta X_t^i)\) at epidemiological week t in countries \(i=A,B\) are the following:
For simplicity in exposition, assume that \(\Delta X_t^A, \Delta X_t^B, \varepsilon _t^A, \varepsilon _t^B\) are all zero mean, iid processes. By subtracting the second equation from the first and defining \(\Delta Y_t \equiv \Delta Y_t^A  \Delta Y_t^B\) and \(\Delta X_t \equiv \Delta X_t^A  \Delta X_t^B\), we can write:
where \(\eta _t \equiv (\beta _A  \beta ) \Delta X_t^A  (\beta _B\beta ) \Delta X_t^B + (\varepsilon _t^A  \varepsilon _t^B)\). It is easy to see that, for \(\beta _i \ne \beta\), estimation of \(\beta\) will not be consistent, since, by construction, \(cov(\Delta X_t,\eta _t) \ne 0\).
If nonetheless one estimates (1) by ordinary least squares, what does the regression coefficient \(\beta\) converge to? It turns out that it converges to a varianceweighted average of \(\beta _A\), \(\beta _B\), as summarized in the following proposition.
Let \(\Delta X_t^A, \Delta X_t^B, \varepsilon _t^A, \varepsilon _t^B, \beta _A\), \(\beta _B, \beta\) be all as above. Then \({\hat{\beta }}\), the ordinary least squares coefficient of regressing \(\Delta Y_t\) on \(\Delta X_t\), converges in probability to:
with \(w \equiv \frac{{\mathbb {E}}[(\Delta X_t^A)^2]}{{\mathbb {E}}[(\Delta X_t^A)^2] + {\mathbb {E}}[(\Delta X_t^B) ^2] }\).
FormalPara ProofUnder the stated assumptions, \({\hat{\beta }} = \frac{\sum _{t}^T \Delta X_t \Delta Y_t}{\sum _{t}^T \Delta X_t^2} \xrightarrow {p} \frac{ {\mathbb {E}}[\Delta Y_t \Delta X_t] }{{\mathbb {E}}[\Delta X_t^2] } \equiv \beta\). One can calculate the population parameter \(\beta\) analytically:
Note that \(\frac{ {\mathbb {E}}[ \Delta Y_t^A \Delta X_t^A]}{ {\mathbb {E}}[(\Delta X_t^A)^2] } = \beta _A\) and \(\frac{ {\mathbb {E}}[ \Delta Y_t^B \Delta X_t^B]}{ {\mathbb {E}}[(\Delta X_t^B)^2] } = \beta _B\). Using that and the definition of w we arrive at the desired result. \(\square\)
The intuition regarding the (2) in the Proposition is simple. Whenever the variance of \(\Delta X_t^A\) is large relative to country B, \(w \rightarrow 1\) and \(\beta \rightarrow \beta _A\). Similarly, if the variance of \(\Delta X_t^B\) is large relative to country A, \(w \rightarrow 0\) and \(\beta \rightarrow \beta _B\).
What does this mean for the analysis of Savaris et al.^{1}? In general, it means that one cannot interpret their estimated \({\hat{\beta }}\) without knowing the underlying relative variances. Additionally, one cannot infer that an insignificant (or even numerically zero) \({\hat{\beta }}\) implies absence of association in either country.
To see that, suppose countries A and B have identical variance in their independent variables, but \(\beta _A\), \(\beta _B\) are different. In country A, the policymaker adjusts stayathome orders in response to the increase in deaths, such that the change in the percentage of the public staying at home is positively correlated with the change in deaths. In country B, the policymaker does not act, such that the change in share of population staying at home is negatively correlated with contacts, infections, and deaths.
Consider the case in which \(\beta _B =  \beta _A\). Then, since the regions have identical variance, \(w=1/2\) and \(\beta = 0\) even though the true association is nonzero in both countries. The regression coefficients in Savaris et al.^{1} should not lead one to conclude that, in either country, there is no association between the change in mobility and the change in deaths per million. Figure 1 shows the result of 10,000 simulated \(\hat{\beta}\) in which \(\beta _A = 10\) and \(\beta _B = 10\). In this case, \(var(X^A_t) = var(X^B_t)\) and variables are iid and normally distributed. As expected, sample estimates are distributed around the population value of \(\beta =0\).
For \(\beta _A \ne \beta _B\), then, regionspecific dynamics are heterogeneous and, as shown by Pesaran & Smith^{2}, aggregating or pooling slopes can lead to biased estimates, making individual regressions for each group member preferable. If authors assume that \(\beta _A = \beta _B\) for each pair in their sample – i.e., homogeneous \(\beta\) –, then dynamic panels would have many advantages in terms of efficiency and use of instruments to circumvent endogeneity. In either case, their pairwise approach would not be appropriate.
In order to verify if “staying at home had an impact on mortality rates,” it would be necessary to address many other issues in the analysis, including, but not limited to, omitted variable bias, measurement error, and endogeneity of the regressors. However, as shown above, even in a purely correlational analysis, with no causality claims, the applied methodology will simply deliver a weightedaverage of coefficients across the two regions. An estimated coefficient \(\beta \approx 0\) does not imply that there is no association between the variables in either country. Therefore, their conclusion does not follow from their regressions.
References
Savaris, R. F., Pumi, G., Dalzochio, J. & Kunst, R. Stayathome policy is a case of exception fallacy: an internetbased ecological study. Scientific Reports 11, 5313. issn: 20452322 (2021).
Pesaran, M. H. & Smith, R. Estimating longrun relationships from dynamic heterogeneous panels. J. Econ. 68, 79–113 (1995).
Author information
Authors and Affiliations
Contributions
This article is solo authored.
Corresponding author
Ethics declarations
Competing interests
The author declares no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Góes, C. Pairwise difference regressions are just weighted averages. Sci Rep 11, 23044 (2021). https://doi.org/10.1038/s41598021020963
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598021020963
This article is cited by

What scientists have learnt from COVID lockdowns
Nature (2022)

Retraction Note: Stayathome policy is a case of exception fallacy: an internetbased ecological study
Scientific Reports (2021)

Impact of mobility reduction on COVID19 mortality: absence of evidence might be due to methodological issues
Scientific Reports (2021)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.