Inadequate methods undermine a study of malaria, deforestation and trade

Kuschnig, Nikolas

doi:10.1038/s41467-021-22514-4

Download PDF

Matters Arising
Open access
Published: 18 June 2021

Inadequate methods undermine a study of malaria, deforestation and trade

Nikolas Kuschnig ORCID: orcid.org/0000-0002-6642-2543¹

Nature Communications volume 12, Article number: 3762 (2021) Cite this article

1776 Accesses
3 Altmetric
Metrics details

Subjects

The Original Article was published on 09 March 2020

Arising from Chaves et al. Nature Communications https://doi.org/10.1038/s41467-020-14954-1 (2020)

In a recent study, Chaves et al.¹ find international consumption and trade to be major drivers of ‘malaria risk’ via deforestation. Their analysis is based on a counterfactual ‘malaria risk’ footprint, defined as the number of malaria cases in absence of two malaria interventions, which is constructed using linear regression. In this letter, I argue that their study hinges on an obscured weighting scheme and suffers from methodological flaws, such as disregard for sources of bias. When addressed properly, these issues nullify results, overturning the significance and reversing the direction of the claimed relationship. Nonetheless, I see great potential in the mixed methods approach and conclude with recommendations for future studies.

To construct ‘malaria risk’, Chaves et al.¹ regress malaria cases on cumulative tree cover loss and two malaria intervention variables, expressed in shares of usage. Their globally aggregated data cover the period from 2000 until 2015 on a yearly basis. Data on malaria cases and tree cover loss are available for 26 countries in tropical biomes, while the two intervention variables are only available for 13 of these countries in Africa. Figure 1 shows the time series under scrutiny; additional information on the data is provided in Supplementary Note 1.

**Fig. 1: Time series under consideration.**

Chaves et al.¹ specify their regression model as (see their paper for notation)

$$\mathop{\sum }\limits_{r}{I}_{r}(t)={\beta }_{0}+{\beta }_{L}\mathop{\sum }\limits_{r}{L}_{r}(t)+{\beta }_{n}n(t)+{\beta }_{a}a(t).$$

(1)

However, the actual model is a weighted regression of the type

$$w(t)\mathop{\sum }\limits_{r}{I}_{r}(t)={\beta }_{0}+{\beta }_{L}w(t)\mathop{\sum }\limits_{r}{L}_{r}(t)+{\beta }_{n}w(t)n(t)+{\beta }_{a}w(t)a(t)+\epsilon (t),$$

(2)

where w(t) is a weight scalar and ϵ(t) is an error term at time t. Weights were constructed via replication of observations, meaning that ∑_tw(t) ≠ 1. The sample size is not adjusted accordingly, meaning that standard errors are too small by a factor of 2.08 on average (see Table 1, column two). The weighting was obscured by its omission from the Methods and by the replicated rows only being visible after unhiding them in the spreadsheet that is provided in their replication files. Chaves et al.¹ weigh 2005 at 42.86%, 2001 at 17.86%, and 2014 at 16.07%. The unweighted model, as it is specified in the paper, undoes the significance and switches the sign of forest loss, as can be seen in columns one and three of Table 1.

Table 1 Comparison of original regression results to alternatives.

Full size table

The study by Chaves et al.¹ is looking to estimate a causal effect of deforestation on malaria incidence. Valid estimates of this relation can only be obtained using appropriate techniques and assumptions that require theoretical justification². The authors do not consider these intricacies and offer no explanation of why their ‘malaria risk’ measure may be interpreted as it is. Instead, they disregard a number of statistical issues that I discuss below.

Chaves et al.¹ base their model selection on achieving a ‘sufficient’ R²—a procedure that is well known to be inadequate³. To illustrate this, consider a regression of birth rates on stork population. Common seasonal patterns lead to high correlation and high values of R². However, we learn very little about the actual relationship and estimates will be spurious. Chaves et al.¹ claim that any model adaptation would only marginally increase R² and hence necessarily mimic their results. This is factually incorrect, missing the relative nature of R². See column (4) of Table 1 for a demonstration of how an additional variable can affect results.

Obtaining unbiased estimates from a linear regression relies on the exogeneity assumption, i.e. no correlation between explanatory variables and the error term. This assumption is commonly violated by simultaneity or omitted variables⁴. Simultaneity occurs when variables are determined contemporaneously, e.g. due to reciprocal causation. Regressing a disease’s incidence on its interventions is a textbook example for this phenomenon. Valid inference could only be drawn using elaborate methods, such as instrumental variables, or, if theoretically justifiable, by assuming no effects of malaria incidence on the use of nets and therapy. Omitted variable bias occurs when the dependent and explanatory variables are both affected by a third factor. Chaves et al.¹ cite Garg⁵ and Berazneva and Byker⁶, who establish causal links between deforestation and malaria for specific regions. These studies rely on panel data, allowing for subnational heterogeneity, and an extensive set of control variables in order to distil a causal effect. Chaves et al.¹ themselves observe a number of malaria determinants in their appendix, which are also drivers of deforestation⁶. Yet, the authors do not take any of these factors into account. The distortion caused by this oversight becomes noticeable when including a linear time trend, as one of many omitted variables (see Table 1, column (4)).

In their study, Chaves et al.¹ perform a time series regression without considering any of the associated complexities. Crucially, their model relies on stationarity of variables, i.e. their distributions, hence moments such as the mean, must be constant over time⁴. Non-stationary variables generally lead to the spurious regression problem⁷. Results would then indicate strong correlation between variables, but do not imply causation. In the study’s model, we cannot reject non-stationarity for any of the variables considered and we find autocorrelated residuals—all at any reasonable level of significance (see Supplementary Table 1 for test results). The variable of interest, cumulative forest loss, is even non-stationary by design. When dealing with this issue in two simple ways, we find completely different results—namely sign-switching and insignificant coefficients. See columns (4) and (5) of Table 1 for a model accounting for a linear time trend and one where the relation of yearly changes of variables is modelled.

Putting aside inadequate methods, there is a number of simplifications that neglect important complexities of both malaria and deforestation dynamics. By aggregating data, Chaves et al.¹ implicitly assume international homogeneity of malaria dynamics. This assumption is striking, given weak empirical support⁸ and the spatial mismatch of malaria and forest loss. Malaria predominantly occurs in Africa, with 93% of global cases in 2018⁹, while forest loss mostly stems from other regions¹⁰. Furthermore, Chaves et al.¹ silently equate the distinct concepts of forest loss, deforestation and commodity-driven deforestation. With the Hansen et al.¹⁰ data, they use information on forest loss, which is only partly due to deforestation^10,11. Deforestation, in turn, is driven by multiple factors, including but not limited to commodity production¹². Since commodity-driven deforestation is only a subset of forest loss, with arguably special dynamics, this distinction is relevant for conclusions that can be drawn.

To sum up, the study by Chaves et al.¹ constitutes an important attempt at linking malaria, deforestation and trade, but falls short of this ambitious goal. Their use of an unorthodox weighting scheme lacks justification and pushes results towards showing a link between deforestation and malaria. Their model is plagued by a number of serious methodological issues, including simultaneity, omitted variables and non-stationarity. Each one of them individually is enough to invalidate results. Still, I hope this direction is pursued further and offer some recommendations: (a) be transparent with assumptions made, (b) approach interdisciplinary problems with an interdisciplinary team, (c) be precise and careful with the notion of causality.

Data availability

All data used for this work stem from the original research paper by Chaves et al.¹ and can be found in their online repository at https://doi.org/10.5281/zenodo.3630653.

Code availability

All code used for this work can be found in Supplementary Software 1 or online at https://gist.github.com/nk027/44af20da3e337f69e0052870ef21e8ed.

References

Chaves, L. S. M. et al. Global consumption and international trade in deforestation-associated commodities could influence malaria risk. Nat. Commun. 11, 1–10 (2020).
Article Google Scholar
Morgan, S. L. & Winship, C. Counterfactuals and Causal Inference (Cambridge University Press, Cambridge, 2015).
Google Scholar
Wooldridge, J. M. Introductory econometrics: a modern approach (Cengage Learning, Mason, 2016).
Google Scholar
Hayashi, F. Econometrics (Princeton University Press, Princeton, 2000).
MATH Google Scholar
Garg, T. Ecosystems and human health: the local benefits of forest cover in Indonesia. J. Environ. Econ. Manag. 98, 102271 (2019).
Article Google Scholar
Berazneva, J. & Byker, T. S. Does forest loss increase human disease? Evidence from Nigeria. Am. Econ. Rev. 107, 516–21 (2017).
Article Google Scholar
Granger, C. W. & Newbold, P. Spurious regressions in econometrics. J. Econom. 2, 111–120 (1974).
Article Google Scholar
Bauhoff, S. & Busch, J. Does deforestation increase malaria prevalence? Evidence from satellite data and health surveys. World Dev. 127, 104734 (2020).
Article Google Scholar
WHO—World Health Organization. World Malaria Report 2019 (World Health Organization, 2019).
Hansen, M. C. et al. High-resolution global maps of 21st-century forest cover change. Science 342, 850–853 (2013).
Article CAS ADS Google Scholar
Curtis, P. G., Slay, C. M., Harris, N. L., Tyukavina, A. & Hansen, M. C. Classifying drivers of global forest loss. Science 361, 1108–1111 (2018).
Article CAS ADS Google Scholar
Busch, J. & Ferretti-Gallon, K. What drives deforestation and what stops it? A meta-analysis. Rev. Environ. Econ. Policy 11, 3–23 (2017).
Article Google Scholar

Download references

Acknowledgements

This work has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant Agreement No. 725525).

Author information

Authors and Affiliations

Vienna University of Economics and Business (WU), Vienna, Austria
Nikolas Kuschnig

Authors

Nikolas Kuschnig
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

N.K. performed the research and wrote the paper.

Corresponding author

Correspondence to Nikolas Kuschnig.

Ethics declarations

Competing interests

The author declares no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Description of Additional Supplementary Files

Supplementary Software 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kuschnig, N. Inadequate methods undermine a study of malaria, deforestation and trade. Nat Commun 12, 3762 (2021). https://doi.org/10.1038/s41467-021-22514-4

Download citation

Received: 15 May 2020
Accepted: 11 March 2021
Published: 18 June 2021
DOI: https://doi.org/10.1038/s41467-021-22514-4

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Inadequate methods undermine a study of malaria, deforestation and trade

Subjects

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Supplementary Information

Description of Additional Supplementary Files

Supplementary Software 1

Rights and permissions

About this article

Cite this article

Comments

Search

Quick links

Subjects

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Supplementary Information

Description of Additional Supplementary Files

Supplementary Software 1

Rights and permissions

About this article

Cite this article

Share this article

Comments

Search

Quick links