Using machine learning to assess the livelihood impact of electricity access

Ratledge, Nathan; Cadamuro, Gabe; de la Cuesta, Brandon; Stigler, Matthieu; Burke, Marshall

doi:10.1038/s41586-022-05322-8

Article
Published: 16 November 2022

Using machine learning to assess the livelihood impact of electricity access

Nature volume 611, pages 491–495 (2022)Cite this article

14k Accesses
14 Citations
440 Altmetric
Metrics details

Subjects

Abstract

In many regions of the world, sparse data on key economic outcomes inhibit the development, targeting and evaluation of public policy^1,2. We demonstrate how advancements in satellite imagery and machine learning (ML) can help ameliorate these data and inference challenges. In the context of an expansion of the electrical grid across Uganda, we show how a combination of satellite imagery and computer vision can be used to develop local-level livelihood measurements appropriate for inferring the causal impact of electricity access on livelihoods. We then show how ML-based inference techniques deliver more reliable estimates of the causal impact of electrification than traditional alternatives when applied to these data. We estimate that grid access improves village-level asset wealth in rural Uganda by up to 0.15 standard deviations, more than doubling the growth rate during our study period relative to untreated areas. Our results provide country-scale evidence on the impact of grid-based infrastructure investment and our methods provide a low-cost, generalizable approach to future policy evaluation in data-sparse environments.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Electricity access rates, Ugandan electrification and asset wealth training data.**

**Fig. 2: Wealth predictions in Uganda over time.**

**Fig. 3: Electricity access increases household wealth in electrified communities compared with unelectrified communities.**

A high-resolution gridded dataset to assess electrification in sub-Saharan Africa

Article Open access 03 July 2019

Giacomo Falchetta, Shonali Pachauri, … Edward Byers

Global poverty estimation using private and public sector big data sources

Article Open access 07 February 2024

Robert Marty & Alice Duhaut

Using publicly available satellite imagery and deep learning to understand economic well-being in Africa

Article Open access 22 May 2020

Christopher Yeh, Anthony Perez, … Marshall Burke

Data availability

Data and R code to replicate all the main results are available on GitHub at https://github.com/nwrat/RCQSB2022_public.

References

Devarajan, S. Africa’s statistical tragedy. Rev. Income Wealth 59, S9–S15 (2013).
Article Google Scholar
Burke, M., Driscoll, A., Lobell, D. & Ermon, S. Using satellite imagery to understand and promote sustainable development. Science 371, eabe8628 (2021).
Article CAS PubMed Google Scholar
Yeh, C. et al. Using publicly available satellite imagery and deep learning to understand economic well-being in Africa. Nat. Commun. 11, 2583 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Jean, N. et al. Combining satellite imagery and machine learning to predict poverty. Science 353, 790–794 (2016).
Article ADS CAS PubMed Google Scholar
Chi, G., Fang, H., Chatterjee, S. & Blumenstock, J. E. Micro-estimates of wealth for all low-and middle-income countries. Proc. Natl Acad. Sci. USA 119, e2113658119 (2022).
Article PubMed PubMed Central Google Scholar
Steele, J. E. et al. Mapping poverty using mobile phone and satellite data. J. R. Soc. Interface 14, 20160690 (2017).
Article PubMed PubMed Central Google Scholar
Pokhriyal, N. & Jacques, D. Combining disparate data sources for improved poverty prediction and mapping. Proc. Natl Acad. Sci. USA 114, E9783–E9792 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Huang, L., Hsiang, S. & Gonzalez-Navarro, M. Using satellite imagery and deep learning to evaluate the impact of anti-poverty programs. Preprint at https://arxiv.org/abs/2104.11772 (2021).
The World Bank. Access to electricity (% of population)—Uganda. https://data.worldbank.org/indicator/EG.ELC.ACCS.ZS?locations=UG (2021).
International Energy Agency (IEA). World energy outlook 2019 (2019).
International Energy Agency (IEA). Africa energy outlook 2019 (2019).
Lenz, L., Munyehirw, A., Peters, J. & Seivert, M. Does large-scale infrastructure investment alleviate poverty? Impacts of Rwanda’s electricity access roll-out program. World Dev. 89, 88–110 (2017).
Article Google Scholar
Chakravorty, U., Emerick, K. & Ravago, M.-L. Lighting up the last mile: the benefits and costs of extending electricity to the rural poor. Resources for the Future Discussion Paper 16–22 (2016).
Dinkelman, T. The effects of rural electrification on employment: new evidence from South Africa. Am. Econ. Rev. 101, 3078–3108 (2011).
Article Google Scholar
Lee, K., Miguel, E. & Wolfram, C. Does household electrification supercharge economic development? J. Econ. Perspect. 34, 122–144 (2020).
Article Google Scholar
Lee, K. et al. Electrification for “under grid” households in rural Kenya. Dev. Eng. 1, 26–35 (2016).
Article Google Scholar
Bayer, P., Kennedy, R., Yang, J. & Urpelainen, J. The need for impact evaluation in electricity access research. Energy Policy 137, 111099 (2020).
Article Google Scholar
Bernard, T. Impact analysis of rural electrification projects in sub-Saharan Africa. World Bank Res. Obs. 27, 33–51 (2012).
Article Google Scholar
Jaeger, D. A., Joyce, T. J. & Kaestne, R. A cautionary tale of evaluating identifying assumptions: did reality TV really cause a decline in teenage childbearing? J. Bus. Econ. Stat. 38, 317–326 (2020).
Article MathSciNet Google Scholar
Kahn-Lang, A. & Lang, K. The promise and pitfalls of differences-in-differences: reflections on 16 and Pregnant and other applications. J. Bus. Econ. Stat. 38, 613–620 (2020).
Article MathSciNet Google Scholar
Sahn, D. E. & Stifel, D. Exploring alternative measures of welfare in the absence of expenditure data. Rev. Income Wealth 49, 463–489 (2003).
Article Google Scholar
Filmer, D. & Scott, K. Assessing asset indices. Demography 49, 359–392 (2012).
Article PubMed Google Scholar
He, K., Zhang, X., Ren, S. & Sun, J. in Proc. European Conference on Computer Vision – ECCV 2016 (eds Leibe, B., Matas, J., Sebe, N. & Welling, M.) 630–645 (2016).
Athey, S., Bayati, M., Doudchenko, N., Imbens, G. & Khosravi, K. Matrix completion methods for causal panel data models. J. Am. Stat. Assoc. 116, 1716–1730 (2021).
Article MathSciNet CAS Google Scholar
Doudchenko, N. & Imbens, G. Balancing, regression, difference-in-differences and synthetic control methods: a synthesis. Preprint at https://arxiv.org/abs/1610.07748 (2016).
Jedwab, R. & Storeygard, A. The average and heterogeneous effects of transportation investments: evidence from Sub-Saharan Africa 1960–2010. J. Eur. Econ. Assoc. 20, 1–38 (2022).
Article Google Scholar
Uganda National Roads Authority. Connecting Uganda. https://www.unra.go.ug/home (2021).
Collins Bartholomew Ltd. Collins mobile coverage explorer (2014).
World Bank Group. Poverty maps of Uganda (2018).
World Bank Group. Uganda systematic country diagnostic: boosting inclusive growth and accelerating poverty reduction (2015).
Burlig, F. & Preonas, L. Out of the darkness and into the light? Development effects of rural electrification. Energy Inst. Haas WP 268, 26 (2016).
Google Scholar
Lee, K., Miguel, E. & Wolfram, C. Experimental evidence on the economics of rural electrification. J. Polit. Econ. 128, 1523–1565 (2020).
Article Google Scholar
Filmer, D. & Pritchett, L. H. Estimating wealth effects without expenditure data—or tears: an application to educational enrollments in states of India. Demography 38, 115–132 (2001).
CAS PubMed Google Scholar
Omulo, G., Banadda, N. & Kiggundu, N. Harnessing of banana ripening process for banana juice extraction in Uganda. Afr. J. Food Sci. 6, 108–117 (2015).
Google Scholar
Ministry of Energy and Mineral Development. Uganda’s Sustainable Energy for All (SE4ALL) initiative action agenda (2015).
Ugandan Energy Sector GIS Working Group. Distribution lines operational (2016) (2017).
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2017).
OpenStreetMap contributors. Planet dump retrieved from https://planet.osm.org (2019).
Goodman-Bacon, A. Difference-in-differences with variation in treatment timing. J. Econom. 225, 254–277 (2021).
Article MathSciNet MATH Google Scholar
Callaway, B. & Sant’Anna, P. H. C. Difference-in-differences with multiple time periods. J. Econom. 225, 200–230 (2021).
Article MathSciNet MATH Google Scholar
Abadie, A., Diamond, A. & Hainmuelle, J. Synthetic control methods for comparative case studies: estimating the effect of California’s tobacco control program. J. Am. Stat. Assoc. 105, 493–505 (2010).
Article MathSciNet CAS Google Scholar

Download references

Acknowledgements

We thank seminar participants at Stanford and Atlas AI for helpful comments and colleagues in Uganda for their help in locating and verifying the electricity grid maps. We thank A. Driscoll and J. Li for excellent research assistance. N.R. thanks the TomKat Center for Sustainable Energy at Stanford for financial support.

Author information

Authors and Affiliations

Emmett Interdisciplinary Program in Environment and Resources, Stanford University, Palo Alto, CA, USA
Nathan Ratledge
TomKat Center for Sustainable Energy, Stanford University, Palo Alto, CA, USA
Nathan Ratledge
Atlas AI, Palo Alto, CA, USA
Gabe Cadamuro
King Center on Global Development, Stanford University, Palo Alto, CA, USA
Brandon de la Cuesta
Environmental Policy Lab, ETH Zurich, Zurich, Switzerland
Matthieu Stigler
Center on Food Security and the Environment, Stanford University, Palo Alto, CA, USA
Marshall Burke
Doerr School of Sustainability, Stanford University, Palo Alto, CA, USA
Marshall Burke
National Bureau of Economic Research, Cambridge, MA, USA
Marshall Burke

Authors

Nathan Ratledge
View author publications
You can also search for this author in PubMed Google Scholar
Gabe Cadamuro
View author publications
You can also search for this author in PubMed Google Scholar
Brandon de la Cuesta
View author publications
You can also search for this author in PubMed Google Scholar
Matthieu Stigler
View author publications
You can also search for this author in PubMed Google Scholar
Marshall Burke
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

N.R. conceived the study, collected the data, led the econometric analysis and wrote the paper. G.C. designed and performed the CNN modelling, developed the bias penalty term and wrote the paper. B.d.l.C. contributed to econometric analysis and wrote the paper. M.S. contributed to the econometric analysis and wrote the paper. M.B. advised the project, contributed to econometric analysis and wrote the paper.

Corresponding author

Correspondence to Marshall Burke.

Ethics declarations

Competing interests

M.B. is a cofounder at Atlas AI, a company that uses machine learning to measure economic outcomes in the developing world. G.C. is an employee at Atlas AI.

Peer review

Peer review information

Nature thanks Gang He, J. Henderson and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Data and analysis pipeline.

Starting with three forms of raw data on the left, a CNN was trained and modified to predict annual estimates of community-level asset wealth with reduced bias across 25 countries and 27,174 communities in SSA. Two ML approaches (MC and SC-EN) were then used to predict unit-specific unobserved counterfactuals for locations that received electrification between 2010 and 2012.

Extended Data Fig. 2 DHS from SSA used in this study.

DHS years in our sample run from 2005 to 2017. We used asset wealth estimates from more than 640,000 households across 27,174 villages to train our CNN wealth-prediction models.

Extended Data Fig. 3 Survey variables in DHS used in the creation of an asset wealth index in the training data.

The column on the right gives the variable code used in the original DHS data.

Extended Data Fig. 4 Alternative approaches to constructing the asset wealth index result in indices that are highly correlated.

a, Compares the base wealth index constructed from the variables listed in Extended Data Fig. 3 to an index that also includes the DHS’ ‘has electricity’ variable. The r², intercept and slope coefficient from regressing the alternative index on the base index are shown at the top. b, Compares the base index to the base index plus the electricity variable but minus three electrical appliances (TV, refrigerator and phone). c, Compares the base index to the base index minus three electrical appliances.

Extended Data Fig. 5 Our custom loss function corrects for prediction biases at different points in the wealth distribution and modestly reduces overall predictive performance (r²) in held-out Uganda data.

a–e, DHS observed wealth index values in Uganda on the x axis (n = 1,798) and CNN-predicted values on the y axis. The red line is a 45° line, the black is the line of best fit from regressing predicted on observed; slope coefficient and regression r² are shown in the lower-right corner of each subplot. Higher penalties on quintile-specific bias leads to slope coefficients closer to 1 and slight reductions in r². However, when the penalty term gets too large, as in the 7.5 case in e, regression coefficients begin to depart from 1 again. f, Relationship between the slope coefficient and r² for each subplot. The results from the entire sub-Saharan dataset are shown in gold and those for the Uganda-only sample in black.

Extended Data Fig. 6 Cross-validation shows that MC and SC-EN can accurately estimate the average of held-out control values.

We split the control sample into ten equally sized, random folds and use MC, SC-EN and DiD to predict cluster-specific wealth values in the held-out test fold for each year, 2011–2016. We use each fold as a test set one time and in the training set nine times. The mean average differences across all years are 0.017, 0.018 and 0.013, respectively, signifying that MC and SC-EN can accurately predict unobserved values among control locations.

Extended Data Fig. 7 Simulated data indicate that MC has smaller prediction error than DiD when pre-trend bias is present, and both estimators are equally attenuated under Berkson-type measurement error.

a, We simulate a setting in which a treatment has a true-positive effect equal to 1 but units are trending differently in the absence of treatment and can have unit-specific trends that are correlated with treatment. We test this situation across several time periods (4 to 22 years, in which treatment begins in t/2 + 1) and varying growth rates of upward-trending units being selected into the treated group (0.1 to 0.25; darker blue or red lines indicate treated units with increasingly higher average growth rates). We then compare DiD and MC causal estimates and find that MC is less biased in the presence of treatment-correlated time-trending unobservables, particularly as time series lengthen. Blue dashed lines are DiD estimates that pool all years; blue solid lines are DiD estimates using only one post-treatment year; and red lines are MC estimates. b,c, We create a sample of ‘observed’ data that has a smaller variance than the true distribution, representing a common challenge in data imputation. Using our ‘observed’ sample data in b, we again simulate a setting that has a positive treatment effect equal to 1 but no differential pre-trends. Under this Berkson-type error, DiD and MC are similarly attenuated, underscoring the importance of bias reduction across the outcome distribution in the CNN training process.

Extended Data Fig. 8 Treatment effect estimates are robust to using larger inclusion buffers around electricity grid locations to identify treatment and control samples.

Our baseline analysis uses a 2-km buffer around the electrical grid lines to identify treatment and control units. Treatment effect estimates using either a 3-km or a 4-km buffer are slightly larger but qualitatively similar, suggesting that our results are robust to plausible alternative buffers. Error bars represent 95% CIs, and are based on 100 bootstrapped model runs.

Extended Data Fig. 9 The estimated treatment effect of electrification is higher when quintile-specific bias in wealth predictions is penalized more heavily.

Each estimate represents a separate estimate from MC or SC-EN using output from a model with varying quintile-specific bias, from λ_b = 0 up to λ_b = 7.5, as in Extended Data Fig. 5. Error bars represent 95% CIs, and are based on 100 bootstrapped model runs.

Extended Data Fig. 10 Road improvements were modest in scale during our study period in Uganda and not associated with electrical grid extensions.

Red lines denote electricity grid extensions made between 2011 and 2012, which are used to identify treated communities in our main analysis. Yellow lines denote road improvements (resurfacing) made before (2010) or during (2011–2012) our treatment period. Blue lines mark road improvements completed in the post-treatment period, and black dots in the main figure and inset panel represent DHS survey locations. During the entire period (2010–2016), road-upgrade projects averaged 184 km per year, totalling 1,102 km—less than 1% of the road network in Uganda. The location of road improvements shows almost no overlap with electricity grid development during our study period.

Extended Data Fig. 11 Cellular phone ownership and coverage is not associated with electrical grid expansion.

a, Cell phone ownership in newly electrified versus unelectrified communities based on household-level DHS responses in 2006, 2009, 2011, 2014 and 2016. A DiD estimate is shown in the lower-right corner, indicating that cell ownership in control communities grew faster post-treatment than in treated areas. b, Percentage of treated and control communities with cell phone coverage based on GSMA data from 2010 to 2016. Data again show that coverage grew faster in control communities post-treatment. c, As in b, except for the quality of cellular coverage (‘variable’ or ‘strong’ signal). Together, we do not find evidence of cellular ownership or investment being associated with new electricity access, suggesting that the public and private roll-out of cellular phone services did not occur in tandem with electricity grid development.

Extended Data Fig. 12 Rejection rate for DiD estimator in pre-trends test.

Percentage of runs in which a DiD estimator rejected a null hypothesis of no difference in pre-treatment trends in asset wealth between treated and control locations, at either 95% or 90% confidence. ‘With Uganda’ and ‘without Uganda’ refer to whether or not Uganda was used in the training and validation steps. ‘Penalty’ indicates the quintile-bias penalty term (λ_b) used in each model run.

Extended Data Fig. 13 Effect of electrification on specific assets in the asset index.

Using DiD on DHS locations, we find statistically significant positive impacts across numerous individual assets, including three variables associated with home construction and three variables associated with electricity. We find weakly positive results for more expensive items, such as cars and motorcycles. Error bars represent 95% CIs.

Supplementary information

Peer Review File

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ratledge, N., Cadamuro, G., de la Cuesta, B. et al. Using machine learning to assess the livelihood impact of electricity access. Nature 611, 491–495 (2022). https://doi.org/10.1038/s41586-022-05322-8

Download citation

Received: 01 September 2021
Accepted: 05 September 2022
Published: 16 November 2022
Issue Date: 17 November 2022
DOI: https://doi.org/10.1038/s41586-022-05322-8

This article is cited by

Developing a machine learning model for accurate nucleoside hydrogels prediction based on descriptors
- Weiqi Li
- Yinghui Wen
- Hang Zhao
Nature Communications (2024)
The cost of electrifying all households in 40 Sub-Saharan African countries by 2030
- Florian Egli
- Churchill Agutu
- Tobias S. Schmidt
Nature Communications (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.