Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Using machine learning to assess the livelihood impact of electricity access

Abstract

In many regions of the world, sparse data on key economic outcomes inhibit the development, targeting and evaluation of public policy1,2. We demonstrate how advancements in satellite imagery and machine learning (ML) can help ameliorate these data and inference challenges. In the context of an expansion of the electrical grid across Uganda, we show how a combination of satellite imagery and computer vision can be used to develop local-level livelihood measurements appropriate for inferring the causal impact of electricity access on livelihoods. We then show how ML-based inference techniques deliver more reliable estimates of the causal impact of electrification than traditional alternatives when applied to these data. We estimate that grid access improves village-level asset wealth in rural Uganda by up to 0.15 standard deviations, more than doubling the growth rate during our study period relative to untreated areas. Our results provide country-scale evidence on the impact of grid-based infrastructure investment and our methods provide a low-cost, generalizable approach to future policy evaluation in data-sparse environments.

This is a preview of subscription content, access via your institution

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Fig. 1: Electricity access rates, Ugandan electrification and asset wealth training data.
Fig. 2: Wealth predictions in Uganda over time.
Fig. 3: Electricity access increases household wealth in electrified communities compared with unelectrified communities.

Data availability

Data and R code to replicate all the main results are available on GitHub at https://github.com/nwrat/RCQSB2022_public.

References

  1. Devarajan, S. Africa’s statistical tragedy. Rev. Income Wealth 59, S9–S15 (2013).

    Article  Google Scholar 

  2. Burke, M., Driscoll, A., Lobell, D. & Ermon, S. Using satellite imagery to understand and promote sustainable development. Science 371, eabe8628 (2021).

    Article  CAS  PubMed  Google Scholar 

  3. Yeh, C. et al. Using publicly available satellite imagery and deep learning to understand economic well-being in Africa. Nat. Commun. 11, 2583 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  4. Jean, N. et al. Combining satellite imagery and machine learning to predict poverty. Science 353, 790–794 (2016).

    Article  ADS  CAS  PubMed  Google Scholar 

  5. Chi, G., Fang, H., Chatterjee, S. & Blumenstock, J. E. Micro-estimates of wealth for all low-and middle-income countries. Proc. Natl Acad. Sci. USA 119, e2113658119 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  6. Steele, J. E. et al. Mapping poverty using mobile phone and satellite data. J. R. Soc. Interface 14, 20160690 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  7. Pokhriyal, N. & Jacques, D. Combining disparate data sources for improved poverty prediction and mapping. Proc. Natl Acad. Sci. USA 114, E9783–E9792 (2017).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  8. Huang, L., Hsiang, S. & Gonzalez-Navarro, M. Using satellite imagery and deep learning to evaluate the impact of anti-poverty programs. Preprint at https://arxiv.org/abs/2104.11772 (2021).

  9. The World Bank. Access to electricity (% of population)—Uganda. https://data.worldbank.org/indicator/EG.ELC.ACCS.ZS?locations=UG (2021).

  10. International Energy Agency (IEA). World energy outlook 2019 (2019).

  11. International Energy Agency (IEA). Africa energy outlook 2019 (2019).

  12. Lenz, L., Munyehirw, A., Peters, J. & Seivert, M. Does large-scale infrastructure investment alleviate poverty? Impacts of Rwanda’s electricity access roll-out program. World Dev. 89, 88–110 (2017).

    Article  Google Scholar 

  13. Chakravorty, U., Emerick, K. & Ravago, M.-L. Lighting up the last mile: the benefits and costs of extending electricity to the rural poor. Resources for the Future Discussion Paper 16–22 (2016).

  14. Dinkelman, T. The effects of rural electrification on employment: new evidence from South Africa. Am. Econ. Rev. 101, 3078–3108 (2011).

    Article  Google Scholar 

  15. Lee, K., Miguel, E. & Wolfram, C. Does household electrification supercharge economic development? J. Econ. Perspect. 34, 122–144 (2020).

    Article  Google Scholar 

  16. Lee, K. et al. Electrification for “under grid” households in rural Kenya. Dev. Eng. 1, 26–35 (2016).

    Article  Google Scholar 

  17. Bayer, P., Kennedy, R., Yang, J. & Urpelainen, J. The need for impact evaluation in electricity access research. Energy Policy 137, 111099 (2020).

    Article  Google Scholar 

  18. Bernard, T. Impact analysis of rural electrification projects in sub-Saharan Africa. World Bank Res. Obs. 27, 33–51 (2012).

    Article  Google Scholar 

  19. Jaeger, D. A., Joyce, T. J. & Kaestne, R. A cautionary tale of evaluating identifying assumptions: did reality TV really cause a decline in teenage childbearing? J. Bus. Econ. Stat. 38, 317–326 (2020).

    Article  MathSciNet  Google Scholar 

  20. Kahn-Lang, A. & Lang, K. The promise and pitfalls of differences-in-differences: reflections on 16 and Pregnant and other applications. J. Bus. Econ. Stat. 38, 613–620 (2020).

    Article  MathSciNet  Google Scholar 

  21. Sahn, D. E. & Stifel, D. Exploring alternative measures of welfare in the absence of expenditure data. Rev. Income Wealth 49, 463–489 (2003).

    Article  Google Scholar 

  22. Filmer, D. & Scott, K. Assessing asset indices. Demography 49, 359–392 (2012).

    Article  PubMed  Google Scholar 

  23. He, K., Zhang, X., Ren, S. & Sun, J. in Proc. European Conference on Computer Vision – ECCV 2016 (eds Leibe, B., Matas, J., Sebe, N. & Welling, M.) 630–645 (2016).

  24. Athey, S., Bayati, M., Doudchenko, N., Imbens, G. & Khosravi, K. Matrix completion methods for causal panel data models. J. Am. Stat. Assoc. 116, 1716–1730 (2021).

    Article  MathSciNet  CAS  Google Scholar 

  25. Doudchenko, N. & Imbens, G. Balancing, regression, difference-in-differences and synthetic control methods: a synthesis. Preprint at https://arxiv.org/abs/1610.07748 (2016).

  26. Jedwab, R. & Storeygard, A. The average and heterogeneous effects of transportation investments: evidence from Sub-Saharan Africa 1960–2010. J. Eur. Econ. Assoc. 20, 1–38 (2022).

    Article  Google Scholar 

  27. Uganda National Roads Authority. Connecting Uganda. https://www.unra.go.ug/home (2021).

  28. Collins Bartholomew Ltd. Collins mobile coverage explorer (2014).

  29. World Bank Group. Poverty maps of Uganda (2018).

  30. World Bank Group. Uganda systematic country diagnostic: boosting inclusive growth and accelerating poverty reduction (2015).

  31. Burlig, F. & Preonas, L. Out of the darkness and into the light? Development effects of rural electrification. Energy Inst. Haas WP 268, 26 (2016).

    Google Scholar 

  32. Lee, K., Miguel, E. & Wolfram, C. Experimental evidence on the economics of rural electrification. J. Polit. Econ. 128, 1523–1565 (2020).

    Article  Google Scholar 

  33. Filmer, D. & Pritchett, L. H. Estimating wealth effects without expenditure data—or tears: an application to educational enrollments in states of India. Demography 38, 115–132 (2001).

    CAS  PubMed  Google Scholar 

  34. Omulo, G., Banadda, N. & Kiggundu, N. Harnessing of banana ripening process for banana juice extraction in Uganda. Afr. J. Food Sci. 6, 108–117 (2015).

    Google Scholar 

  35. Ministry of Energy and Mineral Development. Uganda’s Sustainable Energy for All (SE4ALL) initiative action agenda (2015).

  36. Ugandan Energy Sector GIS Working Group. Distribution lines operational (2016) (2017).

  37. Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2017).

  38. OpenStreetMap contributors. Planet dump retrieved from https://planet.osm.org (2019).

  39. Goodman-Bacon, A. Difference-in-differences with variation in treatment timing. J. Econom. 225, 254–277 (2021).

    Article  MathSciNet  MATH  Google Scholar 

  40. Callaway, B. & Sant’Anna, P. H. C. Difference-in-differences with multiple time periods. J. Econom. 225, 200–230 (2021).

    Article  MathSciNet  MATH  Google Scholar 

  41. Abadie, A., Diamond, A. & Hainmuelle, J. Synthetic control methods for comparative case studies: estimating the effect of California’s tobacco control program. J. Am. Stat. Assoc. 105, 493–505 (2010).

    Article  MathSciNet  CAS  Google Scholar 

Download references

Acknowledgements

We thank seminar participants at Stanford and Atlas AI for helpful comments and colleagues in Uganda for their help in locating and verifying the electricity grid maps. We thank A. Driscoll and J. Li for excellent research assistance. N.R. thanks the TomKat Center for Sustainable Energy at Stanford for financial support.

Author information

Authors and Affiliations

Authors

Contributions

N.R. conceived the study, collected the data, led the econometric analysis and wrote the paper. G.C. designed and performed the CNN modelling, developed the bias penalty term and wrote the paper. B.d.l.C. contributed to econometric analysis and wrote the paper. M.S. contributed to the econometric analysis and wrote the paper. M.B. advised the project, contributed to econometric analysis and wrote the paper.

Corresponding author

Correspondence to Marshall Burke.

Ethics declarations

Competing interests

M.B. is a cofounder at Atlas AI, a company that uses machine learning to measure economic outcomes in the developing world. G.C. is an employee at Atlas AI.

Peer review

Peer review information

Nature thanks Gang He, J. Henderson and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Data and analysis pipeline.

Starting with three forms of raw data on the left, a CNN was trained and modified to predict annual estimates of community-level asset wealth with reduced bias across 25 countries and 27,174 communities in SSA. Two ML approaches (MC and SC-EN) were then used to predict unit-specific unobserved counterfactuals for locations that received electrification between 2010 and 2012.

Extended Data Fig. 2 DHS from SSA used in this study.

DHS years in our sample run from 2005 to 2017. We used asset wealth estimates from more than 640,000 households across 27,174 villages to train our CNN wealth-prediction models.

Extended Data Fig. 3 Survey variables in DHS used in the creation of an asset wealth index in the training data.

The column on the right gives the variable code used in the original DHS data.

Extended Data Fig. 4 Alternative approaches to constructing the asset wealth index result in indices that are highly correlated.

a, Compares the base wealth index constructed from the variables listed in Extended Data Fig. 3 to an index that also includes the DHS’ ‘has electricity’ variable. The r2, intercept and slope coefficient from regressing the alternative index on the base index are shown at the top. b, Compares the base index to the base index plus the electricity variable but minus three electrical appliances (TV, refrigerator and phone). c, Compares the base index to the base index minus three electrical appliances.

Extended Data Fig. 5 Our custom loss function corrects for prediction biases at different points in the wealth distribution and modestly reduces overall predictive performance (r2) in held-out Uganda data.

ae, DHS observed wealth index values in Uganda on the x axis (n = 1,798) and CNN-predicted values on the y axis. The red line is a 45° line, the black is the line of best fit from regressing predicted on observed; slope coefficient and regression r2 are shown in the lower-right corner of each subplot. Higher penalties on quintile-specific bias leads to slope coefficients closer to 1 and slight reductions in r2. However, when the penalty term gets too large, as in the 7.5 case in e, regression coefficients begin to depart from 1 again. f, Relationship between the slope coefficient and r2 for each subplot. The results from the entire sub-Saharan dataset are shown in gold and those for the Uganda-only sample in black.

Extended Data Fig. 6 Cross-validation shows that MC and SC-EN can accurately estimate the average of held-out control values.

We split the control sample into ten equally sized, random folds and use MC, SC-EN and DiD to predict cluster-specific wealth values in the held-out test fold for each year, 2011–2016. We use each fold as a test set one time and in the training set nine times. The mean average differences across all years are 0.017, 0.018 and 0.013, respectively, signifying that MC and SC-EN can accurately predict unobserved values among control locations.

Extended Data Fig. 7 Simulated data indicate that MC has smaller prediction error than DiD when pre-trend bias is present, and both estimators are equally attenuated under Berkson-type measurement error.

a, We simulate a setting in which a treatment has a true-positive effect equal to 1 but units are trending differently in the absence of treatment and can have unit-specific trends that are correlated with treatment. We test this situation across several time periods (4 to 22 years, in which treatment begins in t/2 + 1) and varying growth rates of upward-trending units being selected into the treated group (0.1 to 0.25; darker blue or red lines indicate treated units with increasingly higher average growth rates). We then compare DiD and MC causal estimates and find that MC is less biased in the presence of treatment-correlated time-trending unobservables, particularly as time series lengthen. Blue dashed lines are DiD estimates that pool all years; blue solid lines are DiD estimates using only one post-treatment year; and red lines are MC estimates. b,c, We create a sample of ‘observed’ data that has a smaller variance than the true distribution, representing a common challenge in data imputation. Using our ‘observed’ sample data in b, we again simulate a setting that has a positive treatment effect equal to 1 but no differential pre-trends. Under this Berkson-type error, DiD and MC are similarly attenuated, underscoring the importance of bias reduction across the outcome distribution in the CNN training process.

Extended Data Fig. 8 Treatment effect estimates are robust to using larger inclusion buffers around electricity grid locations to identify treatment and control samples.

Our baseline analysis uses a 2-km buffer around the electrical grid lines to identify treatment and control units. Treatment effect estimates using either a 3-km or a 4-km buffer are slightly larger but qualitatively similar, suggesting that our results are robust to plausible alternative buffers. Error bars represent 95% CIs, and are based on 100 bootstrapped model runs.

Extended Data Fig. 9 The estimated treatment effect of electrification is higher when quintile-specific bias in wealth predictions is penalized more heavily.

Each estimate represents a separate estimate from MC or SC-EN using output from a model with varying quintile-specific bias, from λb = 0 up to λb = 7.5, as in Extended Data Fig. 5. Error bars represent 95% CIs, and are based on 100 bootstrapped model runs.

Extended Data Fig. 10 Road improvements were modest in scale during our study period in Uganda and not associated with electrical grid extensions.

Red lines denote electricity grid extensions made between 2011 and 2012, which are used to identify treated communities in our main analysis. Yellow lines denote road improvements (resurfacing) made before (2010) or during (2011–2012) our treatment period. Blue lines mark road improvements completed in the post-treatment period, and black dots in the main figure and inset panel represent DHS survey locations. During the entire period (2010–2016), road-upgrade projects averaged 184 km per year, totalling 1,102 km—less than 1% of the road network in Uganda. The location of road improvements shows almost no overlap with electricity grid development during our study period.

Extended Data Fig. 11 Cellular phone ownership and coverage is not associated with electrical grid expansion.

a, Cell phone ownership in newly electrified versus unelectrified communities based on household-level DHS responses in 2006, 2009, 2011, 2014 and 2016. A DiD estimate is shown in the lower-right corner, indicating that cell ownership in control communities grew faster post-treatment than in treated areas. b, Percentage of treated and control communities with cell phone coverage based on GSMA data from 2010 to 2016. Data again show that coverage grew faster in control communities post-treatment. c, As in b, except for the quality of cellular coverage (‘variable’ or ‘strong’ signal). Together, we do not find evidence of cellular ownership or investment being associated with new electricity access, suggesting that the public and private roll-out of cellular phone services did not occur in tandem with electricity grid development.

Extended Data Fig. 12 Rejection rate for DiD estimator in pre-trends test.

Percentage of runs in which a DiD estimator rejected a null hypothesis of no difference in pre-treatment trends in asset wealth between treated and control locations, at either 95% or 90% confidence. ‘With Uganda’ and ‘without Uganda’ refer to whether or not Uganda was used in the training and validation steps. ‘Penalty’ indicates the quintile-bias penalty term (λb) used in each model run.

Extended Data Fig. 13 Effect of electrification on specific assets in the asset index.

Using DiD on DHS locations, we find statistically significant positive impacts across numerous individual assets, including three variables associated with home construction and three variables associated with electricity. We find weakly positive results for more expensive items, such as cars and motorcycles. Error bars represent 95% CIs.

Supplementary information

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ratledge, N., Cadamuro, G., de la Cuesta, B. et al. Using machine learning to assess the livelihood impact of electricity access. Nature 611, 491–495 (2022). https://doi.org/10.1038/s41586-022-05322-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-022-05322-8

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing