Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Perspective
  • Published:

Optimally generate policy-based evidence before scaling

Abstract

Social scientists have increasingly turned to the experimental method to understand human behaviour. One critical issue that makes solving social problems difficult is scaling up the idea from a small group to a larger group in more diverse situations. The urgency of scaling policies impacts us every day, whether it is protecting the health and safety of a community or enhancing the opportunities of future generations. Yet, a common result is that, when we scale up ideas, most experience a ‘voltage drop’—that is, on scaling, the cost–benefit profile depreciates considerably. Here I argue that, to reduce voltage drops, we must optimally generate policy-based evidence. Optimality requires answering two crucial questions: what information should be generated and in what sequence. The economics underlying the science of scaling provides insights into these questions, which are in some cases at odds with conventional approaches. For example, there are important situations in which I advocate flipping the traditional social science research model to an approach that, from the beginning, produces the type of policy-based evidence that the science of scaling demands. To do so, I propose augmenting efficacy trials by including relevant tests of scale in the original discovery process, which forces the scientist to naturally start with a recognition of the big picture: what information do I need to have scaling confidence?

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Adding option C thinking to A/B testing.
Fig. 2: Two dimensions that affect optimal data generation sequencing.

Similar content being viewed by others

References

  1. McCall, W. A. How to Measure in Education (Macmillan, 1922).

  2. Gosnell, H. F. Getting Out the Vote (Univ. Chicago Press, 1927).

  3. Fisher, R. A. The Design of Experiments (Oliver and Boyd, 1935). Represented an early formal treatment of the experimental method and created a methodological tripod that remains in use today.

  4. Lewin, K. Field theory and experiment in social psychology: concepts and methods. Am. J. Sociol. 44, 868–896 (1939).

    Article  Google Scholar 

  5. Smith, V. L. An experimental study of competitive market behavior. J. Polit. Econ. 70, 111–137 (1962). Helped to establish laboratory experiments as a tool for modern empirical economics and showcased its power using market experiments.

    Article  Google Scholar 

  6. Harrison, G. W. & List, J. A. Field experiments. J. Econ. Lit. 42, 1009–1055 (2004). Helped to establish field experiments as a useful tool for social scientists and created a typology for field experimental approaches.

    Article  Google Scholar 

  7. List, J. A. Homo experimentalis evolves. Science 321, 207–209 (2008).

    Article  CAS  PubMed  Google Scholar 

  8. List, J. A. The nature and extent of discrimination in the marketplace: evidence from the field. Q. J. Econ. 119, 49–89 (2004).

    Article  Google Scholar 

  9. Al-Ubaydli, O. & List, J. A. How natural field experiments have enhanced our understanding of unemployment. Nat. Hum. Behav. 3, 33–39 (2019).

    Article  PubMed  Google Scholar 

  10. Banerjee, A. V., Duflo, E., Glennerster, R. & Kothari, D. Improving immunisation coverage in rural India: clustered randomised controlled evaluation of immunisation campaigns with and without incentives. Brit. Med. J. 340, c2220 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  11. Ostrom, E. Governing the Commons: The Evolution of Institutions for Collective Action (Cambridge Univ. Press, 1990).

  12. List, J. A. The market for charitable giving. J. Econ. Perspect. 25, 157–180 (2011).

    Article  Google Scholar 

  13. DellaVigna, S., List, J. A. & Malmendier, U. Testing for altruism and social pressure in charitable giving. Q. J. Econ. 127, 1–56 (2012).

    Article  PubMed  Google Scholar 

  14. DellaVigna, S., List, J. A., Malmendier, U. & Rao, G. Estimating social preferences and gift exchange at work. Am. Econ. Rev. 112, 1038–1074 (2022).

    Article  Google Scholar 

  15. Halperin, B., Ho, B., List, J. A. & Muir, I. Toward an understanding of the economics of apologies: evidence from a large-scale natural field experiment. Econ. J. 132, 273–298 (2022).

    Article  Google Scholar 

  16. Levitt, S. D. & List, J. A. Field experiments in economics: the past, the present, and the future. Eur. Econ. Rev. 53, 1–18 (2009).

    Article  Google Scholar 

  17. Mobarak, A. M. Assessing social aid: the scale-up process needs evidence, too. Nature 609, 892–894 (2022). Provided a useful and meaningful scientific discussion of the science of scaling in development economics.

    Article  ADS  CAS  PubMed  Google Scholar 

  18. List, J. A. The Voltage Effect: How to Make Good Ideas Great and Great Ideas Scale (Currency, 2022).

  19. Al-Ubaydli, O., List, J. A. & Suskind, D. L. What can we learn from experiments? Understanding the threats to the scalability of experimental results. Am. Econ. Rev. 107, 282–286 (2017).

    Article  Google Scholar 

  20. Al-Ubaydli, O., List, J. A., LoRe, D. & Suskind, D. Scaling for economists: lessons from the non-adherence problem in the medical literature. J. Econ. Perspect. 31, 125–144 (2017).

    Article  PubMed  Google Scholar 

  21. Al-Ubaydli, O., List, J. A. & Suskind, D. 2017 Klein Lecture: the science of using science: toward an understanding of the threats to scalabality. Int. Econ. Rev. 61, 1387–1409 (2020). Provided a theoretical structure to understand the science of using science and generated insights that led to the five vital signs discussed here.

    Article  Google Scholar 

  22. Al-Ubaydli, O., Lee, M. S., List, J. A., Mackevicius, C. L. & Suskind, D. How can experiments play a greater role in public policy? Twelve proposals from an economic model of scaling. Behav. Publ. Pol. 5, 2–49 (2021).

    Article  Google Scholar 

  23. How to Solve U.S. Social Problems When Most Rigorous Program Evaluations Find Disappointing Effects (Part One in a Series) (Straight Talk on Evidence, 2018); www.straighttalkonevidence.org/2018/03/21/how-to-solve-u-s-social-problems-when-most-rigorous-program-evaluations-find-disappointing-effects-part-one-in-a-series/.

  24. Brandon, A., Clapp, C. M., List, J. A., Metcalfe, R. D. & Price, M. The Human Perils of Scaling Smart Technologies: Evidence from Field Experiments Working Paper Series No. 30482 (National Bureau of Economic Research, 2022).

  25. Raikes, H. et al. Involvement in early head start home visiting services: demographic predictors and relations to child and parent outcomes. Early Child. Res. Q. 21, 2–24 (2006).

    Article  Google Scholar 

  26. Shapley, H. Of Stars and Men: the Human Response to an Expanding Universe (Washington Square Press, 1964).

  27. Newton, I. Philosophiæ Naturalis Principia Mathematica (London, 1687) (Harvard Univ. Press, 1966).

  28. Brunswik, E. Perception and the Representative Design of Psychological Experiments 2nd edn (Univ. California Press, 1956)

  29. Campbell, D. T., & Stanley, J. C. Experimental and Quasi-Experimental Designs for Research (Rand McNally & Company, 1963).

  30. Al-Ubaydli, O., & List, J. A. in Methods of Modern Experimental Economics (eds Frechette, G. & Schotter, A.) chapter 20, 420–462 (Oxford Univ. Press, 2013).

  31. List, J. A. Non Est Disputandum de Generalizability? A Glimpse into the External Validity Trial Working Paper 27535 (National Bureau of Economic Research, 2020).

  32. Nosek, B. A., Spies, J. R. & Motyl, M. Scientific utopia: II. Restructuring incentives and practices to promote truth over publishability. Perspect. Psychol. Sci. 7, 615–631 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  33. Jennions, M. D. & Møller, A. P. A survey of the statistical power of research in behavioral ecology and animal behavior. Behav. Ecol. 14, 438–445 (2003).

    Article  Google Scholar 

  34. Camerer, C. F. et al. Evaluating replicability of laboratory experiments in economics. Science 351, 1433–1436 (2016).

    Article  ADS  CAS  PubMed  Google Scholar 

  35. Camerer, C. F. et al. Evaluating the replicability of social science experiments in nature and science between 2010 and 2015. Nat. Hum. Behav. 2, 637–644 (2018).

    Article  PubMed  Google Scholar 

  36. List, J. A., Bailey, C. D., Euzent, P. J. & Martin, T. L. Academic economists behaving badly? A survey on three areas of unethical behavior. Econ. Inq. 39, 162–170 (2001).

    Article  Google Scholar 

  37. Dreber, A. et al. Using prediction markets to estimate the reproducibility of scientific research. Proc. Natl Acad. Sci. USA 112, 15343 (2015).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  38. Benjamin, D. J. et al. Redefine statistical significance. Nat. Hum. Behav. 2, 6–10 (2018).

    Article  PubMed  Google Scholar 

  39. Butera, L. & List, J. A. An Economic Approach to Alleviate the Crises of Confidence in Science: With an Application to the Public Goods Game (National Bureau of Economic Research, 2017).

  40. Buck, S. Policy-Based Evidence Doesn't Always Get it Backward, www.arnoldventures.org/stories/when-policy-based-evidence-is-exactly-what-is-needed (Arnold Ventures, 2019).

  41. Ioannidis, J. P. A. Why most published research findings are false. PLoS Med. 2, e124 (2005).

    Article  PubMed  PubMed Central  Google Scholar 

  42. List, J. A. Experimental Economics: Theory and Practice (Univ. Chicago Press, 2024).

  43. Maniadis, Z., Tufano, F. & List, J. A. One swallow doesn’t make a summer: new evidence on anchoring effects. Am. Econ. Rev. 104, 277–290 (2014).

    Article  Google Scholar 

  44. Reed, W. R. A primer on the ‘reproducibility crisis’ and ways to fix it. Aust. Econ. Rev. 51, 286–300 (2018).

    Article  Google Scholar 

  45. Butera, L., Grossman, P. J., Houser, D., List, J. A. & Villeval, M.-C. A New Mechanism to Alleviate the Crises of Confidence in Science—With An Application to the Public Goods Game (National Bureau of Economic Research, 2020).

  46. Maniadis, Z., Tufano, F. & List, J. A. To replicate or not to replicate? Exploring reproducibility in economics through the lens of a model and a pilot study. Econ. J. 127, F209–F235 (2017).

    Article  Google Scholar 

  47. Cleave, B. L., Nikiforakis, N. & Slonim, R. Is there selection bias in laboratory experiments? The case of social and risk preferences. Exp. Econ. 16, 372–382 (2013).

    Article  Google Scholar 

  48. Doty, R. L. & Silverthorne, C. Influence of menstrual cycle on volunteering behaviour. Nature 254, 139–140 (1975).

    Article  ADS  CAS  PubMed  Google Scholar 

  49. Rosenthal, R. & Rosnow, R. L. Artifacts in Behavioral Research: Robert Rosenthal and Ralph L. Rosnow’s Classic Books (Oxford Univ. Press, 2009).

  50. Orne, M. T. On the social psychology of the psychological experiment: with particular reference to demand characteristics and their implications. Am. Psychol. 17, 776–783 (1962).

    Article  Google Scholar 

  51. Henrich, J. et al. In search of homo economicus: behavioral experiments in 15 small-scale societies. Am. Econ. Rev. 91, 73–78 (2001).

    Article  Google Scholar 

  52. Henrich, J., Heine, S. J. & Norenzayan, A. The weirdest people in the world? Behav. Brain Sci. 33, 61–83 (2010). Called attention to, and created a useful discussion of, the importace of participant pools in social science experiments.

    Article  PubMed  Google Scholar 

  53. Henrich, J., Heine, S. J. & Norenzayan, A. Most people are not WEIRD. Nature 466, 29 (2010).

  54. Fehr, E. & List, J. A. The hidden costs and returns of incentives—trust and trustworthiness among CEOs. J. Eur. Econ. Assoc. 2, 743–771 (2004).

    Article  Google Scholar 

  55. Levitt, S. D. & List, J. A. What do laboratory experiments measuring social preferences reveal about the real world? J. Econ. Perspect. 21, 153–174 (2007). Called attention to, and created a useful discussion of, the importance of both the population of experimental participants and the population of situations in economic experiments.

    Article  Google Scholar 

  56. Hotz, J. V., Imbens, G. W. & Mortimer, J. H. Predicting the efficacy of future training programs using past experiences at other locations. J. Econom. 125, 241–270 (2005).

    Article  MathSciNet  Google Scholar 

  57. Kern, H. L., Stuart, E. A., Hill, J. & Green, D. P. Assessing methods for generalizing experimental impact estimates to target populations. J. R. Educ. Effect. 9, 103–127 (2016).

    Google Scholar 

  58. Yeager, D. S. et al. A national experiment reveals where a growth mindset improves achievement. Nature 573, 364–369 (2019).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  59. Yeager, D. S., Krosnick, J. A., Visser, P. S., Holbrook, A. L. & Tahk, A. M. Moderation of classic social psychological effects by demographics in the U.S. adult population: new opportunities for theoretical advancement. J. Person. Soc. Psychol. 117, e84–e99 (2019).

    Article  Google Scholar 

  60. Yeager, D. S. et al. Teacher mindsets help explain where a growth-mindset intervention does and doesn’t work. Psychol. Sci. 33, 18–32 (2022).

    Article  PubMed  Google Scholar 

  61. Tipton, E. Y. et al. Sample selection in randomized experiments: a new method using propensity score stratified sampling. J. Res. Educ. Effect. 7, 114–135 (2014).

    Google Scholar 

  62. Rudolph, K. E. et al. Composition or context: using transportability to understand drivers of site differences in a large-scale housing experiment. Epidemiology. 29, 199–206 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  63. Miguel, E. & Kremer, M. Worms: identifying impacts on education and health in the presence of treatment externalities. Econometrica 72, 159–217 (2004). An early field experiment in development economics that showed the impact of understanding spillover effects in economic experiments.

    Article  Google Scholar 

  64. List, J. A., Momeni, F. & Zenou, Y. Are Measures of Early Education Programs Too Pessimistic? Evidence from a Large-Scale Field Experiment Working Paper (National Bureau of Economic Research, 2019).

  65. Smith, A. An Inquiry Into the Nature and Causes of the Wealth of Nations (A. Strahan & T. Cadell, 1776).

  66. Rabb, N. et al. Evidence from a statewide vaccination RCT shows the limits of nudges. Nature 604, E1–E7 (2022).

    Article  CAS  PubMed  Google Scholar 

  67. Heller, S. B. et al. Thinking, fast and slow? Some field experiments to reduce crime and dropout in Chicago. Q. J. Econ. 132, 1–54 (2017).

    Article  PubMed  Google Scholar 

  68. Bhatt, M. P., Guryan, J., Ludwig, J. & Shah, A. K. Scope Challenges to Social Impact Working Paper 28406 (National Bureau of Economic Research, 2021).

  69. Bettinger, E. P., Long, B. T., Oreopoulos, P. & Sanbonmatsu, L. The role of application assistance and information in college decisions: results from the H&R block FAFSA experiment. Q. J. Econ. 127, 1205–1242 (2012).

    Article  Google Scholar 

  70. Bird, K. A. et al. Nudging at scale: experimental evidence from FAFSA completion campaigns. J. Econ. Behav. Organ. 183, 105–128 (2021).

    Article  Google Scholar 

  71. Bryan, C. J., Tipton, E. & Yeager, D. S. Behavioural science is unlikely to change the world without a heterogeneity revolution. Nat. Hum. Behav. 5, 980–989 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  72. List, J. A. On the interpretation of giving in dictator games. J. Polit. Econ. 115, 482–493 (2007).

    Article  Google Scholar 

  73. List, J. A. The behavioralist meets the market: measuring social preferences and reputation effects in actual transactions. J. Polit. Econ. 114, 1–37 (2006).

    Article  Google Scholar 

  74. Walton, G. M. & Yeager, D. S. Seed and soil: psychological affordances in contexts help to explain where wise interventions succeed or fail. Curr. Dir. Psychol. Sci. 29, 219–226 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  75. Szaszi, B. et al. No reason to expect large and consistent effects of nudge interventions. Proc. Natl Acad. Sci. USA 119, e2200732119 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Holland, P. W. Statistics and causal inference. J. Am. Stat. Assoc. 81, 945–960 (1986).

    Article  MathSciNet  Google Scholar 

  77. Deaton, A. & Cartwright, N. Understanding and misunderstanding randomized controlled trials. Soc. Sci. Med. 210, 2–21 (2018).

    Article  PubMed  Google Scholar 

  78. List, J. A., Pernaudet, J. & Suskind, D. L. Shifting parental beliefs about child development to foster parental investments and improve school readiness outcomes. Nat. Commun. 12, 5765 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  79. List, J. A. & Shogren, J. F. Calibration of the difference between actual and hypothetical valuations in a field experiment. J. Econ. Behav. Organ. 37, 193–205 (1998).

    Article  Google Scholar 

  80. Walton, G. M. et al. Where and with whom does a brief social-belonging intervention promote progress in college? Science 380, 499–505 (2023).

    Article  ADS  CAS  PubMed  Google Scholar 

  81. Blackwell, L. S., Trzesniewski, K. H. & Dweck, C. S. Implicit theories of intelligence predict achievement across an adolescent transition: a longitudinal study and an intervention. Child Dev. 78, 246–263 (2007).

    Article  PubMed  Google Scholar 

  82. Banerjee, A. V., Cole, S., Duflo, E. & Linden, L. Remedying education: evidence from two randomized experiments in India. Q. J. Econ. 122, 1235–1264 (2007).

    Article  Google Scholar 

  83. Banerjee, A. et al. From proof of concept to scalable policies: challenges and solutions, with an application. J. Econ. Perspect. 31, 73–102 (2017).

    Article  Google Scholar 

  84. Allcott, H. Site selection bias in program evaluation. Q. J. Econ. 130, 1117–1165 (2015).

    Article  Google Scholar 

  85. Graham, J. R., Harvey, C. R. & Rajgopal, S. The economic implications of corporate financial reporting. J. Account. Econ. 40, 3–73 (2005).

    Article  Google Scholar 

  86. Davies, R., Haldane, A. G., Nielsen, M. & Pezzini, S. Measuring the costs of short-termism. J. Financ. Stab. 12, 16–25 (2014).

    Article  Google Scholar 

  87. Laverty, K. J. Economic “short-termism”: the debate, the unresolved issues, and the implications for management practice and research. AMR 21, 825–860 (1996).

    Google Scholar 

  88. Marginson, D. & McAulay, L. Exploring the debate on short-termism: a theoretical and empirical analysis. Strateg. Manag. J. 29, 273–292 (2008).

    Article  Google Scholar 

  89. Caplin, A. & Leahy, J. The social discount rate. J. Polit. Econ. 112, 1257–1268 (2004).

    Article  Google Scholar 

  90. Stern, N. The Economics of Climate Change: The Stern Review (Cambridge Univ. Press, 2006).

  91. Dasgupta, P. Discounting climate change. J. Risk Uncertain. 37, 141–169 (2008).

    Article  Google Scholar 

  92. Weitzman, M. L. On modeling and interpreting the economics of catastrophic climate change. Rev. Econ. Stat. 91, 1–19 (2009).

    Article  Google Scholar 

  93. Banerjee, A., Barnhardt, S. & Duflo, E. Can iron-fortified salt control anemia? Evidence from two experiments in rural Bihar. J. Dev. Econ 133, 127–146 (2018).

    Article  Google Scholar 

  94. Fryer, R. G., Levitt, S. D., List, J. A. & Samek, A. Towards an Understanding of What Works in Preschool Education, working paper (Univ. Chicago, 2017).

  95. Fryer, J., Roland G., Levitt, S. D., List, J. A. & Samek, A. Introducing CogX: A New Preschool Education Program Combining Parent and Child Interventions Working Paper (National Bureau of Economic Research, 2020).

  96. Charness, G., List, J. A., Rustichini, A., Samek, A. & Van De Ven, J. Theory of mind among disadvantaged children: evidence from a field experiment. J. Econ. Behav. Organ. 166, 174–194 (2019).

    Article  Google Scholar 

  97. Andreoni, J. et al. Toward an understanding of the development of time preferences: evidence from field experiments. J. Publ. Econ. 177, 104039 (2019).

    Article  Google Scholar 

  98. Andreoni, J., Di Girolamo, A., List, J. A., Mackevicius, C. & Samek, A. Risk preferences of children and adolescents in relation to gender, cognitive skills, soft skills, and executive functions. J. Econ. Behav. Organ. 179, 729–742 (2020).

    Article  PubMed  Google Scholar 

  99. Cappelen, A., List, J., Samek, A. & Tungodden, B. The effect of early-childhood education on social preferences. J. Polit. Econ. 128, 2739–2758 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  100. Islam, A., List, J. A., Vlassopoulos, M. & Zenou, Y. Early Childhood Education, Parent Social Networks, and Child Development, working paper (Univ. Chicago, 2023).

  101. List, J. A. Field experiments: a bridge between lab and naturally-occurring data. BE J. Econ. Anal. Pol. 5(2), 1–47 (2007).

  102. Hall, J. V., Horton, J. J. & Knoepfle, D. T. Pricing in Designed Markets: The Case of Ride-Sharing Working Paper (National Bureau of Economic Research, 2021).

  103. Chandar, B., Gneezy, U., List, J. A. & Muir, I. The Drivers of Social Preferences: Evidence from a Nationwide Tipping Field Experiment Working Paper 26380 (National Bureau of Economic Research, 2019).

  104. Acemoglu, D., Laibson, D. I. & List, J. A. Economics (Pearson, 2017).

Download references

Acknowledgements

Many thanks to K. Milkman, A. Mobarak and D. Yeager for comments that markedly improved the message of this study. F. Fatchen and D. Franks provided research assistance.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John A. List.

Ethics declarations

Competing interests

The author declares no competing interests.

Peer review

Peer review information

Nature thanks Katherine Milkman, Ahmed Mobarak and David Yeager for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

List, J.A. Optimally generate policy-based evidence before scaling. Nature 626, 491–499 (2024). https://doi.org/10.1038/s41586-023-06972-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-023-06972-y

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing