Policy-makers are increasingly turning to behavioural science for insights about how to improve citizens’ decisions and outcomes1. Typically, different scientists test different intervention ideas in different samples using different outcomes over different time intervals2. The lack of comparability of such individual investigations limits their potential to inform policy. Here, to address this limitation and accelerate the pace of discovery, we introduce the megastudy—a massive field experiment in which the effects of many different interventions are compared in the same population on the same objectively measured outcome for the same duration. In a megastudy targeting physical exercise among 61,293 members of an American fitness chain, 30 scientists from 15 different US universities worked in small independent teams to design a total of 54 different four-week digital programmes (or interventions) encouraging exercise. We show that 45% of these interventions significantly increased weekly gym visits by 9% to 27%; the top-performing intervention offered microrewards for returning to the gym after a missed workout. Only 8% of interventions induced behaviour change that was significant and measurable after the four-week intervention. Conditioning on the 45% of interventions that increased exercise during the intervention, we detected carry-over effects that were proportionally similar to those measured in previous research3,4,5,6. Forecasts by impartial judges failed to predict which interventions would be most effective, underscoring the value of testing many ideas at once and, therefore, the potential for megastudies to improve the evidentiary value of behavioural science.
This is a preview of subscription content
Subscribe to Nature+
Get immediate online access to the entire Nature family of 50+ journals
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
The data analysed in this paper were provided by 24 Hour Fitness and we have their legal permission to share the deidentified data. We have therefore made deidentified data available at https://osf.io/9av87/?view_only=8bb9282111c24f81a19c2237e7d7eba3. Furthermore, tables of all of the preregistration links for each of the substudies with the interventions and the prediction studies are available in Supplementary Tables 2 and 30.
The code to replicate the analyses and figures in the paper and Supplementary Information is available online (https://osf.io/9av87/?view_only=8bb9282111c24f81a19c2237e7d7eba3).
Behavioural Insights and Public Policy: Lessons from Around the World (OECD Publishing, 2017).
Benartzi, S. et al. Should governments invest more in nudging? Psychol. Sci. 28, 1041–1055 (2017).
Charness, G. & Gneezy, U. Incentives to EXercise. Econometrica 77, 909–931 (2009).
Acland, D. & Levy, M. R. Naiveté, projection bias, and habit formation in gym attendance. Manage. Sci. 61, 146–160 (2015).
Royer, H., Stehr, M. & Sydnor, J. Incentives, commitments, and habit formation in exercise: evidence from a field experiment with workers at a Fortune-500 company. Am. Econ. J. Appl. Econ. 7, 51–84 (2015).
Beshears, J., Lee, H. N., Milkman, K. L., Mislavsky, R. & Wisdom, J. Creating exercise habits using incentives: the tradeoff between flexibility and routinization. Manage. Sci. 67, 4139-4171 (2020).
DellaVigna, S. & Linos, E. RCTs to Scale: Comprehensive Evidence from Two Nudge Units 65 (National Bureau of Economic Research, 2020).
DellaVigna, S. & Pope, D. What motivates effort? Evidence and expert forecasts. Rev. Econ. Stud. 85, 1029–1069 (2018).
DellaVigna, S. & Pope, D. Predicting experimental results: who knows what? J. Polit. Econ. 126, 2410–2456 (2018).
DellaVigna, S., Pope, D. & Vivalt, E. Predict science to improve science. Science 366, 428–429 (2019).
Kristal, A. S. & Whillans, A. V. What we can learn from five naturalistic field experiments that failed to shift commuter behaviour. Nat. Hum. Behav. 4, 169–176 (2020).
Donoho, D. 50 years of data science. J. Comput. Graph. Stat. 26, 745–766 (2017).
Liberman, M. Fred Jelinek. Comput. Linguist. 36, 595–599 (2010).
Lai, C. K. et al. Reducing implicit racial preferences: I. A comparative investigation of 17 interventions. J. Exp. Psychol. Gen. 143, 1765–1785 (2014).
Lai, C. K. et al. Reducing implicit racial preferences: II. Intervention effectiveness across time. J. Exp. Psychol. Gen. 145, 1001–1016 (2016).
Mellers, B. et al. Psychological strategies for winning a geopolitical forecasting tournament. Psychol. Sci. 25, 1106–1115 (2014).
Open Science Collaboration Estimating the reproducibility of psychological science. Science 349, aac4716 (2015).
Milkman, K. L., Minson, J. A. & Volpp, K. G. M. Holding the hunger games hostage at the gym: an evaluation of temptation bundling. Manage. Sci. 60, 283–299 (2014).
Ward, B. W., Clarke, T. C., Nugent, C. N. & Schiller, J. S. Early Release of Selected Estimates Based on Data From the 2015 National Health Interview Survey 120 (National Center for Health Statistics, 2015).
Lee, I.-M. et al. Effect of physical inactivity on major non-communicable diseases worldwide: an analysis of burden of disease and life expectancy. Lancet 380, 219–229 (2012).
Gollwitzer, P. M. Implementation intentions: strong effects of simple plans. Am. Psychol. 54, 493–503 (1999).
Milkman, K. L., Beshears, J., Choi, J. J., Laibson, D. & Madrian, B. C. Using implementation intentions prompts to enhance influenza vaccination rates. Proc. Natl Acad. Sci. USA 108, 10415–10420 (2011).
Rogers, T., Milkman, K. L., John, L. K. & Norton, M. I. Beyond good intentions: prompting people to make plans improves follow-through on important tasks. Behav. Sci. Pol. 1, 33–41 (2015).
Karlan, D., McConnell, M., Mullainathan, S. & Zinman, J. Getting to the top of mind: how reminders increase saving. Manage. Sci. 62, 3393–3411 (2016).
Homonoff, T. A. Can small incentives have large effects? The impact of taxes versus bonuses on disposable bag use. Am. Econ. J. Econ. Pol. 10, 177–210 (2018).
Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA 100, 9440–9445 (2003).
Allcott, H. Social norms and energy conservation. J. Publ. Econ. 95, 1082–1095 (2011).
Chapman, G. B., Li, M., Colby, H. & Yoon, H. Opting in vs opting out of influenza vaccination. JAMA 304, 43–44 (2010).
Milkman, K. L., et al. A megastudy of text-based nudges encouraging patients to get vaccinated at an upcoming doctor’s appointment. Proc. Natl Acad. Sci. USA 118, e2101165118 (2021).
Lee, M. R. & Shen, M. Winner’s curse: bias estimation for total effects of features in online controlled experiments. In Proc. 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining 491–499 (ACM, 2018).
White, H. Asymptotic Theory for Econometricians (Elsevier, 1984).
Dubner, S. J. How goes the behavior-change revolution? (Ep. 382). Freakonomics https://freakonomics.com/podcast/live-philadelphia/ (2019).
Support for this research was provided in part by the Robert Wood Johnson Foundation, the AKO Foundation, J. Alexander, M. J. Leder, W. G. Lichtenstein, the Pershing Square Fund for Research on the Foundations of Human Behavior from Harvard University and by Roybal Center grants (P30AG034546 and 5P30AG034532) from the National Institute on Aging. The views expressed here do not necessarily reflect the views of any of these individuals or entities. We thank 24 Hour Fitness for partnering with the Behavior Change for Good Initiative at the University of Pennsylvania to make this research possible.
The authors declare no competing interests. The authors did not receive commercial benefits from the fitness chain or speaking/consulting fees related to any of the interventions presented here.
Peer review information Nature thanks Charles Shearer and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
The measured change (blue) vs. change predicted by third-party observers (gold) in whether participants visited the gym that was induced by each of our megastudy’s 53 experimental conditions compared to a Placebo Control condition during a four-week intervention period is depicted here. Error bars represent 95% confidence intervals. See Extended Data Table 7 for complete OLS regression results graphed here in blue, Supplementary Information 11 for more details about the prediction data graphed here in gold, and Supplementary Table 1 for full descriptions of each treatment condition in our megastudy. Sample weights were included in the pooled third-party prediction data to ensure equal weighting of each of our three participant samples (professors, practitioners and prolific respondents). The superscripts a–e denote the different incentive amounts offered in different versions of the bonus for returning after missed workouts, higher incentives and rigidity rewarded conditions, which are described in Supplementary Table 1. In conditions with the same name, superscripts that come earlier in the alphabet indicate larger incentives.
About this article
Cite this article
Milkman, K.L., Gromet, D., Ho, H. et al. Megastudies improve the impact of applied behavioural science. Nature 600, 478–483 (2021). https://doi.org/10.1038/s41586-021-04128-4
Nature Reviews Psychology (2022)
Nature Human Behaviour (2022)
Tutorial. A Behavioral Analysis of Rationality, Nudging, and Boosting: Implications for Policymaking
Perspectives on Behavior Science (2022)
Journal of Organization Design (2022)