Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Quantifying the dynamics of failure across science, startups and security

An Author Correction to this article was published on 05 June 2020

This article has been updated

Abstract

Human achievements are often preceded by repeated attempts that fail, but little is known about the mechanisms that govern the dynamics of failure. Here, building on previous research relating to innovation1,2,3,4,5,6,7, human dynamics8,9,10,11 and learning12,13,14,15,16,17, we develop a simple one-parameter model that mimics how successful future attempts build on past efforts. Solving this model analytically suggests that a phase transition separates the dynamics of failure into regions of progression or stagnation and predicts that, near the critical threshold, agents who share similar characteristics and learning strategies may experience fundamentally different outcomes following failures. Above the critical point, agents exploit incremental refinements to systematically advance towards success, whereas below it, they explore disjoint opportunities without a pattern of improvement. The model makes several empirically testable predictions, demonstrating that those who eventually succeed and those who do not may initially appear similar, but can be characterized by fundamentally distinct failure dynamics in terms of the efficiency and quality associated with each subsequent attempt. We collected large-scale data from three disparate domains and traced repeated attempts by investigators to obtain National Institutes of Health (NIH) grants to fund their research, innovators to successfully exit their startup ventures, and terrorist organizations to claim casualties in violent attacks. We find broadly consistent empirical support across all three domains, which systematically verifies each prediction of our model. Together, our findings unveil detectable yet previously unknown early signals that enable us to identify failure dynamics that will lead to ultimate success or failure. Given the ubiquitous nature of failure and the paucity of quantitative approaches to understand it, these results represent an initial step towards the deeper understanding of the complex dynamics underlying failure.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Mechanisms of chance and learning.
Fig. 2: The k model.
Fig. 3: Testing model predictions.

Similar content being viewed by others

Data availability

This paper makes use of restricted access data from the National Institutes of Health (NIH), protected by the Privacy Act of 1974 as amended (5 U.S.C. 552a). Deidentified data necessary to reproduce all plots and statistical analyses are freely available at https://yian-yin.github.io/quantifyFailure. Those wishing to access the raw data can apply for access following the procedures outlined in the NIH Data Access Policy document (http://report.nih.gov/pdf/DataAccessPolicy.pdf). The VentureXpert database is available from Thomson Reuters. The Global Terrorism Database is publicly available at https://www.start.umd.edu/gtd/.

Code availability

Code is available at https://yian-yin.github.io/quantifyFailure.

Change history

  • 05 June 2020

    An amendment to this paper has been published and can be accessed via a link at the top of the paper.

References

  1. Fortunato, S. et al. Science of science. Science 359, eaao0185 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  2. Harford, T. Adapt: Why Success Always Starts with Failure (Farrar, Straus and Giroux, 2011).

  3. Wuchty, S., Jones, B. F. & Uzzi, B. The increasing dominance of teams in production of knowledge. Science 316, 1036–1039 (2007).

    Article  ADS  CAS  PubMed  Google Scholar 

  4. Jones, B. F. The burden of knowledge and the “death of the renaissance man”: is innovation getting harder? Rev. Econ. Stud. 76, 283–317 (2009).

    Article  MATH  Google Scholar 

  5. Sinatra, R., Wang, D., Deville, P., Song, C. & Barabási, A.-L. Quantifying the evolution of individual scientific impact. Science 354, aaf5239 (2016).

    Article  PubMed  CAS  Google Scholar 

  6. Liu, L. et al. Hot streaks in artistic, cultural, and scientific careers. Nature 559, 396–399 (2018).

    Article  ADS  CAS  PubMed  Google Scholar 

  7. Hu, Y., Havlin, S. & Makse, H. A. Conditions for viral influence spreading through multiplex correlated social networks. Phys. Rev. X 4, 021031 (2014).

    Google Scholar 

  8. Barabási, A.-L. The origin of bursts and heavy tails in human dynamics. Nature 435, 207–211 (2005).

    Article  ADS  PubMed  CAS  Google Scholar 

  9. González, M. C., Hidalgo, C. A. & Barabási, A.-L. Understanding individual human mobility patterns. Nature 453, 779–782 (2008).

    Article  ADS  PubMed  CAS  Google Scholar 

  10. Castellano, C., Fortunato, S. & Loreto, V. Statistical physics of social dynamics. Rev. Mod. Phys. 81, 591–646 (2009).

    Article  ADS  Google Scholar 

  11. Malmgren, R. D., Stouffer, D. B., Campanharo, A. S. & Amaral, L. A. N. On universality in human correspondence activity. Science 325, 1696–1700 (2009).

    Article  ADS  CAS  PubMed  Google Scholar 

  12. Argote, L. Organizational Learning: Creating, Retaining and Transferring Knowledge (Springer Science & Business Media, 2012).

  13. Sitkin, S. B. Learning through failure: the strategy of small losses. Res. Organ. Behav. 14, 231–266 (1992).

    Google Scholar 

  14. Yelle, L. E. The learning curve: historical review and comprehensive survey. Decis. Sci. 10, 302–328 (1979).

    Article  Google Scholar 

  15. Dutton, J. M. & Thomas, A. Treating progress functions as a managerial opportunity. Acad. Manage. Rev. 9, 235–247 (1984).

    Article  Google Scholar 

  16. Huber, G. P. Organizational learning: the contributing processes and the literatures. Organ. Sci. 2, 88–115 (1991).

    Article  Google Scholar 

  17. Cannon, M. D. & Edmondson, A. C. Failing to learn and learning to fail (intelligently): how great organizations put failure to work to innovate and improve. Long Range Plann. 38, 299–319 (2005).

    Article  Google Scholar 

  18. Kaplan, S. N. & Lerner, J. in Measuring Entrepreneurial Businesses: Current Knowledge and Challenges (Univ. Chicago Press, 2016).

  19. Eggers, J. P. & Song, L. Dealing with failure: serial entrepreneurs and the costs of changing industries between ventures. Acad. Manage. J. 58, 1785–1803 (2015).

    Article  Google Scholar 

  20. National Consortium for the Study of Terrorism and Responses to Terrorism. Global Terrorism Database (GTD) https://www.start.umd.edu/research-projects/global-terrorism-database-gtd (2018).

  21. Clauset, A. & Gleditsch, K. S. The developmental dynamics of terrorist organizations. PLoS ONE 7, e48633 (2012).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  22. Johnson, N. et al. Pattern in escalations in insurgent and terrorist activity. Science 333, 81–84 (2011).

    Article  ADS  CAS  PubMed  Google Scholar 

  23. Newell, A. & Rosenbloom, P. S. in Cognitive Skills and their Acquisition 1 (ed. Anderson, J. R.) 1–55 (Erlbaum, 1981).

  24. Anderson, J. R. Acquisition of cognitive skill. Psychol. Rev. 89, 369–406 (1982).

    Article  Google Scholar 

  25. Muth, J. F. Search theory and the manufacturing progress function. Manage. Sci. 32, 948–962 (1986).

    Article  Google Scholar 

  26. Wright, T. P. Factors affecting the cost of airplanes. J. Aeronaut. Sci. 3, 122–128 (1936).

    Article  Google Scholar 

  27. March, J. G. Exploration and exploitation in organizational learning. Organ. Sci. 2, 71–87 (1991).

    Article  ADS  Google Scholar 

  28. Foster, J. G., Rzhetsky, A. & Evans, J. A. Tradition and innovation in scientists’ research strategies. Am. Sociol. Rev. 80, 875–908 (2015).

    Article  Google Scholar 

  29. Arbesman, S. The Half-life of Facts: Why Everything We Know Has an Expiration Date (Penguin, 2013).

  30. Madsen, P. M. & Desai, V. Failing to learn? The effects of failure and success on organizational learning in the global orbital launch vehicle industry. Acad. Manage. J. 53, 451–476 (2010).

    Article  Google Scholar 

  31. Argote, L., Beckman, S. L. & Epple, D. The persistence and transfer of learning in industrial settings. Manage. Sci. 36, 140–154 (1990).

    Article  Google Scholar 

  32. Kuhn, T. S. The Structure of Scientific Revolutions (Chicago Univ. Press, 2012).

  33. Merton, R. K. Singletons and multiples in scientific discovery: a chapter in the sociology of science. Proc. Am. Phil. Soc. 105, 470–486 (1961).

    Google Scholar 

  34. Gompers, P., Kovner, A., Lerner, J. & Scharfstein, D. Performance persistence in entrepreneurship. J. Financ. Econ. 96, 18–32 (2010).

    Article  Google Scholar 

  35. de Holan, P. M. & Phillips, N. Remembrance of things past? the dynamics of organizational forgetting. Manage. Sci. 50, 1603–1613 (2004).

    Article  Google Scholar 

  36. Schelling, T. C. Micromotives and Macrobehavior (WW Norton & Company, 2006).

  37. Watts, D. J. A simple model of global cascades on random networks. Proc. Natl Acad. Sci. USA 99, 5766–5771 (2002).

    Article  ADS  MathSciNet  CAS  PubMed  MATH  Google Scholar 

  38. Holme, P. & Newman, M. E. Nonequilibrium phase transition in the coevolution of networks and opinions. Phys. Rev. E 74, 056108 (2006).

    Article  ADS  CAS  Google Scholar 

  39. Ginther, D. K. et al. Race, ethnicity, and NIH research awards. Science 333, 1015–1019 (2011).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  40. Boudreau, K. J., Guinan, E. C., Lakhani, K. R. & Riedl, C. Looking across and looking beyond the knowledge frontier: intellectual distance, novelty, and resource allocation in science. Manage. Sci. 62, 2765–2783 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  41. Bromham, L., Dinnage, R. & Hua, X. Interdisciplinary research has consistently lower funding success. Nature 534, 684–687 (2016).

    Article  ADS  CAS  PubMed  Google Scholar 

  42. Banal-Estanol, A., Macho-Stadler, I. & Pérez Castrillo, D. Key Success Drivers in Public Research Grants: Funding the Seeds of Radical Innovation in Academia? CESifo Working Paper Series 5852 (CESifo, 2016).

  43. Ma, A., Mondragón, R. J. & Latora, V. Anatomy of funded research in science. Proc. Natl Acad. Sci. USA 112, 14760–14765 (2015).

    Article  ADS  CAS  PubMed  Google Scholar 

  44. Levitt, B. & March, J. G. Organizational learning. Annu. Rev. Sociol. 14, 319–338 (1988).

    Article  Google Scholar 

  45. Argote, L. & Epple, D. Learning curves in manufacturing. Science 247, 920–924 (1990).

    Article  ADS  CAS  PubMed  Google Scholar 

  46. Merton, R. K. et al. The Matthew effect in science. Science 159, 56–63 (1968).

    Article  ADS  PubMed  CAS  Google Scholar 

  47. Huang, J., Ertekin, S. & Giles, C. L. Efficient name disambiguation for large-scale databases. In European Conference on Principles of Data Mining and Knowledge Discovery 536–544 (Springer, 2006).

  48. Shen, H. Inequality quantified: Mind the gender gap. Nature 495, 22–24 (2013).

    Article  ADS  CAS  PubMed  Google Scholar 

  49. Larivière, V., Ni, C., Gingras, Y., Cronin, B. & Sugimoto, C. R. Bibliometrics: global gender disparities in science. Nature 504, 211–213 (2013).

    Article  PubMed  Google Scholar 

  50. Yang, T. & Aldrich, H. E. Who’s the boss? Explaining gender inequality in entrepreneurial teams. Am. Sociol. Rev. 79, 303–327 (2014).

    Article  Google Scholar 

  51. Argote, L., Insko, C. A., Yovetich, N. & Romero, A. A. Group learning curves: the effects of turnover and task complexity on group performance. J. Appl. Soc. Psychol. 25, 512–529 (1995).

    Article  Google Scholar 

  52. Bailey, C. D. Forgetting and the learning curve: a laboratory study. Manage. Sci. 35, 340–352 (1989).

    Article  Google Scholar 

Download references

Acknowledgements

We thank C. Song, A. Clauset, B. Uzzi, B. Jones, E. Finkel, J. Van Mieghem, A. Bassamboo and Y. Xie for helpful discussions, and H. Sauermann and S. Havlin for suggesting extensions of the model, leading us to discover the kα and kα –δ models. This work is supported by the Air Force Office of Scientific Research under award number FA9550-15-1-0162, FA9550-17-1-0089 and FA9550-19-1-0354, National Science Foundation grant SBE 1829344, the Alfred P. Sloan Foundation G-2019-12485, and Northwestern University Data Science Initiative. This work does not reflect the position of NIH.

Author information

Authors and Affiliations

Authors

Contributions

D.W. conceived the project and designed the experiments; Y.Y. and Y.W. collected data and performed empirical analyses with help from D.W. and J.A.E.; Y.Y. and D.W. carried out theoretical calculations; all authors collaboratively designed the model and interpreted results; D.W. and Y.Y. wrote the manuscript; all authors edited the manuscript.

Corresponding author

Correspondence to Dashun Wang.

Ethics declarations

Competing interests

Y.W. and D.W. serve as special volunteers (unpaid) to the NIH. The remaining authors declare no competing interests.

Additional information

Peer review information Nature thanks Shlomo Havlin and Henry Sauermann for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 The k model.

af, Simulation results from the model (α = 0.6) for the cases of k = 0 (a, d) and k → ∞ (b, e) in terms of the average quality (ac) and efficiency (df) of each attempt. k = 0 recovers the chance model, predicting a constant quality (c) and efficiency (f). k → ∞ predicts temporal scaling that characterizes the dynamics of failure (e) with improved quality (b), recovering predictions from learning curves and Wright’s law. gj, Illustration of mapping between failure dynamics (g, h) and canonical ensembles (i, j). The canonical system is characterized by three different states a, b, c with corresponding energy densities Ea(h), Eb(h), Ec(h). Here we assume Ea(h) = (2εh − 1)2, Eb(h) = (2h − 1)2 and Ec(h) = [2ε(1 − h) − 1]2 where ε → 0+. The introduction of ε is to distinguish state a from state c, both of which can be approximated in the limiting condition Ea(h) = Ec(h) = 0. We map f → (2Γ − 1)2, N → ln[n], h → K and Ei(h) = [2Γi(K) − 1]2. In this case, the two transition points k* and k* + 1 correspond to h = 0 and 1 in the canonical ensemble systems.

Extended Data Fig. 2 Predicting temporal dynamics in science, entrepreneurship and security.

ac, We compare the goodness of fit for three different models in temporal dynamics in NIH grants (a, n = 10345), startups (b, n = 275) and terrorist attacks (c, n = 136). For each individual sample, we take all but the last inter-event time for model fitting (n = 1, …, N − 1), comparing model predictions for the last inter-event time. The tested functional forms are power law, tn = anb; exponential, tn = abn; and linear, tn = a + bn. We then calculate the frequency that each model reaches minimum error, defined as \(|\,\log ({t}_{N})-\,\log ({\hat{t}}_{N})|\), among all three forms. The power-law model offers consistently better predictions. df, As in ac, but using \(|{t}_{N}-{\hat{t}}_{N}|\) as the loss function.

Extended Data Fig. 3 Predicting ultimate success in science, entrepreneurship and security.

ac, Area under the receiver operating characteristic curve (AUC) of the prediction task. We apply two logistic regression models (Supplementary Information 6.1) to predict ultimate success in NIH grants (a), startups (b) and terrorist attacks (c). The centres and error bars of AUC scores denote the mean ± s.e.m. calculated from tenfold cross-validation over 50 randomized iterations (green, model 1; red, model 2). d, e, As in a but predicting ultimate success in NIH grants for male (d) and female (e) investigators.

Extended Data Fig. 4 Model validations.

a, b, An illustration of the component dynamics. We extract all MeSH terms associated with the nth attempt, Sn, and calculate the number of new terms mn, defined as \(|{S}_{n}-({S}_{n-1}\cup \cdots \cup {S}_{n-k})|\). b, Testing component dynamics in NIH grant applications. We calculate the dynamics of Mn = 〈mn〉/〈m1〉 using different k and compare it with Tn. The centres and error bars of Mn show the mean ± s.e.m. (n = 5,899) for different k. The shaded area shows mean ± s.e.m. of Tn (log scale) measured on the same subset. All k > 3 lead to similar trends between Mn and Tn. ce, Length of failure streak after randomization in science (c), entrepreneurship (d) and security (e). We take the samples used in Fig. 1 and shuffle the success/failure label from each attempt. This operation keeps both the overall success rate and the total number of attempts for each individual constant. fh, Temporal scaling patterns within the successful group in science (f), entrepreneurship (g) and security (h). We separated the successful group into two subgroups (narrow winners and clear winners) based on eventual performance (0.9 in evaluation score for D1, 0.5 in investment amount for D2 and 1 in wounded individuals for D3). The shaded area shows mean ± s.e.m. of Tn (log scale).

Extended Data Fig. 5 Robustness check on definition of unsuccessful group.

al, Robustness check as we change the threshold of inactivity to 3 years. ac, Failure streak in science (a), entrepreneurship (b) and security (c). Blue circles represent real data from the successful group and dashed lines represent fitted Weibull distributions. df, Temporal scaling patterns in science (d), entrepreneurship (e) and security (f). The shaded area shows mean ± s.e.m. of Tn (log scale). gi, Performance dynamics in science (g, n = 641, 231, 578, 190, from left to right), entrepreneurship (h, n = 248, 1,332, 237, 1,312 from left to right) and security (i, n = 238, 198, 236, 199, from left to right). The successful and unsuccessful groups that experienced a large number of consecutive failures before the last attempt (at least 5 for D1, 3 for D2 and 2 for D3) appear indistinguishable for first failures (two-sided Welch’s t-test; P = 0.566, 0.671 and 0.349), but quickly diverge for second failures (two-sided Welch’s t-test; P = 2.09 × 10−2, 4.95 × 10−3 and 7.77 × 10−2). The successful group also shows significant improvement in performance (one-sided Welch’s t-test; P = 7.03 × 10−2, 2.37 × 10−2 and 2.32 × 10−2), which is absent for the unsuccessful group (one-sided Welch’s t-test; P = 0.717, 0.176 and 0.786). Data are mean ± s.e.m. jl, AUC score of predicting ultimate success in science (j), entrepreneurship (k) and security (l). The centres and error bars of AUC scores denote the mean ± s.e.m calculated from tenfold cross-validation over 50 randomized iterations. mx, As in al but using 7 years as the threshold of inactivity. Sample sizes are s: n = 620, 101, 559, 76; t: n = 248, 977, 237, 989; u: n = 216, 152, 214, 153. P values in su (from bottom to top) are P = 0.883 (s), 0.671 (t), 0.456 (u); P = 2.25 × 10−2 (s), 1.38 × 10−3 (t), 8.34 × 10−2 (u); P = 4.59 × 10−2 (s), 2.37 × 10−2 (t), 3.33 × 10−2 (u); P = 0.838 (s), 0.446 (t), 0.775 (u). *P < 0.1, **P < 0.05, ***P < 0.01, NS, not significant (P ≥ 0.1).

Extended Data Fig. 6 Robustness check on D1.

ac, Failure streak as we change the score threshold to 55 (a), exclude revisions as successes (b) and only focus on new principal investigators without previous R01 grants (c). Blue circles represent real data from successful groups and dashed lines represent fitted Weibull distributions. df, Temporal scaling patterns as we change the score threshold to 55 (d), exclude revisions as successes (e) and only focus on new principal investigators without previous R01 grants (f). The shaded area shows mean ± s.e.m. of Tn (log scale). gi, Performance dynamics as we change the score threshold to 55 (g, n = 768, 189, 686, 170, from left to right), exclude revisions as successes (h, n = 252, 145, 216, 123, from left to right) and only focus on new principal investigators without previous R01 grants (i, n = 1,164, 308, 1,530, 334, from left to right). The successful and unsuccessful groups that experienced a large number of consecutive failures before their last attempt (at least 5 for g and h, and 3 for i) appear indistinguishable for first failures (two-sided Welch’s t-test; P = 0.242, 0.819, 0.289) but quickly diverge for second failures (two-sided Welch’s t-test; P = 3.40 × 10−4, 3.40 × 10−2, 9.70 × 10−7). The successful group also shows a significant improvement in performance (one-sided Welch’s t-test; P = 4.23 × 10−2, 3.04 × 10−2, 1.92 × 10−4), which is absent for the unsuccessful group (one-sided Welch’s t-test; P = 0.863, 0.754, 0.997). Data are mean ± s.e.m. jl, AUC score of predicting ultimate success as we change the score threshold to 55 (j), exclude revisions as successes (k) and only focus on new principal investigators without previous R01 grants (l). The centres and error bars of AUC scores denote the mean ± s.e.m calculated from tenfold cross-validation over 50 randomized iterations. *P < 0.1, **P < 0.05, ***P < 0.01, NS, P ≥ 0.1.

Extended Data Fig. 7 Robustness check on D2.

ac, Failure streak as we change the threshold of high-value mergers and acquisitions (M&A) to 5% (a), exclude M&As as successes (b) and classify unicorns as successes (c). Blue circles represent real data from successful groups and dashed lines represent fitted Weibull distributions. df, Temporal scaling patterns as we change the threshold of high-value M&A to 5% (d), exclude M&As as successes (e) and include unicorns as successes (f). The shaded area shows mean ± s.e.m. of Tn (log scale). gi, Performance dynamics as we change the threshold of high-value M&A to 5% (g, n = 251, 1,304, 243, 1,284, from left to right), exclude M&As as successes (h, n = 248, 1,335, 237, 1,315, from left to right) and include unicorns as successes (i, n = 257, 1,330, 244, 1,311, from left to right). The successful and unsuccessful groups that experienced a large number of consecutive failures before their last attempt (at least 3) appear indistinguishable for first failures (two-sided Welch’s t-test; P = 0.937, 0.647, 0.620) but quickly diverge for second failures (two-sided Welch’s t-test; P = 9.92 × 10−3, 4.94 × 10−3, 6.33 × 10−3). The successful group also shows a significant improvement in performance (one-sided Welch’s t-test; P = 2.16 × 10−2, 2.37 × 10−2, 2.77 × 10−2), which is absent for the unsuccessful group (one-sided Welch’s t-test; P = 0.224, 0.158, 0.167). Data are mean ± s.e.m. jl, AUC score for predicting ultimate success as we change threshold of high-value M&A to 5% (j), exclude M&As as successes (k) and include unicorns as successes (l). The centres and error bars of AUC scores denote the mean ± s.e.m calculated from tenfold cross-validation over 50 randomized iterations. *P < 0.1, **P < 0.05, ***P < 0.01, NS, P ≥ 0.1.

Extended Data Fig. 8 Robustness check on D3.

ac, Failure streak as we focus on all samples (a), samples of human-targeted attacks (b) and include vague data on fatalities (c). Blue circles represent real data from successful groups and dashed lines represent fitted Weibull distributions. df, Temporal scaling patterns as we focus on all samples (d), samples of human-targeted attacks (e) and include vague data on fatalities (f). The shaded area shows mean ± s.e.m. of Tn (log scale). gi, Performance dynamics as we focus on all samples (g, n = 231, 231, 229, 232, from left to right), samples of human-targeted attacks (h, n = 176, 173, 173, 174, from left to right) and include vague data on fatalities (i, n = 227, 147, 225, 148, from left to right). The successful and unsuccessful groups that experienced a large number of consecutive failures before their last attempt (at least 2) appear indistinguishable for first failures (two-sided Welch’s t-test; P = 0.400, 0.859, 0.395), but quickly diverge for second failures (two-sided Welch’s t-test; P = 2.08 × 10−3, 6.70 × 10−3, 3.76 × 10−3). The successful group also shows a significant improvement in performance (one-sided Welch’s t-test; P = 2.55 × 10−2, 5.65 × 10−2, 3.77 × 10−2), which is absent for the unsuccessful group (one-sided Welch’s t-test; P = 0.970, 0.901, 0.967). Data are mean ± s.e.m. jl, AUC score of predicting ultimate success as we focus on all samples (j), samples of human-targeted attacks (k) and include vague data on fatalities (l). The centres and error bars of AUC scores denote the mean ± s.e.m calculated from tenfold cross-validation over 50 randomized iterations. mo, Temporal scaling patterns as we change the threshold for the successful group to fatal attacks that killed at least 5 (m), 10 (n) and 100 (o) people. *P < 0.1, **P < 0.05, ***P < 0.01, NS, P ≥ 0.1.

Extended Data Fig. 9 Additional robustness checks.

ai, Robustness check as we control for temporal variation. ac, Failure streak in science (a), entrepreneurship (b) and security (c). Blue circles represent real data of successful groups and dashed lines represent fitted Weibull distributions. df, Temporal scaling patterns in science (d), entrepreneurship (e) and security (f). The shaded area shows mean ± s.e.m. of Tn (log scale). gi, Performance dynamics in science (g, n = 628, 145, 571, 123, from left to right), entrepreneurship (h, n = 248, 1,332, 237, 1,312, from left to right) and security (i, n = 231, 173, 229, 174, from left to right). The successful and unsuccessful groups that experienced a large number of consecutive failures before their last attempt (at least 5 for D1, 3 for D2 and 2 for D3) appear indistinguishable for first failures (two-sided weighted Welch’s t-test; P = 0.814, 0.728, 0.330) but quickly diverge for second failures (two-sided weighted Welch’s t-test; P = 1.80 × 10−2, 3.10 × 10−2, 4.56 × 10−2). The successful group also shows significant improvement in performance (one-sided weighted Welch’s t-test; P = 2.10 × 10−2, 1.92 × 10−2, 4.53 × 10−2), which is absent for the unsuccessful group (one-sided weighted Welch’s t-test; P = 0.755, 0.175, 0.903). Data are mean ± s.e.m. jl, Performance dynamics as we compare first and halfway attempts in science (j, n = 628, 145, 582, 111, from left to right), entrepreneurship (k, n = 248, 1,332, 240, 1,294, from left to right) and security (l, n = 231, 173, 228, 175, from left to right). The successful and unsuccessful groups that experienced a large number of consecutive failures before their last attempt (at least 5 for D1, 3 for D2 and 2 for D3) appear indistinguishable for first failures (two-sided Welch’s t-test; P = 0.898, 0.671, 0.289) but diverge for halfway failures (two-sided Welch’s t-test; P = 2.18 × 10−5, 1.34 × 10−2, 1.34 × 10−2). The successful group also shows significant improvement in performance (one-sided Welch’s t-test; P = 2.35 × 10−2, 4.54 × 10−2, 3.69 × 10−2), which is absent for the unsuccessful group (one-sided Welch’s t-test; P = 0.992, 0.252, 0.955). Data are mean ± s.e.m. mo, Performance dynamics as we compare the first and penultimate attempts in science (m, n = 628, 145, 896, 87, from left to right), entrepreneurship (n, n = 248, 1,332, 227, 1,199, from left to right) and security (o, n = 231, 173, 230, 173, from left to right). The successful and unsuccessful groups that experienced a large number of consecutive failures before the last attempt (at least 5 for D1, 3 for D2 and 2 for D3) appear indistinguishable for first failures (two-sided Welch’s t-test, P = 0.898, 0.671, 0.289) but diverge for penultimate failures (two-sided Welch’s t-test; P = 8.50 × 10−8, 3.12 × 10−2, 1.13 × 10−2). The successful group also shows a significant improvement in performance (one-sided Welch’s t-test; P = 5.79 × 10−9, 4.30 × 10−2, 1.33 × 10−2), which is absent for the unsuccessful group (one-sided Welch’s t-test; P = 0.980, 0.138, 0.923). Data are mean ± s.e.m. pr, The correlation between length of failure streak and initial performance (samples with repeated failures) in science (p, n = 12,171), entrepreneurship (q, n = 2,086) and security (r, n = 441). Correlation is weak across all three datasets (Pearson correlation; r = −0.051, −0.011, −0.107 for p, q, r, respectively). su, Length of failure streak still follow fat-tailed distributions conditional on bottom 10% initial performance samples in science (s, n = 6,339), entrepreneurship (t, n = 2,438) and security (u, n = 1,092). Two-sided Kolmogorov–Smirnov test between sample and exponential distributions rejects the hypothesis that the two distributions are identical with P < 0.01. *P < 0.1, **P < 0.05, ***P < 0.01, NS, P ≥ 0.1.

Extended Data Fig. 10 Generalization of the k model.

a, The α parameter connects the potential to improve (1 − x) with the likelihood of creating new versions p through p = (1 − x)α. b, Phase diagram of the kα model. The two-dimensional parameter space is separated into three regimes, with boundaries at  = 1 and (k − 1)α = 1. c, The impact of δ parameter on scaling exponent γ for given k = 1, 2, 3 and α = 0.4, 0.8, 1.2. We find that δ may affect the temporal scaling parameter when it is small, but has no further effect beyond a certain point δ* = min(α, 1/(k − 1)). d, Phase diagram of the kαδ model for k = 3, with boundaries at α = δ, (k − 1)δ = 1, (k − 1)δ + α = 1,  = 1 and (k−1)α = 1, respectively.

Supplementary information

Supplementary Information

This file contains the following sections: 1 Data description; 2 Related work and models; 3 Modeling failure dynamics; 4 Generalized models; 5 Empirical measurements; 6 Prediction task; 7 Robustness checks; and Supplementary Tables 1-4 and additional references.

Reporting summary

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yin, Y., Wang, Y., Evans, J.A. et al. Quantifying the dynamics of failure across science, startups and security. Nature 575, 190–194 (2019). https://doi.org/10.1038/s41586-019-1725-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-019-1725-y

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing