Abstract
Human achievements are often preceded by repeated attempts that fail, but little is known about the mechanisms that govern the dynamics of failure. Here, building on previous research relating to innovation^{1,2,3,4,5,6,7}, human dynamics^{8,9,10,11} and learning^{12,13,14,15,16,17}, we develop a simple oneparameter model that mimics how successful future attempts build on past efforts. Solving this model analytically suggests that a phase transition separates the dynamics of failure into regions of progression or stagnation and predicts that, near the critical threshold, agents who share similar characteristics and learning strategies may experience fundamentally different outcomes following failures. Above the critical point, agents exploit incremental refinements to systematically advance towards success, whereas below it, they explore disjoint opportunities without a pattern of improvement. The model makes several empirically testable predictions, demonstrating that those who eventually succeed and those who do not may initially appear similar, but can be characterized by fundamentally distinct failure dynamics in terms of the efficiency and quality associated with each subsequent attempt. We collected largescale data from three disparate domains and traced repeated attempts by investigators to obtain National Institutes of Health (NIH) grants to fund their research, innovators to successfully exit their startup ventures, and terrorist organizations to claim casualties in violent attacks. We find broadly consistent empirical support across all three domains, which systematically verifies each prediction of our model. Together, our findings unveil detectable yet previously unknown early signals that enable us to identify failure dynamics that will lead to ultimate success or failure. Given the ubiquitous nature of failure and the paucity of quantitative approaches to understand it, these results represent an initial step towards the deeper understanding of the complex dynamics underlying failure.
Access options
Subscribe to Journal
Get full journal access for 1 year
$199.00
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
from$8.99
All prices are NET prices.
Data availability
This paper makes use of restricted access data from the National Institutes of Health (NIH), protected by the Privacy Act of 1974 as amended (5 U.S.C. 552a). Deidentified data necessary to reproduce all plots and statistical analyses are freely available at https://yianyin.github.io/quantifyFailure. Those wishing to access the raw data can apply for access following the procedures outlined in the NIH Data Access Policy document (http://report.nih.gov/pdf/DataAccessPolicy.pdf). The VentureXpert database is available from Thomson Reuters. The Global Terrorism Database is publicly available at https://www.start.umd.edu/gtd/.
Code availability
Code is available at https://yianyin.github.io/quantifyFailure.
Change history
05 June 2020
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
References
 1.
Fortunato, S. et al. Science of science. Science 359, eaao0185 (2018).
 2.
Harford, T. Adapt: Why Success Always Starts with Failure (Farrar, Straus and Giroux, 2011).
 3.
Wuchty, S., Jones, B. F. & Uzzi, B. The increasing dominance of teams in production of knowledge. Science 316, 1036–1039 (2007).
 4.
Jones, B. F. The burden of knowledge and the “death of the renaissance man”: is innovation getting harder? Rev. Econ. Stud. 76, 283–317 (2009).
 5.
Sinatra, R., Wang, D., Deville, P., Song, C. & Barabási, A.L. Quantifying the evolution of individual scientific impact. Science 354, aaf5239 (2016).
 6.
Liu, L. et al. Hot streaks in artistic, cultural, and scientific careers. Nature 559, 396–399 (2018).
 7.
Hu, Y., Havlin, S. & Makse, H. A. Conditions for viral influence spreading through multiplex correlated social networks. Phys. Rev. X 4, 021031 (2014).
 8.
Barabási, A.L. The origin of bursts and heavy tails in human dynamics. Nature 435, 207–211 (2005).
 9.
González, M. C., Hidalgo, C. A. & Barabási, A.L. Understanding individual human mobility patterns. Nature 453, 779–782 (2008).
 10.
Castellano, C., Fortunato, S. & Loreto, V. Statistical physics of social dynamics. Rev. Mod. Phys. 81, 591–646 (2009).
 11.
Malmgren, R. D., Stouffer, D. B., Campanharo, A. S. & Amaral, L. A. N. On universality in human correspondence activity. Science 325, 1696–1700 (2009).
 12.
Argote, L. Organizational Learning: Creating, Retaining and Transferring Knowledge (Springer Science & Business Media, 2012).
 13.
Sitkin, S. B. Learning through failure: the strategy of small losses. Res. Organ. Behav. 14, 231–266 (1992).
 14.
Yelle, L. E. The learning curve: historical review and comprehensive survey. Decis. Sci. 10, 302–328 (1979).
 15.
Dutton, J. M. & Thomas, A. Treating progress functions as a managerial opportunity. Acad. Manage. Rev. 9, 235–247 (1984).
 16.
Huber, G. P. Organizational learning: the contributing processes and the literatures. Organ. Sci. 2, 88–115 (1991).
 17.
Cannon, M. D. & Edmondson, A. C. Failing to learn and learning to fail (intelligently): how great organizations put failure to work to innovate and improve. Long Range Plann. 38, 299–319 (2005).
 18.
Kaplan, S. N. & Lerner, J. in Measuring Entrepreneurial Businesses: Current Knowledge and Challenges (Univ. Chicago Press, 2016).
 19.
Eggers, J. P. & Song, L. Dealing with failure: serial entrepreneurs and the costs of changing industries between ventures. Acad. Manage. J. 58, 1785–1803 (2015).
 20.
National Consortium for the Study of Terrorism and Responses to Terrorism. Global Terrorism Database (GTD) https://www.start.umd.edu/researchprojects/globalterrorismdatabasegtd (2018).
 21.
Clauset, A. & Gleditsch, K. S. The developmental dynamics of terrorist organizations. PLoS ONE 7, e48633 (2012).
 22.
Johnson, N. et al. Pattern in escalations in insurgent and terrorist activity. Science 333, 81–84 (2011).
 23.
Newell, A. & Rosenbloom, P. S. in Cognitive Skills and their Acquisition 1 (ed. Anderson, J. R.) 1–55 (Erlbaum, 1981).
 24.
Anderson, J. R. Acquisition of cognitive skill. Psychol. Rev. 89, 369–406 (1982).
 25.
Muth, J. F. Search theory and the manufacturing progress function. Manage. Sci. 32, 948–962 (1986).
 26.
Wright, T. P. Factors affecting the cost of airplanes. J. Aeronaut. Sci. 3, 122–128 (1936).
 27.
March, J. G. Exploration and exploitation in organizational learning. Organ. Sci. 2, 71–87 (1991).
 28.
Foster, J. G., Rzhetsky, A. & Evans, J. A. Tradition and innovation in scientists’ research strategies. Am. Sociol. Rev. 80, 875–908 (2015).
 29.
Arbesman, S. The Halflife of Facts: Why Everything We Know Has an Expiration Date (Penguin, 2013).
 30.
Madsen, P. M. & Desai, V. Failing to learn? The effects of failure and success on organizational learning in the global orbital launch vehicle industry. Acad. Manage. J. 53, 451–476 (2010).
 31.
Argote, L., Beckman, S. L. & Epple, D. The persistence and transfer of learning in industrial settings. Manage. Sci. 36, 140–154 (1990).
 32.
Kuhn, T. S. The Structure of Scientific Revolutions (Chicago Univ. Press, 2012).
 33.
Merton, R. K. Singletons and multiples in scientific discovery: a chapter in the sociology of science. Proc. Am. Phil. Soc. 105, 470–486 (1961).
 34.
Gompers, P., Kovner, A., Lerner, J. & Scharfstein, D. Performance persistence in entrepreneurship. J. Financ. Econ. 96, 18–32 (2010).
 35.
de Holan, P. M. & Phillips, N. Remembrance of things past? the dynamics of organizational forgetting. Manage. Sci. 50, 1603–1613 (2004).
 36.
Schelling, T. C. Micromotives and Macrobehavior (WW Norton & Company, 2006).
 37.
Watts, D. J. A simple model of global cascades on random networks. Proc. Natl Acad. Sci. USA 99, 5766–5771 (2002).
 38.
Holme, P. & Newman, M. E. Nonequilibrium phase transition in the coevolution of networks and opinions. Phys. Rev. E 74, 056108 (2006).
 39.
Ginther, D. K. et al. Race, ethnicity, and NIH research awards. Science 333, 1015–1019 (2011).
 40.
Boudreau, K. J., Guinan, E. C., Lakhani, K. R. & Riedl, C. Looking across and looking beyond the knowledge frontier: intellectual distance, novelty, and resource allocation in science. Manage. Sci. 62, 2765–2783 (2016).
 41.
Bromham, L., Dinnage, R. & Hua, X. Interdisciplinary research has consistently lower funding success. Nature 534, 684–687 (2016).
 42.
BanalEstanol, A., MachoStadler, I. & Pérez Castrillo, D. Key Success Drivers in Public Research Grants: Funding the Seeds of Radical Innovation in Academia? CESifo Working Paper Series 5852 (CESifo, 2016).
 43.
Ma, A., Mondragón, R. J. & Latora, V. Anatomy of funded research in science. Proc. Natl Acad. Sci. USA 112, 14760–14765 (2015).
 44.
Levitt, B. & March, J. G. Organizational learning. Annu. Rev. Sociol. 14, 319–338 (1988).
 45.
Argote, L. & Epple, D. Learning curves in manufacturing. Science 247, 920–924 (1990).
 46.
Merton, R. K. et al. The Matthew effect in science. Science 159, 56–63 (1968).
 47.
Huang, J., Ertekin, S. & Giles, C. L. Efficient name disambiguation for largescale databases. In European Conference on Principles of Data Mining and Knowledge Discovery 536–544 (Springer, 2006).
 48.
Shen, H. Inequality quantified: Mind the gender gap. Nature 495, 22–24 (2013).
 49.
Larivière, V., Ni, C., Gingras, Y., Cronin, B. & Sugimoto, C. R. Bibliometrics: global gender disparities in science. Nature 504, 211–213 (2013).
 50.
Yang, T. & Aldrich, H. E. Who’s the boss? Explaining gender inequality in entrepreneurial teams. Am. Sociol. Rev. 79, 303–327 (2014).
 51.
Argote, L., Insko, C. A., Yovetich, N. & Romero, A. A. Group learning curves: the effects of turnover and task complexity on group performance. J. Appl. Soc. Psychol. 25, 512–529 (1995).
 52.
Bailey, C. D. Forgetting and the learning curve: a laboratory study. Manage. Sci. 35, 340–352 (1989).
Acknowledgements
We thank C. Song, A. Clauset, B. Uzzi, B. Jones, E. Finkel, J. Van Mieghem, A. Bassamboo and Y. Xie for helpful discussions, and H. Sauermann and S. Havlin for suggesting extensions of the model, leading us to discover the k–α and k–α –δ models. This work is supported by the Air Force Office of Scientific Research under award number FA95501510162, FA95501710089 and FA95501910354, National Science Foundation grant SBE 1829344, the Alfred P. Sloan Foundation G201912485, and Northwestern University Data Science Initiative. This work does not reflect the position of NIH.
Author information
Affiliations
Contributions
D.W. conceived the project and designed the experiments; Y.Y. and Y.W. collected data and performed empirical analyses with help from D.W. and J.A.E.; Y.Y. and D.W. carried out theoretical calculations; all authors collaboratively designed the model and interpreted results; D.W. and Y.Y. wrote the manuscript; all authors edited the manuscript.
Corresponding author
Ethics declarations
Competing interests
Y.W. and D.W. serve as special volunteers (unpaid) to the NIH. The remaining authors declare no competing interests.
Additional information
Peer review information Nature thanks Shlomo Havlin and Henry Sauermann for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 The k model.
a–f, Simulation results from the model (α = 0.6) for the cases of k = 0 (a, d) and k → ∞ (b, e) in terms of the average quality (a–c) and efficiency (d–f) of each attempt. k = 0 recovers the chance model, predicting a constant quality (c) and efficiency (f). k → ∞ predicts temporal scaling that characterizes the dynamics of failure (e) with improved quality (b), recovering predictions from learning curves and Wright’s law. g–j, Illustration of mapping between failure dynamics (g, h) and canonical ensembles (i, j). The canonical system is characterized by three different states a, b, c with corresponding energy densities E_{a}(h), E_{b}(h), E_{c}(h). Here we assume E_{a}(h) = (2εh − 1)^{2}, E_{b}(h) = (2h − 1)^{2} and E_{c}(h) = [2ε(1 − h) − 1]^{2} where ε → 0^{+}. The introduction of ε is to distinguish state a from state c, both of which can be approximated in the limiting condition E_{a}(h) = E_{c}(h) = 0. We map f → (2Γ − 1)^{2}, N → ln[n], h → K and E_{i}(h) = [2Γ_{i}(K) − 1]^{2}. In this case, the two transition points k* and k* + 1 correspond to h = 0 and 1 in the canonical ensemble systems.
Extended Data Fig. 2 Predicting temporal dynamics in science, entrepreneurship and security.
a–c, We compare the goodness of fit for three different models in temporal dynamics in NIH grants (a, n = 10345), startups (b, n = 275) and terrorist attacks (c, n = 136). For each individual sample, we take all but the last interevent time for model fitting (n = 1, …, N − 1), comparing model predictions for the last interevent time. The tested functional forms are power law, t_{n} = an^{b}; exponential, t_{n} = ab^{−n}; and linear, t_{n} = a + bn. We then calculate the frequency that each model reaches minimum error, defined as \(\,\log ({t}_{N})\,\log ({\hat{t}}_{N})\), among all three forms. The powerlaw model offers consistently better predictions. d–f, As in a–c, but using \({t}_{N}{\hat{t}}_{N}\) as the loss function.
Extended Data Fig. 3 Predicting ultimate success in science, entrepreneurship and security.
a–c, Area under the receiver operating characteristic curve (AUC) of the prediction task. We apply two logistic regression models (Supplementary Information 6.1) to predict ultimate success in NIH grants (a), startups (b) and terrorist attacks (c). The centres and error bars of AUC scores denote the mean ± s.e.m. calculated from tenfold crossvalidation over 50 randomized iterations (green, model 1; red, model 2). d, e, As in a but predicting ultimate success in NIH grants for male (d) and female (e) investigators.
Extended Data Fig. 4 Model validations.
a, b, An illustration of the component dynamics. We extract all MeSH terms associated with the nth attempt, S_{n}, and calculate the number of new terms m_{n}, defined as \({S}_{n}({S}_{n1}\cup \cdots \cup {S}_{nk})\). b, Testing component dynamics in NIH grant applications. We calculate the dynamics of M_{n} = 〈m_{n}〉/〈m_{1}〉 using different k and compare it with T_{n}. The centres and error bars of M_{n} show the mean ± s.e.m. (n = 5,899) for different k. The shaded area shows mean ± s.e.m. of T_{n} (log scale) measured on the same subset. All k > 3 lead to similar trends between M_{n} and T_{n}. c–e, Length of failure streak after randomization in science (c), entrepreneurship (d) and security (e). We take the samples used in Fig. 1 and shuffle the success/failure label from each attempt. This operation keeps both the overall success rate and the total number of attempts for each individual constant. f–h, Temporal scaling patterns within the successful group in science (f), entrepreneurship (g) and security (h). We separated the successful group into two subgroups (narrow winners and clear winners) based on eventual performance (0.9 in evaluation score for D_{1}, 0.5 in investment amount for D_{2} and 1 in wounded individuals for D_{3}). The shaded area shows mean ± s.e.m. of T_{n} (log scale).
Extended Data Fig. 5 Robustness check on definition of unsuccessful group.
a–l, Robustness check as we change the threshold of inactivity to 3 years. a–c, Failure streak in science (a), entrepreneurship (b) and security (c). Blue circles represent real data from the successful group and dashed lines represent fitted Weibull distributions. d–f, Temporal scaling patterns in science (d), entrepreneurship (e) and security (f). The shaded area shows mean ± s.e.m. of T_{n} (log scale). g–i, Performance dynamics in science (g, n = 641, 231, 578, 190, from left to right), entrepreneurship (h, n = 248, 1,332, 237, 1,312 from left to right) and security (i, n = 238, 198, 236, 199, from left to right). The successful and unsuccessful groups that experienced a large number of consecutive failures before the last attempt (at least 5 for D_{1}, 3 for D_{2} and 2 for D_{3}) appear indistinguishable for first failures (twosided Welch’s ttest; P = 0.566, 0.671 and 0.349), but quickly diverge for second failures (twosided Welch’s ttest; P = 2.09 × 10^{−2}, 4.95 × 10^{−3} and 7.77 × 10^{−2}). The successful group also shows significant improvement in performance (onesided Welch’s ttest; P = 7.03 × 10^{−2}, 2.37 × 10^{−2} and 2.32 × 10^{−2}), which is absent for the unsuccessful group (onesided Welch’s ttest; P = 0.717, 0.176 and 0.786). Data are mean ± s.e.m. j–l, AUC score of predicting ultimate success in science (j), entrepreneurship (k) and security (l). The centres and error bars of AUC scores denote the mean ± s.e.m calculated from tenfold crossvalidation over 50 randomized iterations. m–x, As in a–l but using 7 years as the threshold of inactivity. Sample sizes are s: n = 620, 101, 559, 76; t: n = 248, 977, 237, 989; u: n = 216, 152, 214, 153. P values in s–u (from bottom to top) are P = 0.883 (s), 0.671 (t), 0.456 (u); P = 2.25 × 10^{−2} (s), 1.38 × 10^{−3} (t), 8.34 × 10^{−2} (u); P = 4.59 × 10^{−2} (s), 2.37 × 10^{−2} (t), 3.33 × 10^{−2} (u); P = 0.838 (s), 0.446 (t), 0.775 (u). *P < 0.1, **P < 0.05, ***P < 0.01, NS, not significant (P ≥ 0.1).
Extended Data Fig. 6 Robustness check on D_{1}.
a–c, Failure streak as we change the score threshold to 55 (a), exclude revisions as successes (b) and only focus on new principal investigators without previous R01 grants (c). Blue circles represent real data from successful groups and dashed lines represent fitted Weibull distributions. d–f, Temporal scaling patterns as we change the score threshold to 55 (d), exclude revisions as successes (e) and only focus on new principal investigators without previous R01 grants (f). The shaded area shows mean ± s.e.m. of T_{n} (log scale). g–i, Performance dynamics as we change the score threshold to 55 (g, n = 768, 189, 686, 170, from left to right), exclude revisions as successes (h, n = 252, 145, 216, 123, from left to right) and only focus on new principal investigators without previous R01 grants (i, n = 1,164, 308, 1,530, 334, from left to right). The successful and unsuccessful groups that experienced a large number of consecutive failures before their last attempt (at least 5 for g and h, and 3 for i) appear indistinguishable for first failures (twosided Welch’s ttest; P = 0.242, 0.819, 0.289) but quickly diverge for second failures (twosided Welch’s ttest; P = 3.40 × 10^{−4}, 3.40 × 10^{−2}, 9.70 × 10^{−7}). The successful group also shows a significant improvement in performance (onesided Welch’s ttest; P = 4.23 × 10^{−2}, 3.04 × 10^{−2}, 1.92 × 10^{−4}), which is absent for the unsuccessful group (onesided Welch’s ttest; P = 0.863, 0.754, 0.997). Data are mean ± s.e.m. j–l, AUC score of predicting ultimate success as we change the score threshold to 55 (j), exclude revisions as successes (k) and only focus on new principal investigators without previous R01 grants (l). The centres and error bars of AUC scores denote the mean ± s.e.m calculated from tenfold crossvalidation over 50 randomized iterations. *P < 0.1, **P < 0.05, ***P < 0.01, NS, P ≥ 0.1.
Extended Data Fig. 7 Robustness check on D_{2}.
a–c, Failure streak as we change the threshold of highvalue mergers and acquisitions (M&A) to 5% (a), exclude M&As as successes (b) and classify unicorns as successes (c). Blue circles represent real data from successful groups and dashed lines represent fitted Weibull distributions. d–f, Temporal scaling patterns as we change the threshold of highvalue M&A to 5% (d), exclude M&As as successes (e) and include unicorns as successes (f). The shaded area shows mean ± s.e.m. of T_{n} (log scale). g–i, Performance dynamics as we change the threshold of highvalue M&A to 5% (g, n = 251, 1,304, 243, 1,284, from left to right), exclude M&As as successes (h, n = 248, 1,335, 237, 1,315, from left to right) and include unicorns as successes (i, n = 257, 1,330, 244, 1,311, from left to right). The successful and unsuccessful groups that experienced a large number of consecutive failures before their last attempt (at least 3) appear indistinguishable for first failures (twosided Welch’s ttest; P = 0.937, 0.647, 0.620) but quickly diverge for second failures (twosided Welch’s ttest; P = 9.92 × 10^{−3}, 4.94 × 10^{−3}, 6.33 × 10^{−3}). The successful group also shows a significant improvement in performance (onesided Welch’s ttest; P = 2.16 × 10^{−2}, 2.37 × 10^{−2}, 2.77 × 10^{−2}), which is absent for the unsuccessful group (onesided Welch’s ttest; P = 0.224, 0.158, 0.167). Data are mean ± s.e.m. j–l, AUC score for predicting ultimate success as we change threshold of highvalue M&A to 5% (j), exclude M&As as successes (k) and include unicorns as successes (l). The centres and error bars of AUC scores denote the mean ± s.e.m calculated from tenfold crossvalidation over 50 randomized iterations. *P < 0.1, **P < 0.05, ***P < 0.01, NS, P ≥ 0.1.
Extended Data Fig. 8 Robustness check on D_{3}.
a–c, Failure streak as we focus on all samples (a), samples of humantargeted attacks (b) and include vague data on fatalities (c). Blue circles represent real data from successful groups and dashed lines represent fitted Weibull distributions. d–f, Temporal scaling patterns as we focus on all samples (d), samples of humantargeted attacks (e) and include vague data on fatalities (f). The shaded area shows mean ± s.e.m. of T_{n} (log scale). g–i, Performance dynamics as we focus on all samples (g, n = 231, 231, 229, 232, from left to right), samples of humantargeted attacks (h, n = 176, 173, 173, 174, from left to right) and include vague data on fatalities (i, n = 227, 147, 225, 148, from left to right). The successful and unsuccessful groups that experienced a large number of consecutive failures before their last attempt (at least 2) appear indistinguishable for first failures (twosided Welch’s ttest; P = 0.400, 0.859, 0.395), but quickly diverge for second failures (twosided Welch’s ttest; P = 2.08 × 10^{−3}, 6.70 × 10^{−3}, 3.76 × 10^{−3}). The successful group also shows a significant improvement in performance (onesided Welch’s ttest; P = 2.55 × 10^{−2}, 5.65 × 10^{−2}, 3.77 × 10^{−2}), which is absent for the unsuccessful group (onesided Welch’s ttest; P = 0.970, 0.901, 0.967). Data are mean ± s.e.m. j–l, AUC score of predicting ultimate success as we focus on all samples (j), samples of humantargeted attacks (k) and include vague data on fatalities (l). The centres and error bars of AUC scores denote the mean ± s.e.m calculated from tenfold crossvalidation over 50 randomized iterations. m–o, Temporal scaling patterns as we change the threshold for the successful group to fatal attacks that killed at least 5 (m), 10 (n) and 100 (o) people. *P < 0.1, **P < 0.05, ***P < 0.01, NS, P ≥ 0.1.
Extended Data Fig. 9 Additional robustness checks.
a–i, Robustness check as we control for temporal variation. a–c, Failure streak in science (a), entrepreneurship (b) and security (c). Blue circles represent real data of successful groups and dashed lines represent fitted Weibull distributions. d–f, Temporal scaling patterns in science (d), entrepreneurship (e) and security (f). The shaded area shows mean ± s.e.m. of T_{n} (log scale). g–i, Performance dynamics in science (g, n = 628, 145, 571, 123, from left to right), entrepreneurship (h, n = 248, 1,332, 237, 1,312, from left to right) and security (i, n = 231, 173, 229, 174, from left to right). The successful and unsuccessful groups that experienced a large number of consecutive failures before their last attempt (at least 5 for D_{1}, 3 for D_{2} and 2 for D_{3}) appear indistinguishable for first failures (twosided weighted Welch’s ttest; P = 0.814, 0.728, 0.330) but quickly diverge for second failures (twosided weighted Welch’s ttest; P = 1.80 × 10^{−2}, 3.10 × 10^{−2}, 4.56 × 10^{−2}). The successful group also shows significant improvement in performance (onesided weighted Welch’s ttest; P = 2.10 × 10^{−2}, 1.92 × 10^{−2}, 4.53 × 10^{−2}), which is absent for the unsuccessful group (onesided weighted Welch’s ttest; P = 0.755, 0.175, 0.903). Data are mean ± s.e.m. j–l, Performance dynamics as we compare first and halfway attempts in science (j, n = 628, 145, 582, 111, from left to right), entrepreneurship (k, n = 248, 1,332, 240, 1,294, from left to right) and security (l, n = 231, 173, 228, 175, from left to right). The successful and unsuccessful groups that experienced a large number of consecutive failures before their last attempt (at least 5 for D_{1}, 3 for D_{2} and 2 for D_{3}) appear indistinguishable for first failures (twosided Welch’s ttest; P = 0.898, 0.671, 0.289) but diverge for halfway failures (twosided Welch’s ttest; P = 2.18 × 10^{−5}, 1.34 × 10^{−2}, 1.34 × 10^{−2}). The successful group also shows significant improvement in performance (onesided Welch’s ttest; P = 2.35 × 10^{−2}, 4.54 × 10^{−2}, 3.69 × 10^{−2}), which is absent for the unsuccessful group (onesided Welch’s ttest; P = 0.992, 0.252, 0.955). Data are mean ± s.e.m. m–o, Performance dynamics as we compare the first and penultimate attempts in science (m, n = 628, 145, 896, 87, from left to right), entrepreneurship (n, n = 248, 1,332, 227, 1,199, from left to right) and security (o, n = 231, 173, 230, 173, from left to right). The successful and unsuccessful groups that experienced a large number of consecutive failures before the last attempt (at least 5 for D_{1}, 3 for D_{2} and 2 for D_{3}) appear indistinguishable for first failures (twosided Welch’s ttest, P = 0.898, 0.671, 0.289) but diverge for penultimate failures (twosided Welch’s ttest; P = 8.50 × 10^{−8}, 3.12 × 10^{−2}, 1.13 × 10^{−2}). The successful group also shows a significant improvement in performance (onesided Welch’s ttest; P = 5.79 × 10^{−9}, 4.30 × 10^{−2}, 1.33 × 10^{−2}), which is absent for the unsuccessful group (onesided Welch’s ttest; P = 0.980, 0.138, 0.923). Data are mean ± s.e.m. p–r, The correlation between length of failure streak and initial performance (samples with repeated failures) in science (p, n = 12,171), entrepreneurship (q, n = 2,086) and security (r, n = 441). Correlation is weak across all three datasets (Pearson correlation; r = −0.051, −0.011, −0.107 for p, q, r, respectively). s–u, Length of failure streak still follow fattailed distributions conditional on bottom 10% initial performance samples in science (s, n = 6,339), entrepreneurship (t, n = 2,438) and security (u, n = 1,092). Twosided Kolmogorov–Smirnov test between sample and exponential distributions rejects the hypothesis that the two distributions are identical with P < 0.01. *P < 0.1, **P < 0.05, ***P < 0.01, NS, P ≥ 0.1.
Extended Data Fig. 10 Generalization of the k model.
a, The α parameter connects the potential to improve (1 − x) with the likelihood of creating new versions p through p = (1 − x)^{α}. b, Phase diagram of the k–α model. The twodimensional parameter space is separated into three regimes, with boundaries at kα = 1 and (k − 1)α = 1. c, The impact of δ parameter on scaling exponent γ for given k = 1, 2, 3 and α = 0.4, 0.8, 1.2. We find that δ may affect the temporal scaling parameter when it is small, but has no further effect beyond a certain point δ* = min(α, 1/(k − 1)). d, Phase diagram of the k–α–δ model for k = 3, with boundaries at α = δ, (k − 1)δ = 1, (k − 1)δ + α = 1, kα = 1 and (k−1)α = 1, respectively.
Supplementary information
Supplementary Information
This file contains the following sections: 1 Data description; 2 Related work and models; 3 Modeling failure dynamics; 4 Generalized models; 5 Empirical measurements; 6 Prediction task; 7 Robustness checks; and Supplementary Tables 14 and additional references.
Rights and permissions
About this article
Cite this article
Yin, Y., Wang, Y., Evans, J.A. et al. Quantifying the dynamics of failure across science, startups and security. Nature 575, 190–194 (2019). https://doi.org/10.1038/s415860191725y
Received:
Accepted:
Published:
Issue Date:
Further reading

The FWord
Journal of the American College of Radiology (2021)

Failure Mode and Effects Analysis (FMEA) for Immunogenicity of Therapeutic Proteins
Journal of Pharmaceutical Sciences (2020)

Inequalities, chance and success in sport competitions: Simulations vs empirical data
Physica A: Statistical Mechanics and its Applications (2020)

Improving Restoration Programs Through Greater Connection With Ecological Theory and Better Monitoring
Frontiers in Ecology and Evolution (2020)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.