Letter | Published:

Hot streaks in artistic, cultural, and scientific careers

Nature (2018) | Download Citation

Abstract

The hot streak—loosely defined as ‘winning begets more winnings’—highlights a specific period during which an individual’s performance is substantially better than his or her typical performance. Although hot streaks have been widely debated in sports1,2, gambling3,4,5 and financial markets6,7 over the past several decades, little is known about whether they apply to individual careers. Here, building on rich literature on the lifecycle of creativity8,9,10,11,12,13,14,15,16,17,18,19,20,21,22, we collected large-scale career histories of individual artists, film directors and scientists, tracing the artworks, films and scientific publications they produced. We find that, across all three domains, hit works within a career show a high degree of temporal regularity, with each career being characterized by bursts of high-impact works occurring in sequence. We demonstrate that these observations can be explained by a simple hot-streak model, allowing us to probe quantitatively the hot streak phenomenon governing individual careers. We find this phenomemon to be remarkably universal across diverse domains: hot streaks are ubiquitous yet usually unique across different careers. The hot streak emerges randomly within an individual’s sequence of works, is temporally localized, and is not associated with any detectable change in productivity. We show that, because works produced during hot streaks garner substantially more impact, the uncovered hot streaks fundamentally drive the collective impact of an individual, and ignoring this leads us to systematically overestimate or underestimate the future impact of a career. These results not only deepen our quantitative understanding of patterns that govern individual ingenuity and success, but also may have implications for identifying and nurturing individuals whose work will have lasting impact.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from $8.99

All prices are NET prices.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  1. 1.

    Gilovich, T., Vallone, R. & Tversky, A. The hot hand in basketball: on the misperception of random sequences. Cognit. Psychol. 17, 295–314 (1985).

  2. 2.

    Miller, J. B. & Sanjurjo, A. Surprised by the gambler’s and hot hand fallacies? A truth in the law of small numbers. IGIER Working Paper No. 552 (2016).

  3. 3.

    Ayton, P. & Fischer, I. The hot hand fallacy and the gambler’s fallacy: two faces of subjective randomness? Mem. Cognit. 32, 1369–1378 (2004).

  4. 4.

    Rabin, M. & Vayanos, D. The gambler’s and hot-hand fallacies: theory and applications. Rev. Econ. Stud. 77, 730–778 (2010).

  5. 5.

    Xu, J. & Harvey, N. Carry on winning: the gamblers’ fallacy creates hot hand effects in online gambling. Cognition 131, 173–180 (2014).

  6. 6.

    Hendricks, D., Patel, J. & Zeckhauser, R. Hot hands in mutual funds: short-run persistence of relative performance, 1974–1988. J. Finance 48, 93–130 (1993).

  7. 7.

    Kahneman, D. & Riepe, M. W. Aspects of investor psychology. J. Portfol. Manage. 24, 52–65 (1998).

  8. 8.

    Lehman, H. C. Age and Achievement (Princeton Univ. Press, Princeton, 1953).

  9. 9.

    Merton, R. K. The Matthew effect in science. Science 159, 56–63 (1968).

  10. 10.

    Simonton, D. K. Age and outstanding achievement: what do we know after a century of research? Psychol. Bull. 104, 251–267 (1988).

  11. 11.

    Jones, B. F. The burden of knowledge and the death of the renaissance man: is innovation getting harder? Rev. Econ. Stud. 76, 283–317 (2009).

  12. 12.

    Petersen, A. M. et al. Reputation and impact in academic careers. Proc. Natl Acad. Sci. USA 111, 15316–15321 (2014).

  13. 13.

    Stringer, M. J., Sales-Pardo, M. & Nunes Amaral, L. A. Effectiveness of journal ranking schemes as a tool for locating information. PLoS One 3, e1683 (2008).

  14. 14.

    Radicchi, F., Fortunato, S. & Castellano, C. Universality of citation distributions: toward an objective measure of scientific impact. Proc. Natl Acad. Sci. USA 105, 17268–17272 (2008).

  15. 15.

    Galenson, D. W. Old Masters and Young Geniuses: The Two Life Cycles of Artistic Creativity (Princeton Univ. Press, Princeton, 2011).

  16. 16.

    Wang, D., Song, C. & Barabási, A.-L. Quantifying long-term scientific impact. Science 342, 127–132 (2013).

  17. 17.

    Way, S. F., Morgan, A. C., Clauset, A. & Larremore, D. B. The misleading narrative of the canonical faculty productivity trajectory. Proc. Natl Acad. Sci. USA 114, E9216–E9223 (2017).

  18. 18.

    Sinatra, R., Wang, D., Deville, P., Song, C. & Barabási, A.-L. Quantifying the evolution of individual scientific impact. Science 354, aaf5239 (2016).

  19. 19.

    Fortunato, S. et al. Science of science. Science 359, eaao0185 (2018).

  20. 20.

    Zeng, A. et al. The science of science: From the perspective of complex systems. Phys. Rep. 714–715, 1–73 (2017).

  21. 21.

    Simonton, D. K. Scientific Genius: A Psychology of Science (Cambridge Univ. Press, Cambridge, 1988).

  22. 22.

    Duch, J. et al. The possible role of resource requirements and academic career-choice risk on gender differences in publication rate and impact. PLoS One 7, e51332 (2012).

  23. 23.

    Price, D. S. A general theory of bibliometric and other cumulative advantage processes. J. Assoc. Inf. Sci. Technol. 27, 292–306 (1976).

  24. 24.

    Barabási, A.-L. & Albert, R. Emergence of scaling in random networks. Science 286, 509–512 (1999).

  25. 25.

    Wasserman, M., Zeng, X. H. T. & Amaral, L. A. N. Cross-evaluation of metrics to estimate the significance of creative works. Proc. Natl Acad. Sci. USA 112, 1281–1286 (2015).

  26. 26.

    Uzzi, B., Mukherjee, S., Stringer, M. & Jones, B. Atypical combinations and scientific impact. Science 342, 468–472 (2013).

  27. 27.

    Wuchty, S., Jones, B. F. & Uzzi, B. The increasing dominance of teams in production of knowledge. Science 316, 1036–1039 (2007).

  28. 28.

    Shockley, W. On the statistics of individual variations of productivity in research laboratories. Proc. IRE 45, 279–290 (1957).

  29. 29.

    Redner, S. Citation statistics from 110 years of physical review. Phys. Today 58, 49 (2005).

  30. 30.

    Moreira, J. A., Zeng, X. H. T. & Amaral, L. A. N. The distribution of the asymptotic number of citations to sets of publications by a researcher or from an academic department are consistent with a discrete lognormal model. PLoS One 10, e0143108 (2015).

  31. 31.

    Pan, R. K., Petersen, A. M., Pammolli, F. & Fortunato, S. The memory of science: inflation, myopia, and the knowledge network. Preprint at https://arxiv.org/abs/1607.05606 (2016).

  32. 32.

    Palla, G., Barabási, A.-L. & Vicsek, T. Quantifying social group evolution. Nature 446, 664–667 (2007).

Download references

Acknowledgements

We thank A.-L. Barabasi, R. Burt, J. Chown, J. Evans, C. Jin, L. Nordgren, W. Ocasio, B. Uzzi, Y. Yin, the Kellogg Insights and all members of NICO for invaluable comments. This work is supported by the Air Force Office of Scientific Research (AFOSR) under award number FA9550-15-1-0162 and FA9550-17-1-0089, and Northwestern University’s Data Science Initiative. R.S. acknowledges support from AFOSR grant FA9550-15-1-0364 and from the Central European University Intellectual Themes Initiative ‘Just Data’.

Reviewer information

Nature thanks J. Walsh, J. West and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Author information

Affiliations

  1. Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, USA

    • Lu Liu
    • , Yang Wang
    •  & Dashun Wang
  2. Kellogg School of Management, Northwestern University, Evanston, IL, USA

    • Lu Liu
    • , Yang Wang
    •  & Dashun Wang
  3. College of Information Sciences and Technology, Pennsylvania State University, University Park, PA, USA

    • Lu Liu
    •  & C. Lee Giles
  4. Department of Network and Data Science, and Department of Mathematics and its Applications, Central European University, Budapest, Hungary

    • Roberta Sinatra
  5. Center for Complex Network Research, Northeastern University, Boston, MA, USA

    • Roberta Sinatra
  6. Complexity Science Hub, Vienna, Austria

    • Roberta Sinatra
  7. Department of Computer Science and Engineering, Pennsylvania State University, University Park, PA, USA

    • C. Lee Giles
  8. Department of Physics, University of Miami, Coral Gables, FL, USA

    • Chaoming Song
  9. McCormick School of Engineering, Northwestern University, Evanston, IL, USA

    • Dashun Wang

Authors

  1. Search for Lu Liu in:

  2. Search for Yang Wang in:

  3. Search for Roberta Sinatra in:

  4. Search for C. Lee Giles in:

  5. Search for Chaoming Song in:

  6. Search for Dashun Wang in:

Contributions

D.W. conceived the project and designed the experiments; L.L. and Y.W. collected data and performed empirical analyses with help from R.S., C.L.G., C.S. and D.W.; L.L., Y.W., C.S. and D.W. did theoretical calculations; all authors discussed and interpreted results; D.W. and L.L. wrote the manuscript; all authors edited the manuscript.

Competing interests

The authors declare no competing interests.

Corresponding author

Correspondence to Dashun Wang.

Extended data figures and tables

  1. Extended Data Fig. 1 Additional results on hot streaks in artistic, cultural, and scientific careers.

    ac, The cumulative distribution P(≥Ni/N) for the order of the top three highest impact works within a career for artists (a), directors (b) and scientists (c). Ni denotes the order of the ith highest-impact work within a career. The colours denote different hit works, and the dashed grey line denotes P(≥Ni/N) for a uniform distribution. df, ϕ(N**, N***) for the second- and third-highest-impact works within a career. ϕ(N**, N***) is also overrepresented along the diagonal. gi, ϕ(N*, N***) for the first- and third-highest-impact works within a career. jr, We shuffled the order of each work in a career while keeping their impact intact. The diagonal patterns in di and Fig. 1a–c disappeared for shuffled careers. su, ϕ(N*, N**) predicted by the hot-streak model successfully recovered the diagonal patterns observed in ac. For du and Fig. 1a–c, we applied the same binning procedure to data, using bins that ranged from 0 to 1 with increments of 0.1.

  2. Extended Data Fig. 2 Measuring the length of streaks using different thresholds.

    a, The distribution of auction price P(Price) for artists. Blue dots denote data, and the red line is a log-normal distribution with average μ = 7.9 and standard deviation σ = 1.5. b, The distribution of film rating P(Rating) for directors. The red line is a normal distribution with average μ = 7.1 and standard deviation σ = 1.2. c, The distribution of raw and rescaled C10 (inset) for scientists. The red line is a log-normal distribution, with μ = 2.3 and σ = 1.3 for c and μ = −0.4 and σ = 0.8 for the inset. df, Definitions of the longest streak L for artists (d), directors (e) and scientists (f). Dots are coloured orange above the threshold, blue otherwise. The lower panel highlights the longest streak in a career. gi, P(L) for real careers and P(Ls) for shuffled careers using the mean impact within a career as the threshold. jl, As in gi, but using the top 10% impact as the threshold to calculate L and Ls. In all cases, P(L) has a wider tail than P(Ls), indicating that high-impact works in real careers tend to cluster together.

  3. Extended Data Fig. 3 Varying career length.

    To test the robustness of our results, we repeated our measurements by controlling for the career length of individuals. ai, Artists and directors with careers of at least 20 years and scientists with careers of at least 30 years. ac, P(≥Ni/N) of the top three highest-impact works within a career. df, R( Δ N N ) among the top three highest-impact works in a career. gi, P(L) for real careers and P(Ls) for shuffled careers. jr, As in ai but for artists and directors with careers of at least 30 years and scientists with careers of at least 40 years. These results demonstrate that the patterns observed in Fig. 1 hold for individuals with different career lengths.

  4. Extended Data Fig. 4 Artistic careers from different eras.

    a, R( Δ N N ) for artists who started their careers before 1850. b, P(L) for real careers and P(Ls) for shuffled careers for artists who started their careers before 1850. c, R( Δ N N ) for artists who started careers between 1850 and 1900. d, P(L) for real careers and P(Ls) for shuffled careers for artists who started their careers between 1850 and 1900. These results demonstrate that the patterns observed in Fig. 1 hold for artists from different eras.

  5. Extended Data Fig. 5 Additional properties of hot streaks.

    ac, Correlations between ΓH and Γ0 for artists (a; n = 3,166), directors (b; n = 5,098) and scientists (c; n = 18,121). The blue background denotes the kernel density of data, dots represent binning results of data, and the red lines depict a linear fit. Inset, the relationship between ΔΓ ( = ΓH − Γ0) and Γ0. df, The distribution of τH/T, representing the duration of hot streaks over total career lengths. The temporally localized nature of a hot streak is also captured by its proportion over career length τH/T.

  6. Extended Data Fig. 6 Comparison of g(t) between the null model and the hot-streak model.

    ac, g(t) of three scientists in our data set with mid-career (a), late-career (b) and early-career (c) onset of hot streaks. Red dots denote data, the blue line is the null model’s prediction based on early performance, and the red line captures the predictions from the hot-streak model, with dashed grey lines denoting the start and end of hot streaks. d, The difference between our hot-streak model and the null model for each individual, Δg(t). Dashed lines with corresponding colours denote the start of the hot streak. d illustrates the discrepancies in estimating an individual’s future impact if we ignore the uncovered hot streaks. e, The distribution of the BIC measure, P(BIC), showing that the hot-streak model outperforms the null model in describing g(t) after accounting for model complexity. f, The distribution of the MAPE measure, P(MAPE), showing that the hot-streak model outperforms the null model in describing g(t). g, The uncertainty envelope of g(t) for an individual in our data set. Blue dots denote data, and the red line is the fitting result of equation (4). Shaded area illustrates predicted uncertainty (one standard deviation). h, The fraction of g(t) falling within the envelope for the null model (blue) and our hot-streak model (red). Fraction = 1.0 indicates that the entire g(t) trajectory falls within the envelope. i, Average MAPE of our hot-streak model and the null model for individuals with early-career, mid-career and late-career onset of hot streaks. The difference is largest for individuals with early-onset hot streaks and smallest for those with late-onset ones.

  7. Extended Data Fig. 7 Testing alternative hot-streak dynamics.

    af, Illustrative examples of Γ(N) for the hot-streak model (a), right trapezoid function (b), isosceles trapezoid function (c), quadratic function (d), tent function (e) and left trapezoid function (f). g, The distribution of the relative position P ( Ñ ) of the three highest-impact works among the six highest-impact works within a career for artists, where Ñ denotes the relative order among the top six hits. hm, P ( Ñ ) predicted by corresponding models shown in af, respectively, according to artists’ real productivity profiles. To test whether data agree with model predictions, we measured their statistical difference using the P value of the Kolmogorov–Smirnov test for discrete distributions. We colour the distributions green if we cannot reject the hypothesis that the data and the model predictions come from the same distributions, and red otherwise. Among the six models, the hot-streak model is the only model whose predictions are consistent with the data in terms of the relative ordering among the six highest-impact works observed in real careers. n, The proportion of real careers that are captured by the model with the smallest BIC among different hypotheses. The hot-streak model again stands out as the best model to describe real careers. We repeated the analyses for directors (pv) and scientists (xad), the conclusions remained the same across all three domains.

  8. Extended Data Fig. 8 Testing Markovian hypotheses.

    Here we test whether the observed patterns can be explained by Markovian dynamics that introduce correlations between neighbouring data points. We first test the assumptions of the Markovian hypothesis from the data (af). ac, The distribution of N, N + 1 differences between adjacent data points observed in real careers for artists (a, n = 3,480), directors (b, n = 6,233) and scientists (c, n = 20,040). df, The autocorrelation measured in real careers for artists (d, n = 3,480), directors (e, n = 6,233), and scientists (f, n = 20,040). af suggest that there is little short-range correlation in data across the three domains. We test three variants of Markovian models (gl). The details of these models are outlined in Supplementary Information S6.2. gi, ϕ(N*, N**) of the top two highest-impact works within a career for three Markovian models using scientists’ profiles as input. jl, The distribution of the longest streak length P(L) and P(Ls) using median impact within a career as threshold for the three Markovian models. gl demonstrate that the three Markovian models failed to capture the observed colocations among hits.

  9. Extended Data Fig. 9 Additional examples of Γ.

    ac, Each subplot denotes the fitting result on Γ sequence for a randomly selected career for artists (a), directors (b) and scientists (c). Blue dots denote the moving average Γ(N) from data and red lines denote the best fitting result of the hot-streak model for each individual.

  10. Extended Data Fig. 10 Individuals with one or more hot streaks.

    ac, The distribution of average impacts for individuals with one or more than one hot streaks for artists (a), directors (b) and scientists (c). Blue dots denote individuals with one hot streak, and red dots denote individuals with at least two hot streaks. df, The distribution of the number of works P(N) within a career for individuals with one or more than one hot streak for for artists (d), directors (e) and scientists (f). gi, The distribution of career length P(τ) for individuals with one or more than one hot streaks for artists (g), directors (h) and scientists (i). jl, The distribution of PH) for individuals with one or more than one hot streaks for artists (j), directors (k) and scientists (l). Between those who have one or two hot streaks, there is no detectable difference in terms of typical performance metrics, including impact, productivity and career length, suggesting that the hot streak captures an orthogonal dimension to current metrics characterizing individual careers.

Supplementary information

  1. Supplementary Information

    This file contains supplementary information S1-S6; which includes supplementary tables S1 and S2.

  2. Reporting Summary

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/s41586-018-0315-8

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.