Detecting evolutionary forces in language change


Both language and genes evolve by transmission over generations with opportunity for differential replication of forms1. The understanding that gene frequencies change at random by genetic drift, even in the absence of natural selection, was a seminal advance in evolutionary biology2. Stochastic drift must also occur in language as a result of randomness in how linguistic forms are copied between speakers3,4. Here we quantify the strength of selection relative to stochastic drift in language evolution. We use time series derived from large corpora of annotated texts dating from the 12th to 21st centuries to analyse three well-known grammatical changes in English: the regularization of past-tense verbs5,6,7,8,9, the introduction of the periphrastic ‘do’10, and variation in verbal negation11. We reject stochastic drift in favour of selection in some cases but not in others. In particular, we infer selection towards the irregular forms of some past-tense verbs, which is likely driven by changing frequencies of rhyming patterns over time. We show that stochastic drift is stronger for rare words, which may explain why rare forms are more prone to replacement than common ones6,9,12. This work provides a method for testing selective theories of language change against a null model and reveals an underappreciated role for stochasticity in language evolution.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: A null model of language change.
Figure 2: Verb regularization and irregularization.
Figure 3: The rise of the periphrastic ‘do’ in Early Modern English.
Figure 4: Evolution of verbal negation.


  1. 1

    Cavalli-Sforza, L. & Feldman, M. Cultural Transmission and Evolution: A Quantitative Approach (Princeton Univ. Press, 1980)

  2. 2

    Crow, J. F. & Kimura, M. An Introduction to Population Genetics Theory (Harper & Row, 1970)

  3. 3

    Bentley, R. A., Hahn, M. W. & Shennan, S. J. Random drift and culture change. Proc. R. Soc. Lond. B 271, 1443–1450 (2004)

    Article  Google Scholar 

  4. 4

    Reali, F. & Griffiths, T. L. Words as alleles: connecting language evolution with Bayesian learners to models of genetic drift. Proc. R. Soc. Lond. B 277, 429–436 (2010)

    Article  Google Scholar 

  5. 5

    Pinker, S. Rules of language. Science 253, 530–535 (1991)

    ADS  CAS  Article  Google Scholar 

  6. 6

    Lieberman, E., Michel, J.-B., Jackson, J., Tang, T. & Nowak, M. A. Quantifying the evolutionary dynamics of language. Nature 449, 713–716 (2007)

    ADS  CAS  Article  Google Scholar 

  7. 7

    Michel, J.-B. et al. Quantitative analysis of culture using millions of digitized books. Science 331, 176–182 (2011)

    ADS  CAS  Article  Google Scholar 

  8. 8

    Reali, F. & Griffiths, T. L. The evolution of frequency distributions: relating regularization to inductive biases through iterated learning. Cognition 111, 317–328 (2009)

    Article  Google Scholar 

  9. 9

    Hooper, J. B. in Current Progress in Historical Linguistics: Proc. 2nd International Conference on Historical Linguistics (ed. Christie, W. M. ) 96–105 (North-Holland, 1976)

  10. 10

    Ellegård, A. The Auxiliary Do: The Establishment and Regulation of its Use in English (Almquist & Wiksell, 1953)

  11. 11

    Jespersen, O. Negation in English and Other Languages (AF Høst, 1917)

  12. 12

    Pagel, M., Atkinson, Q. D. & Meade, A. Frequency of word-use predicts rates of lexical evolution throughout Indo-European history. Nature 449, 717–720 (2007)

    ADS  CAS  Article  Google Scholar 

  13. 13

    Schleicher, A. Darwinism Tested by the Science of Language (John Camden Hotten, 1869)

  14. 14

    Darwin, C. The Descent of Man, and Selection in Relation to Sex (Murray, 1888)

  15. 15

    Haeckel, E. Natürliche Schöpfungs-Geschichte (Georg Reimer, 1868)

  16. 16

    Bybee, J. L. & Moder, C. L. Morphological classes as natural categories. Language 59, 251–270 (1983)

    Article  Google Scholar 

  17. 17

    Davies, M. Expanding horizons in historical linguistics with the 400-million word Corpus of Historical American English. Corpora 7, 121–157 (2012)

    Article  Google Scholar 

  18. 18

    Labov, W. Principles of Linguistic Change Vol. 3 (Blackwell, 2010)

  19. 19

    Kroch, A. S. Reflexes of grammar in patterns of language change. Lang. Var. Change 1, 199–244 (1989)

    Article  Google Scholar 

  20. 20

    Christiansen, M. H ., Chater, N . & Culicover, P. W. Creating Language: Integrating Evolution, Acquisition, and Processing (MIT Press, 2016)

  21. 21

    Croft, W. Explaining Language Change: An Evolutionary Approach (Pearson Education, 2000)

  22. 22

    Prasada, S. & Pinker, S. Generalisation of regular and irregular morphological patterns. Lang. Cogn. Process. 8, 1–56 (1993)

    Article  Google Scholar 

  23. 23

    Dahl, O. Inflationary Effects in Language and Elsewhere (John Benjamins, 2001)

  24. 24

    Hawkins, J. A. A parsing theory of word order universals. Linguist. Inq. 21, 223–261 (1990)

    Google Scholar 

  25. 25

    Blythe, R. A. & Croft, W. S-curves and the mechanisms of propagation in language change. Language 88, 269–304 (2012)

    Article  Google Scholar 

  26. 26

    Wright, S. Evolution in Mendelian populations. Genetics 16, 97–159 (1931)

    CAS  PubMed  PubMed Central  Google Scholar 

  27. 27

    Kandler, A. & Shennan, S. A non-equilibrium neutral model for analysing cultural change. J. Theor. Biol. 330, 18–25 (2013)

    Article  Google Scholar 

  28. 28

    Hahn, M. W. & Bentley, R. A. Drift as a mechanism for cultural change: an example from baby names. Proc. R. Soc. Lond. B 270, S120–S123 (2003)

    Article  Google Scholar 

  29. 29

    Blythe, R. A. Neutral evolution: a null model for language dynamics. Adv. Complex Syst. 15, 1150015 (2012)

    MathSciNet  Article  Google Scholar 

  30. 30

    Baxter, G. J., Blythe, R. A., Croft, W. & McKane, A. J. Modeling language change: an evalution of Trudgill’s theory of the emergence of New Zealand English. Lang. Var. Change 21, 257–296 (2009)

    Article  Google Scholar 

  31. 31

    Cuskley, C. F. et al. Internal and external dynamics in language: evidence from verb regularity in a historical corpus of English. PLoS One 9, e102882 (2014)

    ADS  Article  Google Scholar 

  32. 32

    Feder, A. F., Kryazhimskiy, S. & Plotkin, J. B. Identifying signatures of selection in genetic time series. Genetics 196, 509–522 (2014)

    Article  Google Scholar 

  33. 33

    Jakobson, R ., Waugh, L. R . & Monville-Burston, M. On Language (Harvard Univ. Press, 1995)

  34. 34

    Zipf, G. K. Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology (Addison–Wesley, 1949)

  35. 35

    Ullman, M. T. Acceptability ratings of regular and irregular past-tense forms: evidence for a dual-system model of language from word frequency and phonological neighbourhood effects. Lang. Cogn. Process. 14, 47–67 (1999)

    Article  Google Scholar 

  36. 36

    Crawford, V. P. & Sobel, J. Strategic information transmission. Econometrica 50, 1431–1451 (1982)

    MathSciNet  Article  Google Scholar 

  37. 37

    Ringe, D., Warnow, T. & Taylor, A. Indo-European and computational cladistics. Trans. Philol. Soc. 100, 59–129 (2002)

    Article  Google Scholar 

  38. 38

    Gray, R. D. & Atkinson, Q. D. Language-tree divergence times support the Anatolian theory of Indo-European origin. Nature 426, 435–439 (2003)

    ADS  CAS  Article  Google Scholar 

  39. 39

    Pagel, M. in The Princeton Guide to Evolution (ed. Losos, J.) Ch. VIII.9 (Princeton Univ. Press, 2013)

  40. 40

    Pagel, M. Human language as a culturally transmitted replicator. Nat. Rev. Genet. 10, 405–415 (2009)

    CAS  Article  Google Scholar 

  41. 41

    Lupyan, G. & Dale, R. in Language Structure and Environment: Social, Cultural, and Natural Factors (eds De Busser, R. & LaPolla, R. J. ) Ch. 11 (2015)

  42. 42

    Tamariz, M ., Ellison, T. M ., Barr, D. J . & Fay, N. Cultural selection drives the evolution of human communication systems. Proc. R. Soc. Lond. B 281, 20140488 (2014)

    Article  Google Scholar 

Download references


We thank H. Bacovcin, T. Kroch, and M. Liberman. R.C. acknowledges support from the University of Pennsylvania Research Foundation. J.B.P. acknowledges support from the David & Lucile Packard Foundation, the US Defense Advanced Research Projects Agency (D12AP00025), and the US Army Research Office (W911NF-12-1-0552).

Author information




M.G.N., C.A.A., R.C., and J.B.P. conceived the study, designed the analysis, and wrote the paper.

Corresponding author

Correspondence to Joshua B. Plotkin.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

Reviewer Information Nature thanks R. A. Bentley and the other anonymous reviewers for their contribution to the peer review of this work.

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Figure 1 Time series of changing rhyming patterns.

Each panel shows the time series of a polymorphic verb (black lines), repeated from Fig. 2a, and the frequency of similar-sounding monomorphic regular (orange) and irregular (blue) verbs in the Corpus of Historical American English. The tokens included are all tenses of those lemmas that possess a pronunciation known to the Carnegie Mellon University Pronouncing Dictionary in both the lemma and the simple past tense. The list of verbs incorporated in each time series is given in Extended Data Table 2. For 17 polymorphic verbs we find no similar-sounding monomorphic irregular verbs (all-orange panels). The title of each panel indicates the sign of the maximum-likelihood selection coefficient, either regular → irregular or irregular → regular.

Extended Data Table 1 FIT results for past-tense verbs
Extended Data Table 2 List of similar-sounding monomorphic verbs for each past-tense conjugation of polymorphic verbs
Extended Data Table 3 FIT results for do-support

Related audio

Supplementary information

Life Sciences Reporting Summary (PDF 71 kb)

Supplementary Information

This file contains supplementary text S1.1 – S1.7 and Figure S1 - Temporal trends in the usage of 36 verbs, in the simple past tense and in all tenses. (PDF 371 kb)

PowerPoint slides

Source data

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Newberry, M., Ahern, C., Clark, R. et al. Detecting evolutionary forces in language change. Nature 551, 223–226 (2017).

Download citation

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing