There are more than 7,000 languages spoken in the world today1. It has been argued that the natural and social environment of languages drives this diversity2,3,4,5,6,7,8,9,10,11,12,13. However, a fundamental question is how strong are environmental pressures, and does neutral drift suffice as a mechanism to explain diversification? We estimate the phylogenetic signals of geographic dimensions, distance to water, climate and population size on more than 6,000 phylogenetic trees of 46 language families. Phylogenetic signals of environmental factors are generally stronger than expected under the null hypothesis of no relationship with the shape of family trees. Importantly, they are also—in most cases—not compatible with neutral drift models of constant-rate change across the family tree branches. Our results suggest that language diversification is driven by further adaptive and non-adaptive pressures. Language diversity cannot be understood without modelling the pressures that physical, ecological and social factors exert on language users in different environments across the globe.
Access optionsAccess options
Subscribe to Journal
Get full journal access for 1 year
only $8.67 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Hammarström, H., Forkel, R., Haspelmath, M. & Bank, S. Glottolog 3.2 (Max Planck Institute for the Science of Human History, 2018); http://glottolog.org
Nichols, J. Linguistic diversity and the first settlement of the New World. Language 66, 475–521 (1990).
Nettle, D. Using social impact theory to simulate language change. Lingua 108, 95–117 (1999).
Coupé, C., Hombert, J.-M., Marsico, E. & Pellegrino, F. in East Flows the Great River: Festschrift in Honor of Prof. William S-Y. WANG on his 80th Birthday (eds Peng, G. & Shi, F.) 76–103 (City Univ. Hong Kong Press, Hong Kong, 2013).
Gavin, M. C. et al. Toward a mechanistic understanding of linguistic diversity. Bioscience 63, 524–535 (2013).
Mace, R. & Pagel, M. A latitudinal gradient in the density of human languages in North America. Proc. Biol. Sci. 261, 117–121 (1995).
Collard, I. F. & Foley, R. A. Latitudinal patterns and environmental determinants of recent human cultural diversity: do humans follow biogeographical rules? Evol. Ecol. Res. 4, 371–383 (2002).
Moore, J. L. et al. The distribution of cultural and biological diversity in Africa. Proc. Biol. Sci. 269, 1645–1653 (2002).
Dimmendaal, G. J. Language ecology and linguistic diversity on the African continent. Lang. Linguist. Compass 2, 840–858 (2008).
Axelsen, J. B. & Manrubia, S. River density and landscape roughness are universal determinants of linguistic diversity. Proc. Biol. Sci. 281, 20133029 (2014).
Gavin, M. C. & Sibanda, N. The island biogeography of languages. Glob. Ecol. Biogeogr. 21, 958–967 (2012).
Gavin, M. C. et al. Process-based modelling shows how climate and demography shape language diversity. Glob. Ecol. Biogeogr. 26, 584–591 (2017).
Currie, T. E. & Mace, R. Political complexity predicts the spread of ethnolinguistic groups. Proc. Natl Acad. Sci. USA 106, 7339–7344 (2009).
Cavalli-Sforza, L. L., Menozzi, P. & Piazza, A. The History and Geography of Human Genes (Princeton Univ. Press, Princeton, 1994).
Everett, C., Blasi, D. E. & Roberts, S. G. Climate, vocal folds, and tonal languages: connecting the physiological and geographic dots. Proc. Natl Acad. Sci. USA 112, 1322–1327 (2015).
Lupyan, G. & Dale, R. Why are there different languages? The role of adaptation in linguistic diversity. Trends Cogn. Sci. 20, 649–660 (2016).
Dediu, D., Janssen, R. & Moisik, S. R. Language is not isolated from its wider environment: vocal tract influences on the evolution of speech and language. Lang. Commun. 54, 9–20 (2017).
Welmers, W. E. African Language Structures (University of California Press, Berkeley and Los Angeles, 1973).
McMahon, A. M. Understanding Language Change (Cambridge Univ. Press, Cambridge, 1994).
Sapir, E. Language: An Introduction to the Study of Speech (Harcourt, Brace & World, New York, 1921).
Jones, M. C. & Singh, I. Exploring Language Change (Routledge, New York, 2005).
Blomberg, S. P., Garland, T. Jr., Ives, A. R. & Crespi, B. Testing for phylogenetic signal in comparative data: behavioral traits are more labile. Evolution 57, 717–745 (2003).
Symonds, M. R. & Blomberg, S. P. in Modern Phylogenetic Comparative Methods and Their Application in Evolutionary Biology (ed. Garamszegi, L. Z.) 105–130 (Springer, Heidelberg, 2014).
Verkerk, A. Diachronic change in Indo-European motion event encoding. J. Hist. Linguist. 4, 40–83 (2014).
Verkerk, A. The correlation between motion event encoding and path verb lexicon size in the Indo-European language family. Folia Linguist. Hist. 35, 307–358 (2014).
Bentz, C., Verkerk, A., Kiela, D., Hill, F. & Buttery, P. Adaptive communication: languages with more non-native speakers tend to have fewer word forms. PLoS ONE 10, e0128254 (2015).
Everett, C. Evidence for direct geographic influences on linguistic sounds: the case of ejectives. PLoS ONE 8, e65275 (2013).
Lupyan, G. & Dale, R. Language structure is partly determined by social structure. PLoS ONE 5, e8559 (2010).
Dale, R. & Lupyan, G. Understanding the origins of morphological diversity: the linguistic niche hypothesis. Adv. Complex Syst. 15, 1150017 (2012).
Bentz, C. & Winter, B. Languages with more second language speakers tend to lose nominal case. Lang. Dynam. Change 3, 1–27 (2013).
Revell, L. J., Harmon, L. J. & Collar, D. C. Phylogenetic signal, evolutionary process, and rate. Syst. Biol. 57, 591–601 (2008).
Thomason, S. G. & Kaufman, T. Language Contact, Creolization, and Genetic Linguistics (Univ. California Press, Berkeley & Oxford, 1988).
Diamond, J. M. Guns, Germs and Steel: The Fates of Human Societies (W. W. Norton, New York & London, 1999).
Güldemann, T. & Hammarström, H. in Language Dispersal, Diversification and Contact (eds Crevels, M. & Muysken, P.) (Oxford Univ. Press, Oxford, 2017).
Lupyan, G. & Dale, R. in Language Structure and Environment (eds De Busser, R. & LaPolla, R. J.) 289–316 (John Benjamins Publishing Company, Amsterdam, 2015).
Dediu, D. Making genealogical language classifications available for phylogenetic analysis: Newick trees, unified identifiers, and branch length. Lang. Dynam. Change 8, 1–21 (2018).
Lewis, M. P., Simons, G. F. & Fenning, C. D. Ethnologue: Languages of the World 17th edn (SIL International, Dallas, 2013); http://www.ethnologue.com
Dryer, M. S. & Haspelmath, M. The World Atlas of Language Structures Online (Max Planck Digital Library, 2013); http://wals.info/
Nichols, J., Witzlack-Makarevich, A. & Bickel, B. The AUTOTYP Genealogy and Geography Database 2013 Release (2013); https://www.spw.uzh.ch/autotyp/
Jäger, G. Global-scale phylogenetic linguistic inference from lexical resources. Preprint at http://arxiv.org/abs/1802.06079 (2018).
Münkemüller, T. et al. How to measure and test phylogenetic signal. Methods Ecol. Evol. 3, 743–756 (2012).
R Development Core Team R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2017).
Chamberlain, S. rgbif: Interface to the Global ‘Biodiversity’ Information Facility ‘API’, R package version 0.9.5 (2016); https://CRAN.R-project.org/package=rgbif
Pagel, M. Inferring evolutionary processes from phylogenies. Zool. Scr. 26, 331–348 (1997).
Pagel, M. Inferring the historical patterns of biological evolution. Nature 401, 877–884 (1999).
Freckleton, R. P., Harvey, P. H. & Pagel, M. Phylogenetic analysis and comparative data: a test and review of evidence. Am. Nat. 160, 712–726 (2002).
Revell, L. J. phytools: an R package for phylogenetic comparative biology (and other things). Methods Ecol. Evol. 3, 217–223 (2012).
Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer, New York, 2016).
Yu, G., Smith, D. K., Zhu, H., Guan, Y. & Lam, T. T-Y. ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods Ecol. Evol. 8, 28–36 (2017).
Kahle, D. & Wickham, H. ggmap: spatial visualization with ggplot2. R. J. 5, 144–161 (2013).
C.B. and G.J. were funded by the German Research Foundation (DFG FOR 2237; project ‘Words, Bones, Genes, Tools: Tracking Linguistic, Cultural, and Biological Trajectories of the Human Past’) and the ERC Advanced Grant 324246 EVOLAEMP. D.D. was funded by The Netherlands Organisation for Scientific Research VIDI grant 276-70-022 and the European Institutes for Advanced Study Fellowship Program. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
Supplementary Results 1–4, Supplementary Methods 1–6, Supplementary Note 1
Dediu’s forest data
Maximum likelihood trees data
Environmental variables data
All phylogenetic signals data
Phylogenetic signals for distances to lakes, rivers, and oceans data
Wilcoxon test results by tree set
Wilcoxon results by family
R analysis code files
About this article
Nature Communications (2019)