Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Acoustic regularities in infant-directed speech and song across cultures

Abstract

When interacting with infants, humans often alter their speech and song in ways thought to support communication. Theories of human child-rearing, informed by data on vocal signalling across species, predict that such alterations should appear globally. Here, we show acoustic differences between infant-directed and adult-directed vocalizations across cultures. We collected 1,615 recordings of infant- and adult-directed speech and song produced by 410 people in 21 urban, rural and small-scale societies. Infant-directedness was reliably classified from acoustic features only, with acoustic profiles of infant-directedness differing across language and music but in consistent fashions. We then studied listener sensitivity to these acoustic features. We played the recordings to 51,065 people from 187 countries, recruited via an English-language website, who guessed whether each vocalization was infant-directed. Their intuitions were more accurate than chance, predictable in part by common sets of acoustic features and robust to the effects of linguistic relatedness between vocalizer and listener. These findings inform hypotheses of the psychological functions and evolution of human communication.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Cross-cultural regularities in infant-directed vocalizations.
Fig. 2: How people alter their voices when vocalizing to infants.
Fig. 3: Naive listeners distinguish infant-directed vocalizations from adult-directed vocalizations across cultures.
Fig. 4: Human inferences about infant-directedness are predictable from acoustic features of vocalizations.

Data availability

The audio corpus is available at https://doi.org/10.5281/zenodo.5525161. All data, including supplementary fieldsite-level data and the recording collection protocol, are available at https://github.com/themusiclab/infant-speech-song and are permanently archived at https://doi.org/10.5281/zenodo.6562398. The preregistration for the auditory analyses is at https://osf.io/5r72u.

Code availability

Analysis and visualization code, a reproducible R Markdown manuscript and code for the naive listener experiment are available at https://github.com/themusiclab/infant-speech-song and are permanently archived at https://doi.org/10.5281/zenodo.6562398.

References

  1. Morton, E. S. On the occurrence and significance of motivation–structural rules in some bird and mammal sounds. Am. Nat. 111, 855–869 (1977).

    Article  Google Scholar 

  2. Endler, J. A. Some general comments on the evolution and design of animal communication systems. Phil. Trans. R. Soc. B 340, 215–225 (1993).

    Article  CAS  PubMed  Google Scholar 

  3. Owren, M. J. & Rendall, D. Sound on the rebound: bringing form and function back to the forefront in understanding nonhuman primate vocal signaling. Evol. Anthropol. 10, 58–71 (2001).

    Article  Google Scholar 

  4. Fitch, W. T., Neubauer, J. & Herzel, H. Calls out of chaos: the adaptive significance of nonlinear phenomena in mammalian vocal production. Anim. Behav. 63, 407–418 (2002).

    Article  Google Scholar 

  5. Wiley, R. H. The evolution of communication: information and manipulation. Anim. Behav. 2, 156–189 (1983).

    Google Scholar 

  6. Krebs, J. & Dawkins, R. Animal signals: Mind-reading and manipulation. In Behavioural Ecology: An Evolutionary Approach (eds Krebs, J. & Davies, N.) 380–402 (Blackwell, 1984).

  7. Karp, D., Manser, M. B., Wiley, E. M. & Townsend, S. W. Nonlinearities in meerkat alarm calls prevent receivers from habituating. Ethology 120, 189–196 (2014).

    Article  Google Scholar 

  8. Slaughter, E. I., Berlin, E. R., Bower, J. T. & Blumstein, D. T. A test of the nonlinearity hypothesis in great-tailed grackles (Quiscalus mexicanus). Ethology 119, 309–315 (2013).

    Article  Google Scholar 

  9. Wagner, W. E. Fighting, assessment, and frequency alteration in Blanchard’s cricket frog. Behav. Ecol. Sociobiol. 25, 429–436 (1989).

    Article  Google Scholar 

  10. Ladich, F. Sound production by the river bullhead, Cottus gobio L. (Cottidae, Teleostei). J. Fish Biol. 35, 531–538 (1989).

    Article  Google Scholar 

  11. Filippi, P. et al. Humans recognize emotional arousal in vocalizations across all classes of terrestrial vertebrates: evidence for acoustic universals. Proc. R. Soc. B. 284, 20170990 (2017).

  12. Lingle, S. & Riede, T. Deer mothers are sensitive to infant distress vocalizations of diverse mammalian species. Am. Nat. 184, 510–522 (2014).

    Article  PubMed  Google Scholar 

  13. Custance, D. & Mayer, J. Empathic-like responding by domestic dogs (Canis familiaris) to distress in humans: an exploratory study. Anim. Cogn. 15, 851–859 (2012).

    Article  PubMed  Google Scholar 

  14. Lea, A. J., Barrera, J. P., Tom, L. M. & Blumstein, D. T. Heterospecific eavesdropping in a nonsocial species. Behav. Ecol. 19, 1041–1046 (2008).

    Article  Google Scholar 

  15. Magrath, R. D., Haff, T. M., McLachlan, J. R. & Igic, B. Wild birds learn to eavesdrop on heterospecific alarm calls. Curr. Biol. 25, 2047–2050 (2015).

    Article  CAS  PubMed  Google Scholar 

  16. Piantadosi, S. T. & Kidd, C. Extraordinary intelligence and the care of infants. Proc. Natl Acad. Sci. USA 113, 6874–6879 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Soltis, J. The signal functions of early infant crying. Behav. Brain Sci. 27, 443–458 (2004).

    Article  PubMed  Google Scholar 

  18. Fernald, A. Human maternal vocalizations to infants as biologically relevant signals: An evolutionary perspective. In The Adapted Mind: Evolutionary Psychology and the Generation of Culture (eds Barkow, J. H. et al.) 391–428 (Oxford Univ. Press, 1992).

  19. Burnham, E., Gamache, J. L., Bergeson, T. & Dilley, L. Voice-onset time in infant-directed speech over the first year and a half. Proc. Mtgs Acoust. 19, 060094 (2013).

  20. Fernald, A. & Mazzie, C. Prosody and focus in speech to infants and adults. Dev. Psychol. 27, 209–221 (1991).

    Article  Google Scholar 

  21. Ferguson, C. A. Baby talk in six languages. Am. Anthropol. 66, 103–114 (1964).

    Article  Google Scholar 

  22. Audibert, N. & Falk, S. Vowel space and f0 characteristics of infant-directed singing and speech. In Proc. 9th International Conference on Speech Prosody. 153–157 (2018).

  23. Kuhl, P. K. et al. Cross-language analysis of phonetic units in language addressed to infants. Science 277, 684–686 (1997).

    Article  CAS  PubMed  Google Scholar 

  24. Englund, K. T. & Behne, D. M. Infant directed speech in natural interaction: Norwegian vowel quantity and quality. J. Psycholinguist. Res. 34, 259–280 (2005).

    Article  PubMed  Google Scholar 

  25. Fernald, A. The perceptual and affective salience of mothers’ speech to infants. In The Origins and Growth of Communication (eds Feagans, L. et al.) 5–29 (Praeger, 1984).

  26. Falk, S. & Kello, C. T. Hierarchical organization in the temporal structure of infant-direct speech and song. Cognition 163, 80–86 (2017).

    Article  PubMed  Google Scholar 

  27. Bryant, G. A. & Barrett, H. C. Recognizing intentions in infant-directed speech: evidence for universals. Psychol. Sci. 18, 746–751 (2007).

    Article  PubMed  Google Scholar 

  28. Piazza, E. A., Iordan, M. C. & Lew-Williams, C. Mothers consistently alter their unique vocal fingerprints when communicating with infants. Curr. Biol. 27, 3162–3167 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Trehub, S. E., Unyk, A. M. & Trainor, L. J. Adults identify infant-directed music across cultures. Infant Behav. Dev. 16, 193–211 (1993).

    Article  Google Scholar 

  30. Trehub, S. E., Unyk, A. M. & Trainor, L. J. Maternal singing in cross-cultural perspective. Infant Behav. Dev. 16, 285–295 (1993).

    Article  Google Scholar 

  31. Mehr, S. A., Singh, M., York, H., Glowacki, L. & Krasnow, M. M. Form and function in human song. Curr. Biol. 28, 356–368 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Mehr, S. A. et al. Universality and diversity in human song. Science 366, 957–970 (2019).

    Article  Google Scholar 

  33. Trehub, S. E. Musical predispositions in infancy. Ann. NY Acad. Sci. 930, 1–16 (2001).

  34. Trehub, S. E. & Trainor, L. Singing to infants: lullabies and play songs. Adv. Infancy Res. 12, 43–78 (1998).

    Google Scholar 

  35. Trehub, S. E. et al. Mothers’ and fathers’ singing to infants. Dev. Psychol. 33, 500–507 (1997).

    Article  CAS  PubMed  Google Scholar 

  36. Thiessen, E. D., Hill, E. A. & Saffran, J. R. Infant-directed speech facilitates word segmentation. Infancy 7, 53–71 (2005).

    Article  PubMed  Google Scholar 

  37. Trainor, L. J. & Desjardins, R. N. Pitch characteristics of infant-directed speech affect infants’ ability to discriminate vowels. Psychon. Bull. Rev. 9, 335–340 (2002).

    Article  PubMed  Google Scholar 

  38. Werker, J. F. & McLeod, P. J. Infant preference for both male and female infant-directed talk: a developmental study of attentional and affective responsiveness. Can. J. Psychol. 43, 230–246 (1989).

    Article  CAS  PubMed  Google Scholar 

  39. Ma, W., Fiveash, A., Margulis, E. H., Behrend, D. & Thompson, W. F. Song and infant-directed speech facilitate word learning. Q. J. Exp. Psychol. 73, 1036–1054 (2020).

    Article  Google Scholar 

  40. Falk, D. Prelinguistic evolution in early hominins: whence motherese? Behav. Brain Sci. 27, 491–502 (2004).

    Article  PubMed  Google Scholar 

  41. Mehr, S. A. & Krasnow, M. M. Parent–offspring conflict and the evolution of infant-directed song. Evol. Hum. Behav. 38, 674–684 (2017).

    Article  Google Scholar 

  42. Mehr, S. A., Krasnow, M. M., Bryant, G. A. & Hagen, E. H. Origins of music in credible signaling. Behav. Brain Sci. https://doi.org/10.1017/S0140525X20000345 (2020).

  43. Senju, A. & Csibra, G. Gaze following in human infants depends on communicative signals. Curr. Biol. 18, 668–671 (2008).

    Article  CAS  PubMed  Google Scholar 

  44. Hernik, M. & Broesch, T. Infant gaze following depends on communicative signals: an eye-tracking study of 5- to 7-month-olds in Vanuatu. Dev. Sci. 22, e12779 (2019).

    Article  PubMed  Google Scholar 

  45. Henrich, J., Heine, S. J. & Norenzayan, A. The weirdest people in the world? Behav. Brain Sci. 33, 61–83 (2010).

    Article  PubMed  Google Scholar 

  46. Yarkoni, T. The generalizability crisis. Behav. Brain Sci. 45, e1 (2022).

    Article  Google Scholar 

  47. Broesch, T. & Bryant, G. A. Fathers’ infant-directed speech in a small-scale society. Child Dev. 89, e29–e41 (2018).

    Article  PubMed  Google Scholar 

  48. Ochs, E. & Schieffelin, B. Language acquisition and socialization. In Culture Theory: Essays on Mind, Self, and Emotion (eds Shweder, R. A. & LeVine, R. A.) 276–320 (Cambridge Univ. Press, 1984).

  49. Schieffelin, B. B. The Give and Take of Everyday Life: Language, Socialization of Kaluli Children (Cambridge Univ. Press Archive, 1990).

  50. Ratner, N. B. & Pye, C. Higher pitch in BT is not universal: acoustic evidence from Quiche Mayan. J. Child Lang. 11, 515–522 (1984).

    Article  CAS  PubMed  Google Scholar 

  51. Pye, C. Quiché Mayan speech to children. J. Child Lang. 13, 85–100 (1986).

    Article  CAS  PubMed  Google Scholar 

  52. Heath, S. B. Ways with Words: Language, Life and Work in Communities and Classrooms (Cambridge Univ. Press, 1983).

  53. Trehub, S. E. Challenging infant-directed singing as a credible signal of maternal attention. Behav. Brain Sci. 44, e117 (2021).

  54. Räsänen, O., Kakouros, S. & Soderstrom, M. Is infant-directed speech interesting because it is surprising? Linking properties of IDS to statistical learning and attention at the prosodic level. Cognition 178, 193–206 (2018).

    Article  PubMed  Google Scholar 

  55. Cristia, A. & Seidl, A. The hyperarticulation hypothesis of infant-directed speech. J. Child Lang. 41, 913–934 (2014).

    Article  PubMed  Google Scholar 

  56. Kalashnikova, M., Carignan, C. & Burnham, D. The origins of babytalk: smiling, teaching or social convergence? R. Soc. Open Sci. 4, 170306 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  57. Grieser, D. L. & Kuhl, P. K. Maternal speech to infants in a tonal language: support for universal prosodic features in motherese. Dev. Psychol. 24, 14 (1988).

    Article  Google Scholar 

  58. Fisher, C. & Tokura, H. Acoustic cues to grammatical structure in infant-directed speech: cross-linguistic evidence. Child Dev. 67, 3192–3218 (1996).

    Article  CAS  PubMed  Google Scholar 

  59. Kitamura, C., Thanavishuth, C., Burnham, D. & Luksaneeyanawin, S. Universality and specificity in infant-directed speech: pitch modifications as a function of infant age and sex in a tonal and non-tonal language. Infant Behav. Dev. 24, 372–392 (2001).

    Article  Google Scholar 

  60. Fernald, A. Intonation and communicative intent in mothers’ speech to infants: is the melody the message? Child Dev. 60, 1497–1510 (1989).

    Article  CAS  PubMed  Google Scholar 

  61. Farran, L. K., Lee, C.-C., Yoo, H. & Oller, D. K. Cross-cultural register differences in infant-directed speech: an initial study. PLoS ONE 11, e0151518 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  62. Broesch, T. L. & Bryant, G. A. Prosody in infant-directed speech is similar across Western and traditional cultures. J. Cogn. Dev. 16, 31–43 (2015).

    Article  Google Scholar 

  63. Broesch, T., Rochat, P., Olah, K., Broesch, J. & Henrich, J. Similarities and differences in maternal responsiveness in three societies: evidence From Fiji, Kenya, and the United States. Child Dev. 87, 700–711 (2016).

    Article  PubMed  Google Scholar 

  64. ManyBabies Consortium. Quantifying sources of variability in infancy research using the infant-directed-speech preference. Adv. Methods Pract. Psychol. Sci. 3, 24–52 (2020).

  65. Soley, G. & Sebastian-Galles, N. Infants’ expectations about the recipients of infant-directed and adult-directed speech. Cognition 198, 104214 (2020).

    Article  PubMed  Google Scholar 

  66. Byers-Heinlein, K. et al. A multilab study of bilingual infants: exploring the preference for infant-directed speech. Adv. Methods. Pract. Psychol. Sci. https://doi.org/10.1177/2515245920974622 (2021).

  67. Fernald, A. et al. A cross-language study of prosodic modifications in mothers’ and fathers’ speech to preverbal infants. J. Child Lang. 16, 477–501 (1989).

    Article  CAS  PubMed  Google Scholar 

  68. Kitamura, C. & Burnham, D. Pitch and communicative intent in mother’s speech: adjustments for age and sex in the first year. Infancy 4, 85–110 (2003).

    Article  Google Scholar 

  69. Kitamura, C. & Lam, C. Age-specific preferences for infant-directed affective intent. Infancy 14, 77–100 (2009).

    Article  PubMed  Google Scholar 

  70. Hilton, C., Crowley, L., Yan, R., Martin, A. & Mehr, S. Children infer the behavioral contexts of unfamiliar foreign songs. Preprint at PsyArXiv https://doi.org/10.31234/osf.io/rz6qn (2021).

  71. Yan, R. et al. Across demographics and recent history, most parents sing to their infants and toddlers daily. Phil. Trans. R. Soc. B. 376 (2021).

  72. Custodero, L. A., Rebello Britto, P. & Brooks-Gunn, J. Musical lives: a collective portrait of American parents and their young children. J. Appl. Dev. Psychol. 24, 553–572 (2003).

    Article  Google Scholar 

  73. Mendoza, J. K. & Fausey, C. M. Everyday music in infancy. Developmental Science, 24 (2021).

  74. Konner, M. Aspects of the developmental ethology of a foraging people. In Ethological Studies of Child Behaviour (ed. Blurton Jones, N. G.) 285–304 (Cambridge Univ. Press, 1972).

  75. Marlowe, F.The Hadza Hunter-Gatherers of Tanzania (Univ. of California Press, 2010).

  76. Cirelli, L. K., Jurewicz, Z. B. & Trehub, S. E. Effects of maternal singing style on mother–infant arousal and behavior. J. Cogn. Neurosci. 32, 1213–1220 (2020).

  77. Cirelli, L. K. & Trehub, S. E. Familiar songs reduce infant distress. Dev. Psychol. 56, 861–868 (2020). https://doi.org/10.1037/dev0000917

  78. Bainbridge, C. M. et al. Infants relax in response to unfamiliar foreign lullabies. Nat. Hum. Behav. 5, 256–264 (2021).

  79. Friedman, J., Hastie, T. & Tibshirani, R. Lasso and elastic-net regularized generalized linear models. R package version 2.0-5 (2016).

  80. Hagen, E. H. & Bryant, G. A. Music and dance as a coalition signaling system. Hum. Nat. 14, 21–51 (2003).

    Article  PubMed  Google Scholar 

  81. Corbeil, M., Trehub, S. E. & Peretz, I. Singing delays the onset of infant distress. Infancy 21, 373–391 (2016).

    Article  Google Scholar 

  82. Arnal, L. H., Flinker, A., Kleinschmidt, A., Giraud, A.-L. & Poeppel, D. Human screams occupy a privileged niche in the communication soundscape. Curr. Biol. 25, 2051–2056 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Mehr, S. A., Kotler, J., Howard, R. M., Haig, D. & Krasnow, M. M. Genomic imprinting is implicated in the psychology of music. Psychol. Sci. 28, 1455–1467 (2017).

    Article  PubMed  Google Scholar 

  84. Kotler, J., Mehr, S. A., Egner, A., Haig, D. & Krasnow, M. M. Response to vocal music in Angelman syndrome contrasts with Prader–Willi syndrome. Evol. Hum. Behav. 40, 420–426 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  85. Hilton, C. B. & Mehr, S. A. Citizen science can help to alleviate the generalizability crisis. Behav. Brain Sci. 45, e21 (2022).

  86. Lumsden, C. J. & Wilson, E. O. Translation of epigenetic rules of individual behavior into ethnographic patterns. Proc. Natl Acad. Sci. USA 77, 4382–4386 (1980).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  87. Fitch, W. T. Vocal tract length and formant frequency dispersion correlate with body size in rhesus macaques. J. Acoust. Soc. Am. 102, 1213 (1997).

  88. Blumstein, D. T., Bryant, G. A. & Kaye, P. The sound of arousal in music is context-dependent. Biol. Lett. 8, 744–747 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  89. Reber, S. A. et al. Formants provide honest acoustic cues to body size in American alligators. Sci. Rep. 7, 1816 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  90. Reby, D. et al. Red deer stags use formants as assessment cues during intrasexual agonistic interactions. Proc. R. Soc. B 272, 941–947 (2005).

    Article  PubMed  PubMed Central  Google Scholar 

  91. Bertoncini, J., Jusczyk, P. W., Kennedy, L. J. & Mehler, J. An investigation of young infants’ perceptual representations of speech sounds. J. Exp. Psychol. Gen. 117, 21–33 (1988).

    Article  CAS  PubMed  Google Scholar 

  92. Werker, J. F. & Lalonde, C. E. Cross-language speech perception: initial capabilities and developmental change. Dev. Psychol. 24, 672 (1988).

    Article  Google Scholar 

  93. Polka, L. & Werker, J. F. Developmental changes in perception of nonnative vowel contrasts. J. Exp. Psychol. Hum. Percept. Perform. 20, 421–435 (1994).

    Article  CAS  PubMed  Google Scholar 

  94. Trainor, L. J., Clark, E. D., Huntley, A. & Adams, B. A. The acoustic basis of preferences for infant-directed singing. Infant Behav. Dev. 20, 383–396 (1997).

    Article  Google Scholar 

  95. Tsang, C. D., Falk, S. & Hessel, A. Infants prefer infant-directed song over speech. Child Dev. 88, 1207–1215 (2017).

    Article  PubMed  Google Scholar 

  96. McDermott, J. H., Schultz, A. F., Undurraga, E. A. & Godoy, R. A. Indifference to dissonance in native Amazonians reveals cultural variation in music perception. Nature 535, 547–550 (2016).

    Article  CAS  PubMed  Google Scholar 

  97. Bergelson, E. et al. Everyday language input and production in 1001 children from 6 continents. Preprint at PsyArXiv https://doi.org/10.31234/osf.io/fjr5q (2022).

  98. Trehub, S. E., Hill, D. S. & Kamenetsky, S. B. Parents’ sung performances for infants. Can. J. Exp. Psychol. 51, 385–396 (1997).

    Article  CAS  PubMed  Google Scholar 

  99. Kirby, K. R. et al. D-PLACE: a global database of cultural, linguistic and environmental diversity. PLoS ONE 11, e0158391 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  100. Boersma, P. Praat, a system for doing phonetics by computer. Glot. Int. 5, 341–345 (2001).

  101. Lartillot, O., Toiviainen, P. & Eerola, T. A Matlab toolbox for music information retrieval. In Data Analysis, Machine Learning and Applications (eds Preisach, C. et al.) 261–268 (Springer, 2008).

  102. Patel, A. D. Musical rhythm, linguistic rhythm, and human evolution. Music Percept. 24, 99–104 (2006).

    Article  Google Scholar 

  103. Mertens, P. The prosogram: semi-automatic transcription of prosody based on a tonal perception model. In Proc. 2nd International Conference on Speech Prosody (eds Bel, B. & Marlien, I.) 549–552 (ISCA, 2004).

  104. Kuhn, M. & Wickham, H. Tidymodels: A collection of packages for modeling and machine learning using tidyverse principles. R package version 0.2.0 (2020).

  105. de Leeuw, J. R. jsPsych: a JavaScript library for creating behavioral experiments in a Web browser. Behav. Res. Methods 47, 1–12 (2015).

    Article  PubMed  Google Scholar 

  106. Hartshorne, J. K., de Leeuw, J., Goodman, N., Jennings, M. & O’Donnell, T. J. A thousand studies for the price of one: accelerating psychological science with Pushkin. Behav. Res. Methods 51, 1782–1803 (2019).

    Article  PubMed  Google Scholar 

  107. Sheskin, M. et al. Online developmental science to foster innovation, access, and impact. Sci. Soc. 24, 675-678 (2020).

  108. Anwyl-Irvine, A., Dalmaijer, E. S., Hodges, N. & Evershed, J. K. Realistic precision and accuracy of online experiment platforms, web browsers, and devices. Behav. Res. Methods 53, 1407–1425 (2021).

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

This research was supported by the Harvard University Department of Psychology (M.M.K. and S.A.M.); the Harvard College Research Program (H.L.-R.); the Harvard Data Science Initiative (S.A.M.); the National Institutes of Health Director’s Early Independence Award DP5OD024566 (S.A.M. and C.B.H.); the Academy of Finland grant no. 298513 (J. Antfolk); the Royal Society of New Zealand Te Aparangi Rutherford Discovery Fellowship RDF-UOA1101 (Q.D.A. and T.A.V.); the Social Sciences and Humanities Research Council of Canada (L.K.C.); the Polish Ministry of Science and Higher Education grant no. N43/DBS/000068 (G.J.); the Fogarty International Center (P.M., A. Siddaiah and C.D.P.); the National Heart, Lung, and Blood Institute and the National Institute of Neurological Disorders and Stroke award no. D43 TW010540 (P.M. and A. Siddaiah); the National Institute of Allergy and Infectious Diseases award no. R15-AI128714-01 (P.M.); the Max Planck Institute for Evolutionary Anthropology (C.T.R. and C.M.); a British Academy Research Fellowship and grant no. SRG-171409 (G.D.S.); the Institute for Advanced Study in Toulouse, under an Agence nationale de la recherche grant, Investissements d’Avenir ANR-17-EURE-0010 (L.G. and J. Stieglitz); the Fondation Pierre Mercier pour la Science (C.S.); and the Natural Sciences and Engineering Research Council of Canada (S.E.T.). We thank the participants and their families for providing recordings; L. Sugiyama for supporting pilot data collection; J. Du, E. Pillsworth, P. Wiessner and J. Ziker for collecting or attempting to collect additional recordings; N. Nicolas for research assistance in the Republic of the Congo; Z. Jurewicz for research assistance in Toronto; M. Delfi and R. Sakaliou for research assistance in Indonesia; W. Naiou and A. Altrin for research assistance in Vanuatu; S. Atwood, A. Bergson, D. Li, L. Lopez and E. Radytė for project-wide research assistance; and J. Kominsky, L. Powell and L. Yurdum for feedback on the manuscript. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

S.A.M. and M.M.K. conceived of the research, provided funding and coordinated the recruitment of collaborators and creation of the corpus. S.A.M. and M.M.K. designed the protocol for collecting vocalization recordings with input from D.A., who piloted it in the field. L.G., A.G., G.J., C.T.R., M.B.N., A. Martin, L.K.C., S.E.T., J. Song, M.K., A. Siddaiah, T.A.V., Q.D.A., J. Antfolk, P.M., A. Schachner, C.D.P., G.D.S., S.K., M.S., S.A.C., J.Q.P., C.S., J. Stieglitz, C.M., R.R.S. and B.M.W collected the field recordings, with support from E.A., A. Salenius, J. Andelin, S.C.C., M.A. and A. Mabulla. S.A.M., C.M.B. and J. Simson designed and implemented the online experiment. C.J.M. and H.L.-R. processed all recordings and designed the acoustic feature extraction with S.A.M. and M.M.K.; C.M.B. provided associated research assistance. C.M. designed the fieldsite questionnaire with assistance from M.B. and C.J.M., who collected the data from the principal investigators. C.B.H. and S.A.M. led analyses, with additional contributions from C.J.M., M.B., D.K. and M.M.K. C.B.H. and S.A.M. designed the figures. C.B.H. wrote computer code, with contributions from S.A.M., C.J.M. and M.B. D.K. conducted code review. C.J.M., H.L.-R., M.M.K. and S.A.M. wrote the initial manuscript. C.B.H. and S.A.M. wrote the first revision, with contributions from C.J.M. and M.B. S.A.M. wrote the second and third revisions, with contributions from C.B.H. and C.J.M.

Corresponding authors

Correspondence to Courtney B. Hilton, Cody J. Moser or Samuel A. Mehr.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Variation across societies of infant-directed alterations.

Estimated differences between infant-directed and adult-directed vocalizations, for acoustic feature, in each fieldsite (corresponding with the doughnut plots in Fig. 2). The estimates are derived from the random-effect components of the mixed-effects model reported in the main text. Cells of the table are shaded to facilitate the visibility of corpus-wide consistency (or inconsistency): redder cells represent features where infant-directed vocalizations have higher estimates than adult-directed vocalizations and bluer cells represent features with the reverse pattern. Within speech and song, acoustic features are ordered by their degree of cross-cultural regularity; some features showed the same direction of effect in all 21 societies (for example, for speech, median pitch and pitch variability), whereas others were more variable.

Extended Data Fig. 2 Principal-components analysis of acoustic features.

As an alternative approach to the acoustics data, we ran a principal-components analysis on the full 94 acoustic variables, to test whether an unsupervised method also yielded opposing trends in acoustic features across the different vocalization types. It did. The first three components explained 39% of total variability in the acoustic features. Moreover, the clearest differences between vocalization types accorded with the LASSO and mixed-effects modelling (Figs. 1b and 2). The first principal component most strongly differentiated speech and song, overall; the second most strongly differentiated infant-directed song from adult-directed song; and the third most strongly differentiated infant-directed speech from adult-directed speech. The violins indicate kernel density estimations and the boxplots represent the medians (centres), interquartile ranges (bounds of boxes) and 1.5 × IQR (whiskers). Significance values are computed via two-sided Wilcoxon signed-rank tests (n = 1,570 recordings); *p < 0.05, **p < 0.01, ***p < 0.001. Feature loadings are in Supplementary Table 7.

Extended Data Fig. 3 Screenshots from the naive listener experiment.

On each trial, participants heard a randomly selected vocalization from the corpus and were asked to quickly guess to whom the vocalization was directed: an adult or a baby. The experiment used large emoji and was designed to display comparably on desktop computers (a) or tablets/smartphones (b).

Extended Data Fig. 4 Response biases in the naive listener experiment.

a, Listeners showed reliable biases: regardless of whether a vocalization was infant- or adult-directed, the listeners gave speech recordings substantially fewer “baby” responses than expected by chance, and gave song recordings substantially more “baby” responses. The grey points represent average ratings for each of the recordings in the corpus that were used in the experiment (after exclusions, n = 1,138 recordings from the corpus of 1,615), split by speech and song; the orange and blue points indicate the means of each vocalization type; and the horizontal dashed line represents hypothetical chance level of 50%. b, Despite the response biases, within speech and song, the raw data nevertheless showed clear differences between infant-directed and adult-directed vocalizations, that is, by comparing infant-directedness scores within the same voice, across infant-directed and adult-directed vocalizations (visible here in the steep negative slopes of the grey lines). The main text results report only d’ statistics for these data, for simplicity, but the main effects are nonetheless visible here in the raw data. The points indicate average ratings for each recording; the grey lines connecting the points indicate the pairs of vocalizations produced by the same voice; the half-violins are kernel density estimations; the boxplots represent the medians, interquartile ranges and 95% confidence intervals (indicated by the notches); and the horizontal dashed lines indicate the response bias levels (from a).

Extended Data Fig. 5 Response-time analysis of naive listener experiment.

We recorded the response times of participants in their mobile or desktop browsers, using jsPsych (see Methods), and asked whether, when responding correctly, participants more rapidly detected infant-directedness in speech or song. They did not: a mixed-effects regression predicting the difference in response time between infant-directed and adult-directed vocalizations (within speech or song), adjusting hierarchically for fieldsite and world region, yielded no significant differences (ps > .05 from two-sided linear combination tests; no adjustments made for multiple comparisons). The grey points represent average ratings for each of the recordings in the corpus that were used in the experiment (after exclusions, n = 1,138 recordings from the corpus of 1,615), split by speech and song; the grey lines connecting the points indicate the pairs of vocalizations produced by the same participant; the half-violins are kernel density estimations; and the boxplots represent the medians, interquartile ranges and 95% confidence intervals (indicated by the notches).

Supplementary information

Supplementary Information

Supplementary Methods, Results, Figs. 1–4 and Tables 1–7.

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hilton, C.B., Moser, C.J., Bertolo, M. et al. Acoustic regularities in infant-directed speech and song across cultures. Nat Hum Behav 6, 1545–1556 (2022). https://doi.org/10.1038/s41562-022-01410-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41562-022-01410-x

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing