Abstract
When interacting with infants, humans often alter their speech and song in ways thought to support communication. Theories of human child-rearing, informed by data on vocal signalling across species, predict that such alterations should appear globally. Here, we show acoustic differences between infant-directed and adult-directed vocalizations across cultures. We collected 1,615 recordings of infant- and adult-directed speech and song produced by 410 people in 21 urban, rural and small-scale societies. Infant-directedness was reliably classified from acoustic features only, with acoustic profiles of infant-directedness differing across language and music but in consistent fashions. We then studied listener sensitivity to these acoustic features. We played the recordings to 51,065 people from 187 countries, recruited via an English-language website, who guessed whether each vocalization was infant-directed. Their intuitions were more accurate than chance, predictable in part by common sets of acoustic features and robust to the effects of linguistic relatedness between vocalizer and listener. These findings inform hypotheses of the psychological functions and evolution of human communication.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout




Data availability
The audio corpus is available at https://doi.org/10.5281/zenodo.5525161. All data, including supplementary fieldsite-level data and the recording collection protocol, are available at https://github.com/themusiclab/infant-speech-song and are permanently archived at https://doi.org/10.5281/zenodo.6562398. The preregistration for the auditory analyses is at https://osf.io/5r72u.
Code availability
Analysis and visualization code, a reproducible R Markdown manuscript and code for the naive listener experiment are available at https://github.com/themusiclab/infant-speech-song and are permanently archived at https://doi.org/10.5281/zenodo.6562398.
References
Morton, E. S. On the occurrence and significance of motivation–structural rules in some bird and mammal sounds. Am. Nat. 111, 855–869 (1977).
Endler, J. A. Some general comments on the evolution and design of animal communication systems. Phil. Trans. R. Soc. B 340, 215–225 (1993).
Owren, M. J. & Rendall, D. Sound on the rebound: bringing form and function back to the forefront in understanding nonhuman primate vocal signaling. Evol. Anthropol. 10, 58–71 (2001).
Fitch, W. T., Neubauer, J. & Herzel, H. Calls out of chaos: the adaptive significance of nonlinear phenomena in mammalian vocal production. Anim. Behav. 63, 407–418 (2002).
Wiley, R. H. The evolution of communication: information and manipulation. Anim. Behav. 2, 156–189 (1983).
Krebs, J. & Dawkins, R. Animal signals: Mind-reading and manipulation. In Behavioural Ecology: An Evolutionary Approach (eds Krebs, J. & Davies, N.) 380–402 (Blackwell, 1984).
Karp, D., Manser, M. B., Wiley, E. M. & Townsend, S. W. Nonlinearities in meerkat alarm calls prevent receivers from habituating. Ethology 120, 189–196 (2014).
Slaughter, E. I., Berlin, E. R., Bower, J. T. & Blumstein, D. T. A test of the nonlinearity hypothesis in great-tailed grackles (Quiscalus mexicanus). Ethology 119, 309–315 (2013).
Wagner, W. E. Fighting, assessment, and frequency alteration in Blanchard’s cricket frog. Behav. Ecol. Sociobiol. 25, 429–436 (1989).
Ladich, F. Sound production by the river bullhead, Cottus gobio L. (Cottidae, Teleostei). J. Fish Biol. 35, 531–538 (1989).
Filippi, P. et al. Humans recognize emotional arousal in vocalizations across all classes of terrestrial vertebrates: evidence for acoustic universals. Proc. R. Soc. B. 284, 20170990 (2017).
Lingle, S. & Riede, T. Deer mothers are sensitive to infant distress vocalizations of diverse mammalian species. Am. Nat. 184, 510–522 (2014).
Custance, D. & Mayer, J. Empathic-like responding by domestic dogs (Canis familiaris) to distress in humans: an exploratory study. Anim. Cogn. 15, 851–859 (2012).
Lea, A. J., Barrera, J. P., Tom, L. M. & Blumstein, D. T. Heterospecific eavesdropping in a nonsocial species. Behav. Ecol. 19, 1041–1046 (2008).
Magrath, R. D., Haff, T. M., McLachlan, J. R. & Igic, B. Wild birds learn to eavesdrop on heterospecific alarm calls. Curr. Biol. 25, 2047–2050 (2015).
Piantadosi, S. T. & Kidd, C. Extraordinary intelligence and the care of infants. Proc. Natl Acad. Sci. USA 113, 6874–6879 (2016).
Soltis, J. The signal functions of early infant crying. Behav. Brain Sci. 27, 443–458 (2004).
Fernald, A. Human maternal vocalizations to infants as biologically relevant signals: An evolutionary perspective. In The Adapted Mind: Evolutionary Psychology and the Generation of Culture (eds Barkow, J. H. et al.) 391–428 (Oxford Univ. Press, 1992).
Burnham, E., Gamache, J. L., Bergeson, T. & Dilley, L. Voice-onset time in infant-directed speech over the first year and a half. Proc. Mtgs Acoust. 19, 060094 (2013).
Fernald, A. & Mazzie, C. Prosody and focus in speech to infants and adults. Dev. Psychol. 27, 209–221 (1991).
Ferguson, C. A. Baby talk in six languages. Am. Anthropol. 66, 103–114 (1964).
Audibert, N. & Falk, S. Vowel space and f0 characteristics of infant-directed singing and speech. In Proc. 9th International Conference on Speech Prosody. 153–157 (2018).
Kuhl, P. K. et al. Cross-language analysis of phonetic units in language addressed to infants. Science 277, 684–686 (1997).
Englund, K. T. & Behne, D. M. Infant directed speech in natural interaction: Norwegian vowel quantity and quality. J. Psycholinguist. Res. 34, 259–280 (2005).
Fernald, A. The perceptual and affective salience of mothers’ speech to infants. In The Origins and Growth of Communication (eds Feagans, L. et al.) 5–29 (Praeger, 1984).
Falk, S. & Kello, C. T. Hierarchical organization in the temporal structure of infant-direct speech and song. Cognition 163, 80–86 (2017).
Bryant, G. A. & Barrett, H. C. Recognizing intentions in infant-directed speech: evidence for universals. Psychol. Sci. 18, 746–751 (2007).
Piazza, E. A., Iordan, M. C. & Lew-Williams, C. Mothers consistently alter their unique vocal fingerprints when communicating with infants. Curr. Biol. 27, 3162–3167 (2017).
Trehub, S. E., Unyk, A. M. & Trainor, L. J. Adults identify infant-directed music across cultures. Infant Behav. Dev. 16, 193–211 (1993).
Trehub, S. E., Unyk, A. M. & Trainor, L. J. Maternal singing in cross-cultural perspective. Infant Behav. Dev. 16, 285–295 (1993).
Mehr, S. A., Singh, M., York, H., Glowacki, L. & Krasnow, M. M. Form and function in human song. Curr. Biol. 28, 356–368 (2018).
Mehr, S. A. et al. Universality and diversity in human song. Science 366, 957–970 (2019).
Trehub, S. E. Musical predispositions in infancy. Ann. NY Acad. Sci. 930, 1–16 (2001).
Trehub, S. E. & Trainor, L. Singing to infants: lullabies and play songs. Adv. Infancy Res. 12, 43–78 (1998).
Trehub, S. E. et al. Mothers’ and fathers’ singing to infants. Dev. Psychol. 33, 500–507 (1997).
Thiessen, E. D., Hill, E. A. & Saffran, J. R. Infant-directed speech facilitates word segmentation. Infancy 7, 53–71 (2005).
Trainor, L. J. & Desjardins, R. N. Pitch characteristics of infant-directed speech affect infants’ ability to discriminate vowels. Psychon. Bull. Rev. 9, 335–340 (2002).
Werker, J. F. & McLeod, P. J. Infant preference for both male and female infant-directed talk: a developmental study of attentional and affective responsiveness. Can. J. Psychol. 43, 230–246 (1989).
Ma, W., Fiveash, A., Margulis, E. H., Behrend, D. & Thompson, W. F. Song and infant-directed speech facilitate word learning. Q. J. Exp. Psychol. 73, 1036–1054 (2020).
Falk, D. Prelinguistic evolution in early hominins: whence motherese? Behav. Brain Sci. 27, 491–502 (2004).
Mehr, S. A. & Krasnow, M. M. Parent–offspring conflict and the evolution of infant-directed song. Evol. Hum. Behav. 38, 674–684 (2017).
Mehr, S. A., Krasnow, M. M., Bryant, G. A. & Hagen, E. H. Origins of music in credible signaling. Behav. Brain Sci. https://doi.org/10.1017/S0140525X20000345 (2020).
Senju, A. & Csibra, G. Gaze following in human infants depends on communicative signals. Curr. Biol. 18, 668–671 (2008).
Hernik, M. & Broesch, T. Infant gaze following depends on communicative signals: an eye-tracking study of 5- to 7-month-olds in Vanuatu. Dev. Sci. 22, e12779 (2019).
Henrich, J., Heine, S. J. & Norenzayan, A. The weirdest people in the world? Behav. Brain Sci. 33, 61–83 (2010).
Yarkoni, T. The generalizability crisis. Behav. Brain Sci. 45, e1 (2022).
Broesch, T. & Bryant, G. A. Fathers’ infant-directed speech in a small-scale society. Child Dev. 89, e29–e41 (2018).
Ochs, E. & Schieffelin, B. Language acquisition and socialization. In Culture Theory: Essays on Mind, Self, and Emotion (eds Shweder, R. A. & LeVine, R. A.) 276–320 (Cambridge Univ. Press, 1984).
Schieffelin, B. B. The Give and Take of Everyday Life: Language, Socialization of Kaluli Children (Cambridge Univ. Press Archive, 1990).
Ratner, N. B. & Pye, C. Higher pitch in BT is not universal: acoustic evidence from Quiche Mayan. J. Child Lang. 11, 515–522 (1984).
Pye, C. Quiché Mayan speech to children. J. Child Lang. 13, 85–100 (1986).
Heath, S. B. Ways with Words: Language, Life and Work in Communities and Classrooms (Cambridge Univ. Press, 1983).
Trehub, S. E. Challenging infant-directed singing as a credible signal of maternal attention. Behav. Brain Sci. 44, e117 (2021).
Räsänen, O., Kakouros, S. & Soderstrom, M. Is infant-directed speech interesting because it is surprising? Linking properties of IDS to statistical learning and attention at the prosodic level. Cognition 178, 193–206 (2018).
Cristia, A. & Seidl, A. The hyperarticulation hypothesis of infant-directed speech. J. Child Lang. 41, 913–934 (2014).
Kalashnikova, M., Carignan, C. & Burnham, D. The origins of babytalk: smiling, teaching or social convergence? R. Soc. Open Sci. 4, 170306 (2017).
Grieser, D. L. & Kuhl, P. K. Maternal speech to infants in a tonal language: support for universal prosodic features in motherese. Dev. Psychol. 24, 14 (1988).
Fisher, C. & Tokura, H. Acoustic cues to grammatical structure in infant-directed speech: cross-linguistic evidence. Child Dev. 67, 3192–3218 (1996).
Kitamura, C., Thanavishuth, C., Burnham, D. & Luksaneeyanawin, S. Universality and specificity in infant-directed speech: pitch modifications as a function of infant age and sex in a tonal and non-tonal language. Infant Behav. Dev. 24, 372–392 (2001).
Fernald, A. Intonation and communicative intent in mothers’ speech to infants: is the melody the message? Child Dev. 60, 1497–1510 (1989).
Farran, L. K., Lee, C.-C., Yoo, H. & Oller, D. K. Cross-cultural register differences in infant-directed speech: an initial study. PLoS ONE 11, e0151518 (2016).
Broesch, T. L. & Bryant, G. A. Prosody in infant-directed speech is similar across Western and traditional cultures. J. Cogn. Dev. 16, 31–43 (2015).
Broesch, T., Rochat, P., Olah, K., Broesch, J. & Henrich, J. Similarities and differences in maternal responsiveness in three societies: evidence From Fiji, Kenya, and the United States. Child Dev. 87, 700–711 (2016).
ManyBabies Consortium. Quantifying sources of variability in infancy research using the infant-directed-speech preference. Adv. Methods Pract. Psychol. Sci. 3, 24–52 (2020).
Soley, G. & Sebastian-Galles, N. Infants’ expectations about the recipients of infant-directed and adult-directed speech. Cognition 198, 104214 (2020).
Byers-Heinlein, K. et al. A multilab study of bilingual infants: exploring the preference for infant-directed speech. Adv. Methods. Pract. Psychol. Sci. https://doi.org/10.1177/2515245920974622 (2021).
Fernald, A. et al. A cross-language study of prosodic modifications in mothers’ and fathers’ speech to preverbal infants. J. Child Lang. 16, 477–501 (1989).
Kitamura, C. & Burnham, D. Pitch and communicative intent in mother’s speech: adjustments for age and sex in the first year. Infancy 4, 85–110 (2003).
Kitamura, C. & Lam, C. Age-specific preferences for infant-directed affective intent. Infancy 14, 77–100 (2009).
Hilton, C., Crowley, L., Yan, R., Martin, A. & Mehr, S. Children infer the behavioral contexts of unfamiliar foreign songs. Preprint at PsyArXiv https://doi.org/10.31234/osf.io/rz6qn (2021).
Yan, R. et al. Across demographics and recent history, most parents sing to their infants and toddlers daily. Phil. Trans. R. Soc. B. 376 (2021).
Custodero, L. A., Rebello Britto, P. & Brooks-Gunn, J. Musical lives: a collective portrait of American parents and their young children. J. Appl. Dev. Psychol. 24, 553–572 (2003).
Mendoza, J. K. & Fausey, C. M. Everyday music in infancy. Developmental Science, 24 (2021).
Konner, M. Aspects of the developmental ethology of a foraging people. In Ethological Studies of Child Behaviour (ed. Blurton Jones, N. G.) 285–304 (Cambridge Univ. Press, 1972).
Marlowe, F.The Hadza Hunter-Gatherers of Tanzania (Univ. of California Press, 2010).
Cirelli, L. K., Jurewicz, Z. B. & Trehub, S. E. Effects of maternal singing style on mother–infant arousal and behavior. J. Cogn. Neurosci. 32, 1213–1220 (2020).
Cirelli, L. K. & Trehub, S. E. Familiar songs reduce infant distress. Dev. Psychol. 56, 861–868 (2020). https://doi.org/10.1037/dev0000917
Bainbridge, C. M. et al. Infants relax in response to unfamiliar foreign lullabies. Nat. Hum. Behav. 5, 256–264 (2021).
Friedman, J., Hastie, T. & Tibshirani, R. Lasso and elastic-net regularized generalized linear models. R package version 2.0-5 (2016).
Hagen, E. H. & Bryant, G. A. Music and dance as a coalition signaling system. Hum. Nat. 14, 21–51 (2003).
Corbeil, M., Trehub, S. E. & Peretz, I. Singing delays the onset of infant distress. Infancy 21, 373–391 (2016).
Arnal, L. H., Flinker, A., Kleinschmidt, A., Giraud, A.-L. & Poeppel, D. Human screams occupy a privileged niche in the communication soundscape. Curr. Biol. 25, 2051–2056 (2015).
Mehr, S. A., Kotler, J., Howard, R. M., Haig, D. & Krasnow, M. M. Genomic imprinting is implicated in the psychology of music. Psychol. Sci. 28, 1455–1467 (2017).
Kotler, J., Mehr, S. A., Egner, A., Haig, D. & Krasnow, M. M. Response to vocal music in Angelman syndrome contrasts with Prader–Willi syndrome. Evol. Hum. Behav. 40, 420–426 (2019).
Hilton, C. B. & Mehr, S. A. Citizen science can help to alleviate the generalizability crisis. Behav. Brain Sci. 45, e21 (2022).
Lumsden, C. J. & Wilson, E. O. Translation of epigenetic rules of individual behavior into ethnographic patterns. Proc. Natl Acad. Sci. USA 77, 4382–4386 (1980).
Fitch, W. T. Vocal tract length and formant frequency dispersion correlate with body size in rhesus macaques. J. Acoust. Soc. Am. 102, 1213 (1997).
Blumstein, D. T., Bryant, G. A. & Kaye, P. The sound of arousal in music is context-dependent. Biol. Lett. 8, 744–747 (2012).
Reber, S. A. et al. Formants provide honest acoustic cues to body size in American alligators. Sci. Rep. 7, 1816 (2017).
Reby, D. et al. Red deer stags use formants as assessment cues during intrasexual agonistic interactions. Proc. R. Soc. B 272, 941–947 (2005).
Bertoncini, J., Jusczyk, P. W., Kennedy, L. J. & Mehler, J. An investigation of young infants’ perceptual representations of speech sounds. J. Exp. Psychol. Gen. 117, 21–33 (1988).
Werker, J. F. & Lalonde, C. E. Cross-language speech perception: initial capabilities and developmental change. Dev. Psychol. 24, 672 (1988).
Polka, L. & Werker, J. F. Developmental changes in perception of nonnative vowel contrasts. J. Exp. Psychol. Hum. Percept. Perform. 20, 421–435 (1994).
Trainor, L. J., Clark, E. D., Huntley, A. & Adams, B. A. The acoustic basis of preferences for infant-directed singing. Infant Behav. Dev. 20, 383–396 (1997).
Tsang, C. D., Falk, S. & Hessel, A. Infants prefer infant-directed song over speech. Child Dev. 88, 1207–1215 (2017).
McDermott, J. H., Schultz, A. F., Undurraga, E. A. & Godoy, R. A. Indifference to dissonance in native Amazonians reveals cultural variation in music perception. Nature 535, 547–550 (2016).
Bergelson, E. et al. Everyday language input and production in 1001 children from 6 continents. Preprint at PsyArXiv https://doi.org/10.31234/osf.io/fjr5q (2022).
Trehub, S. E., Hill, D. S. & Kamenetsky, S. B. Parents’ sung performances for infants. Can. J. Exp. Psychol. 51, 385–396 (1997).
Kirby, K. R. et al. D-PLACE: a global database of cultural, linguistic and environmental diversity. PLoS ONE 11, e0158391 (2016).
Boersma, P. Praat, a system for doing phonetics by computer. Glot. Int. 5, 341–345 (2001).
Lartillot, O., Toiviainen, P. & Eerola, T. A Matlab toolbox for music information retrieval. In Data Analysis, Machine Learning and Applications (eds Preisach, C. et al.) 261–268 (Springer, 2008).
Patel, A. D. Musical rhythm, linguistic rhythm, and human evolution. Music Percept. 24, 99–104 (2006).
Mertens, P. The prosogram: semi-automatic transcription of prosody based on a tonal perception model. In Proc. 2nd International Conference on Speech Prosody (eds Bel, B. & Marlien, I.) 549–552 (ISCA, 2004).
Kuhn, M. & Wickham, H. Tidymodels: A collection of packages for modeling and machine learning using tidyverse principles. R package version 0.2.0 (2020).
de Leeuw, J. R. jsPsych: a JavaScript library for creating behavioral experiments in a Web browser. Behav. Res. Methods 47, 1–12 (2015).
Hartshorne, J. K., de Leeuw, J., Goodman, N., Jennings, M. & O’Donnell, T. J. A thousand studies for the price of one: accelerating psychological science with Pushkin. Behav. Res. Methods 51, 1782–1803 (2019).
Sheskin, M. et al. Online developmental science to foster innovation, access, and impact. Sci. Soc. 24, 675-678 (2020).
Anwyl-Irvine, A., Dalmaijer, E. S., Hodges, N. & Evershed, J. K. Realistic precision and accuracy of online experiment platforms, web browsers, and devices. Behav. Res. Methods 53, 1407–1425 (2021).
Acknowledgements
This research was supported by the Harvard University Department of Psychology (M.M.K. and S.A.M.); the Harvard College Research Program (H.L.-R.); the Harvard Data Science Initiative (S.A.M.); the National Institutes of Health Director’s Early Independence Award DP5OD024566 (S.A.M. and C.B.H.); the Academy of Finland grant no. 298513 (J. Antfolk); the Royal Society of New Zealand Te Aparangi Rutherford Discovery Fellowship RDF-UOA1101 (Q.D.A. and T.A.V.); the Social Sciences and Humanities Research Council of Canada (L.K.C.); the Polish Ministry of Science and Higher Education grant no. N43/DBS/000068 (G.J.); the Fogarty International Center (P.M., A. Siddaiah and C.D.P.); the National Heart, Lung, and Blood Institute and the National Institute of Neurological Disorders and Stroke award no. D43 TW010540 (P.M. and A. Siddaiah); the National Institute of Allergy and Infectious Diseases award no. R15-AI128714-01 (P.M.); the Max Planck Institute for Evolutionary Anthropology (C.T.R. and C.M.); a British Academy Research Fellowship and grant no. SRG-171409 (G.D.S.); the Institute for Advanced Study in Toulouse, under an Agence nationale de la recherche grant, Investissements d’Avenir ANR-17-EURE-0010 (L.G. and J. Stieglitz); the Fondation Pierre Mercier pour la Science (C.S.); and the Natural Sciences and Engineering Research Council of Canada (S.E.T.). We thank the participants and their families for providing recordings; L. Sugiyama for supporting pilot data collection; J. Du, E. Pillsworth, P. Wiessner and J. Ziker for collecting or attempting to collect additional recordings; N. Nicolas for research assistance in the Republic of the Congo; Z. Jurewicz for research assistance in Toronto; M. Delfi and R. Sakaliou for research assistance in Indonesia; W. Naiou and A. Altrin for research assistance in Vanuatu; S. Atwood, A. Bergson, D. Li, L. Lopez and E. Radytė for project-wide research assistance; and J. Kominsky, L. Powell and L. Yurdum for feedback on the manuscript. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
Author information
Authors and Affiliations
Contributions
S.A.M. and M.M.K. conceived of the research, provided funding and coordinated the recruitment of collaborators and creation of the corpus. S.A.M. and M.M.K. designed the protocol for collecting vocalization recordings with input from D.A., who piloted it in the field. L.G., A.G., G.J., C.T.R., M.B.N., A. Martin, L.K.C., S.E.T., J. Song, M.K., A. Siddaiah, T.A.V., Q.D.A., J. Antfolk, P.M., A. Schachner, C.D.P., G.D.S., S.K., M.S., S.A.C., J.Q.P., C.S., J. Stieglitz, C.M., R.R.S. and B.M.W collected the field recordings, with support from E.A., A. Salenius, J. Andelin, S.C.C., M.A. and A. Mabulla. S.A.M., C.M.B. and J. Simson designed and implemented the online experiment. C.J.M. and H.L.-R. processed all recordings and designed the acoustic feature extraction with S.A.M. and M.M.K.; C.M.B. provided associated research assistance. C.M. designed the fieldsite questionnaire with assistance from M.B. and C.J.M., who collected the data from the principal investigators. C.B.H. and S.A.M. led analyses, with additional contributions from C.J.M., M.B., D.K. and M.M.K. C.B.H. and S.A.M. designed the figures. C.B.H. wrote computer code, with contributions from S.A.M., C.J.M. and M.B. D.K. conducted code review. C.J.M., H.L.-R., M.M.K. and S.A.M. wrote the initial manuscript. C.B.H. and S.A.M. wrote the first revision, with contributions from C.J.M. and M.B. S.A.M. wrote the second and third revisions, with contributions from C.B.H. and C.J.M.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Variation across societies of infant-directed alterations.
Estimated differences between infant-directed and adult-directed vocalizations, for acoustic feature, in each fieldsite (corresponding with the doughnut plots in Fig. 2). The estimates are derived from the random-effect components of the mixed-effects model reported in the main text. Cells of the table are shaded to facilitate the visibility of corpus-wide consistency (or inconsistency): redder cells represent features where infant-directed vocalizations have higher estimates than adult-directed vocalizations and bluer cells represent features with the reverse pattern. Within speech and song, acoustic features are ordered by their degree of cross-cultural regularity; some features showed the same direction of effect in all 21 societies (for example, for speech, median pitch and pitch variability), whereas others were more variable.
Extended Data Fig. 2 Principal-components analysis of acoustic features.
As an alternative approach to the acoustics data, we ran a principal-components analysis on the full 94 acoustic variables, to test whether an unsupervised method also yielded opposing trends in acoustic features across the different vocalization types. It did. The first three components explained 39% of total variability in the acoustic features. Moreover, the clearest differences between vocalization types accorded with the LASSO and mixed-effects modelling (Figs. 1b and 2). The first principal component most strongly differentiated speech and song, overall; the second most strongly differentiated infant-directed song from adult-directed song; and the third most strongly differentiated infant-directed speech from adult-directed speech. The violins indicate kernel density estimations and the boxplots represent the medians (centres), interquartile ranges (bounds of boxes) and 1.5 × IQR (whiskers). Significance values are computed via two-sided Wilcoxon signed-rank tests (n = 1,570 recordings); *p < 0.05, **p < 0.01, ***p < 0.001. Feature loadings are in Supplementary Table 7.
Extended Data Fig. 3 Screenshots from the naive listener experiment.
On each trial, participants heard a randomly selected vocalization from the corpus and were asked to quickly guess to whom the vocalization was directed: an adult or a baby. The experiment used large emoji and was designed to display comparably on desktop computers (a) or tablets/smartphones (b).
Extended Data Fig. 4 Response biases in the naive listener experiment.
a, Listeners showed reliable biases: regardless of whether a vocalization was infant- or adult-directed, the listeners gave speech recordings substantially fewer “baby” responses than expected by chance, and gave song recordings substantially more “baby” responses. The grey points represent average ratings for each of the recordings in the corpus that were used in the experiment (after exclusions, n = 1,138 recordings from the corpus of 1,615), split by speech and song; the orange and blue points indicate the means of each vocalization type; and the horizontal dashed line represents hypothetical chance level of 50%. b, Despite the response biases, within speech and song, the raw data nevertheless showed clear differences between infant-directed and adult-directed vocalizations, that is, by comparing infant-directedness scores within the same voice, across infant-directed and adult-directed vocalizations (visible here in the steep negative slopes of the grey lines). The main text results report only d’ statistics for these data, for simplicity, but the main effects are nonetheless visible here in the raw data. The points indicate average ratings for each recording; the grey lines connecting the points indicate the pairs of vocalizations produced by the same voice; the half-violins are kernel density estimations; the boxplots represent the medians, interquartile ranges and 95% confidence intervals (indicated by the notches); and the horizontal dashed lines indicate the response bias levels (from a).
Extended Data Fig. 5 Response-time analysis of naive listener experiment.
We recorded the response times of participants in their mobile or desktop browsers, using jsPsych (see Methods), and asked whether, when responding correctly, participants more rapidly detected infant-directedness in speech or song. They did not: a mixed-effects regression predicting the difference in response time between infant-directed and adult-directed vocalizations (within speech or song), adjusting hierarchically for fieldsite and world region, yielded no significant differences (ps > .05 from two-sided linear combination tests; no adjustments made for multiple comparisons). The grey points represent average ratings for each of the recordings in the corpus that were used in the experiment (after exclusions, n = 1,138 recordings from the corpus of 1,615), split by speech and song; the grey lines connecting the points indicate the pairs of vocalizations produced by the same participant; the half-violins are kernel density estimations; and the boxplots represent the medians, interquartile ranges and 95% confidence intervals (indicated by the notches).
Supplementary information
Supplementary Information
Supplementary Methods, Results, Figs. 1–4 and Tables 1–7.
Rights and permissions
About this article
Cite this article
Hilton, C.B., Moser, C.J., Bertolo, M. et al. Acoustic regularities in infant-directed speech and song across cultures. Nat Hum Behav 6, 1545–1556 (2022). https://doi.org/10.1038/s41562-022-01410-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41562-022-01410-x
This article is cited by
-
Universality, domain-specificity and development of psychological responses to music
Nature Reviews Psychology (2023)
-
How games can make behavioural science better
Nature (2023)
-
A systematic review and Bayesian meta-analysis of the acoustic features of infant-directed speech
Nature Human Behaviour (2022)