Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

How adults understand what young children say

Abstract

Children’s early speech often bears little resemblance to that of adults, and yet parents and other caregivers are able to interpret that speech and react accordingly. Here we investigate how adult listeners’ inferences reflect sophisticated beliefs about what children are trying to communicate, as well as how children are likely to pronounce words. Using a Bayesian framework for modelling spoken word recognition, we find that computational models can replicate adult interpretations of children’s speech only when they include strong, context-specific prior expectations about the messages that children will want to communicate. This points to a critical role of adult cognitive processes in supporting early communication and reveals how children can actively prompt adults to take actions on their behalf even when they have only a nascent understanding of the adult language. We discuss the wide-ranging implications of the powerful listening capabilities of adults for theories of first language acquisition.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Schematic overview of the Bayesian spoken word recognition models and experiments.
Fig. 2: Performance identifying intelligible/unintelligible vocalizations by model.
Fig. 3: Average posterior surprisal of the transcribers’ recovered word interpretation under each model with the phoneme-based likelihood.
Fig. 4: Properties and performance of the phoneme-specific likelihood.
Fig. 5: Child-specific model performance.

Data availability

All data used to train language models come from public child language transcripts retrieved through the Child Language Data Exchange System (CHILDES50) using childes-db74. Test datasets come from the Providence corpus27, which have been made publicly available through the PhonBank75 project (https://phonbank.talkbank.org/phon/Eng-NA/Providence.zip). For this project data were obtained through childes-db74 (https://childes-db.stanford.edu).

Code availability

All model training and analysis code is available through our GitHub repository at https://github.com/smeylan/child-directed-listening. Fine-tuned models and pre-processed child transcripts can be accessed through our Open Science Foundation repository at osf.io/v7c3e/.

References

  1. Chomsky, N. Aspects of the Theory of Syntax (MIT Press, 1965).

  2. Pinker, S. Formal models of language learning. Cognition 7, 217–283 (1979).

    Article  CAS  PubMed  Google Scholar 

  3. Saffran, J. R., Aslin, R. N. & Newport, E. L. Statistical learning by 8-month-old infants. Science 274, 1926–1928 (1996).

    Article  CAS  PubMed  Google Scholar 

  4. Landauer, T. K. & Dumais, S. T. A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol. Rev. 104, 211 (1997).

    Article  Google Scholar 

  5. Dupoux, E. Cognitive science in the era of artificial intelligence: a roadmap for reverse-engineering the infant language-learner. Cognition 173, 43–59 (2018).

    Article  PubMed  Google Scholar 

  6. Hoff, E. How social contexts support and shape language development. Dev. Rev. 26, 55–88 (2006).

    Article  Google Scholar 

  7. Onnis, L. Caregiver communication to the child as moderator and mediator of genes for language. Behav. Brain Res. 325, 197–202 (2017).

    Article  PubMed  Google Scholar 

  8. Markus, J., Mundy, P., Morales, M., Delgado, C. E. F. & Yale, M. Individual differences in infant skills as predictors of child–caregiver joint attention and language. Soc. Dev. 9, 302–315 (2000).

    Article  Google Scholar 

  9. Roseberry, S., Hirsh-Pasek, K. & Golinkoff, R. M. Skype me! Socially contingent interactions help toddlers learn language. Child Dev. 85, 956–970 (2014).

    Article  PubMed  Google Scholar 

  10. Rowland, C. F., Pine, J. M., Lieven, E. V. & Theakston, A. L. Determinants of acquisition order in wh-questions: re-evaluating the role of caregiver speech. J. Child Lang. 30, 609–635 (2003).

    Article  PubMed  Google Scholar 

  11. Stein, A., Malmberg, L. E., Sylva, K., Barnes, J. & Leach, P. The influence of maternal depression, caregiving, and socioeconomic status in the post-natal year on children’s language development. Child Care Health Dev. 34, 603–612 (2008).

    Article  CAS  PubMed  Google Scholar 

  12. Fusaroli, R., Weed, E., Fein, D. & Naigles, L. Caregiver linguistic alignment to autistic and typically developing children. Cognition 236, 105422 (2021).

    Article  Google Scholar 

  13. Newport, E. L. Motherese: The Speech of Mothers to Young Children (Univ. Pennsylvania, 1975).

  14. Huttenlocher, J., Haight, W., Bryk, A., Seltzer, M. & Lyons, T. Early vocabulary growth: relation to language input and gender. Dev. Psychol. 27, 236 (1991).

    Article  Google Scholar 

  15. Hart, B. & Risley, T. R. Meaningful Differences in the Everyday Experience of Young American Children (Paul H Brookes Publishing, 1995).

  16. Rowe, M. L. A longitudinal investigation of the role of quantity and quality of child-directed speech in vocabulary development. Child Dev. 83, 17620–1774 (2012).

    Article  Google Scholar 

  17. Golinkoff, R. M., Hoff, E., Rowe, M. L., Tamis-LeMonda, C. S. & Hirsh-Pasek, K. Language matters: denying the existence of the 30-million-word gap has serious consequences. Child Dev. 90, 985–992 (2019).

    Article  PubMed  Google Scholar 

  18. Cartmill, E. A. et al. Quality of early parent input predicts child vocabulary 3 years later. Proc. Natl Acad. Sci. USA 110, 11278–11283 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Weizman, Z. O. & Snow, C. E. Lexical input as related to children’s vocabulary acquisition: effects of sophisticated exposure and support for meaning. Dev. Psychol. 37, 265–279 (2001).

    Article  CAS  PubMed  Google Scholar 

  20. Bergelson, E. et al. What do North American babies hear? A large-scale cross-corpus analysis. Dev. Sci. 22, e12724 (2019).

    Article  PubMed  Google Scholar 

  21. Cristia, A., Dupoux, E., Gurven, M. & Stieglitz, J. Child-directed speech is infrequent in a forager-farmer population: a time allocation study. Child Dev. 90, 759–773 (2019).

    Article  PubMed  Google Scholar 

  22. Golinkoff, R. M. ‘I beg your pardon?’: the preverbal negotiation of failed messages. J. Child Lang. 13, 455–476 (1986).

    Article  CAS  PubMed  Google Scholar 

  23. Golinkoff, R. M. & Gordon, L. What makes communication run? Characteristics of immediate successes. First Lang. 8, 103–124 (1988).

    Article  Google Scholar 

  24. Tomasello, M., Conti-Ramsden, G. & Ewert, B. Young children’s conversations with their mothers and fathers: differences in breakdown and repair. J. Child Lang. 17, 115–130 (1990).

    Article  CAS  PubMed  Google Scholar 

  25. Frank, M. C., Braginsky, M., Yurovsky, D. & Marchman, V. A. Variability and Consistency in Early Language Learning: The Wordbank Project (MIT Press, 2021).

  26. Demuth, K., Culbertson, J. & Alter, J. Word-minimality, epenthesis and coda licensing in the early acquisition of English. Lang. Speech 49, 137–174 (2006).

    Article  PubMed  Google Scholar 

  27. Demuth, K. & McCullough, E. The prosodic (re)organization of children’s early English articles. J. Child Lang. 36, 173–200 (2009).

    Article  PubMed  Google Scholar 

  28. Shannon, C. E. Prediction and entropy of printed English. Bell Syst. Tech. J. 30, 50–64 (1951).

    Article  Google Scholar 

  29. Levy, R. A noisy-channel model of human sentence comprehension under uncertain input. In Proc. 2008 Conference on Empirical Methods in Natural Language Processing (EMNLP) 234–243 (Association for Computational Linguistics, 2008).

  30. Gibson, E., Bergen, L. & Piantadosi, S. T. Rational integration of noisy evidence and prior semantic expectations in sentence interpretation. Proc. Natl Acad. Sci. USA 110, 8051–8056 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Meylan, S. C., Nair, S. & Griffiths, T. L. Evaluating models of robust word recognition with serial reproduction. Cognition 210, 104553 (2021).

    Article  PubMed  Google Scholar 

  32. Norris, D. & McQueen, J. M. Shortlist B: a Bayesian model of continuous speech recognition. Psychol. Rev. 115, 357–395 (2008).

    Article  PubMed  Google Scholar 

  33. Chater, N. & Oaksford, M. The Probabilistic Mind: Prospects for Bayesian Cognitive Science (Oxford Univ. Press, 2008).

  34. Perfors, A., Tenenbaum, J. B., Griffiths, T. L. & Xu, F. A tutorial introduction to Bayesian models of cognitive development. Cognition 120, 302–321 (2011).

    Article  PubMed  Google Scholar 

  35. Miller, G. A., Heise, G. A. & Lichten, W. The intelligibility of speech as a function of the context of the test materials. J. Exp. Psychol. 41, 329 (1951).

    Article  CAS  PubMed  Google Scholar 

  36. Howes, D. On the relation between the intelligibility and frequency of occurrence of English words. J. Acoust. Soc. Am. 29, 296–305 (1957).

    Article  Google Scholar 

  37. Norris, D., McQueen, J. M. & Cutler, A. Prediction, Bayesian inference and feedback in speech recognition. Lang. Cogn. Neurosci. 31, 4–18 (2016).

    Article  PubMed  Google Scholar 

  38. Rohde, H. & Ettlinger, M. Integration of pragmatic and phonetic cues in spoken word recognition. J. Exp. Psychol. Learn. Mem. Cogn. 38, 967–983 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Altmann, G. T. M. & Kamide, Y. Incremental interpretation at verbs: restricting the domain of subsequent reference. Cognition 73, 247–264 (1999).

    Article  CAS  PubMed  Google Scholar 

  40. Kamide, Y., Altmann, G. T. M. & Haywood, S. L. The time-course of prediction in incremental sentence processing: evidence from anticipatory eye movements. J. Mem. Lang. 49, 133–156 (2003).

    Article  Google Scholar 

  41. Tanenhaus, M. K., Spivey-Knowlton, M. J., Eberhard, K. M. & Sedivy, J. C. Integration of visual and linguistic information in spoken language comprehension. Science 268, 1632–1634 (1995).

    Article  CAS  PubMed  Google Scholar 

  42. Kleinschmidt, D. F. & Jaeger, T. F. Robust speech perception: recognize the familiar, generalize to the similar, and adapt to the novel. Psychol. Rev. 122, 148–203 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  43. Reddy, D. R. (ed.) Speech Recognition: Invited Papers Presented at the 1974 IEEE Symposium (Elsevier, 1975).

  44. Wagner, R. A. & Fischer, M. J. The string-to-string correction problem. J. ACM 21, 168–173 (1974).

    Article  Google Scholar 

  45. Devlin, J., Chang, M., Lee, K., and Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies Vol. 1, 4171–4186 (Association for Computational Linguistics, 2019).

  46. Radford, A. et al. Language Models are Unsupervised Multitask Learners (OpenAI, 2019).

  47. Meister, C. et al. Revisiting the uniform information density hypothesis. In Proc. 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP) 963–980 (Association for Computational Linguistics, 2021).

  48. Schrimpf, M. et al. The neural architecture of language: integrative modeling converges on predictive processing. Proc. Natl Acad. Sci. USA https://doi.org/10.1073/pnas.2105646118 (2021).

  49. Manning, C. & Schutze, H. Foundations of Statistical Natural Language Processing (MIT Press, 1999).

  50. MacWhinney, B. The CHILDES Project: Tools for Analyzing Talk. Transcription Format and Programs Vol. 1 (Psychology Press, 2000).

  51. Godfrey, J. J., Holliman, E. C. & McDaniel, J. Switchboard: telephone speech corpus for research and development. In IEEE International Conference on Acoustics, Speech, and Signal Processing Vol. 1, 517–520 (IEEE Computer Society, 1992).

  52. DeLong, E. R., DeLong, D. M. & Clarke-Pearson, D. L. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44, 837–845 (1988).

    Article  CAS  PubMed  Google Scholar 

  53. Levy, R. Expectation-based syntactic comprehension. Cognition 106, 1126–1177 (2008).

    Article  PubMed  Google Scholar 

  54. Hale, J. A probabilistic Earley parser as a psycholinguistic model. In Proc. 2nd Meeting of the North American Chapter of the Association for Computational Linguistics N01-1021 (Association for Computational Linguistics, 2001).

  55. Barr, D. J., Levy, R., Scheepers, C. & Tily, H. J. Random effects structure for confirmatory hypothesis testing: keep it maximal. J. Mem. Lang. https://doi.org/10.1016/j.jml.2012.11.001 (2013).

  56. Chouinard, M. M. & Clark, E. V. Adult reformulations of child errors as negative evidence. J. Child Lang. 30, 637–669 (2003).

    Article  PubMed  Google Scholar 

  57. Marcus, G. F. Negative evidence in language acquisition. Cognition 46, 53–85 (1993).

    Article  CAS  PubMed  Google Scholar 

  58. Demetras, M. J., Post, K. N. & Snow, C. E. Feedback to first language learners: the role of repetitions and clarification questions. J. Child Lang. 13, 275–292 (1986).

    Article  CAS  PubMed  Google Scholar 

  59. Dore, J. Holophrases, speech acts and language universals. J. Child Lang. 2, 21–40 (1975).

    Article  Google Scholar 

  60. Fenson, L. et al. MacArthur-Bates Communicative Development Inventories (Paul H. Brookes Publishing Company, 2007).

  61. Mohri, M., Pereira, F. & Riley, M. Weighted finite-state transducers in speech recognition. Comput. Speech Lang. 16, 69–88 (2002).

    Article  Google Scholar 

  62. Gorman, K. et al. The SIGMORPHON 2020 shared task on multilingual grapheme-to-phoneme conversion. In Proc. 17th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology 40–50 (Association for Computational Linguistics, 2020).

  63. Novak, J. R., Minematsu, N. & Hirose, K. Phonetisaurus: exploring grapheme-to-phoneme conversion with joint n-gram models in the wfst framework. Nat. Lang. Eng. 22, 907–938 (2016).

    Article  Google Scholar 

  64. Dempster, A. P., Laird, N. M. & Rubin, D. B. Maximum likelihood from incomplete data via the em algorithm. J. R. Stat. Soc. B 39, 1–22 (1977).

    Google Scholar 

  65. Gorman, K., Kirov, C., Roark, B. & Sproat, R. Structured abbreviation expansion in context. In Findings of the Association for Computational Linguistics: EMNLP 2021 995–1005 (Association for Computational Linguistics, 2021).

  66. Galescu, L. & Allen, J. F. Bi-directional conversion between graphemes and phonemes using a joint n-gram model. In 4th ISCA Tutorial and Research Workshop (ITRW) on Speech Synthesis (International Speech Communication Association, 2001).

  67. Novak, J.R., Minematsu, N. & Hirose, K. WFST-based grapheme-to-phoneme conversion: Open source tools for alignment, model-building and decoding. In Proc. 10th International Workshop on Finite State Methods and Natural Language Processing 45–49 (Association for Computational Linguistics, 2012).

  68. Salazar, J., Liang, D., Nguyen, T. Q. & Kirchhoff, K. Masked language model scoring. In Proc. 58th Annual Meeting of the Association for Computational Linguistics 2699–2712 (Association for Computational Linguistics, 2020).

  69. Jawahar, G., Sagot, B., and Seddah, D. What does BERT learn about the structure of language? In Proc. 57th Annual Meeting of the Association for Computational Linguistics 3651–3657 (Association for Computational Linguistics, 2019).

  70. Wolf, T. et al. Transformers: state-of-the-art natural language processing. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations 38–45 (Association for Computational Linguistics, 2020).

  71. Hofmann, V., Pierrehumbert, J., & Schütze, H. Superbizarre is not superb: derivational morphology improves BERT’s interpretation of complex words. In Proc. 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing Vol. 1, 3594–3608 (Association for Computational Linguistics, 2021).

  72. Shibata, Y. et al. Byte Pair Encoding: A Text Compression Scheme That Accelerates Pattern Matching Technical Report DOI-TR-161 (Department of Informatics, Kyushu University, 1999).

  73. Chen, S. F. & Goodman, J. An empirical study of smoothing techniques for language modeling. Comput. Speech Lang. 13, 359–394 (1999).

    Article  Google Scholar 

  74. Sanchez, A. et al. childes-db: a flexible and reproducible interface to the child language data exchange system. Behav. Res. Methods 51, 1928–1941 (2019).

    Article  PubMed  Google Scholar 

  75. Rose, Y., & MacWhinney, B. in The Oxford Handbook of Corpus Phonology (eds Durand J. et al.) 380–401 (Oxford Univ. Press, 2014).

  76. Child-directed listening. Open Science Framework https://osf.io/v7c3e/ (2021).

  77. Fawcett, T. An introduction to ROC analysis. Pattern Recognit. Lett. 27, 861–874 (2006).

    Article  Google Scholar 

Download references

Acknowledgements

We thank J. Mankewitz, S. Nair and R. Jansen for providing feedback on early drafts as well as members of the Computational Psycholinguistics Lab at MIT and the Bergelson Lab at Duke for valuable discussion. We thank K. Gorman, T. Eisape and P. Qian for several helpful technical consultations. S. Zhi contributed to the implementation of the pronunciation module. This work was supported by NSF grants BCS-1551866 (R.P.L.), BCS-1844710 (R.P.L.) and BCS-2121074 (R.P.L.); NIH grant 1F32HD097982 (S.C.M.) and DP5 OD019812-01 (E.B.); and the CONVO grant to MIT Brain and Cognitive Sciences from the Simons Center for the Social Brain (R.P.L., S.C.M. and N.H.W.). R.F. received no specific funding for this work. The funders above had no role in study design, data collection and analysis, or the decision to publish or preparation of the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

S.C.M. and R.F. conceived the project and designed the analyses. S.C.M. and N.H.W. developed the models and conducted the analyses E.B. and R.P.L. supervised the project. All authors wrote the manuscript and provided critical feedback.

Corresponding author

Correspondence to Stephan C. Meylan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Human Behaviour thanks Riccardo Fusaroli, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Meylan, S.C., Foushee, R., Wong, N.H. et al. How adults understand what young children say. Nat Hum Behav (2023). https://doi.org/10.1038/s41562-023-01698-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41562-023-01698-3

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing