Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

The speech neuroprosthesis

Abstract

Loss of speech after paralysis is devastating, but circumventing motor-pathway injury by directly decoding speech from intact cortical activity has the potential to restore natural communication and self-expression. Recent discoveries have defined how key features of speech production are facilitated by the coordinated activity of vocal-tract articulatory and motor-planning cortical representations. In this Review, we highlight such progress and how it has led to successful speech decoding, first in individuals implanted with intracranial electrodes for clinical epilepsy monitoring and subsequently in individuals with paralysis as part of early feasibility clinical trials to restore speech. We discuss high-spatiotemporal-resolution neural interfaces and the adaptation of state-of-the-art speech computational algorithms that have driven rapid and substantial progress in decoding neural activity into text, audible speech, and facial movements. Although restoring natural speech is a long-term goal, speech neuroprostheses already have performance levels that surpass communication rates offered by current assistive-communication technology. Given this accelerated rate of progress in the field, we propose key evaluation metrics for speed and accuracy, among others, to help standardize across studies. We finish by highlighting several directions to more fully explore the multidimensional feature space of speech and language, which will continue to accelerate progress towards a clinically viable speech neuroprosthesis.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Key milestones in speech decoding.
Fig. 2: Articulatory control of speech.
Fig. 3: Decoding speech from neural activity.
Fig. 4: Evaluating and standardizing speech neuroprostheses.

Similar content being viewed by others

References

  1. Felgoise, S. H., Zaccheo, V., Duff, J. & Simmons, Z. Verbal communication impacts quality of life in patients with amyotrophic lateral sclerosis. Amyotroph. Lateral Scler. Front. Degener. 17, 179–183 (2016).

    Article  Google Scholar 

  2. Das, J. M., Anosike, K. & Asuncion, R. M. D. Locked-in syndrome. StatPearls https://www.ncbi.nlm.nih.gov/books/NBK559026/ (StatPearls, 2021).

  3. Lulé, D. et al. Life can be worth living in locked-in syndrome. Prog. Brain Res. 177, 339–351 (2009).

    Article  PubMed  Google Scholar 

  4. Pels, E. G. M., Aarnoutse, E. J., Ramsey, N. F. & Vansteensel, M. J. Estimated prevalence of the target population for brain–computer interface neurotechnology in the Netherlands. Neurorehabil. Neural Repair 31, 677–685 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  5. Koch Fager, S., Fried-Oken, M., Jakobs, T. & Beukelman, D. R. New and emerging access technologies for adults with complex communication needs and severe motor impairments: state of the science. Augment. Altern. Commun. Baltim. MD 1985 35, 13–25 (2019).

    Google Scholar 

  6. Vansteensel, M. J. et al. Fully implanted brain–computer interface in a locked-in patient with ALS. N. Engl. J. Med. 375, 2060–2066 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  7. Utsumi, K. et al. Operation of a P300-based brain–computer interface in patients with Duchenne muscular dystrophy. Sci. Rep. 8, 1753 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  8. Pandarinath, C. et al. High performance communication by people with paralysis using an intracortical brain–computer interface. eLife 6, e18554 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  9. Willett, F. R., Avansino, D. T., Hochberg, L. R., Henderson, J. M. & Shenoy, K. V. High-performance brain-to-text communication via handwriting. Nature 593, 249–254 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Chang, E. F. & Anumanchipalli, G. K. Toward a speech neuroprosthesis. JAMA 323, 413–414 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  11. Bull, P. & Frederikson, L. in Companion Encyclopedia of Psychology (Routledge, 1994).

  12. Moses, D. A. et al. Neuroprosthesis for decoding speech in a paralyzed person with anarthria. N. Engl. J. Med. 385, 217–227 (2021). The authors first demonstrated speech decoding in a person with vocal-tract paralysis by decoding cortical activity word-by-word into sentences, using a vocabulary of 50 words at a rate of 15 wpm.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Angrick, M. et al. Online speech synthesis using a chronically implanted brain–computer interface in an individual with ALS. Preprint at medRxiv https://doi.org/10.1101/2023.06.30.23291352 (2023). The authors demonstrated speech synthesis of single words from cortical activity during attempted speech in a person with vocal-tract paralysis.

  14. Metzger, S. L. et al. A high-performance neuroprosthesis for speech decoding and avatar control. Nature https://doi.org/10.1038/s41586-023-06443-4 (2023). The authors reported demonstrations of speech synthesis and avatar animation (orofacial-movement decoding), along with improved text-decoding vocabulary size and speed, by using connectionist temporal classification loss to train models to map persistent-somatotopic representations on the sensorimotor cortex into sentences during silent speech (a large vocabulary was used at a speech rate of 78wpm).

  15. Willett, F. R. et al. A high-performance speech neuroprosthesis. Nature https://doi.org/10.1038/s41586-023-06377-x (2023). The authors improved text decoding to an expansive vocabulary size at 62wpm, by training models with connectionist temporal classification loss to decode sentences from multiunit activity from microelectrode arrays on precentral gyrus while a person with dysarthria silently attempted to speak.

  16. Card, N. S. et al. An Accurate and Rapidly Calibrating Speech Neuroprosthesis https://doi.org/10.1101/2023.12.26.23300110 (2023). The authors used a similar approach to Willett et al. (2023), demonstrating that doubling the number of microelectrode arrays in the precentral gyrus further improved text-decoding accuracy with a rate of 33wpm.

  17. Bouchard, K. E., Mesgarani, N., Johnson, K. & Chang, E. F. Functional organization of human sensorimotor cortex for speech articulation. Nature 495, 327–332 (2013). Here, the authors demonstrated the dynamics of somatotopic organization and speech-articulator representations for the jaw, lips, tongue and larynx during production of syllables, directly connecting phonetic production with speech-motor control of vocal-tract movements.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Carey, D., Krishnan, S., Callaghan, M. F., Sereno, M. I. & Dick, F. Functional and quantitative MRI mapping of somatomotor representations of human supralaryngeal vocal tract. Cereb. Cortex N. Y. N. 1991 27, 265–278 (2017).

    Google Scholar 

  19. Ludlow, C. L. Central nervous system control of the laryngeal muscles in humans. Respir. Physiol. Neurobiol. 147, 205–222 (2005).

    Article  PubMed  PubMed Central  Google Scholar 

  20. Browman, C. P. & Goldstein, L. Articulatory gestures as phonological units. Phonology 6, 201–251 (1989).

    Article  Google Scholar 

  21. Ladefoged, P. & Johnson, K. A Course in Phonetics (Cengage Learning, 2014).

  22. Berry, J. J. Accuracy of the NDI wave speech research system. J. Speech Lang. Hear. Res. 54, 1295–1301 (2011).

    Article  PubMed  Google Scholar 

  23. Liu, P. et al. A deep recurrent approach for acoustic-to-articulatory inversion. In 2015 IEEE International Conf. Acoustics, Speech and Signal Processing (ICASSP) https://doi.org/10.1109/ICASSP.2015.7178812 (2015).

  24. Chartier, J., Anumanchipalli, G. K., Johnson, K. & Chang, E. F. Encoding of articulatory kinematic trajectories in human speech sensorimotor cortex. Neuron 98, 1042–1054.e4 (2018). The authors demonstrated that, during continuous speech in able speakers, cortical activity on the ventral sensorimotor cortex encodes coordinated kinematic trajectories of speech articulators and gives rise to a low-dimensional representation of consonants and vowels.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Illa, A. & Ghosh, P. K. Representation learning using convolution neural network for acoustic-to-articulatory inversion. In ICASSP 2019 — 2019 IEEE International Conf. Acoustics, Speech and Signal Processing (ICASSP) https://doi.org/10.1109/ICASSP.2019.8682506 (2019).

  26. Shahrebabaki, A. S., Salvi, G., Svendsen, T. & Siniscalchi, S. M. Acoustic-to-articulatory mapping with joint optimization of deep speech enhancement and articulatory inversion models. IEEEACM Trans. Audio Speech Lang. Process. 30, 135–147 (2022).

    Article  Google Scholar 

  27. Tychtl, Z. & Psutka, J. Speech production based on the mel-frequency cepstral coefficients. In 6th European Conf. Speech Communication and Technology (Eurospeech 1999) https://doi.org/10.21437/Eurospeech.1999-510 (ISCA, 1999).

  28. Belyk, M. & Brown, S. The origins of the vocal brain in humans. Neurosci. Biobehav. Rev. 77, 177–193 (2017).

    Article  PubMed  Google Scholar 

  29. Simonyan, K. & Horwitz, B. Laryngeal motor cortex and control of speech in humans. Neuroscientist 17, 197–208 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  30. McCawley, J. D. in Tone (ed. Fromkin, V. A.) 113–131 (Academic, 1978).

  31. Murray, I. R. & Arnott, J. L. Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. J. Acoust. Soc. Am. 93, 1097–1108 (1993).

    Article  CAS  PubMed  Google Scholar 

  32. Chomsky, N. & Halle, M. The Sound Pattern of English (Harper, 1968).

  33. Baddeley, A. Working Memory xi, 289 (Clarendon/Oxford Univ. Press, 1986).

  34. Penfield, W. & Boldrey, E. Somatic motor and sensory representation in the cerebral cortex of man as studied by electrical stimulation. Brain 60, 389–443 (1937). The authors demonstrated evidence of somatotopy on sensorimotor cortex by localizing cortical-stimulation-induced movement and sensation for individual muscle groups.

    Article  Google Scholar 

  35. Penfield, W. & Roberts, L. Speech and Brain-Mechanisms (Princeton Univ. Press, 1959). This study provided insights into cortical control of speech and language through neurosurgical cases, including cortical resection, direct-cortical stimulation and seizure mapping.

  36. Cushing, H. A note upon the Faradic stimulation of the postcentral gyrus in conscious patients. Brain 32, 44–53 (1909). This study was one of the first that applied direct-cortical stimulation to localize function on the sensorimotor cortex.

    Article  Google Scholar 

  37. Roux, F.-E., Niare, M., Charni, S., Giussani, C. & Durand, J.-B. Functional architecture of the motor homunculus detected by electrostimulation. J. Physiol. 598, 5487–5504 (2020).

    Article  CAS  PubMed  Google Scholar 

  38. Jensen, M. A. et al. A motor association area in the depths of the central sulcus. Nat. Neurosci. 26, 1165–1169 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Eichert, N., Papp, D., Mars, R. B. & Watkins, K. E. Mapping human laryngeal motor cortex during vocalization. Cereb. Cortex 30, 6254–6269 (2020).

    Article  PubMed  Google Scholar 

  40. Umeda, T., Isa, T. & Nishimura, Y. The somatosensory cortex receives information about motor output. Sci. Adv. 5, eaaw5388 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  41. Murray, E. A. & Coulter, J. D. Organization of corticospinal neurons in the monkey. J. Comp. Neurol. 195, 339–365 (1981).

    Article  CAS  PubMed  Google Scholar 

  42. Arce, F. I., Lee, J.-C., Ross, C. F., Sessle, B. J. & Hatsopoulos, N. G. Directional information from neuronal ensembles in the primate orofacial sensorimotor cortex. Am. J. Physiol. Heart Circ. Physiol. https://doi.org/10.1152/jn.00144.2013 (2013).

  43. Mugler, E. M. et al. Differential representation of articulatory gestures and phonemes in precentral and inferior frontal gyri. J. Neurosci. 4653, 1206–1218 (2018). The authors demonstrated that the ventral sensorimotor cortex, not Broca’s area in the inferior frontal gyrus, best represents speech-articulatory gestures.

    Google Scholar 

  44. Dichter, B. K., Breshears, J. D., Leonard, M. K. & Chang, E. F. The control of vocal pitch in human laryngeal motor cortex. Cell 174, 21–31.e9 (2018). The authors uncovered the causal role of the dorsal laryngeal motor cortex in controlling vocal pitch through feedforward motor commands, as well as additional auditory properties.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Belyk, M., Eichert, N. & McGettigan, C. A dual larynx motor networks hypothesis. Philos. Trans. R. Soc. B 376, 20200392 (2021).

    Article  Google Scholar 

  46. Lu, J. et al. Neural control of lexical tone production in human laryngeal motor cortex. Nat. Commun. 14, 6917 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Silva, A. B. et al. A neurosurgical functional dissection of the middle precentral gyrus during speech production. J. Neurosci. 42, 8416–8426 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Itabashi, R. et al. Damage to the left precentral gyrus is associated with apraxia of speech in acute stroke. Stroke 47, 31–36 (2016).

    Article  PubMed  Google Scholar 

  49. Chang, E. F. et al. Pure apraxia of speech after resection based in the posterior middle frontal gyrus. Neurosurgery 87, E383–E389 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  50. Levy, D. F. et al. Apraxia of speech with phonological alexia and agraphia following resection of the left middle precentral gyrus: illustrative case. J. Neurosurg. Case Lessons 5, CASE22504 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  51. Willett, F. R. et al. Hand knob area of premotor cortex represents the whole body in a compositional way. Cell 181, 396–409.e26 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Stavisky, S. D. et al. Neural ensemble dynamics in dorsal motor cortex during speech in people with paralysis. eLife 8, e46015 (2019). The authors demonstrated that, at single locations on the dorsal precentral gyrus (hand area), neurons are tuned to movements of each key speech articulator.

    Article  PubMed  PubMed Central  Google Scholar 

  53. Venezia, J. H., Thurman, S. M., Richards, V. M. & Hickok, G. Hierarchy of speech-driven spectrotemporal receptive fields in human auditory cortex. NeuroImage 186, 647–666 (2019).

    Article  PubMed  Google Scholar 

  54. Mesgarani, N., Cheung, C., Johnson, K. & Chang, E. F. Phonetic feature encoding in human superior temporal gyrus. Science 343, 1006–1010 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Akbari, H., Khalighinejad, B., Herrero, J. L., Mehta, A. D. & Mesgarani, N. Towards reconstructing intelligible speech from the human auditory cortex. Sci. Rep. 9, 874 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  56. Pasley, B. N. et al. Reconstructing speech from human auditory cortex. PLOS Biol. 10, e1001251 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Binder, J. R. The Wernicke area. Neurology 85, 2170–2175 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  58. Binder, J. R. Current controversies on Wernicke’s area and its role in language. Curr. Neurol. Neurosci. Rep. 17, 58 (2017).

    Article  PubMed  Google Scholar 

  59. Martin, S. et al. Word pair classification during imagined speech using direct brain recordings. Sci. Rep. 6, 25803 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Pei, X., Barbour, D., Leuthardt, E. C. & Schalk, G. Decoding vowels and consonants in spoken and imagined words using electrocorticographic signals in humans. J. Neural Eng. 8, 046028 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  61. Martin, S. et al. Decoding spectrotemporal features of overt and covert speech from the human cortex. Front. Neuroeng. https://doi.org/10.3389/fneng.2014.00014 (2014).

  62. Proix, T. et al. Imagined speech can be decoded from low- and cross-frequency intracranial EEG features. Nat. Commun. 13, 48 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Simanova, I., Hagoort, P., Oostenveld, R. & van Gerven, M. A. J. Modality-independent decoding of semantic information from the human brain. Cereb. Cortex 24, 426–434 (2014).

    Article  PubMed  Google Scholar 

  64. Wandelt, S. K. et al. Online internal speech decoding from single neurons in a human participant. Preprint at medRxiv https://doi.org/10.1101/2022.11.02.22281775 (2022). The authors decoded neuronal activity from a microelectrode array in the supramarginal gyrus into a set of eight words while the participant in their study imagined speaking.

  65. Acharya, A. B. & Maani, C. V. Conduction aphasia. StatPearls https://www.ncbi.nlm.nih.gov/books/NBK537006/ (StatPearls, 2023).

  66. Price, C. J., Moore, C. J., Humphreys, G. W. & Wise, R. J. Segregating semantic from phonological processes during reading. J. Cogn. Neurosci. 9, 727–733 (1997).

    Article  CAS  PubMed  Google Scholar 

  67. Huth, A. G., de Heer, W. A., Griffiths, T. L., Theunissen, F. E. & Gallant, J. L. Natural speech reveals the semantic maps that tile human cerebral cortex. Nature 532, 453–458 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  68. Tang, J., LeBel, A., Jain, S. & Huth, A. G. Semantic reconstruction of continuous language from non-invasive brain recordings. Nat. Neurosci. 26, 858–866 (2023). The authors developed an approach to decode functional MRI activity during imagined speech into sentences with preserved semantic meaning, although word-by-word accuracy was limited.

    Article  CAS  PubMed  Google Scholar 

  69. Andrews, J. P. et al. Dissociation of Broca’s area from Broca’s aphasia in patients undergoing neurosurgical resections. J. Neurosurg. https://doi.org/10.3171/2022.6.JNS2297 (2022).

  70. Mohr, J. P. et al. Broca aphasia: pathologic and clinical. Neurology 28, 311–324 (1978).

    Article  CAS  PubMed  Google Scholar 

  71. Matchin, W. & Hickok, G. The cortical organization of syntax. Cereb. Cortex 30, 1481–1498 (2020).

    Article  PubMed  Google Scholar 

  72. Chang, E. F., Kurteff, G. & Wilson, S. M. Selective interference with syntactic encoding during sentence production by direct electrocortical stimulation of the inferior frontal gyrus. J. Cogn. Neurosci. 30, 411–420 (2018).

    Article  PubMed  Google Scholar 

  73. Thukral, A., Ershad, F., Enan, N., Rao, Z. & Yu, C. Soft ultrathin silicon electronics for soft neural interfaces: a review of recent advances of soft neural interfaces based on ultrathin silicon. IEEE Nanotechnol. Mag. 12, 21–34 (2018).

    Article  Google Scholar 

  74. Chow, M. S. M., Wu, S. L., Webb, S. E., Gluskin, K. & Yew, D. T. Functional magnetic resonance imaging and the brain: a brief review. World J. Radiol. 9, 5–9 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  75. Panachakel, J. T. & Ramakrishnan, A. G. Decoding covert speech from EEG — a comprehensive review. Front. Neurosci. 15, 642251 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  76. Lopez-Bernal, D., Balderas, D., Ponce, P. & Molina, A. A state-of-the-art review of EEG-based imagined speech decoding. Front. Hum. Neurosci. 16, 867281 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  77. Rabut, C. et al. A window to the brain: ultrasound imaging of human neural activity through a permanent acoustic window. Preprint at bioRxiv https://doi.org/10.1101/2023.06.14.544094 (2023).

  78. Kwon, J., Shin, J. & Im, C.-H. Toward a compact hybrid brain–computer interface (BCI): performance evaluation of multi-class hybrid EEG-fNIRS BCIs with limited number of channels. PLOS ONE 15, e0230491 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Wittevrongel, B. et al. Optically pumped magnetometers for practical MEG-based brain–computer interfacing. In BrainComputer Interface Research: A State-of-the-Art Summary 10 (eds Guger, C., Allison, B. Z. & Gunduz, A.) https://doi.org/10.1007/978-3-030-79287-9_4 (Springer International, 2021).

  80. Zheng, H. et al. The emergence of functional ultrasound for noninvasive brain–computer interface. Research 6, 0200 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  81. Fernández-de Thomas, R. J., Munakomi, S. & De Jesus, O. Craniotomy. StatPearls https://www.ncbi.nlm.nih.gov/books/NBK560922/ (StatPearls, 2024).

  82. Parvizi, J. & Kastner, S. Promises and limitations of human intracranial electroencephalography. Nat. Neurosci. 21, 474–483 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Rubin, D. B. et al. Interim safety profile from the feasibility study of the BrainGate Neural Interface system. Neurology 100, e1177–e1192 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  84. Guenther, F. H. et al. A wireless brain–machine interface for real-time speech synthesis. PLoS ONE 4, e8218 (2009). The authors demonstrated above-chance online synthesis of formants, but not words or sentences, from neural activity recorded with an intracortical neurotrophic microelectrode in the precentral gyrus of an individual with anarthria.

    Article  PubMed  PubMed Central  Google Scholar 

  85. Brumberg, J., Wright, E., Andreasen, D., Guenther, F. & Kennedy, P. Classification of intended phoneme production from chronic intracortical microelectrode recordings in speech motor cortex. Front. Neurosci. https://doi.org/10.3389/fnins.2011.00065 (2011). In a follow-up study to Guenther et al. (2009), the authors demonstrated the above-chance classification accuracy of phonemes.

  86. Ray, S. & Maunsell, J. H. R. Different origins of gamma rhythm and high-gamma activity in macaque visual cortex. PLOS Biol. 9, e1000610 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  87. Ray, S., Crone, N. E., Niebur, E., Franaszczuk, P. J. & Hsiao, S. S. Neural correlates of high-gamma oscillations (60–200 Hz) in macaque local field potentials and their potential implications in electrocorticography. J. Neurosci. 28, 11526–11536 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  88. Crone, N. E., Boatman, D., Gordon, B. & Hao, L. Induced electrocorticographic gamma activity during auditory perception. Clin. Neurophysiol. 112, 565–582 (2001).

    Article  CAS  PubMed  Google Scholar 

  89. Crone, N. E., Miglioretti, D. L., Gordon, B. & Lesser, R. P. Functional mapping of human sensorimotor cortex with electrocorticographic spectral analysis. II. Event-related synchronization gamma band. Brain 121, 2301–2315 (1998).

    Article  PubMed  Google Scholar 

  90. Vakani, R. & Nair, D. R. in Handbook of Clinical Neurology Vol. 160 (eds Levin, K. H. & Chauvel, P.) Ch. 20, 313–327 (Elsevier, 2019).

  91. Lee, A. T. et al. Modern intracranial electroencephalography for epilepsy localization with combined subdural grid and depth electrodes with low and improved hemorrhagic complication rates. J. Neurosurg. 1, 1–7 (2022).

    Google Scholar 

  92. Nair, D. R. et al. Nine-year prospective efficacy and safety of brain-responsive neurostimulation for focal epilepsy. Neurology 95, e1244–e1256 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  93. Degenhart, A. D. et al. Histological evaluation of a chronically-implanted electrocorticographic electrode grid in a non-human primate. J. Neural Eng. 13, 046019 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  94. Silversmith, D. B. et al. Plug-and-play control of a brain–computer interface through neural map stabilization. Nat. Biotechnol. 39, 326–335 (2021).

    Article  CAS  PubMed  Google Scholar 

  95. Luo, S. et al. Stable decoding from a speech BCI enables control for an individual with ALS without recalibration for 3 months. Adv. Sci. Weinh. Baden-Wurtt. Ger. https://doi.org/10.1002/advs.202304853 (2023). The authors demonstrated stability of electrocorticography-based speech decoding in a person with dysarthria by showing that, despite not re-training a model over the course of months, performance did not drop off.

  96. Nordhausen, C. T., Maynard, E. M. & Normann, R. A. Single unit recording capabilities of a 100 microelectrode array. Brain Res. 726, 129–140 (1996).

    Article  CAS  PubMed  Google Scholar 

  97. Normann, R. A. & Fernandez, E. Clinical applications of penetrating neural interfaces and Utah Electrode Array technologies. J. Neural Eng. 13, 061003 (2016).

    Article  PubMed  Google Scholar 

  98. Wilson, G. H. et al. Decoding spoken English from intracortical electrode arrays in dorsal precentral gyrus. J. Neural Eng. 17, 066007 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  99. Patel, P. R. et al. Utah array characterization and histological analysis of a multi-year implant in non-human primate motor and sensory cortices. J. Neural Eng. 20, 014001 (2023).

    Article  Google Scholar 

  100. Barrese, J. C. et al. Failure mode analysis of silicon-based intracortical microelectrode arrays in non-human primates. J. Neural Eng. 10, 066014 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  101. Woeppel, K. et al. Explant analysis of Utah electrode arrays implanted in human cortex for brain–computer-interfaces. Front. Bioeng. Biotechnol. https://doi.org/10.3389/fbioe.2021.759711 (2021).

  102. Wilson, G. H. et al. Long-term unsupervised recalibration of cursor BCIs. Preprint at bioRxiv https://doi.org/10.1101/2023.02.03.527022 (2023).

  103. Degenhart, A. D. et al. Stabilization of a brain–computer interface via the alignment of low-dimensional spaces of neural activity. Nat. Biomed. Eng. 4, 672–685 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  104. Karpowicz, B. M. et al. Stabilizing brain–computer interfaces through alignment of latent dynamics. Preprint at bioRxiv https://doi.org/10.1101/2022.04.06.487388 (2022).

  105. Fan, C. et al. Plug-and-play stability for intracortical brain–computer interfaces: a one-year demonstration of seamless brain-to-text communication. Preprint at bioRxiv https://doi.org/10.48550/arXiv.2311.03611 (2023).

  106. Herff, C. et al. Brain-to-text: decoding spoken phrases from phone representations in the brain. Front. Neurosci. https://doi.org/10.3389/fnins.2015.00217 (2015). The authors demonstrated that sequences of phonemes can be decoded from cortical activity in able speakers and assembled into sentences using language models, albeit with high error rates on increased vocabulary sizes.

  107. Mugler, E. M. et al. Direct classification of all American English phonemes using signals from functional speech motor cortex. J. Neural Eng. 11, 035015 (2014). The authors demonstrated that all English phonemes can be decoded from cortical activity of able speakers.

    Article  PubMed  PubMed Central  Google Scholar 

  108. Makin, J. G., Moses, D. A. & Chang, E. F. Machine translation of cortical activity to text with an encoder–decoder framework. Nat. Neurosci. 23, 575–582 (2020). The authors developed a recurrent neural network-based approach to decode cortical activity from able speakers word-by-word into sentences, with word error rates as low as 3%.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  109. Sun, P., Anumanchipalli, G. K. & Chang, E. F. Brain2Char: a deep architecture for decoding text from brain recordings. J. Neural Eng. 17, 066015 (2020). The authors trained a recurrent neural network with connectionist temporal classification loss to decode cortical activity from able speakers into sequences of characters, which were then built into sentences using language models, achieving word error rates as low as 7% with an over 1,000-word vocabulary.

    Article  Google Scholar 

  110. Anumanchipalli, G. K., Chartier, J. & Chang, E. F. Speech synthesis from neural decoding of spoken sentences. Nature 568, 493–498 (2019). The authors developed a biomimetic approach to synthesize full sentences from cortical activity in able speakers: articulatory kinematics were first decoded from cortical activity and an acoustic waveform was subsequently synthesized from this intermediate representation.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  111. Angrick, M. et al. Speech synthesis from ECoG using densely connected 3D convolutional neural networks. J. Neural Eng. 16, 036019 (2019). The authors developed a neural-network-based approach to synthesize single words from cortical activity in able speakers.

    Article  PubMed  PubMed Central  Google Scholar 

  112. Herff, C. et al. Generating natural, intelligible speech from brain activity in motor, premotor, and inferior frontal cortices. Front. Neurosci. https://doi.org/10.3389/fnins.2019.01267 (2019). The authors developed a concatenative speech-synthesis approach for single words in healthy speakers, tailored to limited-sized datasets.

  113. Salari, E. et al. Classification of articulator movements and movement direction from sensorimotor cortex activity. Sci. Rep. 9, 14165 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  114. Salari, E., Freudenburg, Z. V., Vansteensel, M. J. & Ramsey, N. F. Classification of facial expressions for intended display of emotions using brain–computer interfaces. Ann. Neurol. 88, 631–636 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  115. Berezutskaya, J. et al. Direct speech reconstruction from sensorimotor brain activity with optimized deep learning models. Preprint at bioRxiv https://doi.org/10.1101/2022.08.02.502503 (2022).

  116. Martin, S. et al. Decoding inner speech using electrocorticography: progress and challenges toward a speech prosthesis. Front. Neurosci. https://doi.org/10.3389/fnins.2018.00422 (2018).

  117. Moses, D. A., Leonard, M. K., Makin, J. G. & Chang, E. F. Real-time decoding of question-and-answer speech dialogue using human cortical activity. Nat. Commun. 10, 3096 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  118. Ramsey, N. F. et al. Decoding spoken phonemes from sensorimotor cortex with high-density ECoG grids. NeuroImage 180, 301–311 (2018).

    Article  CAS  PubMed  Google Scholar 

  119. Graves, A., Fernández, S., Gomez, F. & Schmidhuber, J. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In Proc. 23rd Int. Conf. Machine Learning — ICML ’06 https://doi.org/10.1145/1143844.1143891 (ACM Press, 2006).

  120. Metzger, S. L. et al. Generalizable spelling using a speech neuroprosthesis in an individual with severe limb and vocal paralysis. Nat. Commun. 13, 6510 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  121. Pandarinath, C. et al. Latent factors and dynamics in motor cortex and their application to brain–machine interfaces. J. Neurosci. 38, 9390–9401 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  122. Parrell, B. & Houde, J. Modeling the role of sensory feedback in speech motor control and learning. J. Speech Lang. Hear. Res. 62, 2963–2985 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  123. Houde, J. & Nagarajan, S. Speech production as state feedback control. Front. Hum. Neurosci. https://doi.org/10.3389/fnhum.2011.00082 (2011).

  124. Sitaram, R. et al. Closed-loop brain training: the science of neurofeedback. Nat. Rev. Neurosci. 18, 86–100 (2017).

    Article  CAS  PubMed  Google Scholar 

  125. Wairagkar, M., Hochberg, L. R., Brandman, D. M. & Stavisky, S. D. Synthesizing speech by decoding intracortical neural activity from dorsal motor cortex. In 2023 11th Int. IEEE/EMBS Conf. Neural Engineering (NER) https://doi.org/10.1109/NER52421.2023.10123880 (IEEE, 2023).

  126. Casanova, E. et al. YourTTS: towards zero-shot multi-speaker TTS and zero-shot voice conversion for everyone. In Proc. 39th Int. Conf. Machine Learning (eds Chaudhuri, K. et al.) Vol. 162, 2709–2720 (PMLR, 2022).

  127. Peters, B., O’Brien, K. & Fried-Oken, M. A recent survey of augmentative and alternative communication use and service delivery experiences of people with amyotrophic lateral sclerosis in the United States. Disabil. Rehabil. Assist. Technol. https://doi.org/10.1080/17483107.2022.2149866 (2022).

  128. Wu, P., Watanabe, S., Goldstein, L., Black, A. W. & Anumanchipalli, G. K. Deep speech synthesis from articulatory representations. In Proc. Interspeech 2022, 779–783 (2022). https://doi.org/10.21437/Interspeech.2022-10892.

  129. Cho, C. J., Wu, P., Mohamed, A. & Anumanchipalli, G. K. Evidence of vocal tract articulation in self-supervised learning of speech. In ICASSP 2023 — 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (IEEE, 2023). https://doi.org/10.1109/icassp49357.2023.10094711.

  130. Mehrabian, A. Silent Messages: Implicit Communication of Emotions and Attitudes (Wadsworth, 1981).

  131. Jia, J., Wang, X., Wu, Z., Cai, L. & Meng, H. Modeling the correlation between modality semantics and facial expressions. In Proc. 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference 1–10 (2012).

  132. Sumby, W. H. & Pollack, I. Visual contribution to speech intelligibility in noise. J. Acoust. Soc. Am. 26, 212–215 (1954).

    Article  Google Scholar 

  133. Branco, M. P. et al. Brain–computer interfaces for communication: preferences of individuals with locked-in syndrome. Neurorehabil. Neural Repair. 35, 267–279 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  134. Patterson, J. R. & Grabois, M. Locked-in syndrome: a review of 139 cases. Stroke 17, 758–764 (1986).

    Article  CAS  PubMed  Google Scholar 

  135. Tomik, B. & Guiloff, R. J. Dysarthria in amyotrophic lateral sclerosis: a review. Amyotroph. Lateral Scler. 11, 4–15 (2010).

    Article  CAS  PubMed  Google Scholar 

  136. Thomas, T. M. et al. Decoding articulatory and phonetic components of naturalistic continuous speech from the distributed language network. J. Neural Eng. 20, 046030 (2023).

    Article  Google Scholar 

  137. Flinker, A. et al. Redefining the role of Broca’s area in speech. Proc. Natl Acad. Sci. USA 112, 2871–2875 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  138. Cogan, G. B. et al. Sensory–motor transformations for speech occur bilaterally. Nature 507, 94–98 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  139. Rainey, S., Martin, S., Christen, A. & Mégevand, P. & Fourneret, E. Brain recording, mind-reading, and neurotechnology: ethical issues from consumer devices to brain-based speech decoding. Sci. Eng. Ethics 26, 2295–2311 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  140. Nip, I. & Roth, C. R. in Encyclopedia of Clinical Neuropsychology (eds Kreutzer, J., DeLuca, J. & Caplan, B.) (Springer International, 2017).

  141. Xiong, W. et al. Toward human parity in conversational speech recognition. IEEEACM Trans. Audio Speech Lang. Process. 25, 2410–2423 (2017).

    Article  Google Scholar 

  142. Munteanu, C., Penn, G., Baecker, R., Toms, E. & James, D. Measuring the acceptable word error rate of machine-generated webcast transcripts. In Interspeech 2006 https://doi.org/10.21437/Interspeech.2006-40 (2006).

  143. Panayotov, V., Chen, G., Povey, D. & Khudanpur, S. Librispeech: an ASR corpus based on public domain audio books. In 2015 IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP) https://doi.org/10.1109/ICASSP.2015.7178964 (IEEE, 2015).

  144. Godfrey, J. J., Holliman, E. C. & McDaniel, J. SWITCHBOARD: telephone speech corpus for research and development. In Proc. ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing Vol. 1, 517–520 (1992).

  145. OpenAI. GPT-4 Technical Report. Preprint at https://arxiv.org/abs/2303.08774 (2023).

  146. Trnka, K., Yarrington, D., McCaw, J., McCoy, K. F. & Pennington, C. The effects of word prediction on communication rate for AAC. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers 173–176 (Association for Computational Linguistics, 2007).

  147. Venkatagiri, H. Effect of window size on rate of communication in a lexical prediction AAC system. Augment. Altern. Commun. 10, 105–112 (1994).

    Article  Google Scholar 

  148. Trnka, K., Mccaw, J., Mccoy, K. & Pennington, C. in Human Language Technologies 2007 173–176 (2008).

  149. Kayte, S. N., Mal, M., Gaikwad, S. & Gawali, B. Performance evaluation of speech synthesis techniques for English language. In Proc. Int. Congress on Information and Communication Technology (eds Satapathy, S. C., Bhatt, Y. C., Joshi, A. & Mishra, D. K.) 253–262 https://doi.org/10.1007/978-981-10-0755-2_27 (Springer, 2016).

  150. Wagner, P. et al. Speech synthesis evaluation — state-of-the-art assessment and suggestion for a novel research program. In 10th ISCA Workshop on Speech Synthesis (SSW 10) https://doi.org/10.21437/SSW.2019-19 (ISCA, 2019).

  151. Kubichek, R. Mel-cepstral distance measure for objective speech quality assessment. In Proc. IEEE Pacific Rim Conf. Communications Computers and Signal Processing Vol. 1, 125–128 (1993).

  152. Varshney, S., Farias, D., Brandman, D. M., Stavisky, S. D. & Miller, L. M. Using automatic speech recognition to measure the intelligibility of speech synthesized from brain signals. In 2023 11th Int. IEEE/EMBS Conf. Neural Engineering (NER) https://doi.org/10.1109/NER52421.2023.10123751 (IEEE, 2023).

  153. Radford, A. et al. Robust speech recognition via large-scale weak supervision. Preprint at http://arxiv.org/abs/2212.04356 (2022).

  154. Yates, A. J. Delayed auditory feedback. Psychol. Bull. 60, 213–232 (1963).

    Article  CAS  PubMed  Google Scholar 

  155. Zanette, D. Statistical patterns in written language. Preprint at https://arxiv.org/abs/1412.3336v1 (2014).

  156. Adolphs, S. & Schmitt, N. Lexical coverage of spoken discourse. Appl. Linguist. 24, 425–438 (2003).

    Article  Google Scholar 

  157. Laureys, S. et al. The locked-in syndrome: what is it like to be conscious but paralyzed and voiceless? in Progress in Brain Research Vol. 150 (ed. Laureys, S.) 495–611 (Elsevier, 2005).

  158. Peters, B. et al. Brain–computer interface users speak up: the Virtual Users’ Forum at the 2013 International Brain–Computer Interface Meeting. Arch. Phys. Med. Rehabil. 96, S33–S37 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  159. Huggins, J. E., Wren, P. A. & Gruis, K. L. What would brain–computer interface users want? Opinions and priorities of potential users with amyotrophic lateral sclerosis. Amyotroph. Lateral Scler. 12, 318–324 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  160. Kreuzberger, D., Kühl, N. & Hirschl, S. Machine learning operations (MLOps): overview, definition, and architecture. IEEE Access. 11, 31866–31879 (2023).

    Article  Google Scholar 

  161. Gordon, E. M. et al. A somato-cognitive action network alternates with effector regions in motor cortex. Nature https://doi.org/10.1038/s41586-023-05964-2 (2023).

  162. Degenhart, A. D. et al. Remapping cortical modulation for electrocorticographic brain–computer interfaces: a somatotopy-based approach in individuals with upper-limb paralysis. J. Neural Eng. 15, 026021 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  163. Kikkert, S., Pfyffer, D., Verling, M., Freund, P. & Wenderoth, N. Finger somatotopy is preserved after tetraplegia but deteriorates over time. eLife 10, e67713 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  164. Bruurmijn, M. L. C. M., Pereboom, I. P. L., Vansteensel, M. J., Raemaekers, M. A. H. & Ramsey, N. F. Preservation of hand movement representation in the sensorimotor areas of amputees. Brain 140, 3166–3178 (2017).

    Article  PubMed  Google Scholar 

  165. Guenther, F. H. Neural Control of Speech (MIT Press, 2016).

  166. Castellucci, G. A., Kovach, C. K., Howard, M. A., Greenlee, J. D. W. & Long, M. A. A speech planning network for interactive language use. Nature 602, 117–122 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  167. Murphy, E. et al. The spatiotemporal dynamics of semantic integration in the human brain. Nat. Commun. 14, 6336 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  168. Ozker, M., Doyle, W., Devinsky, O. & Flinker, A. A cortical network processes auditory error signals during human speech production to maintain fluency. PLOS Biol. 20, e3001493 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  169. Quirarte, J. A. et al. Language supplementary motor area syndrome correlated with dynamic changes in perioperative task-based functional MRI activations: case report. J. Neurosurg. 134, 1738–1742 (2020).

    Article  PubMed  Google Scholar 

  170. Bullock, L., Forseth, K. J., Woolnough, O., Rollo, P. S. & Tandon, N. Supplementary motor area in speech initiation: a large-scale intracranial EEG evaluation of stereotyped word articulation. Preprint at bioRxiv https://doi.org/10.1101/2023.04.04.535557 (2023).

  171. Oby, E. R. et al. New neural activity patterns emerge with long-term learning. Proc. Natl Acad. Sci. USA 116, 15210–15215 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  172. Luu, T. P., Nakagome, S., He, Y. & Contreras-Vidal, J. L. Real-time EEG-based brain–computer interface to a virtual avatar enhances cortical involvement in human treadmill walking. Sci. Rep. 7, 8895 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  173. Alimardani, M. et al. BrainComputer Interface and Motor Imagery Training: The Role of Visual Feedback and Embodiment. Evolving BCI Therapy — Engaging Brain State Dynamicshttps://doi.org/10.5772/intechopen.78695 (IntechOpen, 2018).

  174. Orsborn, A. L. et al. Closed-loop decoder adaptation shapes neural plasticity for skillful neuroprosthetic control. Neuron 82, 1380–1393 (2014).

    Article  CAS  PubMed  Google Scholar 

  175. Muller, L. et al. Thin-film, high-density micro-electrocorticographic decoding of a human cortical gyrus. In 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) https://doi.org/10.1109/EMBC.2016.7591001 (2016).

  176. Duraivel, S. et al. High-resolution neural recordings improve the accuracy of speech decoding. Nat. Commun. 14, 6938 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  177. Kaiju, T., Inoue, M., Hirata, M. & Suzuki, T. High-density mapping of primate digit representations with a 1152-channel µECoG array. J. Neural Eng. 18, 036025 (2021).

    Article  Google Scholar 

  178. Woods, V. et al. Long-term recording reliability of liquid crystal polymer µECoG arrays. J. Neural Eng. 15, 066024 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  179. Rachinskiy, I. et al. High-density, actively multiplexed µECoG array on reinforced silicone substrate. Front. Nanotechnol. https://doi.org/10.3389/fnano.2022.837328 (2022).

  180. Sun, J. et al. Intraoperative microseizure detection using a high-density micro-electrocorticography electrode array. Brain Commun. 4, fcac122 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  181. Ho, E. et al. The layer 7 cortical interface: a scalable and minimally invasive brain–computer interface platform. Preprint at bioRxiv https://doi.org/10.1101/2022.01.02.474656 (2022).

  182. Oxley, T. J. et al. Motor neuroprosthesis implanted with neurointerventional surgery improves capacity for activities of daily living tasks in severe paralysis: first in-human experience. J. NeuroIntervent. Surg. 13, 102–108 (2021).

    Article  Google Scholar 

  183. Chen, R., Canales, A. & Anikeeva, P. Neural recording and modulation technologies. Nat. Rev. Mater. 2, 1–16 (2017).

    Article  CAS  Google Scholar 

  184. Hong, G. & Lieber, C. M. Novel electrode technologies for neural recordings. Nat. Rev. Neurosci. 20, 330–345 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  185. Sahasrabuddhe, K. et al. The Argo: a high channel count recording system for neural recording in vivo. J. Neural Eng. 18, 015002 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  186. Musk, E. & Neuralink. An integrated brain–machine interface platform with thousands of channels. J. Med. Internet Res. 21, e16194 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  187. Paulk, A. C. et al. Large-scale neural recordings with single neuron resolution using neuropixels probes in human cortex. Nat. Neurosci. 25, 252–263 (2022).

    Article  CAS  PubMed  Google Scholar 

  188. Chung, J. E. et al. High-density single-unit human cortical recordings using the neuropixels probe. Neuron 110, 2409–2421.e3 (2022).

    Article  CAS  PubMed  Google Scholar 

  189. Kingma, D. P. & Welling, M. An introduction to variational autoencoders. Found. Trends Mach. Learn. 12, 307–392 (2019).

    Article  Google Scholar 

  190. Schneider, S., Lee, J. H. & Mathis, M. W. Learnable latent embeddings for joint behavioural and neural analysis. Nature 617, 360–368 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  191. Liu, R. et al. Drop, swap, and generate: a self-supervised approach for generating neural activity. Preprint at http://arxiv.org/abs/2111.02338 (2021).

  192. Cho, C. J., Chang, E. & Anumanchipalli, G. Neural latent aligner: cross-trial alignment for learning representations of complex, naturalistic neural data. In Proc. 40th Int. Conf. Machine Learning 5661–5676 (PMLR, 2023).

  193. Keshtkaran, M. R. et al. A large-scale neural network training framework for generalized estimation of single-trial population dynamics. Nat. Methods 19, 1572–1577 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  194. Berezutskaya, J. et al. Direct speech reconstruction from sensorimotor brain activity with optimized deep learning models. J. Neural Eng. 20, 056010 (2023).

    Article  PubMed Central  Google Scholar 

  195. Touvron, H. et al. LLaMA: Open and Efficient Foundation Language Models. Preprint at https://doi.org/10.48550/arXiv.2302.13971 (2023).

  196. Graves, A. Sequence transduction with recurrent neural networks. Preprint at https://doi.org/10.48550/arXiv.1211.3711 (2012).

  197. Shi, Y. et al. Emformer: efficient memory transformer based acoustic model for low latency streaming speech recognition. Preprint at https://doi.org/10.48550/arXiv.2010.10759 (2020).

  198. Rapeaux, A. B. & Constandinou, T. G. Implantable brain machine interfaces: first-in-human studies, technology challenges and trends. Curr. Opin. Biotechnol. 72, 102–111 (2021).

    Article  CAS  PubMed  Google Scholar 

  199. Matsushita, K. et al. A fully implantable wireless ECoG 128-channel recording device for human brain–machine interfaces: W-HERBS. Front. Neurosci. 12, 511 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  200. Cajigas, I. et al. Implantable brain–computer interface for neuroprosthetic-enabled volitional hand grasp restoration in spinal cord injury. Brain Commun. 3, fcab248 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  201. Jarosiewicz, B. & Morrell, M. The RNS system: brain-responsive neurostimulation for the treatment of epilepsy. Expert Rev. Med. Dev. 18, 129–138 (2021).

    Article  CAS  Google Scholar 

  202. Lorach, H. et al. Walking naturally after spinal cord injury using a brain–spine interface. Nature 618, 126–133 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  203. Weiss, J. M., Gaunt, R. A., Franklin, R., Boninger, M. L. & Collinger, J. L. Demonstration of a portable intracortical brain–computer interface. Brain-Comput. Interfaces 6, 106–117 (2019).

    Article  Google Scholar 

  204. Kim, J. S., Kwon, S. U. & Lee, T. G. Pure dysarthria due to small cortical stroke. Neurology 60, 1178–1180 (2003).

    Article  PubMed  Google Scholar 

  205. Urban, P. P. et al. Left-hemispheric dominance for articulation: a prospective study on acute ischaemic dysarthria at different localizations. Brain 129, 767–777 (2006).

    Article  CAS  PubMed  Google Scholar 

  206. Wu, P. et al. Speaker-independent acoustic-to-articulatory speech inversion. Preprint at https://doi.org/10.48550/arXiv.2302.06774 (2023).

  207. Oppenheim, A. V., Schafer, R. W. & Schafer, R. W. Discrete-Time Signal Processing (Pearson, 2014).

  208. Kim, J. W., Salamon, J., Li, P. & Bello, J. P. CREPE: a convolutional representation for pitch estimation. Preprint at https://doi.org/10.48550/arXiv.1802.06182 (2018).

  209. Park, K. & Kim, J. g2pE. Github https://github.com/Kyubyong/g2p (2019).

  210. Duffy, J. R. Motor Speech Disorders: Substrates, Differential Diagnosis, and Management (Elsevier Health Sciences, 2019).

  211. Basilakos, A., Rorden, C., Bonilha, L., Moser, D. & Fridriksson, J. Patterns of poststroke brain damage that predict speech production errors in apraxia of speech and aphasia dissociate. Stroke 46, 1561–1566 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  212. Berthier, M. L. Poststroke aphasia: epidemiology, pathophysiology and treatment. Drugs Aging 22, 163–182 (2005).

    Article  CAS  PubMed  Google Scholar 

  213. Wilson, S. M. et al. Recovery from aphasia in the first year after stroke. Brain 146, 1021–1039 (2022).

    Article  PubMed Central  Google Scholar 

  214. Marzinske, M. Help for speech, language disorders. Mayo Clinic Health System https://www.mayoclinichealthsystem.org/hometown-health/speaking-of-health/help-is-available-for-speech-and-language-disorders (2022).

  215. Amyotrophic lateral sclerosis. CDC https://www.cdc.gov/als/WhatisALS.html (CDC, 2022).

  216. Sokolov, A. Inner Speech and Thought (Springer Science & Business Media, 2012).

  217. Alderson-Day, B. & Fernyhough, C. Inner speech: development, cognitive functions, phenomenology, and neurobiology. Psychol. Bull. 141, 931–965 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  218. Sankaran, N., Moses, D., Chiong, W. & Chang, E. F. Recommendations for promoting user agency in the design of speech neuroprostheses. Front. Hum. Neurosci. 17, 1298129 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  219. Sun, X. & Ye, B. The functional differentiation of brain–computer interfaces (BCIs) and its ethical implications. Humanit. Soc. Sci. Commun. 10, 1–9 (2023).

    Article  Google Scholar 

  220. Ienca, M., Haselager, P. & Emanuel, E. J. Brain leaks and consumer neurotechnology. Nat. Biotechnol. 36, 805–810 (2018).

    Article  CAS  PubMed  Google Scholar 

  221. Yuste, R. Advocating for neurodata privacy and neurotechnology regulation. Nat. Protoc. 18, 2869–2875 (2023).

    Article  CAS  PubMed  Google Scholar 

  222. Kamal, A. H. et al. A person-centered, registry-based learning health system for palliative care: a path to coproducing better outcomes, experience, value, and science. J. Palliat. Med. 21, S-61 (2018).

    Article  PubMed Central  Google Scholar 

  223. Alford, J. The multiple facets of co-production: building on the work of Elinor Ostrom. Public. Manag. Rev. 16, 299–316 (2014).

    Article  Google Scholar 

  224. Institute of Medicine (US) Roundtable on Value & Science-Driven Health Care. Clinical Data as the Basic Staple of Health Learning: Creating and Protecting a Public Good: Workshop Summary (National Academies Press, 2011).

Download references

Acknowledgements

The authors are incredibly grateful to the many people who enrolled in the aforedescribed studies. A.B.S. was supported by the National Institute on Deafness and Other Communication Disorders of the National Institutes of Health under award number F30DC021872. K.T.L. is supported by the National Science Foundation GRFP. J.R.L. and D.A.M. were supported by the National Institutes of Health grant U01 DC018671-01A1.

Author information

Authors and Affiliations

Authors

Contributions

E.F.C. and A.B.S. researched data for the article and contributed substantially to discussion of the content. All authors wrote the article and reviewed and/or edited the manuscript before submission.

Corresponding author

Correspondence to Edward F. Chang.

Ethics declarations

Competing interests

D.A.M., J.R.L. and E.F.C. are inventors on a pending provisional UCSF patent application that is relevant to the neural-decoding approaches surveyed in this work. E.F.C. is an inventor on patent application PCT/US2020/028926, D.A.M. and E.F.C. are inventors on patent application PCT/US2020/043706 and E.F.C. is an inventor on patent US9905239B2, which are broadly relevant to the neural-decoding approaches surveyed in this work. EFC is co-founder of Echo Neurotechnologies, LLC. All other authors declare no competing interests.

Peer review

Peer review information

Nature Reviews Neuroscience thanks Gregory Cogan, who co-reviewed with Suseendrakumar Duraivel; Marcel van Gerven; Christian Herff; and Cynthia Chestek for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Glossary

Anarthria

Speech-motor disorder referring to an inability to move the vocal-tract muscles to articulate speech.

Aphasias

A disorder of understanding or expressing language.

Attempted speech

This is an instruction given to individuals with vocal-tract paralysis to attempt to speak the best they can, despite lack of the attempt being intelligible.

Concatenative synthesizer

A speech-synthesis approach that relies on matching neural activity with discrete units of a speech waveform that are then concatenated together.

Corticobulbar system

The pathway through which motor commands from the cortex reach the muscles of the vocal tract. At a high level, cortical motor neurons send axons via the corticobulbar tract which terminate in cranial nerve nuclei in the brainstem. Second-order motor neurons in the cranial nerve nuclei then send axons, that bundle and form cranial nerves, to innervate the muscles of the vocal tract.

Formants

The preferred resonating frequencies of the vocal tract that are critical for forming different vowel sounds.

Language models

Models that are trained to capture the statistical patterns of word occurrences in natural language.

Locked-in syndrome

This refers to a clinical condition in which a participant retains cognitive capacity but has limited voluntary motor function. Locked-in syndrome is a spectrum, ranging from fully locked in states (no residual voluntary motor function) to partially locked in states (some residual voluntary motor function such as head movements).

Mime

An attempt to move vocal-tract muscles without attempting to vocalize.

Sensorimotor cortex

This area of the cortex is composed of the precentral and postcentral gyri, primarily responsible for motor control and sensation, respectively.

Silently attempted speech

This is an instruction given to individuals with vocal-tract paralysis to attempt to speak the best they can, but without vocalizing.

Speech articulators

The vocal-tract muscle groups that are important for producing (articulating) speech, including the lips, jaw, tongue and larynx.

Syntax

The arrangement and structure of words to form coherent sentences.

Vocal-tract paralysis

An inability to contract and move the speech articulators, often caused by injury to descending motor-neuron tracts in the brainstem.

Zipf’s law

The law that generally proposes that the frequencies of items are inversely proportional to their ranks.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Silva, A.B., Littlejohn, K.T., Liu, J.R. et al. The speech neuroprosthesis. Nat. Rev. Neurosci. (2024). https://doi.org/10.1038/s41583-024-00819-9

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41583-024-00819-9

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing