Brain–computer interfaces (BCIs) can restore communication to people who have lost the ability to move or speak. So far, a major focus of BCI research has been on restoring gross motor skills, such as reaching and grasping1,2,3,4,5 or point-and-click typing with a computer cursor6,7. However, rapid sequences of highly dexterous behaviours, such as handwriting or touch typing, might enable faster rates of communication. Here we developed an intracortical BCI that decodes attempted handwriting movements from neural activity in the motor cortex and translates it to text in real time, using a recurrent neural network decoding approach. With this BCI, our study participant, whose hand was paralysed from spinal cord injury, achieved typing speeds of 90 characters per minute with 94.1% raw accuracy online, and greater than 99% accuracy offline with a general-purpose autocorrect. To our knowledge, these typing speeds exceed those reported for any other BCI, and are comparable to typical smartphone typing speeds of individuals in the age group of our participant (115 characters per minute)8. Finally, theoretical considerations explain why temporally complex movements, such as handwriting, may be fundamentally easier to decode than point-to-point movements. Our results open a new approach for BCIs and demonstrate the feasibility of accurately decoding rapid, dexterous movements years after paralysis.
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
All neural data needed to reproduce the findings in this study are publicly available at the Dryad repository (https://doi.org/10.5061/dryad.wh70rxwmv). The dataset contains neural activity recorded during the attempted handwriting of 1,000 sentences (43,501 characters) over 10.7 hours.
Code that implements an offline reproduction of the central findings in this study (high-performance neural decoding with an RNN) is publicly available on GitHub at https://github.com/fwillett/handwritingBCI.
Hochberg, L. R. et al. Reach and grasp by people with tetraplegia using a neurally controlled robotic arm. Nature 485, 372–375 (2012).
Collinger, J. L. et al. High-performance neuroprosthetic control by an individual with tetraplegia. Lancet 381, 557–564 (2013).
Aflalo, T. et al. Neurophysiology. Decoding motor imagery from the posterior parietal cortex of a tetraplegic human. Science 348, 906–910 (2015).
Bouton, C. E. et al. Restoring cortical control of functional movement in a human with quadriplegia. Nature 533, 247–250 (2016).
Ajiboye, A. B. et al. Restoration of reaching and grasping movements through brain-controlled muscle stimulation in a person with tetraplegia: a proof-of-concept demonstration. Lancet 389, 1821–1830 (2017).
Jarosiewicz, B. et al. Virtual typing by people with tetraplegia using a self-calibrating intracortical brain–computer interface. Sci. Transl. Med. 7, 313ra179 (2015).
Pandarinath, C. et al. High performance communication by people with paralysis using an intracortical brain–computer interface. eLife 6, e18554 (2017).
Palin, K., Feit, A. M., Kim, S., Kristensson, P. O. & Oulasvirta, A. How do people type on mobile devices? Observations from a study with 37,000 volunteers. In Proc. 21st International Conference on Human–Computer Interaction with Mobile Devices and Services 1–12 (Association for Computing Machinery, 2019).
Yousry, T. A. et al. Localization of the motor hand area to a knob on the precentral gyrus. A new landmark. Brain 120, 141–157 (1997).
Willett, F. R. et al. Hand knob area of premotor cortex represents the whole body in a compositional way. Cell 181, 396–409 (2020).
Williams, A. H. et al. Discovering precise temporal patterns in large-scale neural recordings through robust and interpretable time warping. Neuron 105, 246–259 (2020).
Hinton, G. et al. Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29, 82–97 (2012).
Graves, A., Mohamed, A. & Hinton, G. Speech recognition with deep recurrent neural networks. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing 6645–6649 (2013).
Xiong, W. et al. The Microsoft 2017 Conversational Speech Recognition System. Preprint at https://arxiv.org/abs/1708.06073 (2017).
He, Y. et al. Streaming end-to-end speech recognition for mobile devices. In 2019 IEEE International Conference on Acoustics, Speech and Signal Processing 6381–6385 (2019).
Anumanchipalli, G. K., Chartier, J. & Chang, E. F. Speech synthesis from neural decoding of spoken sentences. Nature 568, 493–498 (2019).
Makin, J. G., Moses, D. A. & Chang, E. F. Machine translation of cortical activity to text with an encoder-decoder framework. Nat. Neurosci. 23, 575–582 (2020).
Chen, X. et al. High-speed spelling with a noninvasive brain–computer interface. Proc. Natl Acad. Sci. USA 112, E6058–E6067 (2015).
Dickey, A. S., Suminski, A., Amit, Y. & Hatsopoulos, N. G. Single-unit stability using chronically implanted multielectrode arrays. J. Neurophysiol. 102, 1331–1339 (2009).
Eleryan, A. et al. Tracking single units in chronic, large scale, neural recordings for brain machine interface applications. Front. Neuroeng. 7, 23 (2014).
Downey, J. E., Schwed, N., Chase, S. M., Schwartz, A. B. & Collinger, J. L. Intracortical recording stability in human brain–computer interface users. J. Neural Eng. 15, 046016 (2018).
Willett, F. R. et al. Signal-independent noise in intracortical brain–computer interfaces causes movement time properties inconsistent with Fitts’ law. J. Neural Eng. 14, 026010 (2017).
Gao, P. et al. A theory of multineuronal dimensionality, dynamics and measurement. Preprint at https://doi.org/10.1101/214262 (2017).
Musallam, S., Corneil, B. D., Greger, B., Scherberger, H. & Andersen, R. A. Cognitive control signals for neural prosthetics. Science 305, 258–262 (2004).
Santhanam, G., Ryu, S. I., Yu, B. M., Afshar, A. & Shenoy, K. V. A high-performance brain–computer interface. Nature 442, 195–198 (2006).
Cunningham, J. P., Yu, B. M., Gilja, V., Ryu, S. I. & Shenoy, K. V. Toward optimal target placement for neural prosthetic devices. J. Neurophysiol. 100, 3445–3457 (2008).
Pels, E. G. M., Aarnoutse, E. J., Ramsey, N. F. & Vansteensel, M. J. Estimated prevalence of the target population for brain–computer interface neurotechnology in the Netherlands. Neurorehabil. Neural Repair 31, 677–685 (2017).
Vansteensel, M. J. et al. Fully implanted brain–computer interface in a locked-in patient with ALS. N. Engl. J. Med. 375, 2060–2066 (2016).
Nijboer, F. et al. A P300-based brain–computer interface for people with amyotrophic lateral sclerosis. Clin. Neurophysiol. 119, 1909–1916 (2008).
Townsend, G. et al. A novel P300-based brain–computer interface stimulus presentation paradigm: moving beyond rows and columns. Clin. Neurophysiol. 121, 1109–1120 (2010).
McCane, L. M. et al. P300-based brain–computer interface (BCI) event-related potentials (ERPs): people with amyotrophic lateral sclerosis (ALS) vs. age-matched controls. Clin. Neurophysiol. 126, 2124–2131 (2015).
Wolpaw, J. R. et al. Independent home use of a brain–computer interface by people with amyotrophic lateral sclerosis. Neurology 91, e258–e267 (2018).
Bacher, D. et al. Neural point-and-click communication by a person with incomplete locked-in syndrome. Neurorehabil. Neural Repair 29, 462–471 (2015).
Mugler, E. M. et al. Direct classification of all American English phonemes using signals from functional speech motor cortex. J. Neural Eng. 11, 035015 (2014).
Nurmikko, A. Challenges for large-scale cortical interfaces. Neuron 108, 259–269 (2020).
Vázquez-Guardado, A., Yang, Y., Bandodkar, A. J. & Rogers, J. A. Recent advances in neurotechnologies with broad potential for neuroscience research. Nat. Neurosci. 23, 1522–1536 (2020).
Simeral, J. D., Kim, S.-P., Black, M. J., Donoghue, J. P. & Hochberg, L. R. Neural control of cursor trajectory and click by a human with tetraplegia 1000 days after implant of an intracortical microelectrode array. J. Neural Eng. 8, 025027 (2011).
Bullard, A. J., Hutchison, B. C., Lee, J., Chestek, C. A. & Patil, P. G. Estimating risk for future intracranial, fully implanted, modular neuroprosthetic systems: a systematic review of hardware complications in clinical deep brain stimulation and experimental human intracortical arrays. Neuromodulation 23, 411–426 (2020).
Nuyujukian, P. et al. Cortical control of a tablet computer by people with paralysis. PLoS One 13, e0204566 (2018).
Musk, E. An integrated brain–machine interface platform with thousands of channels. J. Med. Internet Res. 21, e16194 (2019).
Sahasrabuddhe, K. et al. The Argo: a high channel count recording system for neural recording in vivo. J. Neural Eng. 18, 015002 (2021).
Sussillo, D., Stavisky, S. D., Kao, J. C., Ryu, S. I. & Shenoy, K. V. Making brain–machine interfaces robust to future neural variability. Nat. Commun. 7, 13749 (2016).
Dyer, E. L. et al. A cryptography-based approach for movement decoding. Nat. Biomed. Eng. 1, 967–976 (2017).
Degenhart, A. D. et al. Stabilization of a brain–computer interface via the alignment of low-dimensional spaces of neural activity. Nat. Biomed. Eng. 4, 672–685 (2020).
We thank participant T5 and his caregivers for their dedicated contributions to this research, N. Lam, E. Siauciunas and B. Davis for administrative support and E. Woodrum for the drawings in Figs. 1a, 2a. F.R.W. and D.T.A. acknowledge the support of the Howard Hughes Medical Institute. L.R.H. acknowledges the support of the Office of Research and Development, Rehabilitation R&D Service, US Department of Veterans Affairs (A2295R, N2864C); the National Institute of Neurological Disorders and Stroke and BRAIN Initiative (UH2NS095548); and the National Institute on Deafness and Other Communication Disorders (R01-DC009899, U01-DC017844). K.V.S. and J.M.H. acknowledge the support of the National Institute on Deafness and Other Communication Disorders (R01-DC014034, U01-DC017844); the National Institute of Neurological Disorders and Stroke (UH2-NS095548, U01-NS098968); L. and P. Garlick; S. and B. Reeves; and the Wu Tsai Neurosciences Institute at Stanford. K.V.S. acknowledges the support of the Simons Foundation Collaboration on the Global Brain 543045 and the Howard Hughes Medical Institute (K.V.S. is a Howard Hughes Medical Institute Investigator). The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
The MGH Translational Research Center has a clinical research support agreement with Neuralink, Paradromics and Synchron, for which L.R.H. provides consultative input. J.M.H. is a consultant for Neuralink, and serves on the Medical Advisory Board of Enspire DBS. K.V.S. consults for Neuralink and CTRL-Labs (part of Facebook Reality Labs) and is on the scientific advisory boards of MIND-X, Inscopix and Heal. F.R.W., J.M.H. and K.V.S. are inventors on patent application US 2021/0064135 A1 (the applicant is Stanford University), which covers the neural decoding approach taken in this work. All other authors have no competing interests.
Peer review information Nature thanks Karim Oweiss and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
We used a two-layer gated recurrent unit (GRU) recurrent neural network architecture to convert sequences of neural firing rate vectors xt (which were temporally smoothed and binned at 20 ms) into sequences of character probability vectors yt and ‘new character’ probability scalars zt. The yt vectors describe the probability of each character being written at that moment in time, and the zt scalars go high whenever the RNN detects that T5 is beginning to write any new character. Note that the top RNN layer runs at a slower frequency than the bottom layer, which we found improved the speed of training by making it easier to hold information in memory for long time periods. Thus, the RNN outputs are updated only once every 100 ms. Also, note that we used a day-specific affine transform to account for day-to-day changes in the neural activity (bottom row)—this helps the RNN to account for changes in neural tuning caused by electrode array micromotion or brain plasticity when training data are combined across multiple days.
a, Diagram of the session flow for copy-typing and free-typing sessions (each rectangle corresponds to one block of data). First, single-letter and sentences training data are collected (blue and red blocks). Next, the RNN is trained using the newly collected data plus all of the previous days’ data (purple block). Finally, the RNN is held fixed and evaluated (green blocks). b, Diagram of the data processing and RNN training process (purple block in a). First, the single-letter data are time-warped and averaged to create spatiotemporal templates of neural activity for each character. These templates are used to initialize the hidden Markov models (HMMs) for sentence labelling. After labelling, the observed data are cut apart and rearranged into new sequences of characters to make synthetic sentences. Finally, the synthetic sentences are combined with the real sentences to train the RNN. c, Diagram of a forced-alignment HMM used to label the sentence ‘few black taxis drive up major roads on quiet hazy nights’. The HMM states correspond to the sequence of characters in the sentence. d, The label quality can be verified with cross-correlation heat maps made by correlating the single character neural templates with the real data. The HMM-identified character start times form clear hotspots on the heat maps. Note that these heat maps are depicted only to qualitatively show label quality and aren’t used for training (only the character start times are needed to generate the targets for RNN training). e, To generate new synthetic sentences, the neural data corresponding to each labelled character in the real data are cut out of the data stream and put into a snippet library. These snippets are then pulled from the library at random, stretched or compressed in time by up to 30% (to add more artificial timing variability) and combined into new sentences.
a, Training with synthetic data (left) and artificial white noise added to the inputs (right) were both essential for high performance. Data are shown from a grid search over both parameters, and lines show performance at the best value for the other parameter. Results indicate that both parameters are needed for high performance, even when the other is at the best value. Using synthetic data is more important when the size of the dataset is highly limited, as is the case when training on only a single day of data (blue line). Note that the inputs given to the RNN were z-scored, so the input white noise is in units of standard deviations of the input features. b, Artificial noise added to the feature means (random offsets and slow changes in the baseline firing rate) greatly improves the ability of the RNN to generalize to new blocks of data that occur later in the session, but does not help the RNN to generalize to new trials within blocks of data on which it was already trained. This is because feature means change slowly over time. For each parameter setting, three separate RNNs were trained (circles); results show low variability across RNN training runs. c, Training an RNN with all of the datasets combined improves performance relative to training on each day separately. Each circle shows the performance on one of seven days. d, Using separate input layers for each day is better than using a single layer across all days. e, Improvements in character error rates are summarized for each parameter. 95% CIs were computed with bootstrap resampling of single trials (n = 10,000). As shown in the table, all parameters show a statistically significant improvement for at least one condition (CIs do not intersect zero).
a, To visualize how much the neural recordings changed across time, decoded pen-tip trajectories were plotted for two example letters (m and z) for all 10 days of data (columns), using decoders trained on all other days (rows). Each session is labelled according to the number of days passed relative to 9 December 2019 (day 4). Results show that although patterns of neural activity clearly change over time, their essential structure is largely conserved (as decoders trained on past days transfer readily to future days). b, The correlation (Pearson’s r) between the neural activity patterns of each session was computed for each pair of sessions and plotted as a function of the number of days separating each pair. Blue circles show the correlation computed in the full neural space (all 192 electrodes), whereas red circles show the correlation in the ‘anchor’ space (top 10 principal components of the earlier session). High values indicate a high similarity in how characters are neurally encoded across days. The fact that correlations are higher in the anchor space suggests that the structure of the neural patterns stays largely the same as it slowly rotates into a new space, causing shrinkage in the original space but little change in structure. c, A visualization of how each character’s neural representation changes over time, as viewed through the top two PCs of the original ‘anchor’ space. Each circle represents the neural activity pattern for a single character, and each x symbol shows that same character on a later day (lines connect matching characters). Left, a pair of sessions with only two days between them (day −2 to 0); right, a pair of sessions with 11 days between them (day −2 to 9). The relative positioning of the neural patterns remains similar across days, but most conditions shrink noticeably towards the origin. This is consistent with the neural representations slowly rotating out of the original space into a new space, and suggests that scaling-up the input features may help a decoder to transfer more accurately to a future session (by counteracting this shrinkage effect). d, Similar to Fig. 3b, copy-typing data from eight sessions were used to assess offline whether scaling-up the decoder inputs improves performance when evaluating the decoder on a future session (when no decoder retraining is used). All session pairs (X, Y) were considered. Decoders were first initialized using all data from session X and earlier, then evaluated on session Y under different input-scaling factors (for example, an input scale of 1.5 means that input features were scaled up by 50%). Lines indicate the mean raw character error rate and shaded regions show 95% CIs. Results show that when long periods of time pass between sessions, input scaling improves performance. We therefore used an input-scaling factor of 1.5 when assessing decoder performance in the ‘no retraining’ conditions of Fig. 3.
a, Example noise vectors and covariance matrix for temporally correlated noise. On the left, example noise vectors are plotted (each line depicts a single example). Noise vectors are shown for all 100 time steps of neuron 1. On the right, the covariance matrix used to generate temporally correlated noise is plotted (dimensions = 200 × 200). The first 100 time steps describe the noise of neuron 1 and the last 100 time steps describe the noise of neuron 2. The diagonal band creates noise that is temporally correlated within each simulated neuron (but the two neurons are uncorrelated with each other). b, Classification accuracy when using a maximum likelihood classifier to classify between all four possible trajectories in the presence of temporally correlated noise. Even in the presence of temporally correlated noise, the time-varying trajectories are still much easier to classify. c, Example noise vectors and noise covariance matrix for noise that is correlated with the signal (that is, noise that is concentrated only in spatiotemporal dimensions that span the class means). Unlike the temporally correlated noise, this covariance matrix generates spatiotemporal noise that has correlations between time steps and neurons. d, Classification accuracy in the presence of signal-correlated noise. Again, time-varying trajectories are easier to classify than constant trajectories. See Supplementary Note 1 for a detailed interpretation of this figure.
a, Using the principle of maximizing the nearest-neighbour distance, we optimized for a set of pen trajectories that are theoretically easier to classify than the Latin alphabet (using standard assumptions of linear neural tuning to pen-tip velocity). b, For comparison, we also optimized a set of 26 straight lines that maximize the nearest-neighbour distance. c, Pairwise Euclidean distances between pen-tip trajectories were computed for each set, revealing a larger nearest-neighbour distance (but not mean distance) for the optimized alphabet compared to the Latin alphabet. Each circle represents a single movement and bar heights show the mean. d, Simulated classification accuracy as a function of the amount of artificial noise added. Results confirm that the optimized alphabet is indeed easier to classify than the Latin alphabet, and that the Latin alphabet is much easier to classify than straight lines, even when the lines have been optimized. e, Distance matrices for the Latin alphabet and optimized alphabets show the pairwise Euclidean distances between the pen trajectories. The distance matrices were sorted into seven clusters using single-linkage hierarchical clustering. The distance matrix for the optimized alphabet has no apparent structure; by contrast, the Latin alphabet has two large clusters of similar letters (letters that begin with a counter-clockwise curl, and letters that begin with a downstroke).
a, Magnetic resonance imaging (MRI)-derived brain anatomy of participant T5. Microelectrode array locations (blue squares) were determined by co-registration of postoperative CT images with preoperative MRI images. b, Example spike waveforms detected during a 10-s time window are plotted for each electrode (data were recorded on post-implant day 1,218). Each rectangular panel corresponds to a single electrode and each blue trace is a single spike waveform (2.1-ms duration). Spiking events were detected using a −4.5 root mean square (RMS) threshold, thereby excluding almost all background activity. Electrodes with a mean threshold crossing rate of at least 2 Hz were considered to have ‘spiking activity’ and are outlined in red (note that this is a conservative estimate that is meant to include only spiking activity that could be from single neurons, as opposed to multiunit ‘hash’). The results show that many electrodes still record large spiking waveforms that are well above the noise floor (the y axis of each panel spans 330 μV, whereas the background activity has an average RMS value of only 6.4 μV). On this day, 92 electrodes out of 192 had a threshold crossing rate of at least 2 Hz.
This file contains the Supplementary Methods and Supplementary Note 1.
: Copying sentences in real-time with the handwriting brain-computer interface. In this video, participant T5 copies sentences displayed on a computer monitor with the handwriting-brain computer interface. When the red square on the monitor turns green, this cues T5 to begin copying the sentence.
: Hand micromotion while using the handwriting brain-computer interface. Participant T5 is paralyzed from the neck down (C4 ASIA C spinal cord injury) and only generates small micromotions of the hand when attempting to handwrite. T5 retains no useful hand function.
: Freely answering questions in real-time with the handwriting brain-computer interface. In this video, participant T5 answers questions that appear on a computer monitor using the handwriting brain-computer interface. T5 was instructed to take as much time as he wanted to formulate an answer, and then to write it as quickly as possible.
: Side-by-side comparison between the handwriting brain-computer interface and the prior state of the art for intracortical brain-computer interfaces. In a prior study (Pandarinath et al., 2017) participant T5 achieved the highest typing speed ever reported with an intracortical brain-computer interface (39 correct characters per minute using a point-and-click typing system). Here, we show an example sentence typed by T5 using the point-and-click system (shown on the bottom) and the new handwriting brain-computer interface (shown on the top), which is more than twice as fast.
About this article
Cite this article
Willett, F.R., Avansino, D.T., Hochberg, L.R. et al. High-performance brain-to-text communication via handwriting. Nature 593, 249–254 (2021). https://doi.org/10.1038/s41586-021-03506-2
Online recognition of handwritten characters from scalp-recorded brain activities during handwriting
Journal of Neural Engineering (2021)