Fast hands-free writing by gaze direction

Ward, David J.; MacKay, David J. C.

doi:10.1038/418838a

Download PDF

Brief Communication
Published: 22 August 2002

Artificial intelligence

Fast hands-free writing by gaze direction

David J. Ward¹ &
David J. C. MacKay¹

Nature volume 418, page 838 (2002)Cite this article

2627 Accesses
158 Citations
6 Altmetric
Metrics details

Abstract

Here we describe a method for text entry based on inverse arithmetic coding that relies on gaze direction alone and which is faster and more accurate than using an on-screen keyboard. These benefits are derived from two innovations: the writing task is matched to the capabilities of the eye, and a language model is used to make predictable words and phrases easier to write.

Accelerating eye movement research via accurate and affordable smartphone eye tracking

Article Open access 11 September 2020

Universal and specific reading mechanisms across different writing systems

Article 22 February 2022

Assessing the allocation of attention during visual search using digit-tracking, a calibration-free alternative to eye tracking

Article Open access 09 February 2023

Main

For people who cannot use a standard keyboard or mouse, the direction of gaze is one of the few means by which they can convey information to a computer. Many systems for gaze-controlled text entry provide an on-screen keyboard with buttons that can be 'pressed' by staring at them. But eyes did not evolve to push buttons, and this method of writing is exhausting.

Moreover, on-screen keyboards are inefficient because typical text has considerable redundancy¹. Although a partial solution to this defect is to include word-completion buttons as alternative buttons alongside the keyboard, a language model's predictions can be better integrated into the writing process. By inverting an efficient method for text compression — arithmetic coding² — we have created an efficient method for text entry, which is also well matched to the eye's natural talent for search and navigation.

One way to write a piece of text is to delve into a theoretical 'library' that contains all possible books, and find the book that contains exactly the desired piece of text³; writing thus becomes a navigational task. In our idealized library, the 'books' are arranged alphabetically on one enormous shelf. As soon as the user looks at a part of the shelf, the view zooms in continuously on the point of gaze. So, to write a message that begins "hello", the user first steers towards the section of the shelf marked 'h', where all the books beginning with 'h' are found. Within this section are different sections for books beginning 'ha', 'hb', 'hc' and so on; the user enters the 'he' section, then the 'hel' section within it, and so forth.

To make the writing process efficient, we use a language model, which predicts the probability of each letter's occurrence in a given context, to allocate the shelf space for each letter of the alphabet (Fig. 1a). When the language model's predictions are accurate, many successive characters can be selected by a single gesture.

We previously evaluated this system, which we call 'Dasher', with a mouse as the steering device⁴. Novices rapidly learned to write and an expert could write at 34 words per minute; all users made fewer errors than when they were using a standard 'QWERTY' keyboard.

Figure 1b shows an evaluation of Dasher driven by an eye-tracker, compared with an on-screen keyboard. After an hour of practice, Dasher users could write at up to 25 words per minute, whereas on-screen keyboard users could manage only 15 words per minute. Moreover, the error rate with the on-screen keyboard was about five times that obtained with Dasher.

Users of both systems reported that the on-screen keyboard was more stressful to use than Dasher for two reasons. First, they often felt uncertain whether an error had been made in the current word (the word-completion feature works only if no error has been made); an error can be spotted only by looking away from the keyboard. Second, a decision has to be made after 'pressing' each character on whether to use word completion or to continue typing — looking to the word-completion area is a gamble as it is not guaranteed that the required word will be there, and finding the correct completion requires a switch to a new mental activity. By contrast, Dasher users can see simultaneously the last few characters they have written and the most probable options for the next few. Furthermore, Dasher makes no distinction between word completion and ordinary writing.

Dasher works in most languages — the language model can be trained on sample documents and adapts to the user's language as he or she writes. It can also be operated with other pointing devices, such as a touch screen or rollerball. Dasher is potentially an efficient, accurate and fun writing system not only for disabled computer users but also for users of mobile computers.

References

Shannon, C. E. Bell Syst. Tech. J. 27, 379–423, 623–656 (1948).
Article Google Scholar
Witten, I. H., Neal, R. M. & Cleary, J. G. Commun. Assoc. Comput. Machin. 30, 520–540 (1987).
Google Scholar
Borges, J. L. The Library of Babel (Godine, Boston, 1941).
Ward, D. J., Blackwell, A. F. & MacKay, D. J. C. Proc. 13th Annu. ACM Symp. User Interface Software Technol. 2000 129–137 (2000).
Ward, D. J. Dasher, version 1.6.4 (2001); http://www.inference.phy.cam.ac.uk/dasher
Cleary, J. G. & Witten, I. H. IEEE Trans. Commun. 32, 396–402 (1984).
Article Google Scholar
Teahan, W. J. Proc. NZ Comp. Sci. Res. Stud. Conf. (1995); http://citeseer.nj.nec.com/teahan95probability.html

Download references

Author information

Authors and Affiliations

Cavendish Laboratory, Cambridge, CB3 0HE, UK
David J. Ward & David J. C. MacKay

Authors

David J. Ward
View author publications
You can also search for this author in PubMed Google Scholar
David J. C. MacKay
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David J. C. MacKay.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ward, D., MacKay, D. Fast hands-free writing by gaze direction. Nature 418, 838 (2002). https://doi.org/10.1038/418838a

Download citation

Issue Date: 22 August 2002
DOI: https://doi.org/10.1038/418838a

This article is cited by

Eardrum-inspired soft viscoelastic diaphragms for CNN-based speech recognition with audio visualization images
- Seok-Jin Park
- Hee-Beom Lee
- Gi-Woo Kim
Scientific Reports (2023)
A large quantitative analysis of written language challenges the idea that all languages are equally complex
- Alexander Koplenig
- Sascha Wolfer
- Peter Meyer
Scientific Reports (2023)
An intelligent MXene/MoS2 acoustic sensor with high accuracy for mechano-acoustic recognition
- Jingwen Chen
- Linlin Li
- Guozhen Shen
Nano Research (2023)
Classical and quantum compression for edge computing: the ubiquitous data dimensionality reduction
- Maryam Bagherian
- Sarah Chehade
- Ali Passian
Computing (2023)
Revisiting Topography-Based and Selection-Based Verbal Behavior
- Anna Ingeborg Petursdottir
- Einar T. Ingvarsson
The Analysis of Verbal Behavior (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Fast hands-free writing by gaze direction

Abstract

Similar content being viewed by others

Accelerating eye movement research via accurate and affordable smartphone eye tracking

Universal and specific reading mechanisms across different writing systems

Assessing the allocation of attention during visual search using digit-tracking, a calibration-free alternative to eye tracking

Main

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Rights and permissions

About this article

Cite this article

This article is cited by

Eardrum-inspired soft viscoelastic diaphragms for CNN-based speech recognition with audio visualization images

A large quantitative analysis of written language challenges the idea that all languages are equally complex

An intelligent MXene/MoS2 acoustic sensor with high accuracy for mechano-acoustic recognition

Classical and quantum compression for edge computing: the ubiquitous data dimensionality reduction

Revisiting Topography-Based and Selection-Based Verbal Behavior

Comments

Search

Quick links

Abstract

Similar content being viewed by others

Main

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links