Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

The phylogeny of The Canterbury Tales


Geoffrey Chaucer's The Canterbury Tales survives in about 80 different manuscript versions1. We have used the techniques of evolutionary biology to produce what is, in effect, a phylogenetic tree showing the relationships between 58 extant fifteenth-century manuscripts of “The Wife of Bath's Prologue” from The Canterbury Tales. We found that many of the manuscripts fall into separate groups sharing distinct ancestors.


Manuscripts such as these were created by copying, directly or indirectly, from the original material (written, in the case of The Canterbury Tales, in the late fourteenth century). In the process of copying, the scribes made (deliberately or otherwise) changes, which were themselves copied. Textual scholars have developed a system for reconstructing the relationships between textual traditions by analysing the distribution of these shared changes, and have constructed family trees (stemmata) on the basis of the results, with the ultimate aim of establishing precisely what the author actually wrote. This analysis is carried out manually and is feasible only for a few manuscripts of short texts. The sheer quantity of information in a tradition the size of The Canterbury Tales defeats any system of manual analysis.

However, the principle of historical reconstruction is similar to the computerized techniques used by evolutionary biologists to reconstruct phylogenetic trees of different organisms using sequence data. We therefore applied phylogenetic techniques to The Canterbury Tales using the 850 lines of 58 surviving fifteenth-century manuscripts of “The Wife of Bath's Prologue”. We believe this to be the first full tradition of a major work to be analysed in this manner.

It may be inappropriate to impose a tree-like structure on such data sets, so we used the method of split decomposition implemented in the program SplitsTree2, in addition to the cladistic analysis of PAUP3. Figure 1 shows a SplitsTree analysis of 44 of the 58 manuscripts. Very similar results were given by PAUP (not shown). Several manuscripts form groups (A, B, C/D, E and F), each descended from a single and distinct common ancestor. The remaining 14 manuscripts were removed from the analysis shown in Fig. 1, as they were likely to have been copied from more than one exemplar, either by deliberate conflation of readings or by changing the exemplar during the course of copying. These manuscripts were identified by comparison of the trees generated with different regions of the text, which showed that their position in the analysis varied dramatically depending on which region was used. The central point is likely to represent the ancestor of the whole tradition. The manuscripts grouped as O are particularly crucial; their position near to the centre suggests that they all descend from Chaucer's original, and may therefore contain crucial evidence about this original. However, most of them have been ignored by scholars.

Figure 1: SplitsTree analysis of 44 manuscripts of “The Wife of Bath's Prologue” from Chaucer's The Canterbury Tales.4
figure 1

The two- or three-character codes indicate individual manuscripts, whereas the large capitals indicate groups of manuscripts, which are coloured the same.

From this analysis and other evidence, we deduce that the ancestor of the whole tradition, Chaucer's own copy, was not a finished or fair copy, but a working draft containing (for example) Chaucer's own notes of passages to be deleted or added, and alternative drafts of sections. In time, this may lead editors to produce a radically different text of The Canterbury Tales. These results also demonstrate the power of applying phylogenetic techniques, and particularly split decomposition, to the study of large numbers of different versions of sizeable texts.


  1. Blake, N. F. The Textual Tradition of The Canterbury Tales (Edward Arnold, London, 1985).

  2. Huson, D. H. Bioinformatics 14, 68–73 (1998).

    CAS  Article  Google Scholar 

  3. Swofford, D. L. PAUP Version 3.1.1. (Smithsonian Institute, Washington DC, 1993).

  4. Robinson, P. M. W. in The Canterbury TalesProject: Occasional Papers Vol. II (eds Blake, N.F. & Robinson, P.M.W.) 69-132 (Office for Humanities Communication, London, 1997).

Download references

Author information

Authors and Affiliations


Rights and permissions

Reprints and Permissions

About this article

Cite this article

Barbrook, A., Howe, C., Blake, N. et al. The phylogeny of The Canterbury Tales. Nature 394, 839 (1998).

Download citation

  • Issue Date:

  • DOI:

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing