Introduction

Genealogy is the discipline that traditionally deals with the reconstruction of family history. So far, the quest to trace ancestors and distant relatives has mainly relied on oral traditions, church records and other documents. The human genome constitutes an archive of our history and evolutionary past that can be explored for genealogical research. Genetic markers, like mitochondrial DNA (mtDNA) and Y chromosome polymorphisms, have been shown in historical investigations to be reliable tools for tracing matri- and patrilineal genealogies, respectively.1,2 Since hereditary family names are passed down in the same way as paternally inherited Y chromosomes, a surname should, within a genealogy, correlate with a Y chromosome haplotype.3,4

The island population of Tristan da Cunha is unique with regard to its well-documented genealogy that dates back to the first permanent settlement in the early 19th century.5 The current population of 278 individuals6 is thought to have descended from 15 ancestors, seven females and eight males who arrived on the island at various times between 1816 and 1908. The male founders were all of western European ancestry, originating from Scotland, England, Holland, the USA and Italy. Today, there are seven family names in use (Glass, Green, Hagan, Lavarello, Repetto, Rogers and Swain), corresponding to the number of founding fathers with present-day male descendents. In addition to these known donors, there is also evidence for ‘hidden ancestors’ who supposedly contributed to the gene pool, but took their names with them upon leaving the island.7

In a previous study, the present-day mtDNA pool of the Tristan islanders was traced to five female ancestors.8 These data identified inconsistencies in the number of mtDNA types and the written records. Although the historical documents mentioned two sister pairs among the founding females, mtDNA data showed support for only one pair of sisters. In the present study, we studied 13 Y chromosome polymorphisms in a sample from Tristan da Cunha together with the available genealogical information to (i) infer the Y chromosome haplotypes of the known male founders who brought their surnames to the island and (ii) test if the Y chromosome transmission is consistent with the documented patrilineal history of the island community.

Subjects and methods

DNA samples were prepared from peripheral blood by standard phenol–chloroform procedures from 76 males in a sample collected on the island of Tristan da Cunha in 1982. This study was approved by the Committee for Research on Human Subjects at the University of Witwatersrand, Johannesburg, South Africa. Out of respect for the sensitivities of the present inhabitants of Tristan da Cunha, the observed haplotypes were neither associated with the family names nor with the origins of the founding fathers in order to avoid identification. DNA samples were screened for 13 Y chromosome-specific polymorphisms. The seven bi-allelic Y chromosome markers used here (Figure 1) have previously been shown to define haplogroups frequently found in European populations.9,10,11 Haplogroup designation follows the nomenclature proposed by the Y Chromosome Consortium (YCC).12 The markers M1 (Y chromosome Alu insertion polymorphism ‘YAP’) and M9 were typed in all samples. The allelic states of the remaining polymorphisms were determined hierarchically based on the known Y chromosome genealogy (Figure 1).12 M1, p12f2 and SRY10831 were screened as described.13,9,14 M9,15 M170, M207, M21316 were typed using agarose gel-based PCR–RFLP assays. The protocols for PCR amplification and the detection of polymorphic markers are available from the authors upon request. The six microsatellite loci DYS19, DYS388, DYS390, DYS391, DYS392 and DYS39317 were amplified in two separate multiplex PCR reactions. The PCR conditions are available from the authors upon request. The fluorescently labelled PCR products were resolved on polyacrylamide gels in an automated ABI 377 DNA sequencer. The number of repeats at each locus was determined adopting the proposed nomenclature.18 Haplotypes were defined by both bi-allelic and microsatellite polymorphisms. Microsatellite haplotypes were constructed using the number of repeats (alleles) in the order DYS19-DYS388-DYS390-DYS391-DYS392-DYS393.

Figure 1
figure 1

Y chromosome phylogeny of the bi-allelic poly-morphisms used in the study (black bars). The three Y chromosome haplogroups found in the sample from Tristan da Cunha are indicated (solid lines). Haplogroups are designated according to the YCC nomenclature.12

Results and discussion

We derived nine Y chromosome haplotypes of which seven could unambiguously be traced to the seven men who are known to have introduced surnames to the island (Table 1, haplotypes shown in bold). These seven haplotypes segregated either on a haplogroup R-M207 (× SRY10831.2) or an I-M170 background. R-M207 (x SRY10831.2) is mainly equivalent to R-M173 (× M17),12 which is the dominant Y chromosome lineage in western European populations where it reaches frequencies of 50% and above.10,11 Haplogroup I-M170 is virtually confined to Europe. In western Europeans it has been observed at moderate frequencies (eg 22% in Dutch and 8% in Italians).10,11 Hence, the presence of haplogroups R-M207 and I-M170 in the sample from Tristan da Cunha is consistent with the western European ancestry of the known founding fathers, as suggested by historical records.7

Table 1 Y chromosome haplotypes found in the seven families on Tristan da Cunha

Moreover, two additional haplotypes were discovered (Table 1, haplotypes in families 3 and 4 shown in italics). The first one was found in family 3, in four individuals spanning three generations (Figure 2a). This haplotype could have evolved from one of the two haplotypes already present in families 3 and 5 (Table 1), because of single-step microsatellite mutations at either DYS391 or DYS392. Since the observed mutation rate at DYS392 is very low in European populations (0 in 415 meioses, 95% CI 7.15 × 10−3),19 the mutation at DYS391 seems to be the most likely event that gave rise to the new haplotype through transmission within family 3. Unfortunately, we do not have DNA from the parental generation where the mutation is presumed to have occurred, making paternity testing using autosomal markers impossible. Assuming the origin of the new haplotype within family 3, we estimated the mutation rate at the DYS391 locus to be 1 in 100 meioses (10 × 10−3, 95% CI 0.2–54.5 × 10−3, excluding inconsistent transmissions, discussed later in the text). The mutation rate at this locus determined in a random sample of father/son pairs was reported to be 4.82 × 10−3 (95% CI 0.61–16.4 × 10−3).19

Figure 2
figure 2

Partial male genealogies showing inconsistencies in the transmission of Y chromosome haplotypes in five of the seven families on Tristan da Cunha. To conceal the identities of the families, only partial sib-ships are shown without indicating the generations in the pedigrees. Not all the males from whom Y chromosome data were obtained are presented. Tested individuals are indicated by filled squares. (a) Partial genealogy of family 3. An additional haplotype was found in four individuals spanning three generations. This haplotype most likely evolved from the haplotype associated with family 3 due to a single-step microsatellite mutation at DYS391. Arrow indicates the individual in whom the mutation probably originated. (b) Partial genealogy of family 4. The additional haplotype has a non-island origin (see Table 1). (c) Partial genealogy of family 5. The additional haplotypes observed in two branches of the genealogy are identical to the haplotype of family 7. (d) Partial genealogy of family 6. The additional haplotype in one branch of the genealogy is identical to the haplotype associated with family 5. (e) Partial genealogy of family 7. The additional haplotype is identical to the haplotype associated with family 5.

The second additional haplotype was observed in family 4, in one individual who had no sons (Figure 2b). This haplotype can be distinguished from all the others in the Tristan sample due to a reversion at SRY10831 on a haplogroup R-M207 background (Figure 1 and Table 1). This mutation defines a lineage that is particularly frequent in eastern and northern Europeans (54% in Polish, 47% in Russians, 31% in Norwegians).9 Furthermore, the microsatellite allele constellation found in the Tristan individual (Table 1) represents the most common haplotype within this haplogroup.20 Based on genealogical information, we deduced that this new haplotype was introduced into family 4 in the early 1900s. It is well documented that many passenger ships, cargo vessels and whalers, some from Russia and Norway, used the island as a stopover port for trade and replenishing supplies.7 Perhaps the new lineage was brought to the island during such a visit. This finding provides evidence for the contribution of a hidden ancestor who left his genes but not his name on the island.

The Y chromosome data in conjunction with genealogical information enabled us to unequivocally identify four additional instances of nonpaternity (in families 5–7; Table 1, haplotypes shown in regular font). In these cases, Y chromosomes were introduced that belonged to a different haplogroup from that associated with the family or surname. These ‘newcomers’, however, matched two other lineages already present on the island (Table 1 and Figures 2c–e). The ‘new’ chromosomes observed in two branches of the genealogy in family 5 are identical to the haplotype of family 7 (Figure 2c). The haplotype associated with family 5, on the other hand, was also found in families 6 (Figure 2d) and 7 (Figure 2e). There are two possible explanations for these findings: First, the introduced chromosomes could be the result of pre- or extramarital affairs involving men from the island community. There are indications in the records that would support this possibility.7 The second explanation, of course, could be that the chromosomes were left on the island by other hidden ancestors who happened to have haplotypes that were identical to those of some of the islanders. In fact, the ‘new’ microsatellite haplotypes appear to be common in European populations; in the European Y-STR Haplotype Reference Database21 the frequency for the haplotype of family 5 has been reported as 1.5 × 10−3 and for that of family 7 as 3.9 × 10−2 (the haplotype frequencies are calculated using five microsatellite loci only without considering DYS388 and the bi-allelic markers. As a result, the actual frequencies of the extended haplotypes used in this study would be lower). It is difficult to assess the probability of an internal versus an external contribution of these chromosomes to the Tristan population. However, given the presence of the two lineages in the community already (in families 5 and 7) and the geographic remoteness of Tristan da Cunha (thus the limited traffic to and from the island; Figure 3), it seems improbable that mysterious visitors contributed their chromosomes in all four instances.

Figure 3
figure 3

Map showing the location of Tristan da Cunha in the South Atlantic and its relationship to neighbouring islands and continents. Tristan da Cunha is commonly referred to as the ‘remotest island in the world’.

We calculated the incidence of non-paternity in the male Tristan sample to be about 4% (five instances in 118 male births; illegitimate daughters cannot be detected studying Y chromosome transmission). Although this rate falls within the observed range of estimates (between 1.3 and 30%),3,22 it is most likely underestimated, given the high level of inbreeding7 and the consequently high extent of Y chromosome haplotype sharing on the island. Obviously, we could only discern those cases in which the Y chromosomes of the biological father and the name-giver were different. Interestingly, a study based on 27 autosomal serogenetic markers, including the highly informative Gm haplotype, did not indicate any inconsistencies in the genealogies of the Tristan da Cunha islanders.23

The uniqueness of the Tristan genealogy covering a period of almost 200 years presented us with an ideal opportunity of testing the accuracy of written records with information stored in DNA. The Y chromosome data shown here, together with those based on mtDNA,8 question the validity of some of the genealogical documentation. This study demonstrates convincingly that DNA is a powerful archival resource that can be used to test questions and to refine theories pertaining to human history and origins.