Main

The human Y chromosome can be used to reveal the paternal relationships among historical individuals by studying their present descendants, because the most part of the Y chromosome is passed along the male lineage.1 Successful cases include the determination of paternity of the descendant of Thomas Jefferson2 and the inference of Y chromosome haplotype of Jewish priests.3 When combined with the stemma records, Y chromosome can be used to study the relationships of the individuals in even more ancient history. Recording the stemmas has been a religious tradition of Han Chinese, and some stemmas link the contemporary individuals to their ancestors over 3000 years ago, although their authenticity requires careful validation. To explore the utility of the stemmas for evolution studies, we recently studied the Y chromosomes of the clans who claimed to be descendants of Emperor CAO Cao (155AD–220AD) as an example. Moreover, we examined the relationship of Emperor CAO with his claimed aristocratic ancestry.4

Emperor CAO is probably one of the most famous persons in the East Asia, because of the popularity of a novel, Romance of the Three Kingdoms. Emperor CAO claimed that his clan was derived from Marquis CAO Can (?–190BC), the second Prime Minister of Han Dynasty, and further to the ancient Dukes of CAO (eleventh century BC to 487 BC), and therefore was of aristocratic ancestry. To be of aristocratic ancestry was of great importance for justifying political privileges during early time of China. As Emperor CAO's grandfather was a eunuch, the Emperor's father should have been adopted from his grandfather’s own clan (Marquis CAO’s clan) according to the strict patriciate custom.5 However, Emperor CAO's opponents alleged that Emperor CAO's father was adopted from beggardom and was of humble ancestry. This controversy has lasted for nearly 1800 years.

Here, we typed 100 Y chromosome single-nucleotide polymorphisms (Supplementary Table 1) as listed in the latest Y-chromosome phylogenetic tree6 on 280 individuals of 79 Cao clans or clan clusters from different locations throughout the China (Figure 1), and 446 individuals of different clans with other surnames. A clan cluster may consist of several simplex clans if they carry different Y-chromosome haplotypes. Thus, we studied overall 111 simplex clans of CAO (Supplementary Table 2). According to their stemma records, 15 of the CAO clans claimed to be descendants of Emperor CAO. These 15 clans distributed in different provinces and never knew the existence of each other. Their Y chromosomes comprise six haplotypes (Table 1). Only one of these six haplotypes can be Emperor CAO's type. The other haplotypes found in the claimed clans might be introduced by other sources such as adoption, acceding to mother's surname, nonpaternities, and so on.7 Here we need to recognize the most probable Emperor CAO's haplotype by examining the haplotype distribution among the clan groups.

Figure 1
figure 1

Geographical focuses and Y-chromosome genotypes of the Cao clans (a) and genealogical relationship among the six related clans (b).

Table 1 Fisher's exact tests between different clan groups

We classified the clans into three groups: claimed clans of Emperor CAO, unclaimed CAO clans (or claimed to be other origin) and the general population with other surnames. We pairwisely examined the frequency difference of each haplotype among the three groups.8, 9 There is no significant difference between the unclaimed CAO clans and the general population, indicating that the surname CAO may have multiple founders, similar to other surnames.10 Thus, these two groups were combined into one (referred to as the reference group hereafter) in the subsequent analysis. Interestingly, haplotype O2-M268 is the only one that is enriched significantly in the claimed clans, meanwhile, rare in the reference group (Fisher's exact test, P=9.32 × 10−5), suggesting that it is the most likely haplotype of Emperor CAO (Table 1). The probability of a haplotype to be ancestor's type is not just its frequency in the case group, but need to be estimated from the frequency difference between the case group and the reference group. Here, we used the odds ratio (OR), which provide an estimate (with confident interval) for relationship between two binary variables, to estimate the probability.11 OR between the two groups is 12.72 for haplotype O2-M268 compared with the reference group.

Haplotype O3-002611 reaches the highest frequency among the other CAO clans, and is the only haplotype appears in all five clans who claimed to be direct descendants of the ancient CAO Dukes and CAO Can. Here, different clan groups were classified to examine the probable haplotype of the CAO Dukes: five clans of haplotype O3-002611 and no clans of other haplotypes for the claimed group, 79 clans of O3-002611 and 458 clans of other haplotypes for the reference group. Thus, O3-002611 is most likely the haplotype of the ancient CAO Dukes (Fisher's exact test, P=1.968 × 10−4). Taken the two results together, Emperor CAO was unlikely the descendant of CAO Can and the CAO Dukes, and his claim of aristocratic ancestry is therefore not supported by genetic evidence.

Our results, given the data, also established the authenticity of the stemma records of Emperor CAOs’ descendants with O2-M268 and those of CAO Cans’ descendants with O3-002611. Connected by the ancestors of 70–100 generations spanning over 2000 years, these samples are of great value for studying the Y-chromosome evolution. This study offers a successful showcase of the utility of genetics in studying the ancient history.