Abstract
Eight palindromes comprise one-quarter of the euchromatic DNA of the male-specific region of the human Y chromosome, the MSY1. They contain many testis-specific genes and typically exhibit 99.97% intra-palindromic (arm-to-arm) sequence identity1. This high degree of identity could be interpreted as evidence that the palindromes arose through duplication events that occurred about 100,000 years ago. Using comparative sequencing in great apes, we demonstrate here that at least six of these MSY palindromes predate the divergence of the human and chimpanzee lineages, which occurred about 5 million years ago. The arms of these palindromes must have subsequently engaged in gene conversion, driving the paired arms to evolve in concert. Indeed, analysis of MSY palindrome sequence variation in existing human populations provides evidence of recurrent arm-to-arm gene conversion in our species. We conclude that during recent evolution, an average of approximately 600 nucleotides per newborn male have undergone Y–Y gene conversion, which has had an important role in the evolution of multi-copy testis gene families in the MSY.
The human MSY palindromes, designated P1–P8, are surprisingly large, with arm lengths that range from 9 kilobases (kb; P7) to 1.45 megabases (Mb; P1) (see Table 2 and Figs 2, 3 and 5 of the accompanying manuscript1). The paired arms of each palindrome are separated by a non-duplicated spacer that measures 2–170 kb in length. Fifteen gene and transcript families have been identified in the palindrome arms (none in the spacers), and all seem to be expressed predominantly or exclusively in testes1. Similar to the palindrome arms in which they reside, these gene families are characterized by extremely low sequence divergence between the copies found in a single Y chromosome.
The DAZ gene family of the MSY resides exclusively in the arms of palindromes P1 and P2 (ref. 2). Near identity between DAZ copies in a single Y chromosome led some investigators to conclude, based on molecular clock reasoning, that DAZ gene amplification had occurred only within the last 200,000 years3. However, multiple Y-linked copies of DAZ also exist in apes and Old World monkeys3, 4, 5, 6. This suggests that palindromes P1 and P2, which contain the DAZ genes, might predate the divergence of humans from other primate lineages. This may be true for the other MSY palindromes as well. In that case, the near identity observed between palindrome arms could be the consequence of gene conversion—"the non-reciprocal transfer of information from one DNA duplex to another"7. Gene conversion sometimes involves transfer between repeated sequences on the same chromosome8.
To test the ancient origins/gene-conversion hypothesis, we looked for evidence that MSY palindromes were present in the common ancestor of humans and chimpanzees. Specifically, we searched for orthologues of the eight human palindromes in chimpanzees (Pan troglodytes), bonobos (pygmy chimpanzee, Pan paniscus) and gorillas (Gorilla gorilla). In each species, and for each palindrome, we attempted to amplify, by polymerase chain reaction (PCR), and sequence the two inner boundaries (between spacer and arms) and the two outer boundaries (between arms and surrounding sequences). We successfully amplified both inner boundaries in multiple palindromes (Table 1). In all of these cases, the PCR products were observed only when male genomic DNAs were used as templates, and never when using female genomic DNAs (data not shown). This implies that the PCR products were amplified from the male-specific regions of the great ape Y chromosomes. In all cases, the boundary sequences were essentially identical in humans and great apes (Fig. 1a; see also Supplementary Information). Only for P7 did we successfully amplify both outer boundaries (in chimpanzee and bonobo). These findings suggested that: (1) most palindromes found in the modern human MSY were already present, in the MSY, in the common ancestor of humans and chimpanzees; and (2) inner boundaries are more highly conserved than outer boundaries.
Figure 1: Sequence comparison of human and ape MSY palindromes.

a, Nucleotide sequences of inner boundaries of palindrome P6 in human and apes. Dots represent identity to human sequence. Full interspecific alignments of this and other palindromes' boundaries are in Supplementary Information. b, Overview of sequence divergence between human and chimpanzee palindromes, and between palindrome arms within each species. Each palindrome is shown to scale, folded about the centre of the spacer. For palindromes P1/P2 and P6, only the central portions are contained in sequenced chimpanzee BACs, and the palindromes are not perfectly centred within the BACs. Therefore, more sequence from one arm is available than from the other. For P1/P2 we include the 5' and 3' DAZ exons but exclude the central, intragenically duplicated regions of the gene24). The CDY1 genes are not in the portions of P1 shown1. For palindrome P7, the entire sequence of both arms is represented in sequenced chimpanzee BACs, as is extensive flanking, non-ampliconic sequence. Supplementary Table 8 provides confidence intervals, calculations and links to sequence alignments.
High resolution image and legend (79K)To enable detailed comparisons of human and chimpanzee palindromes, we screened a male chimpanzee genomic bacterial artificial chromosome (BAC) library for clones homologous to the inner boundaries of human palindromes P1–P8. We identified and then sequenced chimpanzee BACs corresponding to palindromes P1, P2, P6 and P7. (The BAC library provided only one- to twofold coverage, on average, of chimpanzee MSY sequences and thus was not expected to contain all boundaries of MSY palindromes.) Comparative sequence analysis confirmed the structural similarity of the human and chimpanzee palindromes and, by inference, their common ancestry (Fig. 1b; see Supplementary Information for complete sequence alignments). We observed 1.44% sequence divergence, on average, between orthologous palindrome arms in human and chimpanzee (Fig. 1b and Table 2). Such divergence between species probably reflects the simple accumulation of neutral mutations in the human and chimpanzee lineages after their separation. However, within each of the chimpanzee palindromes studied, we observed markedly little arm-to-arm divergence: 0.028%, on average, which is statistically indistinguishable from the 0.021% arm-to-arm divergence observed in the human MSY palindromes (Table 2; see also Supplementary Table 7). We conclude that the MSY palindromes predated separation of the human and chimpanzee lineages, and that, in both the human and chimpanzee lineages, the paired arms of the palindromes evolved in concert.
If gene conversion between palindrome arms was responsible for our findings, it might leave traces in the recent genealogy of the human MSY. In particular, we might find evidence that single nucleotide differences between the two arms of a human MSY palindrome had been eliminated by gene conversion. Examination of two CDY genes—one in each arm of palindrome P1—revealed a duplicated site of sequence variation that fulfilled this prediction. By sequencing this duplicated site in diverse, unrelated men, we identified some Y chromosomes with a C at this site in both arms of P1 (C/C chromosomes), other chromosomes with a C in one arm and a T in the second arm (C/T chromosomes), and other chromosomes with a T in both arms (T/T chromosomes; Fig. 2a). We confirmed these findings using a PCR/restriction-digestion assay (Supplementary Fig. 4). This single nucleotide substitution occurs at nucleotide 381 of the CDY coding region but does not alter the predicted amino acid sequence.
Figure 2: Site in CDY1 showing evidence of multiple independent gene conversion events.

This site, named CDY1 + 381, occurs in each arm of palindrome P1. a, Sequence traces for samples PD365, PD335 and PD207 with C/C, C/T and T/T chromosomes, respectively. b, Distribution of C/C, C/T and T/T chromosomes in the MSY genealogical tree, focusing on the cluster of related branches to which C/T and T/T chromosomes are confined. M92, M67, M12, M172, p12f: biallelic polymorphisms that define branch points in the part of the tree shown26–28. See Supplementary Fig. 1 for the full tree and inference of ancestral genotypes.
High resolution image and legend (44K)We then typed this nucleotide variant in 171 unrelated men chosen to represent the great diversity of Y chromosomes that other investigators have discovered in human populations. Specifically, these 171 Y chromosomes represented 42 distinct branches of a robust tree of human Y chromosome genealogy (Supplementary Fig. 1)9. In this sampling of the MSY genealogical tree, C/T chromosomes and T/T chromosomes were confined to a young cluster of five closely related branches (Fig. 2b; see also Supplementary Fig. 1). In the 37 other tested branches, only C/C chromosomes were observed. This distribution (Fig. 2b) suggested that the chromosome immediately ancestral to the five-branch cluster was C/T, and that this chromosome had arisen (from a C/C chromosome) by a C
T substitution in one arm of palindrome P1.
In three of this cluster's five branches, we observed T/T as well as C/T chromosomes (Fig. 2; see also Supplementary Fig. 1). This finding is readily explained by gene conversion in a C/T chromosome—the ancestral chromosome for this cluster—replacing the C in one arm of palindrome P1 with the T in the other arm. The data reveal at least three such gene-conversion events—one in each of the branches that have T/T chromosomes (Fig. 2b). In one of these branches, we also observed C/C chromosomes alongside C/T and T/T chromosomes (Fig. 2b). Here we surmise that gene conversion in a C/T chromosome replaced the T in one arm of P1 with the C in the other arm. Thus, during recent human history, gene conversion in C/T chromosomes has used either the C copy or the T copy as template. In addition, we investigated two other duplicated sites of sequence variation, and at both sites we found evidence of recurrent gene conversion during recent human history (Supplementary Figs 2 and 3).
How frequently does gene conversion occur in the MSY palindromes? Near uniformity of arm-to-arm sequence divergence in both human and chimpanzee palindromes (Table 2 in ref. 1 and Fig. 1b) suggests a steady-state balance between new mutations that create differences between arms, and gene-conversion events that erase these differences. Accordingly, we can calculate the rate of gene conversion needed to maintain the observed divergence in the face of new mutations. Let
be the human MSY mutation rate, 1.6
10-9 substitutions per nucleotide per year (see Methods). Let d be the observed divergence between human MSY palindrome arms (3
10-4 substitutions per duplicated nucleotide), and let c be the (unknown) rate of gene conversion (in both directions combined) per duplicated nucleotide per year. Differences between arms are introduced at a rate of 2
(as a mutation in either arm creates a difference between arms), and homogenized at a rate of cd. Thus, at steady state, cd = 2
. Then c = 2
/d = 2
1.6
10-9/3
10-4 = 1.1
10-5 gene conversions per duplicated nucleotide per year. For a 20-year human generation, this corresponds to a rate of 2.2
10-4 conversions per duplicated nucleotide per generation, comparable to rates estimated directly in a mouse transgenic system10. Over the 5.4 Mb in human MSY palindromes (2.7
106 duplicated nucleotides), then, an average of about 600 duplicated nucleotides have undergone arm-to-arm gene conversion for every son born in recent human evolution. Most of these conversions would have involved two identical DNA sequences, and thus their products would be unobservable. The inferred kinetics of gene conversion in MSY palindromes is especially striking because the MSY was previously viewed as recombinationally inert under normal circumstances: it was known previously as the non-recombining region, or NRY.
At present, we do not know whether gene conversion in MSY palindromes occurs during meiosis, mitosis, or both. It may involve homology-directed double-strand break repair, as in gene conversion between homologous chromosomes or sister chromatids11. An interesting observation is that human–chimpanzee divergence is significantly reduced in MSY palindrome arms as compared with other MSY sequences examined (Table 2). This reduction is evident even when comparing Alu and other interspersed repeat sequences that are presumed to be of little functional consequence (Supplementary Table 1). Thus, the reduced rate of evolution in palindrome arms does not seem to be due to selective constraints. A weak directional bias in gene conversion, favouring restoration of the original sequence, might account for these observations.
Our finding of abundant gene conversion in MSY palindromes raises questions about the molecular-clock dating of other segmental duplications in the human genome12. Some of these were interpreted as being of recent origin based on low copy-to-copy divergence13. In other cases, however, analysis by Southern blots14, 15 or quantitative PCR16 indicated that these duplications exist in great apes as well as in humans. Thus, these duplications might well represent conserved genomic organizations subject to gene conversion and concerted evolution. In the case of human X-chromosomal colour vision genes, 2 kb of comparative sequence data confirm concerted evolution17, 18. Our current findings, taken together with these previous results, raise the possibility that gene conversion in primate genomes could be much more pervasive than previously thought.
Finally, we note a strong association between gene conversion and MSY testis genes. In humans, all genes in MSY palindromes seem to be expressed predominantly or exclusively in testes, and most MSY genes with this expression pattern occur in palindromes1. Given the abundance of gene conversion in palindromes, we infer that Y–Y gene conversion has accompanied and shaped the evolution of multi-copy testis gene families in the MSY. Perhaps some selective advantage stemmed from the palindromic duplication of MSY testis genes during human evolution. If so, has Y–Y gene conversion had a role in that advantage? Has it allowed genes in palindromes to resist, or at least retard, the evolutionary decay that is a hallmark of Y chromosome evolution19? This could explain the observation, as reported in the accompanying paper, that intact testis-specific genes tend to be located in palindrome arms whereas non-functional copies of these genes seem to be distributed randomly (see Table 4 in ref. 1). A full understanding of the functional and evolutionary significance of our findings will require further study in primates and other mammals.
Methods
Estimating the MSY mutation rate
We estimated the MSY mutation rate in the human lineage based on the data and analysis in ref. 20, and an estimate of 5.5 million years ago for the most recent common ancestor of humans and chimpanzees21. The result is 1.6
10-9 substitutions per nucleotide per year (Supplementary Fig. 5).
PCR amplification and sequencing of palindrome boundaries
Supplementary Table 2 lists the PCR primers and conditions used to amplify palindrome boundaries. Supplementary Table 3 provides GenBank accession numbers for the chimpanzee, bonobo and gorilla sequences obtained.
Identification and sequencing of chimpanzee BACs
We screened high-density filters from the RPCI-43 male chimpanzee BAC library22 (BACPAC resources) using hybridization probes designed to detect sequences (1) near the inner boundaries of palindromes P1–P6 and P8; (2) near P7; and (3) from a non-ampliconic region of the human MSY. STS content and BAC-end sequences confirmed that, among the candidate BACs identified by hybridization, six contained the central portions of orthologues to human MSY palindromes. The BACs were sequenced as previously described2. Supplementary Table 4 provides descriptions of the sequenced BACs and their GenBank accession numbers.
Sequence analysis
Sequences were aligned with CLUSTAL W using default parameters23. In a few cases, the resulting alignments were adjusted manually. All alignments are provided as Supplementary Information.
Typing nucleotide variants in palindrome arms
The sites studied were CDY1 + 381 (Fig. 2 and Supplementary Fig. 1), CDY1 - 84 (Supplementary Fig. 2), and sY586 (Supplementary Fig. 3). sY586 was genotyped as previously described24. PCR primers and conditions for amplifying CDY1 + 381 (sY1313) and CDY1 - 84 (sY1314) have been deposited in GenBank (accession numbers G73596 and G73597, respectively). When typing CDY1 + 381 by sequencing, 'primer A' in GenBank G73596 served as the sequencing primer. CDY1 - 84 was typed by sequencing using 'primer B' in GenBank G73597.
For the samples that showed evidence of gene conversion (Fig. 2 and Supplementary Figs 1–3), we excluded the possibility of deletion of one copy of the variant site as discussed in Supplementary Note 1.
Steady-state balance between mutations and gene-conversion
To show that the combined action of mutation and gene conversion results in a steady-state level of arm-to-arm divergence, we use the following recursion: dn+1 = (1 - cg)dn + 2
g where dn is the sequence divergence between repeat copies at generation n,
g is the mutation rate per nucleotide per generation, and cg is the gene conversion rate per duplicated nucleotide per generation. We presume that d0 = 0, corresponding to no differences between sequence copies immediately after the initial duplication event. However, as 1 - cg < 1, limn
dn = 2
g/cg, for any value of d0 small enough to support cg. Because
g and d are very small, mutations almost never occur at sites that already differ between the two palindrome arms, and this possibility can be ignored. As shown in Supplementary Note 2, our analysis is a special case of Ohta's analysis25.


