Main

In humans and fruit flies, the X chromosome has several thousand genes, whereas its former homolog, the Y chromosome, contains only a few dozen3,4. This heteromorphism in gene content results from the loss of function of most genes on the Y chromosome. The degeneration of the Y is a consequence of its lack of sexual recombination (it is clonally transmitted from father to son)1. In nonrecombining genomes, selection at one site interferes with the action of selection at linked sites, reducing the efficacy of natural selection5. Two types of natural selection can interfere with the removal of deleterious mutations on a nonrecombining Y chromosome1. Models based solely on negative selection posit that the constant removal of deleterious mutations by purifying selection decreases the effective number of Y chromosomes. This, in turn, reduces the overall efficacy of purifying selection by increasing the influence of genetic drift1. Thus, deleterious mutations can accumulate on the Y solely as a consequence of other deleterious mutations (the background selection and Muller's ratchet models1). An alternative view is that positive selection on the Y chromosome has a key role in its degeneration2. This model posits that recurrent fixations of strongly beneficial mutations (selective sweeps) drag along linked deleterious mutations.

Unfortunately, old Y chromosomes, such as those of humans, retain few traces of the processes that led to their degeneration. The best hope for distinguishing between positive and negative selection as causes of degeneration is to survey patterns of variability on young Y chromosomes that are still in the process of degeneration. Although both positive and negative selection models might account for reduced variability on an evolving Y chromosome1, the selective sweep model predicts a marked excess of low-frequency mutations relative to neutral expectations6,7, whereas negative selection models produce a less severe distortion in the frequency spectrum of mutations8,9,10.

The newly formed neo-Y chromosome of D. miranda is at an intermediate stage of transition from an ordinary autosome into a degenerate Y chromosome11,12,13. Several types of deleterious mutations are accumulating on the neo-Y, including nonsense or frameshift mutations at some protein-coding genes11,13. In addition, there is a general increase in the rate of amino-acid substitution at neo-Y-linked genes14 as well as an accumulation of repetitive and transposable element–derived DNA11,13,15. Thus, the D. miranda neo-Y provides a useful snapshot of the early stages of Y chromosome degeneration.

To determine what mode of selection is driving the accumulation of deleterious mutations on the neo-Y, I resequenced 37 kb in a population sample of 12 D. miranda neo-Y chromosomes. Levels of neutral variability on the neo-Y were much lower than in highly recombining regions (Table 1 and Supplementary Fig. 1 online). Diversity (θW ) on the neo-Y is 0.021% per silent site, 30 times lower (or one-tenth of the expected neutral value correcting for relative numbers of X and Y chromosomes) than levels of diversity based on 15 neo-X- and X-linked genes (0.57% per site; refs. 12,16). This is in close agreement with a previous comparison of 5 kb of homologous regions on the neo-X and neo-Y chromosomes, which found a reduction in variability relative to the neutral expectation, π/π0, of 0.10 (95% confidence interval = 0.03–0.37; ref. 12).

Table 1 Genomic regions investigated on the neo-Y chromosome

Can both positive and negative selection models account for the observed reduction in variability on the neo-Y chromosome? The reduction in neutral variability due to deleterious mutations on the neo-Y depends on the number of Y chromosomes Ne, the rate of deleterious mutations Ud and their associated selection coefficient sd8,10. The expected Ne for the neo-Y in the absence of selection can be estimated independently from average levels of nucleotide variability at other loci and is 600,000. Relatively little is known about the rate of deleterious mutations and their associated selection coefficients17. To investigate whether deleterious mutations can, in principle, explain both the reduction in variability and the distortion in the frequency spectrum of segregating mutations, I allowed the two parameters Ud and sd to vary over a wide grid of fixed values. The reduction in neutral variability (π/π0) due to deleterious mutations can be approximated by the equation

in which Tij is the mean time to the most recent common ancestor of two individuals with i and j deleterious mutations and fi and fj are the frequencies of individuals with i and j deleterious mutations given by a Poisson distribution with mean Ud/sd (refs. 10,18). For a given value of Ne, I computed possible combinations of Ud and sd that are consistent with the observed reduction in variability on the neo-Y chromosome (0.03 ≤ π/π0 ≤ 0.37).

The shaded area in Figure 1a indicates parameter combinations of deleterious mutation models (combinations of Ud and sd for a given Ne) that are compatible with the observed reduction in diversity on the neo-Y. To the top right of Figure 1a are parameter combinations for which an equilibrium can be established between recurrent deleterious mutations and purifying selection (background selection1) are indicated. Under these parameters, all chromosomes in the population ultimately derive from the class of chromosomes free of deleterious mutations (the f0 class). The f0 class is stable under background selection, but as sd becomes weaker (moving to the left in Fig. 1a), this class can be lost by random drift (a process called Muller's ratchet1). Parameter combinations for which Muller's ratchet is operating are indicated to the bottom left of Figure 1a.

Figure 1: Parameter values for deleterious (a) and beneficial (b) selection models to explain the reduction in variability on the neo-Y chromosome of D. miranda.
figure 1

(a) Parameters for the deleterious mutation rate Ud and their associated selection coefficient sd that are compatible with a reduction in diversity of 0.03–0.37. The dark gray area indicates values in the parameter space where the class of chromosomes containing the fewest number of deleterious mutations is stable (background selection), and the light gray area indicates parameters where Muller's ratchet is operating. (b) Estimates of the time since the last selective sweep TS on the neo-Y chromosome that are compatible with the observed reduction in variability of the neo-Y chromosome (0.03–0.46 Ne generations ago). Shown is the proportion of simulations that gave a value of | θπ, obsθπ, sim | < δ, where θπ, obs is 3.44 and δ is 0.5.

Beneficial mutation models can also account for the observed reduction in diversity on the neo-Y (Fig. 1b). For a nonrecombining chromosome, such as the neo-Y, a selective sweep can be modeled as a population bottleneck to size one. Such a 'catastrophic sweep' causes all lineages in the population to coalesce suddenly and wipes out variability completely7,19. Under the catastrophic sweep model considered, the reduction in variability on the neo-Y depends solely on the time TS since the last fixation of a beneficial mutation. I ran coalescent simulations to obtain values of TS that are compatible with the observed diversity reduction. A recent selective sweep between 0.03 and 0.46 Ne generations ago is consistent with the observed reduction in variability; a very recent selective sweep about 0.07 Ne generations ago is most compatible with observed levels of diversity on the neo-Y (Fig. 1b).

Although both models can, in principle, explain the diversity reduction, their effects on the frequencies of linked neutral mutations differ. To summarize the frequency spectrum of mutations, I used Tajima's D statistic20. D is expected to be close to zero in a constant-sized population with no selection (the standard neutral model) and negative when there is an excess of rare mutations. Most of the mutations (17 of 19) observed on the D. miranda neo-Y chromosome occur just once in the sample (Supplementary Fig. 1 online), resulting in a strongly negative value of D (−1.98). Such an excess of low-frequency variants is highly unlikely by chance (P < 0.01) under the standard neutral model. I used coalescent simulations to generate samples under both deleterious and beneficial selection models to obtain empirical critical distributions for the test statistic.

Figure 2 shows the average values of D obtained from coalescent simulations for both positive and negative selection models. Deleterious mutation models can substantially reduce the mean value of D (Fig. 2a), with deleterious mutations of intermediate strength having the largest effect (Ne × sd ≈10–100; Fig. 3). Despite reducing the mean value of D (as low as −1.3 for some combinations of Ud and sd), deleterious mutations rarely result in extreme values of D. For all combinations of Ud and sd, at most 5% of the simulations gave as large a negative value of D as observed in the actual data (Fig. 4a). This means that each of these parameter combinations, despite being compatible with the observed diversity reduction on the neo-Y, can be rejected at the 5% significance level based on the observed value of D. Therefore, deleterious mutation models (background selection and Muller's ratchet) are unlikely to account for both the reduction in variability and the frequency spectrum of mutations on the neo-Y chromosome.

Figure 2: Mean value of Tajima's D under deleterious (a) and beneficial (b) selection models for a given reduction in diversity.
figure 2

(a) Each point shows the average value of Tajima's D, obtained from 1,000 coalescent simulations, for given parameter values of Ud and sd (see Fig. 1). The dark gray points indicate parameters of background selection (Ne × f0 × sd > 10); the white points indicate parameters of Muller's ratchet (Ne × f0 × sd < 1); and the light gray points indicate parameters in between. (b) Mean value of Tajima's D for the positive selection model, based on 10,000 coalescent simulations, for different values of TS.

Figure 3: Mean value of Tajima's D under the deleterious mutation model versus the strength of selection (Ne × sd).
figure 3

Deleterious mutations of intermediate effect (Ne × sd = 10–100) cause the strongest distortion in the frequency spectrum of linked neutral mutations.

Figure 4: Probability P of obtaining a Tajima's D equally or more negative in the simulations as observed from the data under deleterious (a) and beneficial (b) selection models.
figure 4

(a) For all parameter combinations of Ud and sd, at most 5% of all simulations give as large a negative Tajima's D as observed in the data. (b) A wide range of TS values for the selective sweep model are compatible with the Tajima's D of the data; up to 40% of the simulations, for a sweep 0.05 Ne generation ago, give as large a negative Tajima's D value as observed on the neo-Y chromosome.

Selective sweeps on the neo-Y chromosome, in contrast, are expected to both reduce neutral variability and strongly distort the frequency distribution of neutral mutations7. The average value of D in selective sweep simulations was as low as −1.9 (Fig. 2b). The most likely value of TS based on the reduction in variability is close to the most likely estimate based on the frequency distribution of mutations on the neo-Y chromosome (0.07 versus 0.05 Ne generations; Figs. 1b and 4b). Unlike deleterious selection models, a large fraction of the selective sweep simulations (up to 40%) are compatible with the observed frequency distribution of mutations on the neo-Y chromosome (Fig. 4b). This suggests that a recent selective sweep shaped patterns of variability on the neo-Y chromosome of D. miranda.

Estimates of the rate and effect of deleterious mutations in Drosophila species suggest that beneficial mutations probably drag at least some deleterious alleles to fixation on the neo-Y of D. miranda21 (Supplementary Note online). In particular, a large number of weakly deleterious mutations can be dragged to fixation by a beneficial allele. For example, a single beneficial mutation of 1–2% selective advantage could drag more than 100 transposable elements (sd = 10−4) to fixation, consistent with the observed accumulation and fixation of a large number of transposable elements on the neo-Y chromosome of D. miranda11,13,15.

What could cause a selective sweep on the neo-Y? Apart from recent suggestions of a high rate of beneficial mutations in Drosophila species, the Y chromosome is susceptible to certain types of adaptive mutations. Perhaps most importantly, the fact that the Y chromosome is male-limited may promote sex-specific adaptation. Because the Y is never transmitted through females, it is a haven for alleles that benefit males but harm females. The neo-Y may be experiencing recurrent selective sweeps acquiring such sexually antagonistic alleles22. A related hypothesis is that genes with sex-related function undergo adaptive specialization on newly formed X and Y chromosomes13. In particular, sex-related genes might evolve male-specific functions on the neo-Y13.

Is positive selection driving the degeneration of Y-chromosomes? Here, I showed that a recent selective sweep has occurred on the D. miranda neo-Y. Recurrent selective sweeps on the neo-Y could lead to considerable degeneration by fixing linked deleterious mutations2. Unfortunately, on a nonrecombining chromosome, the current genealogy only contains information dating back to the last selective sweep event. Thus, the history of selection, both positive and negative, before this event cannot be inferred. In this regard, it will be of considerable interest to investigate Y and neo-Y chromosomes of other species. The approach presented here can be used to establish whether positive selection is frequent enough to have a general role in the process of Y chromosome degeneration. If so, Y chromosome degeneration may reflect adaptation at a few loci at the expense of most other genes on this chromosome.

Methods

Data collection, D. miranda lines and DNA isolation.

I collected 36,648 bp of neo-Y-linked sequence per line (Table 1). The sequences of CycB, robo-1, eve and eng were published previously (5,166 bp in total12). In addition, I collected 31,482 bp of nucleotide sequence information, mostly from noncoding regions. For the newly collected data set, I isolated sequence data from five different genomic regions (containing the genes mle, robo, dpn, exu1 and zip) from a genomic library constructed from D. miranda12. I used allele-specific primers to amplify the neo-Y-linked regions from male genomic DNA by PCR and then directly sequenced both strands using the ABI BigDye chemistry (Perkin-Elmer) on an ABI 377 automated sequencer. Details on primer sequences and sequencing strategy are available on request. The population survey used twelve D. miranda lines described elsewhere12. I isolated genomic DNA from individual male flies using the PUREGENE DNA extraction kit (Gentra). Because there is no recombination on the neo-Y chromosome (all loci share a common genealogy), data can be pooled among the neo-Y regions analyzed.

Deleterious mutation models in the coalescent framework.

To model the dynamics of deleterious mutations in a statistical framework, I used the structured coalescent8,10,18,23. A Wright-Fisher model of a randomly mating, haploid population of Ne breeding individuals is considered. Alleles are classified as coming from different genetic backgrounds, defined by the number of deleterious mutations they carry. The basic model considered here assumes that all deleterious mutations have an identical selection coefficient, sd. The simulation algorithm of the structured coalescent used was described in detail previously8,10. The first step in the algorithm is to generate a transition matrix Qij of the probabilities that a chromosome with i deleterious mutations in a given generation has an ancestor with j deleterious mutations in the previous generation8. Next, samples of chromosomes are generated randomly from the equilibrium distribution of deleterious mutations24. After this preliminary step, the genealogy for a sample of 12 chromosomes with no recombination was generated in a population subject to deleterious mutations. For each generation, the number of deleterious mutations in the ancestor of each individual is obtained randomly by using the probabilities in the transition matrix Qij as expected values. Coalescence is allowed to occur between individuals having the same number of deleterious mutations with probability ki (ki −1)/2/(Nefi ), where ki is the number of lineages with i deleterious mutations present in the sample at a given generation and Nefi is the deterministic equilibrium size of class i after selection10. Once the most recent common ancestor of the sample is reached, neutral mutations are distributed over the tree by sampling from a Poisson distribution with mean (π0t/2), where t is the length of each branch in units of 2 Ne generations. I investigated 29,609 neutral sites on the neo-Y chromosome (Table 1); this corresponds to a neutral mutation rate of 4.4 × 10−5 per generation for the region investigated (29,609 sites × (1.5 × 10−9 per site per generation)) or a neutral π0 () of 53.3 for the neo-Y chromosome.

The average estimate of θX (3) for a total of 15 X-linked and neo-X-linked loci in D. miranda is 0.57% per site12,16. Assuming a neutral mutation rate μ of 1.5 × 10−9 per generation, the effective number of breeding individuals N is estimated to be 1.2 million, corresponding to an effective population size Ne of the neo-Y of 600,000. Although this estimate is somewhat crude, the reduction in variability is almost independent of population size for strongly deleterious mutations (background selection8) and depends only slightly on the effective population size for weakly selected mutations10. Sexual selection in males can reduce the population size of the neo-Y relative to the neo-X chromosome. I investigated possible effects of sexual selection using the maximum value by which variation on the neo-Y can be reduced (one-ninth) or using the most plausible value of sexual selection in Drosophila populations (one-fifth; Supplementary Methods online). Ignoring the effects of sexual selection does not affect any of the conclusions drawn (Supplementary Figs. 27 online).

I ran 1,000 coalescent simulations for each parameter combination of Ud and sd (a total of 388 combinations) and a population size of 600,000 neo-Y chromosomes. I considered the entire range of possible values of Ud and sd that could explain the reduction in variability; mutations with a smaller sd are nearly neutral (Ne × sd = 1.2) and do not cause sufficient reduction in Ne, whereas stronger mutations (sd > 0.1) would require biologically unreasonable mutation rates to account for the reduction in diversity (and would not distort the frequency distribution of neutral mutations).

Positive selection model in the coalescent.

I investigated the likelihood of the data under a catastrophic selective sweep by varying TS, the time since the last selective sweep that is compatible with the observed reduction in variability on the neo-Y chromosome. I carried out coalescent simulations for a sample of 12 chromosomes according to the standard neutral model with constant population size23 until the accumulated time is >TS; at this time point all lineages were forced to coalesce. This truncates the genealogy at the time of the sweep, resulting in a star-shaped genealogy. This method assumes that selection is strong relative to mutation, such that no variant arising during the process of fixation of the selected allele is likely to be sampled7. Neutral mutations are placed on the tree by sampling from a Poisson distribution with mean (π0t/2). Values of TS were varied over a range of points spaced every 0.01 Ne generations; 10,000 coalescence simulations were run for each value of TS.

Summary statistics and hypothesis testing.

I used Tajima's D20 to summarize the frequency spectrum of mutations. D considers the normalized difference between two estimates of the population mutation rate (θπ, based on the mean numbers of pairwise differences between alleles, and θW, based on the numbers of segregating sites) and is expected to be negative when there is an excess of rare mutations. For both deleterious and beneficial selection models, I obtained empirical cut-off points for the test statistics to obtain the 5% significance level. For each value of Ud and sd (deleterious mutation models) or each value of TS (positive selection models), a P value for the data was estimated as the proportion of simulated samples with an identical or more extreme average value of the Tajima's D statistics than observed in the data. The average Tajima's D over several neo-X- and X-linked loci is close to zero13; thus, demographic explanations (for example, a recent population expansion in D. miranda) cannot explain the strong distortion in the frequency spectrum on the neo-Y.

Note: Supplementary information is available on the Nature Genetics website.