Introduction

Newer parts of sex chromosomes, generated by fusion or translocation events with an autosome, offer unique opportunities to study the dynamics of sex chromosome evolution in systems in which degeneration and dosage compensation have not yet been fully established. Neo-sex chromosomes have been extensively studied as models of chromosome evolution in different organismal groups, such as plants (Nicolas et al., 2005), insects (Bachtrog, 2005; Flores et al., 2008) and mammals (Zhou et al., 2008). Sexual antagonistic selection caused by conflicts between the sexes has been proposed as a major agent in the evolution of sex chromosomes (Rice, 1987), and sex-linked genes have in turn been predicted to play an important role in the speciation process, as shown in a recent study of a neo-sex chromosome in sticklebacks (Gasterosteus aculeatus) (Kitano et al., 2009).

The occurrence of neo-sex chromosomes implies a certain level of plasticity in the organization of the genome, and is of particular interest in groups such as birds, in which sex chromosomes show a remarkable stability over evolutionary time, and in which overall genomic and chromosomal organizations are highly conserved (Griffin et al., 2007; Nanda et al., 2008). Limited structural variation is apparent in the comparison between the two model avian genomes, with only two major interchromosomal rearrangements distinguishing the chicken (Gallus gallus) and the zebra finch (Taeniopygia guttata) genome organization (Itoh and Arnold, 2005). First, a fission of chromosome 1 is evident in the zebra finch (as well as in other passerines, for example, Dawson et al., 2007; Stapley et al., 2008). Secondly, chicken chromosome 4 (Gga4) has resulted from a fusion of ancestral chromosomes 4 and 10 (Griffin et al., 2007). In the zebra finch genome, Gga4 is represented by two different chromosomes: Tgu4 and Tgu4a according to the zebra finch nomenclature (Itoh and Arnold, 2005; Völker et al., 2010).

Recently, we have found indications that parts of the orthologue to Tgu4a could be sex-linked in two species of warblers (family Sylviidae): the common whitethroat (Sylvia communis) and the great reed warbler (Acrocephalus arundinaceus). A molecular marker (G61) located at the beginning of chromosome 4a (position 0.9 Mb on Tgu4a) is sex-linked in the great reed warbler (Dawson et al., 2007), and a large region of this chromosome (approximately 5 Mb), presenting high levels of sex-biased expression, was identified in the common whitethroat (S Naurin et al., unpublished data). No signs of sex linkage of the orthologous parts of chromosome 4a were found in the chicken or in the zebra finch (Stapley et al., 2008).

The aim of this study was to investigate the hypothesis of occurrence of a neo-sex chromosome in birds and to assess when it would have arisen in evolutionary time. We have gathered extensive molecular data in five bird species that are representative of independent branches of Passerida, according to two alternative phylogenies (Barker et al., 2004; Alström et al., 2006). Within the Sylvioidea (sensu Alström et al. (2006)), we selected the skylark (Alauda arvensis) in addition to the great reed warbler and the common whitethroat, and, as representatives of branches outside Sylvioidea, we selected the goldcrest (Regulus regulus) and the blue tit (Cyanistes caeruleus).

We designed primers to amplify and sequence a total of 31 loci (introns and exons) distributed over chromosome 4a to determine the extent of sex linkage and the association of these markers to the Z and W chromosomes in the five passerine species. In one species, the great reed warbler, we evaluated linkage between markers on chromosome 4a and markers located on the Z chromosome, by recombination-based linkage analyses in an extended pedigree (Hansson et al., 2005). The cytochrome b and myoglobin intron II sequence data sets of Alström et al. (2006) were re-analysed, with the addition of cytochrome b sequences of a few species, to date the origin of the neo-sex chromosome in relation to the approximately 150 MY old ancestral sex chromosomes (Handley et al., 2004; Nam and Ellegren, 2008). In addition, we have explored the gene content of chromosome 4a as a putative driver underlying the neo-sex chromosome maintenance.

Materials and methods

Marker selection and primer design

A search for orthologues of annotated genes within Tgu4a of the zebra finch genome (build taeGut3.2.4 at Ensembl; www.ensembl.org) was performed in the chicken genome (build WASHUC2.1). Sequences classified as orthologues by Ensembl were aligned using the Geneious Align and MUSCLE Align options in Geneious 5.0.3 (Drummond et al., 2010). Degenerate primers were designed using the primer design module in Geneious based on Primer3 (Rozen and Skaletsky, 2000) to either amplify introns or coding sequences (Supplementary Tables S1 and S2). A graphical representation of the location of the 31 loci on the zebra finch chromosome 4a (Tgu4a) and on the chicken chromosome 4 (Gga4) was obtained using MapChart 2.2 (Voorrips, 2002).

Phylogenetics

The data sets of Alström et al. (2006) were re-analysed with the addition of cytochrome b sequences of species present in this study, which were not included in the initial data set (S. communis and C. caeruleus; Supplementary Table S3). Cytochrome b (1038 bp) and myoglobin intron II (653–711 bp) sequences were aligned using the Geneious Align in Geneious 5.0.3 (Drummond et al., 2010), and then concatenated according to taxon. Phylogenies of the concatenated data set were estimated using MrBayes v. 3.1.2 (Huelsenbeck and Ronquist, 2001; Ronquist and Huelsenbeck, 2003), running four Markov chains for 10 million generations in two parallel replicates, with chain heating parameter set to 0.15. Trees were sampled at intervals of 1000 generations, and posterior probabilities were calculated from 7500 trees after excluding 2 500 000 generations as burn-in. Following Alström et al. (2006), the cytochrome b partition was analysed under a general time-reversible model assuming rate variation across sites following a discrete gamma distribution (G) with four rate categories and an estimated proportion of invariant sites, and the myoglobin intron II partition was analysed under a general time-reversible-G without invariant sites. The average standard deviations of split frequencies were, for the last 5 000 000 generations, stable at <0.01, and in addition Tracer v. 1.5.0 (Rambaut, 2007) was used to manually inspect plots of the likelihood scores, to ensure that they had reached stationarity.

Samples

DNA from individuals from the five species included in this study (see Table 1 for sample size) was obtained from the following sources: the Molecular Ecology and Evolution Lab, Department of Biology, Lund University, Sweden (great reed warbler, common whitethroat and blue tit); P Zehtindjiev, Institute of Zoology, Bulgarian Academy of Sciences, Sofia, Bulgaria (skylark); and the Ottenby Bird Observatory, Degerhamn, Sweden (goldcrest). Individual samples were sexed both morphologically and through the amplification of Z and W genes (using either the P2/P8 primer pair according to Griffiths et al. (1998) or the TGZ-002 primer pair according to D Dawson, University of Sheffield).

Table 1 Pattern of sex linkage of loci on chromosome 4a in great reed warbler, common whitethroat, skylark, goldcrest and blue tit

Marker and sequence analysis

Amplification of the 31 loci was carried out in 10 μl volume reactions with 25–30 ng DNA, using the Qiagen Multiplex kit (Qiagen, Hilden, Germany). Polymerase chain reaction conditions were as follows: pre-heating at 95 °C for 15 min, 35 cycles at 94 °C for 30 s, annealing temperature (Ta) for 45 s, 72 °C for 1 min and a final extension at 72 °C for 10 min (locus-specific Ta is given in Supplementary Table S2). Amplification was followed by direct sequencing using BigDye Terminator Sequencing Kit (Applied Biosystems, Foster City, CA, USA) according to the manufacturer's recommendations. Zebra finch samples were used as positive controls for amplification. The identity of the sequences was confirmed through BLAST search. Variation among sequences of male (ZZ) and female (ZW) samples of the five species was identified using Geneious (Drummond et al., 2010).

Sex linkage

The assessment of sex linkage of the selected markers was performed through the comparison of the sequencing products of male and female samples of each species. Although some previous indication of sex linkage had been obtained for two of the species analysed here (the great reed warbler (Dawson et al., 2007) and the common whitethroat (Naurin et al., unpublished)), no indication was obtained regarding the number of chromosome 4a markers involved nor the prevalence of the sex-linked area in the different species. Therefore, two starting hypotheses were considered for the analysis of each locus: either the locus would be autosomal, as in zebra finch, or it would exhibit sex-specific patterns indicating sex linkage. According to the first hypothesis, no sex-specific polymorphism should be found in the comparison of male and female sequences. Males and females could either be homozygous or heterozygous at polymorphic sites, and heterozygous individuals should be identified in both sexes. Conversely, in the case of sex linkage (for example, the physical association to a sex chromosome), two possible scenarios should be taken into account: the exclusive association with one of the sex chromosomes or the amplification of alleles linked to both Z and W chromosomes. In the case of association to both sex chromosomes, females should be ‘heterozygous’ at distinct positions representing sequence variation between the Z and W chromosomes, whereas the males should be homozygous (ZZ) at these sites. In the case of association to the Z chromosome only, or if only the Z copy was amplified, males should additionally be either homozygous or heterozygous at variable sites, whereas females should be hemizygous at variable positions detected in males (this single sequence corresponding to the single Z associated copy). The specific polymorphic sites that allowed for the assessment of sex linkage and association with either the Z or W chromosome were determined per species and per locus. Absence of informative polymorphic sites was indicated as ‘m’ (monomorphic; Table 1).

We sequenced a small number of individuals at each of the 31 loci in each of the five species (Table 1). Thus, at each SNP/indel at each locus in each species, the statistical support of sex linkage is weak; however, this is not the case over a chromosome region holding several loci and polymorphisms. The likelihood that a W-linked pattern would appear by chance at a subset of autosomal loci equals the likelihood that all females are heterozygous and all males are homozygous (given that one female is known to be heterozygous, as we only evaluate polymorphic sites) at a subset of loci in the supposed sex-linked region. This likelihood, L(W), is described by the following expression:

L(W)=([(2pq)^(N−1) × (1−2pq)^N]^KW) × Nloci!/(KW!(NlociKW)!),

where p and q are the allele frequencies, N the sample sizes and KW the number of loci with W-linked pattern (KW=14 and Nloci=17 in great reed warbler, 16 and 18 in common whitethroat, and 16 and 19 in skylark; Table 1). We also calculated the likelihood that a Z-linked pattern would appear by chance over a set of autosomal loci. This equals the likelihood that all females are homozygous (given that at least one male is heterozygous) at all loci (with a Z-linked polymorphism) in the supposed sex-linked region. This likelihood, L(Z), is described by the following expression:

L(Z)=((1−2pq)^N)^Nloci,  

where p and q are the allele frequencies, and Nloci the number of loci with Z-linked pattern (and at least one Z-linked polymorphism; that is, the Z(0) cases in Table 1 are not counted; Nloci=10 in great reed warbler, 15 in common whitethroat and 12 in skylark; Table 1). These likelihoods (L(W) and L(Z)) depend on the minor allele frequency, which is unknown for most loci. Therefore, we estimated these likelihoods using two different minor allele frequencies: a very low value (P=0.1) and a more realistic value (that is, P=0.30; measured as the mean allele frequency at four loci located on chromosome 4a that were genotyped in a larger number of great reed warblers; see below). The likelihood functions follow standard probability statistics and were designed in Excel (Microsoft, Redmond, WA, USA), which allowed for simple handling of specific parameters for the different species and allele frequencies.

Genotyping of additional great reed warbler samples

The inheritance patterns at a subsample of the loci isolated in this study were evaluated using families of a Swedish population of the great reed warbler, comprised within a total sample of 373 individuals collected during the breeding seasons between 1987 and 1997 (for example, Hasselquist, 1998; Bensch et al., 1998).

For simplicity of scoring by standard fragment analysis and the possibility of targeting specific within-locus polymorphic sites, primers (Supplementary Table S4) were designed to amplify indel areas in five selected loci (DIAPH2, RPS6KA6, THOC2, SMARCA1 and GABRE), distributed over chromosome 4a (at respectively, 1.6, 4.7, 10.2, 11.6 and 19.2 Mb). At one locus, RPS6KA6, there was indication of amplification of both a Z (holding the target polymorphism) and a W fragment, and primers were designed outside the conserved area between the gametologues to exclude amplification of the W fragment. All reverse primers were fluorescently labelled and products were amplified using the Qiagen Multiplex kit. Polymerase chain reaction conditions were as follows: pre-heating at 95 °C for 15 min, 35 cycles at 94 °C for 30 s, 56 °C for 45 s, 72 °C for 45 s min and a final extension at 72 °C for 10 min. Alleles were detected in an ABI 3730 capillary sequencer (Applied Biosystems) and analyzed with GENEMAPPER 3.0 (Applied Biosystems).

To estimate allelic frequencies at four of these loci (DIAPH2, RPS6KA6, SMARCA1 and GABRE), 27 unrelated great reed warbler individuals (12 males and 15 females) from the above-mentioned Swedish population were genotyped.

Linkage analysis in the great reed warbler

To evaluate the inheritance pattern and linkage to the Z chromosome of 4a markers, 373 great reed warblers (within an extended pedigree of a total of 402 individuals) were genotyped for loci DIAPH2 and RPS6KA6. The same samples had previously been genotyped for several sex-linked markers: Aar1, VLDLR9, BRM12, BRM15, CHD1Z20, Ase50, 304A, 313AI, 051A, 306A, 092A and G61 (Hansson et al., 2005; Åkesson et al., 2007). Of these, VLDLR9, BRM12, BRM15 and CHD1Z20 are intron markers of known Z-linked genes in chicken (Hansson et al., 2005), Aar1 and Ase50 are microsatellites that BLAST to chromosome Z (Dawson et al., 2007), G61 is a microsatellite that BLASTs to chromosome 4a (Dawson et al., 2007), whereas 304A, 313AI, 051A, 306A and 092A are anonymous amplified fragment length polymorphism markers (Åkesson et al., 2007). We also genotyped three other chromosome 4a markers that did not show any signs of sex linkage, THOC2, SMARCA1 and GABRE, and tested linkage between these and the Z-chromosome loci listed above, as well as to other loci (Ase61, Ase15, Aar3 and 292G) located on Tgu4/4a and/or LG7 (that is, the chromosome homologous to chromosome 4/4a in the great reed warbler; Hansson et al., 2005; Åkesson et al., 2007; Dawson et al., 2007).

Two-point recombination analysis was performed to calculate recombination fractions between all pairs of markers using CRI-MAP 2.4 (Green et al., 1996), and a logarithmic odd (LOD) score >3 was used to assess significant linkage (that is, a significantly lower recombination rate than the 0.5 expected for unlinked markers). One of the autosomal chromosome 4a markers, GABRE, could not be assigned to any other marker at this threshold (most likely due to its isolated location in the telomeric end of chromosome 4a, and thus high expected recombination rate to all other loci included in the analysis, see, for example, Stapley et al. (2008)), and was included in the linkage analysis at its expected chromosome 4a location. We determined the most parsimonious ordering of markers within each linkage group (the order with the highest likelihood support) with the options FLIPSN and FIXED in CRI-MAP 2.4 (Green et al., 1996).

Neo-sex chromosome age estimation

To date the evolution of the neo-sex chromosome, the cytochrome b and myoglobin data sets were extended with more representatives of Picathartidae, core and basal Corvoidea, Old World and New World suboscines and a New Zealand wren of Acanthisittidae (Supplementary Table S3). Assuming that Acanthisittidae split from other passerines when New Zealand rifted from the West Antartica 85–82 millions years ago (MYA) (Yan, 1993), and using that as a calibration point, the age of the Sylvioidea clade sensu Alström et al. (2006) was estimated with BEAST v. 1.5.4 (Drummond and Rambaut, 2007). Sylvioidea was constrained as monophyletic, as were oscines and all non-New Zealand taxa. The prior distribution for the calibration point was uniform over 85–82 MYA. Cytochrome b and myoglobin were analysed using substitution models as above, unlinked relaxed clock models (uncorrelated log-normal distribution; Drummond et al., 2006) and a tree prior following the Yule process. Trees were sampled every 5000 generations and the runs covered 100 million generations, of which the first 25% were discarded as burn-in. The results were inspected using Tracer v. 1.5.0 (Rambaut, 2007), ensuring stationarity and effective sample sizes of >200.

In addition, the mean age for Sylvioidea (Alström et al., 2006) was calculated with TreeAnnotator v. 1.5.4 (Rambaut, 2007) in maximum clade credibility trees based on 100 maximum-likelihood trees each from the non-parametric rate smoothing and penalized likelihood analyses of Barker et al. (2004), based on molecular dating of genes RAG-1 and RAG-2, and calibrated to a split from Acanthisittidae 83 MYA. This resulted in an estimated mean age of the Sylvioidea of 38.8 MYA for the non-parametric rate smoothing method and 39.5 MYA for the penalized-likelihood method.

Results

Sex linkage

Our results indicated the presence of a neo-sex chromosome and determined the extent of the sex-linked region. The pattern of sex linkage, based on the presence of sex-specific polymorphisms for the 31 loci with orthologues on Tgu4a (Supplementary Tables S1 and S2) in five different bird species, including warblers and closely related passerines, is shown in Figure 1 and Table 1. The 19 markers located on the first part of chromosome 4a (0.9–9.5 Mb) showed evidence of sex linkage in the great reed warbler, the common whitethroat and the skylark (Figure 1 and Table 1), but not in the species outside the Sylvioidea branch (Figure 2).

Figure 1
figure 1

Loci used to evaluate sex linkage and their location on zebra finch chromosome 4a (Tgu4a) and chicken chromosome 4 (Gga4). The area covered by the markers for which sex linkage was observed in the great reed warbler, the common whitethroat and the skylark is highlighted in light grey. Scale bar units in megabases (Mb). Introns are shown in black, and coding sequences in grey. Loci implicated in sex determination are highlighted (rectangles).

Figure 2
figure 2

Phylogenetic relationships between passerine lineages based on myoglobin intron II and cytochrome b sequences from a total of 84 species (cf. Alström et al., 2006; Supplementary Table S3). Collapsed nodes are shown and numbers are posterior probabilities (for model details, see Materials and Methods). Species included in this study are indicated in bold and common name initials are placed to the right of each respective branch: great reed warbler (GRW), common whitethroat (CW), skylark (SL), goldcrest (GC) and blue tit (BT). Grey circles (in full) highlight branches in which sex linkage of chromosome 4a markers was observed; white circles indicate branches in which representative species showed no sex linkage.

Sex linkage and support for the presence of a neo-Z chromosome was assessed by the occurrence of both homozygous and heterozygous sequences in males (corresponding to amplification of the two Z-linked alleles) and the presence of a single sequence at male-specific variable positions in females (corresponding to the single Z-linked allele in females). This pattern was seen in 37 of the 54 locus/species combinations analysed (10 in the great reed warbler, 15 in the common whitethroat and 12 in the skylark). Furthermore, in 45 locus/species combinations, all females were heterozygous at positions in which no variability was observed in males (Table 1). The consistency of the female-specific patterns, the sequence identity and the large number of loci for which the presence of an additional copy was observed, point towards the presence of a neo-W chromosome in the great reed warbler, the common whitethroat and in the skylark, in addition to the neo-Z chromosome(Table 1).

In 16 locus/species combinations, there were no male-specific variable positions over the analysable sequence area for which males and females could be directly compared (despite their presence in other areas along the Z sequence). In those cases (indicated as Z(m) in Table 1), Z linkage could not be directly assessed from direct sequencing, although, in all cases, the Z sequence in males was also identified in females, in addition to the female-specific W sequence.

At locus P2RY4, common whitethroat females exhibited sequences with no heterozygous sites, but with a high number of fixed polymorphisms (36) when compared with the sequences obtained from males for the same locus (Table 1). When directly comparing the male and female sequences, very weak Z-specific peaks were identifiable in parts of the female sequence, but the very low peak intensity did not allow for the assessment of specific polymorphisms. The observed pattern strongly suggest that the W copy of P2RY4 was being preferentially amplified in common whitethroat females, making a total of 46 locus/species combinations supporting the presence of a neo-W chromosome. A similar pattern, suggesting exclusive W amplification, was observed at locus TAF9B, in the great reed warbler. In this case, however, we could only obtain a readable sequence from one female sample. The high level of ambiguities and the impossibility of extracting information from the remaining females prevented further confirmation of this interpretation, so this locus was excluded from the analysis in the great reed warbler (Table 1).

Even with our small sample of individuals per locus and species, the likelihood that a W-linked pattern would appear by chance at a certain number of loci (KW=14, 16 and 16 in great reed warbler, common whitethroat and skylark, respectively; Table 1) among a set of autosomal loci over the first 10 Mb of chromosome 4a (Nloci=15, 18 and 19 in the three species; Table 1) was extremely low in all three species and for the two minor allele frequencies evaluated: 2.5 × 10−24 L(W) 4.6 × 10−15. This allows for the rejection with a high confidence level of the null hypothesis of an autosomal location. Likewise, the likelihood (L(Z)) that a consistent Z-linked pattern over the first 10 Mb of chromosome 4a would appear by chance at a large set of autosomal loci (Nloci=10, 15 and 12 in the three species; Table 1) was extremely low. Assuming a low minor allele frequency (P=0.1), L(Z) was equal to 2.6 × 10−3 for the great reed warbler, 1.3 × 10−4 for the common whitethroat and 7.9 × 10−4 for the skylark. Considering a more realistic allele frequency, estimated as the mean allele frequency at four loci in the Swedish great reed warbler population (P=0.30), L(Z) for the three species was as low as 8.0 × 10−8, 2.3 × 10−11 and 3.0 × 10−9, respectively.

Loci located in the distal part of chromosome 4a (10.2–20.6 Mb) exhibited no indication of sex linkage (Figure 1 and Table 1). The same was true for the markers covering the full length of chromosome 4a in the goldcrest and in the blue tit: the complete lack of sex-linked patterns suggests that the neo-sex chromosome is absent in these two species (Table 1).

Sex-linked inheritance and linkage analysis in the great reed warbler

Further support for the presence of the neo-sex chromosome came from the analysis of segregating alleles in the great reed warbler pedigree (Table 2): all the 186 genotyped females showed a single allele for the two loci (DIAPH2 and RPS6KA6) in which Z-linked indels were targeted, whereas males (n=187) were either homozygous or heterozygous (in accordance with the presence of two Z-linked copies). Moreover, the female offspring showed discernible Z-chromosome inheritance patterns for the two loci located within the sex-linked area in all cases where the genotypes of the parents allowed such confirmation (the inheritance pattern of two families is exemplified in Table 2). In contrast, for the three loci (THOC2, SMARCA1 and GABRE) located outside the sex-linked area, heterozygous genotypes were found among both males and females with no clear pattern distinguishing the two sexes, and the expected autosomal inheritance pattern was observed (Table 2).

Table 2 Allele screening results for the five selected loci in two great reed warbler families

The two-point recombination analysis, used to assess the rate of recombination between sex-linked 4a markers and Z-chromosome markers, showed a significantly reduced rate of recombination between RPS6KA6 and several marker mapped to the Z linkage group (the recombination fraction, rf, was 0.19 to VLDLR9, 0.17 to BRM12, 0.19 to BRM15, 0.06 to CHD1Z20, 0.10 to 304A, 0.17 to 313AI and 0.13 to 306A; range of LOD: 3.60–8.77). As VLDLR9, BRM12, BRM15 and CHD1Z20 are intron markers of known Z-linked genes in chicken (Hansson et al., 2005), the tight linkage between these loci and RPS6KA6 provides strong support for the association of our sex-linked 4a marker to the Z chromosome. The other sex-linked 4a marker (DIAPH2) that we genotyped was significantly linked to G61 (rf=0.06; LOD=11.0) and showed a tendency to linkage to marker 092A (rf=0.26; LOD=1.9).

There was no support for linkage between any sex-linked marker (including RPS6KA6 and DIAPH2) and autosomal markers on chromosomes 4 and 4a (including THOC2, SMARCA1 and GABRE) (LOD ≈0 in all cases). Furthermore, the autosomal 4a markers (THOC2, SMARCA1 and GABRE) were not significantly linked to loci located on chromosome 4 (Ase61, Ase15, Aar3 and 292G). Finally, the two-point recombination analysis supported linkage between THOC2 and SMARCA1 (rf=0.22; LOD=5.24), but not between GABRE and these two loci (rf=0.37–0.48; LOD 0.22; most likely due to the isolated location of GABRE at the telomeric end of chromosome 4a (Figure 1 and Supplementary Table S1), and thus high expected recombination rate to the other loci on the linkage group (for example, Stapley et al., 2008).

The most parsimonious order of loci on linkage group Z was identical to the order of the published great reed warblers linkage map (Åkesson et al., 2007), with the inclusion of RPS6KA6 between Ase50 and 304A, and DIAPH2 between 092A and G61 (Figure 3). This order of loci suggests a fusion event between the distal end of the ancestral Z chromosome and the distal end of the first part of 4a (the neo-sex chromosome), leading to a new enlarged sex chromosome in great reed warblers with a total size of 156.1 cM (Figure 3). The linkage group of the autosomal part of chromosome 4a, including markers THOC2, SMARCA1 and GABRE, was spanning 91.6 cM (Figure 3).

Figure 3
figure 3

A best-order linkage map of the great reed warbler neo-sex chromosome. Markers included in the analysis are shown in the left panel of chromosomes and their position on zebra finch chromosomes Z and 4a is indicated in megabases (Mb). Z-chromosome markers are highlighted in dark grey and chromosome 4a loci analysed here are shown in bold (the sex-linked area suggested by our marker analysis is highlighted in light grey). The most parsimonious maps for independent linkage groups Z+4a-1 and 4a-2 are indicated for the great reed warbler. Cumulative genetic distances are given in centiMorgans (cM).

Neo-sex chromosome age estimation

A large part of chromosome 4a seems to be affected by the transition into a sex chromosome and the extension of the area is apparently the same in all species for which the presence of this neo-sex chromosome was detected (Table 1). The most parsimonious interpretation, according to our observations and the proposed phylogenetic relations within Passerida (Alström et al., 2006), is that the neo-sex chromosome should have arisen at the base of Sylvioidea (Figure 2). According to our dating estimates, this would place the origin of the neo-sex chromosome at 42.2 MYA (47.4–37.6 MYA, 95% highest posterior density).

Discussion

In this study, we report evidence of a neo-sex chromosome in passerines. On the basis of the previous evidence of sex linkage of an autosomal marker (Dawson et al., 2007) and sex-biased expression (S Naurin et al., unpublished), we targeted markers located on zebra finch autosome 4a as candidates taking part in a newly established association to a sex chromosome. Our evaluation of 31 orthologues of markers located on chromosome 4a revealed that a large area of this chromosome is sex-linked in the great reed warbler, the common whitethroat and the skylark. The linkage patterns are consistent in all Sylvioidea passerines analysed here and the extent of the sex-linked region of chromosome 4a is apparently common to all species. None of the markers was shown to be sex-linked in the goldcrest and the blue tit, and the chromosome is autosomal in the zebra finch (Stapley et al., 2008), thus suggesting that the neo-sex chromosome might have arisen at the base of the Sylvioidea branch of the avian phylogeny, at approximately 47.4–37.6 MYA, making it substantially younger than the approximately 150 MYA ancestral avian Z and W chromosomes (Nam and Ellegren, 2008).

In this study, we have not used cytogenetic tools to visualize the physical association of the sex-linked part of chromosome 4a to the sex chromosomes. Nevertheless, apart from unequivocally identifying the neo-sex chromosome, our marker-based approach provides strong support for such association. The linkage analysis performed in the great reed warbler pedigree corroborated the sex chromosome location of the sex-linked 4a markers (RPS6KA6 and DIAPH2). In particular, the highly significant linkage between RPS6KA6 and several genes known to be Z linked in chicken (VLDLR9, BRM12, BRM15 and CHD1Z20; Hansson et al., 2005) provides strong support for the association of the sex-linked part of chromosome 4a and the Z chromosome. On the basis of the inverted position of the 4a markers on the linkage group, we propose that the new enlarged sex chromosome was formed by a fusion between the distal end of the ancestral Z chromosome and the distal end of the first part of 4a (the neo-sex chromosome) (see Figure 3).

The sequence patterns observed point towards an association of the sex-linked loci to both Z and W chromosomes in the three Sylvioidea species. The linkage to both ancestral sex chromosomes is an unusual mechanism in neo-sex chromosome formation, with most fusion events occurring in association to only one member of the pair, and resulting in the constitution of additional independent (neo-) sex chromosomes (Bachtrog, 2005; Zhou et al., 2008; Howell et al., 2009). In birds, sex-linked loci have been reported in association to the Z chromosome (Ellegren, 2000) and very rarely exclusively to the W chromosome (Küpper et al., 2007). Besides our own study of chromosome 4a in Sylvioidea, linkage of autosomal makers (mapped to chicken autosomes 3 and 5) to both Z and W chromosomes has been indicated in a recent study of Raso larks (Alauda razae; Brooke et al., 2010).

Interestingly, Tgu4a, a microchromosome in the zebra finch genome, results from a number of fusion and fission events involving the ancestral avian chromosome 4 (Griffin et al., 2007). No other chromosome in the avian genome has this pattern of repeated rearrangements (Völker et al., 2010); thus, the gene content of chromosome 4a might provide some insight regarding the presence of the neo-sex chromosome in the Sylvioidea lineage. It has been postulated that genomic rearrangements underlying neo-sex chromosome formation may offer selective advantages in one of the sexes, and thus result in sexual antagonisms, if they physically connect genes with sex-specific functions to sex determination genes harboured on the sex chromosomes (Rice, 1987). Support for the hypothesis of gene content as a driving force for the establishment of sex chromosome/autosome associations comes, for instance, from studies of stickleback sex chromosomes (Ross et al., 2009). In two different stickleback species (Gasterosteus wheatlandi and Pungitius pungitius), independent chromosomal fusion events have resulted in linkage of the sex determination locus and linkage group 12. This has been interpreted as evidence that certain chromosomes, harbouring sexually antagonistic genes, might be predisposed to becoming a sex chromosome (Ross et al., 2009). In addition, the question of sexual antagonism has been extensively addressed in the perspective of how selection maintains genes with differential benefit to males and females in sex linkage, and how this process contributes to sex chromosome evolution. Among other examples, the recent acquisition of ‘male beneficial’ genes by the mammalian X (Zhang et al., 2010), the abundance of testis-specific genes on the silkworm Z (Arunkumar et al., 2009), the close linkage of sexually selected colour genes to sex-determination loci in cichlids (Lande et al., 2001) and the presence of loci coding for male courtship display traits, involved in behavioural isolation, in the stickleback neo-sex chromosome (Kitano et al., 2009), all support sexually antagonistic gene content as an important factor in the evolutionary history of both ancestral and more recently derived sex chromosome systems.

In birds, chromosome 4a contains the SOX3 gene, considered as a possible ancestral precursor of the male determination gene SRY. In mammals, this gene is located on the X chromosome and is involved in both gonad and brain development (Graves, 1998). According to our data, however, this gene is located outside the area of chromosome 4a involved in the neo-sex chromosome formation in Sylvioidea, as evidenced by the absence of sex linkage for SOX3 (Figure 1 and Table 1). This result seems to argue against the hypothesis of a sexual antagonistic basis driving the observed chromosomal rearrangements. However, SOX3 is not the only gene that could be influential in this transition. Interestingly, another gene that plays an important role in sexual commitment is also present on chromosome 4a. The androgen receptor (AR) gene is located at around 6.4 Mb on Tgu4a and our data show that it is sex-linked in Sylvioidea (Figure 1 and Table 1). This gene has been implicated in sex differentiation in different animal groups (de Waal et al., 2008; Wang et al., 2009) and is particularly involved in male sexual development. An appealing feature of the linkage of AR with a sex chromosome in birds is the possibility of an association with DMRT1. Although no final consensus has been reached regarding sex determination in birds, it has recently been demonstrated that the expression level of the Z-linked gene DMRT1 is essential for male sex determination (Smith et al., 2009). In mammals, normal activation of the androgen receptor was shown to require the interaction with DMRT1 (Kim et al., 2007), and in the medaka (Oryzias latipes), androgen receptor binding sites have been identified in the promoter of the master male-determination gene dmrt1bY (Herpin et al., 2010). The functional significance of this association will demand further investigation.

The report of a neo-sex chromosome in Sylvioidea constitutes a rare account of an interchromosomal rearrangement involving sex chromosomes in birds, and represents an exciting starting point for studying the role of sexually antagonistic selection on genome evolution in this lineage. The present results open new possibilities for investigating the evolutionary trajectories of newly sex-linked genes in a previously unexplored vertebrate context: a female heterogametic system.

Data archiving

Sequence data have been submitted to GenBank accession numbers: HQ415821–HQ415968 (Supplementary Table S5).