Introduction

It has been long established that chromosome rearrangements between species can cause or reinforce reproductive isolation (Noor et al., 2001; Rieseberg, 2001; Delneri et al., 2003). Hybrids arising from parents with subtly different karyotypes can be compromised in their ability to reproduce and this can function as an evolutionary barrier that ultimately leads to speciation. Despite this, molecular evolutionary research often focuses on the role of DNA, RNA and proteins (Kimura and Ohta, 1974), often disregarding whole chromosomes and homologous synteny blocks. It is yet crucial to describe accurately the comparative molecular cytogenetics between key species as a forerunner for understanding the role of chromosome changes in the evolution of particular phylogenetic groups. Most studies have performed this through the use of zoo-FISH with chromosome paints and/or individual bacterial artificial chromosome (BAC) clones, generally focusing on mammalian genomes (for example, Wienberg, 2004). The precise locations of evolutionary breakpoint regions are possible to determine by zoo-FISH with BACs; however, such studies are extensive and laborious. The increasing opportunity to be able to visualize assembled genomes through modern sequence analysis tools and freely available online data sets make it possible to pinpoint homologous synteny blocks and evolutionary breakpoint regions precisely, quickly and cheaply (Larkin et al., 2009). Again however, such studies have focused on mammals, and birds are relatively understudied in this regard.

The defining characteristics of birds include feathers, having (or having lost) the ability to fly, oviparity, nesting/brooding, high body temperature, longevity, high blood glucose levels and a small genome (one-third the size of mammals). Avian karyotypes are also characteristic with nearly two-thirds of birds having a diploid chromosome number of around 80 and the vast majority having a large number of microchromosomes (Christidis, 1990). Between 2004 and 2010 the chicken genome was the only avian representative being completely characterized and karyotyped (Hillier et al., 2004; Masabanda et al., 2004), but this nevertheless has allowed cross-species chromosome painting to numerous other birds (reviewed in Griffin et al., 2007) and BAC mapping to build physical maps of a few others. These include turkey, duck and zebra finch (Griffin et al., 2008; Skinner et al., 2009a; Volker et al., 2010). From these studies, a number of key messages emerge: first, chicken chromosomes 1–3 and 5–10+Z are representative of the ancestral pattern. Second, homoplasy is commonplace with the ancestral chromosomes 4 and 10 fusing on at least three occasions (in chicken, goose and African collared dove), fusions of the same smaller macrochromosomes and fissions at the centromeres of chromosomes 1 and 2 occurring convergently in several separate lineages (Griffin et al., 2007). Third, whole-chromosome blocks represent ancient conservation of synteny, with chromosomes 1–5+Z being present and intact in turtles (Matsuda et al., 2005) and chromosome 4 similarly intact in humans (Chowdhary and Raudsepp, 2000; Hillier et al., 2004). Finally, despite interchromosomal conservation, intra-chromosomal changes are common; for example, 114 tentative rearrangements, both inversions and translocations were noted between chicken and zebra finch (Volker et al., 2010). The overall picture of avian genome organization is of a successful and conserved karyotypic pattern, with changes, when they occur, tending to recur much more than in mammals. Rare, nonrandom patterns in any evolutionary system, we would assume, usually occur for a reason and therefore we contend that birds are an appropriate group of animals on which to study the mechanistic basis of chromosome evolution in contrast to mammals, in which changes are more commonplace and apparently more random.

Chromosome fission at centromeric loci has previously been reported as being a factor of evolutionary change (for example, Perry et al., 2004). However, other regions of the genome can be subject to frequent breakage (that is, hotspots). This is commonly termed ‘breakpoint reuse.’ Where breakpoints are determined to be between the same base pairs, identity by descent is inferred. Where breakpoints are not identical but within the same genomic region then breakpoint reuse is implicated (Mlynarski et al., 2010). Determination of the precise evolutionary breakpoint regions is therefore a useful tool in distinguishing homoplasy and hemiplasy (Avise and Robinson, 2008) where apparently convergent changes are in fact identical by descent.

Recently we (Volker et al., 2010; Warren et al., 2010) generated comparative maps by aligning whole chromosomal sequences between chicken (Gallus gallus) and zebra finch (Taeniopygia guttata), and subsequently verifying rearrangements by fluorescence in situ hybridization (FISH). This showed an apparent combination of multiple inversions and translocations within chromosomes. Understanding what rearrangements have occurred to produce the current patterns of chromosomal synteny requires more species alignments. With the publication of the turkey (Meleagris gallopavo) genome (Dalloul et al., 2010), we can identify some lineage-specific rearrangements and provide evidence for regions of recurrent breakpoints. In this study therefore we have used whole-chromosomal sequence alignments among turkey, chicken and zebra finch, combined with our previously published physical maps (Griffin et al., 2008; Volker et al. 2010) generated comparative genomic maps among the three avian species. Using zebra finch as an outgroup, we also tested the hypothesis that there are genomic regions within which breakpoints recur during avian evolution.

Materials and methods

To visualize large-scale intrachromosomal rearrangements, we aligned whole-chromosome sequences of chicken macrochromosomes 1–10 and their turkey and zebra finch orthologs using the program GenAlyzer (Choudhuri et al., 2004) with default settings. The Z-chromosome assembly for turkey was not complete enough to align properly. The chicken–zebra finch alignments were already available from our previous study (Volker et al., 2010). Subsequently, to aid visualization, the GenAlyzer output matches (of 100+ base pairs) were combined into contiguous blocks using a custom script. This script combined direct or inverted matches where there was a consecutive run of at least five matches. If a distance of 40 kb occurred with no matches, a new block was called. Blocks of at least 250 kb were plotted, to remove spurious matches caused by repetitive content and to focus on the larger rearrangements. The chromosomes were manually segmented based on these charts, and the segments numbered and ordered relative to turkey.

The Multiple Genomes Rearrangement tool on the GRIMM web server (http://grimm.ucsd.edu/MGR/) (Bourque and Pevzner, 2002) was used to calculate optimal rearrangement pathways between each species, and to reconstruct a likely potential chicken–turkey ancestor, in the manner of Mlynarski et al. (2010) (see also Supplementary Table 3). The series of possible rearrangements from the chicken–turkey ancestor to each species was considered, and for each rearrangement, the segment ends flanking the breakpoints were noted. Within each lineage, the number of times a segment end was involved in a rearrangement was counted. The positions of the segments within each species genome were noted, to allow correlation with our previously published FISH-based physical mapping data (Griffin et al., 2008; Volker et al., 2010). These data were acquired by cross-species FISH of chicken BACs onto turkey chromosomes, and same-species FISH of orthologous zebra finch BACs onto zebra finch chromosomes.

Unmasked chromosome sequences for chicken were downloaded from Ensembl (ftp://ftp.ensembl.org/pub/release-63/fasta/gallus_gallus/dna/). RepeatMasker (Smit et al., 1996–2010; http://www.repeatmasker.org) was run on each chromosome using default settings. For all breakpoints and segments, the number of repeats wholly located within the breakpoint region or segment was counted, and expressed as the number of base pairs per megabase of sequence for each of eight classes of repeat. We used the chicken data for this analysis due to the higher quality of the sequence assembly compared with other genomes.

Results

Alignment of orthologous chromosome sequences

Whole-chromosome alignments of draft genome sequences confirmed previous results demonstrating a high degree of conserved synteny in the macrochromosomes of all three species, with only two interchromosomal rearrangements distinguishing the chicken and zebra finch genomes (Itoh and Arnold, 2005), two distinguishing chicken and turkey (Griffin et al., 2008) and two distinguishing zebra finch and turkey. Alignment of whole-chromosome sequences of orthologous chromosomes 1–10 in all three species to visualize large-scale intrachromosomal rearrangements however identified a large number of differences. The rearrangements predicted agreed with marker order from existing FISH data where overlapping BACs were available (Supplementary Table 1).

To better visualize the intrachromosomal differences identified from the GenAlyzer data, we grouped the short alignments into contiguous blocks and color-coded as direct (blue) or inverted (red) matches (Figure 1 and Supplementary Figures S1–S9). Side-by-side alignments between each of the species of >250 kb blocks were then plotted to allow each chromosome to be divided into segments separated by breakpoints (Figure 2, Supplementary Table 1). The segments were manually numbered, and an optimal series of rearrangements transforming from one species order to the others was calculated using the Multiple Genomes Rearrangement tool on the GRIMM web server (http://grimm.ucsd.edu/MGR/; Bourque and Pevzner, 2002). This produced potential chicken–turkey ancestral segment orders, and the pathways from this ancestor to each modern species (see Supplementary Table 3 for details). Figure 3 shows an example pathway of rearrangements for chromosome 9, with segments ordered from the perspective of a possible ancestral Neoavian organization. It should be noted that with the current data, the Neoavian state must be considered speculative only.

Figure 1
figure 1

Example alignment between chicken chromosome 9 and orthologous turkey chromosome 11. (a) Shows the raw Genalyzer output and (b) shows the same after grouping into synteny blocks. Only synteny blocks 250 kb are plotted.

Figure 2
figure 2

Side-by-side alignments between chicken chromosome 9, turkey chromosome 11 and zebra finch chromosome 9. Colored arrows show the division of the chromosomes into segments, oriented with respect to turkey.

Figure 3
figure 3

Example of a potential series of inversions from a Neoaves common ancestor leading to the organization seen in chicken chromosome 9, turkey chromosome 11 and zebra finch chromosome 9. Inverted segments are indicated with dotted arrows. Note that in this figure, segment order is presented with respect to the hypothetical Neoaves common ancestor. Note also that rearrangements shown in the path to zebra finch could have occurred in either the path from the avian common ancestor to zebra finch or from the avian common ancestor to the chicken–turkey ancestor, depending on the true ancestral organization.

Evidence for regions prone to breakpoints

In a series of inversions from one genomic arrangement to another, the breakpoint regions between segments do not remain consistent, hence, the ends of each segment were used as a proxy for the actual sequence within the breakpoint. The end point of each segment was scored based on the potential sequence of rearrangements for the number of times it was involved in a breakpoint in each lineage (examples for chromosome 9 in Table 1, full data in Supplementary Table 2). The distance between segments was calculated for each species. The median sizes of the regions within which the breakpoints occurred were: 70 kb for chicken, 102 kb for zebra finch and 125 kb for turkey, respectively.

Table 1 Breakpoints in orthologous chromosomes GGA9, MGA11 and TGU9. Breakpoints are numbered according to their order in turkey

In chicken chromosomes 1–10, and their turkey and zebra finch orthologs, 366 segment ends were identified, of which 318 were involved in rearrangements. The optimal pathways from the chicken–turkey ancestor suggested that 32 breakpoint regions (10.1%) recurred in different lineages, whereas 114 breakpoint regions (35.8%) recurred in either the same or different lineage.

We have hypothesized that breakpoint regions would be enriched for repetitive elements. To test this, we used RepeatMasker to identify classes of repetitive sequences in chromosomes 1–10, and compared the numbers of repeats in breakpoint regions versus segments (Table 2). There is an enrichment for all the gross classes of repeat (see Supplementary Table 4 for detailed breakdown), and this enrichment is significant (Wilcoxon signed-rank test, W=36, n=8, P=0.01).

Table 2 Repeat density in breakpoint regions versus segments for various repeat classes as measured by base pairs of repeat per megabase of sequence

Discussion

This paper is the first to describe the comparative genomics among three bird species and thereby provide the basis through which we might define the chromosomal rearrangements and syntenies that have occurred during avian evolution. Recently, in carrying out the same with only two species, we suggested that a series of inversions and intrachromosomal translocations were commonplace. The advantage of having more than two species however allows us to make initial inferences about the direction of change. Contrary to the suppositions in our previous analysis of chicken and zebra finch alone (Volker et al., 2010), we therefore now believe that all of the observed rearrangements could be explained by a series of inversions and there is no need to invoke intrachromosomal translocations as a mechanism. This agrees with recent findings on the nature of complex rearrangements in other species (Schubert and Lysak, 2011). The reason for the conservation of interchromosomal but not intrachromosomal synteny (and apparent paucity of translocations of any kind) is not clear. It may however imply that there is a selective advantage to birds in keeping certain blocks of synteny together in the interphase nuclei. Our studies on interphase nuclei in birds (Skinner et al., 2009b) and pigs (Quilter et al., 2002; Foster et al., 2005) provide proof of principle about how future studies in this area might proceed.

Our data show that the median breakpoint sizes are lowest in the chicken (70 kb) and highest in the turkey (125 kb). This may reflect the different methods used in the sequencing of the three genomes. The chicken and zebra finch genomes were both sequenced from a combination of plasmid, fosmid and BAC-end read pairs, and assembled with PCAP (Hillier et al., 2004; Warren et al., 2010). The turkey genome was generated by Roche 454 (Roche, Branford, CI, USA) and Illumina (San Diego, CA, USA) sequencing from 3 kb and 20 kb libraries and assembled with Celera Assembler 5.3 (Dalloul et al., 2010). Poorer alignments between the turkey assembly and the others could manifest as a higher apparent breakpoint size.

With our current resolution of 250 kb minimum size for a segment, we cannot state unequivocally that exact breakpoints have been reused; however, our data provide evidence that there are regions of avian genomes that seem prone to breakage. Of the segment ends participating in breakpoint regions, about one-third of them appear to occur in more than one rearrangement. This broadly agrees with our previous observations (Volker et al., 2010); whereas there are few rearrangements at the interchromosomal level, rearrangements within chromosomes are far more frequent. Especially notable is the finding that more than a third of the breakpoint regions found in this study appears to have been reused in the same or in a different lineage. Furthermore, 10% of all the breakpoint regions seem to be reused in different lineages. We must note that the order of rearrangements is a supposition, based on the most parsimonious way to transform the three genomes; the actual sequence of rearrangements may have been slightly different. However, even if a longer sequence of rearrangements had occurred than we suggest here, breakpoint region reuse would still be implicated.

It has been suggested that the small number of interchromosomal rearrangements is a consequence of the small numbers of interspersed repeats, segmental duplications and pseudogenes in avian genomes, which provide few substrates for non-allelic homologous recombination (Burt et al., 1999; Burt, 2002). Our analysis of the sequence composition of these breakpoint regions reflects this, showing especial enrichment for pseudogenes, long terminal repeats, DNA transposon and LINEs. We would expect that breakpoint regions also harbor other small inversions beyond the detection of this study. A recent study (Braun et al., 2011) examining microinversions (inversions <50 kb) in non-coding DNA from a range of bird species suggested that these are more common than previously suspected, most appeared lineage-specific and that there was evidence for hotspots of microinversion. This is consistent with our hypothesis; regions prone to breakage cause apparent breakpoint reuse at large-scale (>250 kb) resolution while harboring many smaller independent microinversions. We predict that, as with repetitive content, these breakpoint regions will be found to overlap regions of non-allelic homologous recombination, segmental duplications and copy number variation. Our previous finding that regions of copy number variation and chromosomal rearrangement between chicken and zebra finch are associated with elevated recombination rates already support this (Volker et al. 2010); such studies extended to further avian genomes will be of interest.

We expect there to be a significant overlap between the chromosomal segments described here and the homologous synteny blocks described by Larkin et al. (2009). Larkin et al (2009) suggested that homologous synteny blocks and evolutionary breakpoint regions are subject to different evolutionary pressures. As further avian genomes are published, especially those with atypical karyotypes (for example, Falconiformes and Psittaciformes (de Oliveira et al., 2005, 2010; Nanda et al., 2007; Nishida et al., 2008; Nie et al., 2009), we will be able to test the hypothesis that blocks of ordered genes have been preserved through evolution, and that these are reflected in the syntenic blocks seen here. Further species that have been of interest in zoo-FISH studies and would thus be likely candidates for studies in addition to the Falconiformes and Psittaciformes mentioned above include other Galliformes (Shibusawa et al., 2004a, 2004b) and a range of species from orders such as Anseriformes and Passeriformes (Guttenbach et al., 2003). Comparisons with Paleognathous birds, whose chromosomes are thought to resemble closely the ancestral avian karyotype (Shetty et al., 1999; Nishida-Umehara et al., 2007) should provide a useful outgroup for understanding the chromosomal evolution of the Neoaves.

This analysis of the macrochromosomes of chicken, turkey and zebra finch has provided evidence in support of our hypothesis that there are regions of bird genomes that are prone to breakage, and thus facilitate chromosomal rearrangements. As more bird genomes are sequenced, the relationship between karyotypic rearrangements, evolutionarily conserved synteny blocks and their mechanisms will become clearer. It is becoming clearer that much of the structural variation in bird genomes is only visible with high-resolution (that is, sequence level) comparisons. Thus, it remains to be seen whether the 36% breakpoint reuse and 10% cross-lineage breakpoint reuse values found here are comparable across other bird orders.

Data archiving

Synteny block data and RepeatMasker data have been deposited in the Dryad repository: doi:10.5061/dryad.8h7k64hh.