Abstract
Both genome content and deployment contribute to phenotypic differences between species1, 2, 3, 4, 5. Sex is the most important difference between individuals in a species and has long been posited to be rapidly evolving. Indeed, in the Drosophila genus, traits such as sperm length, genitalia, and gonad size are the most obvious differences between species6. Comparative analysis of sex-biased expression should deepen our understanding of the relationship between genome content and deployment during evolution. Using existing7, 8 and newly assembled genomes9, we designed species-specific microarrays to examine sex-biased expression of orthologues and species-restricted genes in D. melanogaster, D. simulans, D. yakuba, D. ananassae, D. pseudoobscura, D. virilis and D. mojavensis. We show that averaged sex-biased expression changes accumulate monotonically over time within the genus. However, different genes contribute to expression variance within species groups compared to between groups. We observed greater turnover of species-restricted genes with male-biased expression, indicating that gene formation and extinction may play a significant part in species differences. Genes with male-biased expression also show the greatest expression and DNA sequence divergence. This higher divergence and turnover of genes with male-biased expression may be due to high transcription rates in the male germline, greater functional pleiotropy of genes expressed in females, and/or sexual competition.
There are numerous case studies demonstrating that orthologues with sex-biased function diverge more rapidly than genes with non-biased function10. To determine systematically the relative contributions of gene content and expression divergence to sexual differences, we sampled sex-biased expression within the Drosophila genus using species-specific microarrays designed for the closely related D. melanogaster, D. simulans and D. yakuba group (common ancestor, 10–13 million years ago), and for the more distantly related D. ananassae, D. pseudoobscura, D. virilis and D. mojavensis (common ancestor, 40–65 million years ago) (Supplementary Table 1). The species-specific platform eliminated confounding effects of sequence divergence on hybridization and allowed us to assay the expression of lineage-restricted genes.
Previous work has demonstrated that sex-biased expression in D. melanogaster adults is substantial, primarily owing to gametogenesis10. This seems to be characteristic for the genus (Fig. 1, and Supplementary Fig. 1). Generally, we observed greater male-biased expression (
7–14% of the transcriptome) relative to female-biased expression (
3–9% of the transcriptome), at a significance value of P
0.01 (Mann–Whitney, false-discovery-rate-corrected). The exceptions were D. pseudoobscura (
16% female- and male-biased expression) and D. mojavensis (
12% female- and male-biased expression). Additionally, the magnitude of male-biased expression was generally greater than the female-biased expression—the average log2 female:male expression ratio was -1.2 for genes with male-biased expression and 0.8 for genes with female-biased expression. This indicates that there were more genes approaching male-specific expression than female-specific expression. The genes that showed sex-biased expression in each species are listed in Supplementary Information (Supplementary Tables 3–16).
Figure 1: Sex-biased expression in Drosophila species.

a–g, Sex-biased female:male expression ratio (log2) versus average expression intensity (log2) plots for each Drosophila species. Expression intensities are arbitrary, where zero represents the minimum value. Values for genes with significant (P
0.01, false-discovery-rate-corrected Mann–Whitney test) female-biased, male-biased and non-biased expression are shown. The per cent of genes with female-biased or male-biased expression is inset in each panel. D. melanogaster, D. mel; D. simulans, D. sim; D. yakuba, D. yak; D. ananassae, D. ana; D. pseudoobscura, D. pse; D. virilis, D. vir; and D. mojavensis, D. moj.
To examine expression divergence over time, we parsed the genes with orthologues in every species and constructed a pairwise matrix of log2 female:male expression ratios. We compared expression within species (two strains of D. simulans), between species within the closely related melanogaster subgroup, and between all seven species (Fig. 2a–c). Similar pairwise matrices for quadruplicate replicates within each species were also plotted as a baseline measurement of technical noise and biological variability (Supplementary Fig. 2). All expression ratio plots were linear and showed increasing expression divergence with inferred genetic distance.
Figure 2: Expression divergence among common orthologues.

Female:male expression ratios (log2) for orthologue pairs, plotted against each other: a, two different D. simulans strains; b, the melanogaster subgroup (D.melanogaster, D. simulans and D. yakuba); c, all seven Drosophila species. All the density (grey for high, black for low) scatter plots include every 1:1 pair of common orthologues for which both have an expression value. In b and c, the species A and B designation is arbitrary, but A is assigned to the species in the pair most closely related to D. melanogaster. d, Neighbour-joining trees with branch lengths inferred using sequence distance (genomic mutation distance11) and the expression distance (1-Pearson's r) for all pairs of species except the pairs between the D. ananassae outlier and other species (Supplementary Fig. 3). e, Expression distance values plotted against estimated divergence time11 for all possible species pairs and replicates within species. Quadruplicate replicates within each species were used at a time of 0 million years.
High resolution image and legend (148K)There was an especially clear relationship between sequence and expression divergence. Neighbour-joining trees of expression divergence (from the pairwise expression ratios between each species; 1-Pearson's r; Supplementary Fig. 3), or by sequence divergence9, 11 have the same topology (Fig. 2d). Expression divergence tightly correlated with time (Fig. 2e, r2 = 0.96), which may provide a useful tool in molecular phylogenetics.
Although the whole-genome trends in expression divergence were both obvious and clear, at the gene level, the magnitude of expression divergence was modest. Only 384 orthologue pairs (0.3%) showed significant female-biased expression in one species and significant male-biased expression in another. Switches between highly female-biased expression and highly male-biased expression were never observed (Fig. 2c). Extensive (20%) categorical changes in sex-bias class, especially for genes with male-biased expression, were previously reported between D. melanogaster and D. simulans12, 13. We observed a categorical change in sex-biased expression in 12% of the orthologues between these two species, but the changes were dominated by low magnitude changes between modest sex-biased expression and non-sex-biased categories. These values are highly sensitive to arbitrary significance-level cut-offs; however, it was clear in exploratory plots of expression ratios that genes with male-biased expression showed greater expression divergence (Fig. 2b, c). Plots of expression ratio standard deviations against average expression ratio (Fig. 3a) also showed a clear excess of variable expression among orthologues with male-biased expression (P < 10-8, chi-squared test). Thus, male-biased expression contributes heavily to overall expression divergence.
Figure 3: Expression divergence within and between species and groups.

a, Average female:male expression ratios for common orthologues plotted against expression divergence (expression ratio standard deviations between 7 species) for the same orthologues. b, Expression ratio standard deviations among members of the melanogaster subgroup (D. melanogaster, D. simulans and D. yakuba) plotted against standard deviations among the other four species (D. ananassae, D. pseudoobscura, D. virilis and D. mojavensis). c, K-means clustering (K = 10, species-order fixed) of expression ratios where s.d. > 0.5. Female-biased (red), male-biased (blue) and non-biased (black) expression is indicated. d, Examples of gene clusters that are indicated on the Eisengram (c). Species (x axis) and log2 female:male expression ratio (y axis) of common orthologues are shown.
High resolution image and legend (237K)To determine if particular types of genes show greater or lesser expression divergence we analysed Gene Ontology14 (GO) terms. Unsurprisingly, genes annotated as 'unknown function' are significantly over-represented (P < 10-8, Fisher's exact test) among genes with variable expression. Genes with 'transcriptional regulation' annotations were under-represented in the same gene set (P < 10-4, Fisher's exact test), suggesting that genes involved in transcription regulation are under constraint. Similar constrained expression of transcriptional regulators was observed in a study of metamorphosis in the melanogaster subgroup5.
Just as changes in DNA sequence can have consequences ranging from deleterious to neutral to advantageous15, changes in gene expression should have variable effects, owing to underlying mutations in transcription factors, cis-regulatory sites and post-transcriptional regulators, and the resulting variance will be subject to drift and selection2, 3, 5, 13, 16, 17, 18. We were able to distinguish expression differences between species well enough to show a linear relationship with time at the full-transcriptome level, but does this apply to individual genes?
To determine if there is a common set of orthologues that can tolerate variable expression (that can be thought of as the thematic equivalent of a synonomous codon substitution), we asked if expression divergence between orthologues within the melanogaster subgroup correlates with the expression divergence between more distantly related species. We found no significant correlation between orthologue expression divergence between groups of species (r2 = 0.08, Fig. 3b). Genes with greater expression divergence in the melanogaster subgroup and the remaining species are different. Thus, although overall expression divergence shows a clock-like behaviour (reflecting mutation accumulation in a neutral model, or an adaptive speed limit in a selection model), different individual genes contribute to this global expression divergence in different amounts. This suggests that there is not a common set of genes that tolerate large drifts in sex-biased expression ratios.
To analyse further the orthologues with the most divergent expression, we selected orthologues with the greatest expression divergence (s.d. > 0.5) and subjected them to cluster analysis with species-order fixed (Fig. 3c). Strikingly, even those genes with the most variable expression were organized into well-defined clusters. Each of the clusters was subsequently analysed to look for patterns of change. We observed three distinct cluster types revealing expression divergence between lineages, aberrant expression in a single species, and unpatterned variability (Fig. 3c, d). For example, cluster 'A' shows higher male-biased expression in just the melanogaster subgroup (D. melanogaster, D. simulans and D. yakuba); cluster 'B' shows increased male-biased expression in D. pseudoobscura only; and cluster 'C' shows no evidence for a phylogenetic trend. Briefly, among the 5% of common orthologues with the most variable expression, 52% exhibited lineage-specific, 22% species-specific and only 25% unpatterned expression variability.
Having only a few sequenced genomes seriously hinders the study of genes that are species- or lineage-specific (species-restricted). We took advantage of the species-specific array design to determine the contribution of common orthologues and species-restricted genes to overall sex-biased expression patterns (Fig. 4a, b). Female-biased expression was over-represented (P < 10-2, chi-squared test) among common orthologues in four of the seven species, whereas male-biased expression was always under-represented. The pattern was reversed among the species-restricted genes. Female-biased expression of species-restricted genes was less prevalent in all species except D. virilis, and male-biased expression was more prevalent in each of the species examined. Female-biased expression was also under-represented among paralogues (Supplementary Fig. 4). Similar results were obtained using TBLASTN methods to detect genes that had diverged to obscure orthology (Supplementary Fig. 5). These suggest that genes with male-biased expression have higher effective birth and extinction rates.
Figure 4: Relationship between sex-biased expression, gene content and sequence divergence.

Gene content and expression of common orthologues (a) and species-restricted (b) genes. The percentages of all genes (black) and with female (red) or male (blue) -biased expression are shown. Significant differences (P < 10-2, chi-squared test) between sex-biased classes and total genes are indicated (asterisks). See Supplementary Fig. 5 for paralogues. Average KA/KS ratios within the melanogaster subgroup for common orthologues with high or low expression-ratio s.d. (c) and for all common orthologues or species-restricted genes (d). Bars are colour-coded as in a, with the addition of grey bars, which represent non-biased expression. Significant differences (P < 10-2, Mann–Whitney test) between common orthologues with constrained expression and variable expression, or between common orthologues and species-restricted genes are indicated (asterisks).
High resolution image and legend (229K)We also asked if sex-bias and expression divergence correlate with sequence divergence among orthologues. If similar selective pressure acts on both protein-coding capacity and expression at a given locus, then they should correlate. However, protein-coding capacity and expression divergence need not be tightly coupled. For example, high expression divergence can result from changes in upstream transcription factors or the cis-regulatory sites that they bind19.
Synonymous (KS) and non-synonymous substitution rates (KA) in protein-coding genes were used to examine sequence divergence20. Multiple substitutions occur at a given site between distantly related species (for example, D. melanogaster and D. mojavensis) making KA/KS ratios much less reliable, and therefore KA/KS ratios were used only within the melanogaster subgroup (Fig. 4c, d). Genes with male-biased expression were expected to show higher KA/KS ratios10. Indeed, common orthologues with male-biased expression had KA/KS values within the melanogaster subgroup (0.129), more than two times those of common orthologues with female-biased expression (0.061). Interestingly, common orthologues with non-biased expression showed intermediate KA/KS values. We observed a strong correlation between expression and sequence divergence among the genes showing the greatest expression divergence (Fig. 4c), as has also been seen in mammals21. Additionally, species-restricted genes had higher sequence-divergence than common orthologues for all expression categories (Fig. 4d), as has been seen in vertebrates22. Perhaps expression divergence, gene turn-over, sex-bias and sequence divergence of individual genes are often coupled to the same selective forces.
The contrasting divergence and turnover patterns of genes with male-biased expression relative to those with female-biased expression is somewhat surprising. Reproduction is the function of a couple, not an individual; therefore co-evolution of reproductive traits is expected to occur. For example, selection for sperm tail length in Drosophila males is coupled to selection for length of the seminal receptacle in females23. There are a number of possible explanations. There may be greater de novo generation of genes with male-biased expression as a result of simple sequence requirements for core promoter generation24 and extremely high levels of RNA polymerase in spermatocytes25. This combination might result in excessive transcription of intragenic regions26. A few of these new genes with male-biased expression might be functional, but most of these 'de novo' genes would be expected to rapidly degenerate. Alternatively, genes required for oogenesis may be more constrained because of pleiotropy or the under-representation of paralogues with partially overlapping functions. Many D. melanogaster genes required for female fertility are also required for organismal viability27, and genes with clear multiple functions, such as those encoding ribosomal proteins, are overexpressed in ovaries relative to testes28. Finally, male–male competition might be particularly strong29. The addition of more sequenced genomes will provide ample opportunities to explore these questions further.
Methods Summary
Flies
Species were grown on standard media (Tucson Drosophila Stock Center). We isolated messenger RNA from adult females and males grown at 22 °C (5–7 days post eclosion), and labelled and hybridized using standard methods.
Arrays
Oligonucleotide arrays of 50-mers (NimbleGen Systems) were designed against draft assemblies and ab initio annotations we contributed to the gene model reconciliation9 (Supplementary Table 2). D. melanogaster 60-mer expression array (NimbleGen Design ID 2005-10-17_Dmel4_60mer_exp) designed on the basis of Flybase annotation V4.2 was used for D. melanogaster hybridizations. For this report, we remapped all array elements to current consensus gene models9. Our expression results and conclusions were similar using the original models. Because low-magnitude expression divergence is difficult to distinguish from noise, we performed at least quadruplicate replicates for each species and only channels passing a stringent quality control regimen were used in the final analysis (72 channels total). Full platform descriptions and data are available at the GEO under accession GSE6640.
For each gene, log2 intensities for female and male expression were compared by non-parametric two-sample Mann–Whitney tests to generate the significance of sex-biased expression (P
0.01) and ratios of each gene were calculated as the average probeset intensity in female channels divided by the intensity in male channels. Common orthologues are present in all 7 species and species-restricted genes are present in at least one species, but absent in
1 species.
