Genetic incompatibilities are widespread within species

Journal name:
Nature
Volume:
504,
Pages:
135–137
Date published:
DOI:
doi:10.1038/nature12678
Received
Accepted
Published online

The importance of epistasis—non-additive interactions between alleles—in shaping population fitness has long been a controversial topic, hampered in part by lack of empirical evidence1, 2, 3, 4. Traditionally, epistasis is inferred on the basis of non-independence of genotypic values between loci for a given trait. However, epistasis for fitness should also have a genomic footprint5, 6, 7. To capture this signal, we have developed a simple approach that relies on detecting genotype ratio distortion as a sign of epistasis, and we apply this method to a large panel of Drosophila melanogaster recombinant inbred lines8, 9. Here we confirm experimentally that instances of genotype ratio distortion represent loci with epistatic fitness effects; we conservatively estimate that any two haploid genomes in this study are expected to harbour 1.15 pairs of epistatically interacting alleles. This observation has important implications for speciation genetics, as it indicates that the raw material to drive reproductive isolation is segregating contemporaneously within species and does not necessarily require, as proposed by the Dobzhansky–Muller model, the emergence of incompatible mutations independently derived and fixed in allopatry. The relevance of our result extends beyond speciation, as it demonstrates that epistasis is widespread but that it may often go undetected owing to lack of statistical power or lack of genome-wide scope of the experiments.

At a glance

Figures

  1. Locus pairs showing significant GRD across the DSPR lines of Drosophila.
    Figure 1: Locus pairs showing significant GRD across the DSPR lines of Drosophila.

    The outer circle represent each chromosome arm. Each link represents a locus pair showing significant two-locus GRD. Yellow, blue and red links correspond respectively to RIL panel A-2, B-1 and B-2 (5% FDR corrected P<0.05).

  2. From missing genotypes to epistasis.
    Figure 2: From missing genotypes to epistasis.

    a, GRD signature between all genotyped loci on chromosomes (Chr.) 2R and 3R in RIL panel B-2. P, P value. b, Average productivity of each genotypic class recovered from 318 F2 single-pair matings (progeny counts are F3). As predicted from the GRD signal (in a), haplotypes tagged by single nucleotide polymorphism (SNPs) at positions 2R:4806926 and 3R:5870973 show strong negative epistasis for the aa;bb genotypes, P = 5.51121×10−9 LRT29 (indicated by the red bar). c, GRD between loci on chromosomes 3L and X in RIL panel A-2. d, Average productivity of each genotypic class recovered from 401 F2 single-pair matings. Haplotypes tagged by SNPs at positions 3L:11510853 and X:16483812 show strong negative epistasis for the minor alleles on each haplotype aa;bb, P = 8.25×10−5 LRT29 (indicated by the red bar).

  3. Model for unlinked loci with segregating pairs of incompatible alleles.
    Figure 3: Model for unlinked loci with segregating pairs of incompatible alleles.

    The dendrograms on the left and the right represent the genealogies of two haplotypes segregating within a species. The blue dot and the red rectangle indicate the origins of incompatible mutations on each respective genealogy. On the left, derived blue alleles are incompatible with derived red alleles on the right. These genealogies yield the individuals shown in the centre, wherein each line segment corresponds to a chromosome and each coloured square indicates the derived incompatible allele. Importantly, these incompatible allele pairs are polymorphic in this sample of individuals, thus individuals who inherit both incompatible alleles have lower fitness than those with either none or only a single incompatibility.

  4. Description of the DSPR and validation scheme.
    Extended Data Fig. 1: Description of the DSPR and validation scheme.

    a, Geographic distribution of the DSPR founding strains (orange, panel A; red, panel B). b, Construction of the recombinant inbred lines. For each panel all founder strains were crossed in a round-robin design (line 1 ×line 2 , line 2 ×line 3 ,, line 8 ×line 1 ) to produce F1s, and the F1s were then allowed to mate free to produce an F2 population. In each panel A and B, these F2 populations were split into two independent population to create panels A1, A2 and B1, B2. Each was allowed to recombine freely for 50 generations, in very large population. After 50 generations, for each replicate panel, about 400 isofemale lines were inbred for 25 generations to create the 4 panels of RILs used in this study. c, Crossing scheme used to validate epistatic effects. A pair of founder segregating incompatible alleles was selected and crossed to produce F1s; we then intercrossed the F1 progeny to produce a large F2 population, segregating all possible allelic combinations between alleles at loci 1 and 2. We then counted the progeny each pair produced by intercrossing a large number of F2s which were later genotyped at sites near to the predicted interacting loci.

  5. Principal component analysis of all three DSPR RIL panels.
    Extended Data Fig. 2: Principal component analysis of all three DSPR RIL panels.

    Green, panel A-2; blue, panel B-1; and red, panel B-2. No evidence of population structure is shown.

  6. D[prime] distribution for significant GRD.
    Extended Data Fig. 3: D′ distribution for significant GRD.

    Data are plotted across DSPR panels. On the x axis, D′ is a measure of the disequilibrium between interacting alleles. The red curve corresponds to a smooth curve fit using non-parametric density estimation. An outlier box-plot is presented above the histogram (the lozenge represent the mean and 95% CI, the edge of the rectangle represent the 25% and 75% percentile, the vertical bar within the median, the dots are possible outlier and the red bracket represents the shortest length that contain 50% of the data).

  7. Epistasis plot for each validated instance of GRD.
    Extended Data Fig. 4: Epistasis plot for each validated instance of GRD.

    On the y axes are the productivity measurements that correspond to each genotypic class across both chromosomes. The x axes correspond to the genotypes on one of the chromosomes, the other genotype is represented by the colour indicated inside the plot (for example, genotype AA,bb in panel a is found in the lower left corner, where AA is read from the x axis and bb from the blue colour). a, GRD between chromosomes 2R and 3R (tagged by SNPs 2R:4806926, on the X axis and 3R:5870973, coloured lines) shows strong negative epistasis due to the low fitness of the aa;bb genotype. The additive-by-additive genetic effect is equal to −13.75 (in the sense of refs 5 and 29). b, GRD between chromosomes 3L and X (tagged by SNPs 3L: 11510853, on the X axis and X: 16483812, coloured lines) also shows negative epistasis. Here the additive-by-additive genetic effect equals −5.94.

  8. The accumulation of post-zygotic reproductive isolation through time (note log scale on axes).
    Extended Data Fig. 5: The accumulation of post-zygotic reproductive isolation through time (note log scale on axes).

    Approximate divergence times of commonly studied Drosophila species are indicated by green circles, and the red circle indicates a reasonable expectation for divergence times of stocks used to found the DSPR (~10,000years). The horizontal red area indicates a very approximate ‘speciation threshold’, and indicates that many species pairs that are commonly studied substantially exceed this threshold.

Tables

  1. List of all significant inter-chromosomal GRD identified in the DSPR
    Extended Data Table 1: List of all significant inter-chromosomal GRD identified in the DSPR
  2. List of significant inter-chromosomal GRD in the Arabidopsis MAGIC panel and maize NAM panel
    Extended Data Table 2: List of significant inter-chromosomal GRD in the Arabidopsis MAGIC panel and maize NAM panel

References

  1. Presgraves, D. C. The molecular evolutionary basis of species formation. Nature Rev. Genet. 11, 175180 (2010)
  2. Coyne, J. A. & Orr, H. A. Speciation (Sinauer Associates, 2004)
  3. Cutter, A. D. The polymorphic prelude to Bateson–Dobzhansky–Muller incompatibilities. Trends Ecol. Evol. 27, 209218 (2012)
  4. Carlborg, O. & Haley, C. S. Epistasis: too often neglected in complex trait studies? Nature Rev. Genet. 5, 618625 (2004)
  5. Phillips, P. C. Epistasis—the essential role of gene interactions in the structure and evolution of genetic systems. Nature Rev. Genet. 9, 855867 (2008)
  6. Bomblies, K. et al. Autoimmune response as a mechanism for a Dobzhansky-Muller-type incompatibility syndrome in plants. PLoS Biol. 5, e236 (2007)
  7. Payseur, B. A. & Hoekstra, H. E. Signatures of reproductive isolation in patterns of single nucleotide diversity across inbred strains of house mice. Genetics 171, 19051916 (2005)
  8. King, E. G., Macdonald, S. J. & Long, A. D. Properties and power of the Drosophila Synthetic Population Resource for the routine dissection of complex traits. Genetics 191, 935949 (2012)
  9. King, E. G. et al. Genetic dissection of a model complex trait using the Drosophila Synthetic Population Resource. Genome Res. 22, 15581566 (2012)
  10. Zuk, O. et al. The mystery of missing heritability: genetic interactions create phantom heritability. Proc. Natl Acad. Sci. USA 109, 11931198 (2012)
  11. Bikard, D. et al. Divergent evolution of duplicate genes leads to genetic incompatibilities within A. thaliana. Science 323, 623626 (2009)
  12. Palopoli, M. F. & Wu, C. I. Genetics of hybrid male sterility between Drosophila sibling species: a complex web of epistasis is revealed in interspecific studies. Genetics 138, 329341 (1994)
  13. Wu, C. I., Johnson, N. A. & Palopoli, M. F. Haldane’s rule and its legacy: why are there so many sterile males? Trends Ecol. Evol. 11, 281284 (1996)
  14. Wasbrough, E. R. et al. The Drosophila melanogaster sperm proteome-II (DmSP-II). J. Proteomics 73, 21712185 (2010)
  15. Dockendorff, T. C., Robertson, S. E., Faulkner, D. L. & Jongens, T. A. Genetic characterization of the 44D–45B region of the Drosophila melanogaster genome based on an F2 lethal screen. Mol. Gen. Genet. 263, 137143 (2000)
  16. Netzel-Arnett, S. et al. The glycosylphosphatidylinositol-anchored serine protease PRSS21 (testisin) imparts murine epididymal sperm cell maturation and fertilizing ability. Biol. Reprod. 81, 921932 (2009)
  17. Kasai, S. & Tomita, T. Male specific expression of a cytochrome P450 (Cyp312a1) in Drosophila melanogaster. Biochem. Biophys. Res. Commun. 300, 894900 (2003)
  18. Meiklejohn, C. D., Montooth, K. L. & Rand, D. M. Positive and negative selection on the mitochondrial genome. Trends Genet. 23, 259263 (2007)
  19. Kover, P. X. et al. Multiparent advanced generation inter-cross to fine-map quantitative traits in Arabidopsis thaliana. PLoS Genet. 5, e1000551 (2009)
  20. McMullen, M. D. et al. Genetic properties of the maize nested association mapping population. Science 325, 737740 (2009)
  21. Hill, W. G., Goddard, M. E. & Visscher, P. M. Data and theory point to mainly additive genetic variance for complex traits. PLoS Genet. 4, e1000008 (2008)
  22. Dobzhansky, T. Genetics and the Origin of Species (Columbia Univ. Press, 1937)
  23. Orr, H. A. & Turelli, M. The evolution of postzygotic isolation: accumulating Dobzhansky-Muller incompatibilities. Evolution 55, 10851094 (2001)
  24. Presgraves, D. C. & Stephan, W. Pervasive adaptive evolution among interactors of the Drosophila hybrid inviability gene, Nup96. Mol. Biol. Evol. 24, 306314 (2007)
  25. Tao, Y. et al. Genetic dissection of hybrid incompatibilities between Drosophila simulans and D. mauritiana. I. Differential accumulation of hybrid male sterility effects on the X and autosomes. Genetics 164, 13831397 (2003)
  26. Fitzpatrick, B. M. Hybrid dysfunction: population genetic and quantitative genetic perspectives. Am. Nat. 171, 491498 (2008)
  27. Demuth, J. P. & Wade, M. J. On the theoretical and empirical framework for studying genetic interactions within and among species. Am. Nat. 165, 524536 (2005)
  28. Reed, L. K. & Markow, T. A. Early events in speciation: polymorphism for hybrid male sterility in Drosophila. Proc. Natl Acad. Sci. USA 101, 90099012 (2004)
  29. Cheverud, J. M. & Routman, E. J. Epistasis and its contribution to genetic variance components. Genetics 139, 14551461 (1995)
  30. King, E. G. et al. Genetic dissection of a model complex trait using the Drosophila Synthetic Population Resource. Genome Res. 22, 15581566 (2012)
  31. King, E. G., Macdonald, S. J. & Long, A. D. Properties and power of the Drosophila Synthetic Population Resource for the routine dissection of complex traits. Genetics 191, 935949 (2012)
  32. Baird, N. A. et al. Rapid SNP discovery and genetic mapping using sequenced RAD markers. PLoS ONE 3, e3376 (2008)
  33. Ackermann, M. & Beyer, A. Systematic detection of epistatic interactions based on allele pair frequencies. PLoS Genet. 8, e1002463 (2012)
  34. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. B 57, 289300 (1995)
  35. Phillips, P. C. Epistasis—the essential role of gene interactions in the structure and evolution of genetic systems. Nature Rev. Genet. 9, 855867 (2008)
  36. Cheverud, J. M. & Routman, E. J. Epistasis and its contribution to genetic variance components. Genetics 139, 14551461 (1995)
  37. Adams, M. D. et al. The genome sequence of Drosophila melanogaster. Science 287, 21852195 (2000)
  38. Lunter, G. & Goodson, M. Stampy: A statistical algorithm for sensitive and fast mapping of Illumina sequence reads. Genome Res. 21, 936939 (2011)
  39. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 17541760 (2009)
  40. DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nature Genet. 43, 491498 (2011)
  41. Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 6, 8082 (2012)
  42. McMullen, M. D. et al. Genetic properties of the maize nested association mapping population. Science 325, 737740 (2009)
  43. Kover, P. X. et al. A multiparent advanced generation inter-cross to fine-map quantitative traits in Arabidopsis thaliana. PLoS Genet. 5, e1000551 (2009)

Download references

Author information

Affiliations

  1. Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138, USA

    • Russell B. Corbett-Detig,
    • Jun Zhou,
    • Daniel L. Hartl &
    • Julien F. Ayroles
  2. Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14853, USA

    • Andrew G. Clark &
    • Julien F. Ayroles
  3. Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA

    • Andrew G. Clark
  4. Harvard Society of Fellows, Harvard University, Cambridge, Massachusetts 02138, USA

    • Julien F. Ayroles

Contributions

J.F.A. conceived the idea of the project, R.B.C.-D. and J.F.A. conceived and designed experiments and analyses. R.B.C.-D. and J.F.A. conducted bioinformatics and statistical analyses; R.B.C.-D., J.F.A. and J.Z. performed experiments; J.Z. carried out molecular work; A.G.C. and D.L.H. gave analytical and conceptual advice throughout the project.

Competing financial interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to:

All code used and generated for this study is available upon request.

Author details

Extended data figures and tables

Extended Data Figures

  1. Extended Data Figure 1: Description of the DSPR and validation scheme. (226 KB)

    a, Geographic distribution of the DSPR founding strains (orange, panel A; red, panel B). b, Construction of the recombinant inbred lines. For each panel all founder strains were crossed in a round-robin design (line 1 ×line 2 , line 2 ×line 3 ,, line 8 ×line 1 ) to produce F1s, and the F1s were then allowed to mate free to produce an F2 population. In each panel A and B, these F2 populations were split into two independent population to create panels A1, A2 and B1, B2. Each was allowed to recombine freely for 50 generations, in very large population. After 50 generations, for each replicate panel, about 400 isofemale lines were inbred for 25 generations to create the 4 panels of RILs used in this study. c, Crossing scheme used to validate epistatic effects. A pair of founder segregating incompatible alleles was selected and crossed to produce F1s; we then intercrossed the F1 progeny to produce a large F2 population, segregating all possible allelic combinations between alleles at loci 1 and 2. We then counted the progeny each pair produced by intercrossing a large number of F2s which were later genotyped at sites near to the predicted interacting loci.

  2. Extended Data Figure 2: Principal component analysis of all three DSPR RIL panels. (365 KB)

    Green, panel A-2; blue, panel B-1; and red, panel B-2. No evidence of population structure is shown.

  3. Extended Data Figure 3: D′ distribution for significant GRD. (85 KB)

    Data are plotted across DSPR panels. On the x axis, D′ is a measure of the disequilibrium between interacting alleles. The red curve corresponds to a smooth curve fit using non-parametric density estimation. An outlier box-plot is presented above the histogram (the lozenge represent the mean and 95% CI, the edge of the rectangle represent the 25% and 75% percentile, the vertical bar within the median, the dots are possible outlier and the red bracket represents the shortest length that contain 50% of the data).

  4. Extended Data Figure 4: Epistasis plot for each validated instance of GRD. (136 KB)

    On the y axes are the productivity measurements that correspond to each genotypic class across both chromosomes. The x axes correspond to the genotypes on one of the chromosomes, the other genotype is represented by the colour indicated inside the plot (for example, genotype AA,bb in panel a is found in the lower left corner, where AA is read from the x axis and bb from the blue colour). a, GRD between chromosomes 2R and 3R (tagged by SNPs 2R:4806926, on the X axis and 3R:5870973, coloured lines) shows strong negative epistasis due to the low fitness of the aa;bb genotype. The additive-by-additive genetic effect is equal to −13.75 (in the sense of refs 5 and 29). b, GRD between chromosomes 3L and X (tagged by SNPs 3L: 11510853, on the X axis and X: 16483812, coloured lines) also shows negative epistasis. Here the additive-by-additive genetic effect equals −5.94.

  5. Extended Data Figure 5: The accumulation of post-zygotic reproductive isolation through time (note log scale on axes). (104 KB)

    Approximate divergence times of commonly studied Drosophila species are indicated by green circles, and the red circle indicates a reasonable expectation for divergence times of stocks used to found the DSPR (~10,000years). The horizontal red area indicates a very approximate ‘speciation threshold’, and indicates that many species pairs that are commonly studied substantially exceed this threshold.

Extended Data Tables

  1. Extended Data Table 1: List of all significant inter-chromosomal GRD identified in the DSPR (447 KB)
  2. Extended Data Table 2: List of significant inter-chromosomal GRD in the Arabidopsis MAGIC panel and maize NAM panel (153 KB)

Additional data