Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Reverse engineering the genotype–phenotype map with natural genetic variation

Abstract

The genetic variation that occurs naturally in a population is a powerful resource for studying how genotype affects phenotype. Each allele is a perturbation of the biological system, and genetic crosses, through the processes of recombination and segregation, randomize the distribution of these alleles among the progeny of a cross. The randomized genetic perturbations affect traits directly and indirectly, and the similarities and differences between traits in their responses to common perturbations allow inferences about whether variation in a trait is a cause of a phenotype (such as disease) or whether the trait variation is, instead, an effect of that phenotype. It is then possible to use this information about causes and effects to build models of probabilistic 'causal networks'. These networks are beginning to define the outlines of the 'genotype–phenotype map'.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: From genetic randomization to causal network.
Figure 2: Measurement error can confuse causal inference.

Similar content being viewed by others

References

  1. Friedman, N., Linial, M., Nachman, I. & Pe'er, D. Using Bayesian networks to analyze expression data. J. Comput. Biol. 7, 601–620 (2000). This paper provides a clear overview of Bayesian network formalisms, the main framework for causal network inference at present, and demonstrates how they can be applied to gene-expression data.

    Article  CAS  Google Scholar 

  2. Bonneau, R. et al. A predictive model for transcriptional control of physiology in a free living cell. Cell 131, 1354–1365 (2007).

    Article  CAS  Google Scholar 

  3. Sieberts, S. K. & Schadt, E. E. Moving toward a system genetics view of disease. Mamm. Genome 18, 389–401 (2007).

    Article  Google Scholar 

  4. Oyama, S. The Ontogeny of Information: Developmental Systems and Evolution (Duke Univ. Press, 2000).

    Book  Google Scholar 

  5. Fisher, R. A. The arrangement of field experiments. J. Ministry Agric. Great Britain 33, 503–511 (1926).

    Google Scholar 

  6. Jansen, R. C. & Nap, J. P. Genetical genomics: the added value from segregation. Trends Genet. 17, 388–391 (2001). This was the first article in which the many advantages of using natural variation to probe gene-expression networks were articulated.

    Article  CAS  Google Scholar 

  7. Jansen, R. C. Studying complex biological systems using multifactorial perturbation. Nature Rev. Genet. 4, 145–151 (2003).

    Article  CAS  Google Scholar 

  8. Brem, R. B., Yvert, G., Clinton, R. & Kruglyak, L. Genetic dissection of transcriptional regulation in budding yeast. Science 296, 752–755 (2002).

    Article  ADS  CAS  Google Scholar 

  9. Schadt, E. E. et al. Genetics of gene expression surveyed in maize, mouse and man. Nature 422, 297–302 (2003). References 8 and 9 provided the first empirical results showing the power of genetic analysis of genome-wide gene expression.

    Article  ADS  CAS  Google Scholar 

  10. Rockman, M. V. & Kruglyak, L. Genetics of global gene expression. Nature Rev. Genet. 7, 862–872 (2006).

    Article  CAS  Google Scholar 

  11. Lynch, M. & Walsh, B. Genetics and Analysis of Quantitative Traits (Sinauer, 1998).

    Google Scholar 

  12. Stamatoyannopoulos, J. A. The genomics of gene expression. Genomics 84, 449–457 (2004).

    Article  CAS  Google Scholar 

  13. Perez-Enciso, M. In silico study of transcriptome genetic variation in outbred populations. Genetics 166, 547–554 (2004).

    Article  CAS  Google Scholar 

  14. Alberts, R. et al. A statistical multiprobe model for analyzing cis and trans genes in genetical genomics experiments with short-oligonucleotide arrays. Genetics 171, 1437–1439 (2005).

    Article  CAS  Google Scholar 

  15. Carlborg, O. et al. Methodological aspects of the genetic dissection of gene expression. Bioinformatics 21, 2383–2393 (2005).

    Article  CAS  Google Scholar 

  16. Storey, J. D., Akey, J. M. & Kruglyak, L. Multiple locus linkage analysis of genomewide expression in yeast. PLoS Biol. 3, e267 (2005).

    Article  Google Scholar 

  17. Kendziorski, C. M. et al. Statistical methods for expression quantitative trait loci (eQTL) mapping. Biometrics 62, 19–27 (2006).

    Article  MathSciNet  CAS  Google Scholar 

  18. Kulp, D. C. & Jagalur, M. Causal inference of regulator–target pairs by gene mapping of expression phenotypes. BMC Genomics 7, 125 (2006).

    Article  Google Scholar 

  19. Jia, Z. & Xu, S. Mapping quantitative trait loci for expression abundance. Genetics 176, 611–623 (2007).

    Article  CAS  Google Scholar 

  20. Doss, S., Schadt, E. E., Drake, T. A. & Lusis, A. J. Cis-acting expression quantitative trait loci in mice. Genome Res. 15, 681–691 (2005).

    Article  CAS  Google Scholar 

  21. Ronald, J., Brem, R. B., Whittle, J. & Kruglyak, L. Local regulatory variation in Saccharomyces cerevisiae. PLoS Genet. 1, e25 (2005).

    Article  Google Scholar 

  22. Morley, M. et al. Genetic analysis of genome-wide variation in human gene expression. Nature 430, 743–747 (2004).

    Article  ADS  CAS  Google Scholar 

  23. Churchill, G. A. The genetics of gene expression. Mamm. Genome 17, 465 (2006).

    Article  Google Scholar 

  24. Omholt, S. W., Plahte, E., Øyehaug, L. & Xiang, K. Gene regulatory networks generating the phenomena of additivity, dominance and epistasis. Genetics 155, 969–980 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. Gjuvsland, A. B., Hayes, B. J., Omholt, S. W. & Carlborg, O. Statistical epistasis is a generic feature of gene regulatory networks. Genetics 175, 411–420 (2007).

    Article  Google Scholar 

  26. Brem, R. B. & Kruglyak, L. The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proc. Natl Acad. Sci. USA 102, 1572–1577 (2005).

    Article  ADS  CAS  Google Scholar 

  27. Brem, R. B., Storey, J. D., Whittle, J. & Kruglyak, L. Genetic interactions between polymorphisms that affect gene expression in yeast. Nature 436, 701–703 (2005).

    Article  ADS  CAS  Google Scholar 

  28. West, M. A. et al. Global eQTL mapping reveals the complex genetic architecture of transcript-level variation in Arabidopsis. Genetics 175, 1441–1450 (2007).

    Article  CAS  Google Scholar 

  29. Lum, P. Y. et al. Elucidating the murine brain transcriptional network in a segregating mouse population to identify core functional modules for obesity and diabetes. J. Neurochem. 97 (suppl. 1), 50–62 (2006).

    Article  CAS  Google Scholar 

  30. Huttenhower, C. et al. Nearest neighbor networks: clustering expression data based on gene neighborhoods. BMC Bioinformatics 8, 250 (2007).

    Article  Google Scholar 

  31. Chen, Y. et al. Variations in DNA elucidate molecular networks that cause disease. Nature 452, 429–435 (2008).

    Article  ADS  CAS  Google Scholar 

  32. Emilsson, V. et al. Genetics of gene expression and its effect on disease. Nature 452, 423–428 (2008). References 31 and 32 integrated association mapping in human populations and linkage mapping in mice to identify suites of functionally related genes that are causally implicated in disease.

    Article  ADS  CAS  Google Scholar 

  33. Zhu, J. et al. Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks. Nature Genet. 40, 854–861 (2008).

    Article  CAS  Google Scholar 

  34. Schadt, E. E. et al. An integrative genomics approach to infer causal associations between gene expression and disease. Nature Genet. 37, 710–717 (2005).

    Article  CAS  Google Scholar 

  35. de la Fuente, A., Bing, N., Hoeschele, I. & Mendes, P. Discovery of meaningful associations in genomic data using partial correlation coefficients. Bioinformatics 20, 3565–3574 (2004).

    Article  CAS  Google Scholar 

  36. Magwene, P. M. & Kim, J. Estimating genomic coexpression networks using first-order conditional independence. Genome Biol. 5, R100 (2004).

    Article  Google Scholar 

  37. Zhu, J. et al. An integrative genomics approach to the reconstruction of gene networks in segregating populations. Cytogenet. Genome Res. 105, 363–374 (2004). This paper was the first to integrate expression QTL data and phenotypic correlation data into causal modelling, as well as to describe the crucial role of genetic perturbations in anchoring causal links in the Bayesian network context.

    Article  CAS  Google Scholar 

  38. Li, R. et al. Structural model analysis of multiple quantitative traits. PLoS Genet. 2, e114 (2006).

    Article  Google Scholar 

  39. Chen, L. S., Emmert-Streib, F. & Storey, J. D. Harnessing naturally randomized transcription to infer regulatory relationships among genes. Genome Biol. 8, R219 (2007). This paper details a conservative analysis pipeline for uncovering high-confidence causal links with a well-defined false-discovery rate.

    Article  Google Scholar 

  40. Ferrara, C. T. et al. Genetic networks of liver metabolism revealed by integration of metabolic and transcriptional profiling. PLoS Genet. 4, e1000034 (2008).

    Article  Google Scholar 

  41. Bing, N. & Hoeschele, I. Genetical genomics analysis of a yeast segregant population for transcription network inference. Genetics 170, 533–542 (2005).

    Article  CAS  Google Scholar 

  42. Li, H. et al. Inferring gene transcriptional modulatory relations: a genetical genomics approach. Hum. Mol. Genet. 14, 1119–1125 (2005).

    Article  CAS  Google Scholar 

  43. Tu, Z. et al. An integrative approach for causal gene identification and gene regulatory pathway inference. Bioinformatics 22, e489–e496 (2006).

    Article  CAS  Google Scholar 

  44. Suthram, S. et al. eQED: an efficient method for interpreting eQTL associations using protein networks. Mol. Syst. Biol. 4, 162 (2008).

    Article  Google Scholar 

  45. McCarthy, M. I. et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nature Rev. Genet. 9, 356–369 (2008).

    Article  CAS  Google Scholar 

  46. Dixon, A. L. et al. A genome-wide association study of global gene expression. Nature Genet. 39, 1202–1207 (2007).

    Article  CAS  Google Scholar 

  47. Goring, H. H. et al. Discovery of expression QTLs using large-scale transcriptional profiling in human lymphocytes. Nature Genet. 39, 1208–1216 (2007).

    Article  Google Scholar 

  48. Stranger, B. E. et al. Population genomics of human gene expression. Nature Genet. 39, 1217–1224 (2007).

    Article  CAS  Google Scholar 

  49. Schadt, E. E. et al. Mapping the genetic architecture of gene expression in human liver. PLoS Biol. 6, e107 (2008).

    Article  Google Scholar 

  50. Zhu, J. et al. Increasing the power to detect causal associations by combining genotypic and expression data in segregating populations. PLoS Comput. Biol. 3, e69 (2007).

    Article  ADS  MathSciNet  Google Scholar 

  51. Liu, B., de la Fuente, A. & Hoeschele, I. Gene network inference via structural equation modeling in genetical genomics experiments. Genetics 178, 1763–1776 (2008).

    Article  Google Scholar 

  52. Ghazalpour, A. et al. Integrating genetic and network analysis to characterize genes related to mouse weight. PLoS Genet. 2, e130 (2006).

    Article  Google Scholar 

  53. Lee, S. I. et al. Identifying regulatory mechanisms using individual variation reveals key role for chromatin modification. Proc. Natl Acad. Sci. USA 103, 14062–14067 (2006).

    Article  ADS  CAS  Google Scholar 

  54. Fisher, R. A. The Genetical Theory of Natural Selection (Oxford Univ. Press, 1930).

    Book  Google Scholar 

  55. Ronald, J. & Akey, J. M. The evolution of gene expression QTL in Saccharomyces cerevisiae. PLoS ONE 2, e678 (2007). This paper is a founding contribution to the field of functional population genomics; it addresses the genomic basis of phenotypic evolution from the perspective of the functional alleles segregating in populations.

    Article  ADS  Google Scholar 

  56. Yvert, G. et al. Trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors. Nature Genet. 35, 57–64 (2003).

    Article  CAS  Google Scholar 

  57. Barton, N. H. & Keightley, P. D. Understanding quantitative genetic variation. Nature Rev. Genet. 3, 11–21 (2002).

    Article  CAS  Google Scholar 

  58. Ohta, T. Origin of the neutral and nearly neutral theories of evolution. J. Biosci. 28, 371–377 (2003).

    Article  Google Scholar 

  59. Wittkopp, P. J., Haerum, B. K. & Clark, A. G. Regulatory changes underlying expression differences within and between Drosophila species. Nature Genet. 40, 346–350 (2008).

    Article  CAS  Google Scholar 

  60. Schliekelman, P. Statistical power of expression quantitative trait loci for mapping of complex trait loci in natural populations. Genetics 178, 2201–2216 (2008).

    Article  Google Scholar 

  61. Petkov, P. M. et al. Evidence of a large-scale functional organization of mammalian chromosomes. PLoS Genet. 1, e33 (2005).

    Article  Google Scholar 

  62. Begun, D. J. & Aquadro, C. F. Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster. Nature 356, 519–520 (1992).

    Article  ADS  CAS  Google Scholar 

  63. Charlesworth, B., Morgan, M. T. & Charlesworth, D. The effect of deleterious mutations on neutral molecular variation. Genetics 134, 1289–1303 (1993).

    CAS  PubMed  PubMed Central  Google Scholar 

  64. Kulathinal, R. J., Bennett, S. M., Fitzpatrick, C. L. & Noor, M. A. Fine-scale mapping of recombination rate in Drosophila refines its correlation to diversity and divergence. Proc. Natl Acad. Sci. USA 105, 10051–10056 (2008).

    Article  ADS  CAS  Google Scholar 

  65. Bystrykh, L. et al. Uncovering regulatory pathways that affect hematopoietic stem cell function using 'genetical genomics'. Nature Genet. 37, 225–232 (2005).

    Article  CAS  Google Scholar 

  66. Wentzell, A. M. et al. Linking metabolic QTLs with network and cis-eQTLs controlling biosynthetic pathways. PLoS Genet. 3, 1687–1701 (2007).

    Article  CAS  Google Scholar 

  67. Foss, E. J. et al. Genetic basis of proteome variation in yeast. Nature Genet. 39, 1369–1375 (2007).

    Article  CAS  Google Scholar 

  68. Stylianou, I. M. et al. Applying gene expression, proteomics and single-nucleotide polymorphism analysis for complex trait gene identification. Genetics 178, 1795–1805 (2008).

    Article  CAS  Google Scholar 

  69. Churchill, G. A. et al. The Collaborative Cross, a community resource for the genetic analysis of complex traits. Nature Genet. 36, 1133–1137 (2004).

    Article  CAS  Google Scholar 

  70. Li, Y. et al. Mapping determinants of gene expression plasticity by genetical genomics in C. elegans. PLoS Genet. 2, e222 (2006).

    Article  Google Scholar 

  71. Smith, E. N. & Kruglyak, L. Gene–environment interaction in yeast gene expression. PLoS Biol. 6, e83 (2008).

    Article  Google Scholar 

  72. Hubner, N. et al. Integrated transcriptional profiling and linkage analysis for identification of genes underlying disease. Nature Genet. 37, 243–253 (2005).

    Article  CAS  Google Scholar 

  73. Cotsapas, C. J. et al. Genetic dissection of gene regulation in multiple mouse tissues. Mamm. Genome 17, 490–495 (2006).

    Article  Google Scholar 

  74. Wang, S. et al. Genetic and genomic analysis of a fat mass trait with complex inheritance reveals marked sex specificity. PLoS Genet. 2, e15 (2006).

    Article  Google Scholar 

Download references

Acknowledgements

I thank the Jane Coffin Childs Memorial Fund for Medical Research and New York University for support, and L. Chen for discussion.

Author information

Authors and Affiliations

Authors

Ethics declarations

Competing interests

The author declares no competing financial interests.

Additional information

Reprints and permissions information is available at http://www.nature.com/reprints.

Correspondence should be addressed to the author (mrockman@nyu.edu).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rockman, M. Reverse engineering the genotype–phenotype map with natural genetic variation. Nature 456, 738–744 (2008). https://doi.org/10.1038/nature07633

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nature07633

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing