Review Article | Published:

Reverse engineering the genotype–phenotype map with natural genetic variation

Nature volume 456, pages 738744 (11 December 2008) | Download Citation

Subjects

Abstract

The genetic variation that occurs naturally in a population is a powerful resource for studying how genotype affects phenotype. Each allele is a perturbation of the biological system, and genetic crosses, through the processes of recombination and segregation, randomize the distribution of these alleles among the progeny of a cross. The randomized genetic perturbations affect traits directly and indirectly, and the similarities and differences between traits in their responses to common perturbations allow inferences about whether variation in a trait is a cause of a phenotype (such as disease) or whether the trait variation is, instead, an effect of that phenotype. It is then possible to use this information about causes and effects to build models of probabilistic 'causal networks'. These networks are beginning to define the outlines of the 'genotype–phenotype map'.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    , , & Using Bayesian networks to analyze expression data. J. Comput. Biol. 7, 601–620 (2000). This paper provides a clear overview of Bayesian network formalisms, the main framework for causal network inference at present, and demonstrates how they can be applied to gene-expression data.

  2. 2.

    et al. A predictive model for transcriptional control of physiology in a free living cell. Cell 131, 1354–1365 (2007).

  3. 3.

    & Moving toward a system genetics view of disease. Mamm. Genome 18, 389–401 (2007).

  4. 4.

    The Ontogeny of Information: Developmental Systems and Evolution (Duke Univ. Press, 2000).

  5. 5.

    The arrangement of field experiments. J. Ministry Agric. Great Britain 33, 503–511 (1926).

  6. 6.

    & Genetical genomics: the added value from segregation. Trends Genet. 17, 388–391 (2001). This was the first article in which the many advantages of using natural variation to probe gene-expression networks were articulated.

  7. 7.

    Studying complex biological systems using multifactorial perturbation. Nature Rev. Genet. 4, 145–151 (2003).

  8. 8.

    , , & Genetic dissection of transcriptional regulation in budding yeast. Science 296, 752–755 (2002).

  9. 9.

    et al. Genetics of gene expression surveyed in maize, mouse and man. Nature 422, 297–302 (2003). References 8 and 9 provided the first empirical results showing the power of genetic analysis of genome-wide gene expression.

  10. 10.

    & Genetics of global gene expression. Nature Rev. Genet. 7, 862–872 (2006).

  11. 11.

    & Genetics and Analysis of Quantitative Traits (Sinauer, 1998).

  12. 12.

    The genomics of gene expression. Genomics 84, 449–457 (2004).

  13. 13.

    In silico study of transcriptome genetic variation in outbred populations. Genetics 166, 547–554 (2004).

  14. 14.

    et al. A statistical multiprobe model for analyzing cis and trans genes in genetical genomics experiments with short-oligonucleotide arrays. Genetics 171, 1437–1439 (2005).

  15. 15.

    et al. Methodological aspects of the genetic dissection of gene expression. Bioinformatics 21, 2383–2393 (2005).

  16. 16.

    , & Multiple locus linkage analysis of genomewide expression in yeast. PLoS Biol. 3, e267 (2005).

  17. 17.

    et al. Statistical methods for expression quantitative trait loci (eQTL) mapping. Biometrics 62, 19–27 (2006).

  18. 18.

    & Causal inference of regulator–target pairs by gene mapping of expression phenotypes. BMC Genomics 7, 125 (2006).

  19. 19.

    & Mapping quantitative trait loci for expression abundance. Genetics 176, 611–623 (2007).

  20. 20.

    , , & Cis-acting expression quantitative trait loci in mice. Genome Res. 15, 681–691 (2005).

  21. 21.

    , , & Local regulatory variation in Saccharomyces cerevisiae. PLoS Genet. 1, e25 (2005).

  22. 22.

    et al. Genetic analysis of genome-wide variation in human gene expression. Nature 430, 743–747 (2004).

  23. 23.

    The genetics of gene expression. Mamm. Genome 17, 465 (2006).

  24. 24.

    , , & Gene regulatory networks generating the phenomena of additivity, dominance and epistasis. Genetics 155, 969–980 (2000).

  25. 25.

    , , & Statistical epistasis is a generic feature of gene regulatory networks. Genetics 175, 411–420 (2007).

  26. 26.

    & The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proc. Natl Acad. Sci. USA 102, 1572–1577 (2005).

  27. 27.

    , , & Genetic interactions between polymorphisms that affect gene expression in yeast. Nature 436, 701–703 (2005).

  28. 28.

    et al. Global eQTL mapping reveals the complex genetic architecture of transcript-level variation in Arabidopsis. Genetics 175, 1441–1450 (2007).

  29. 29.

    et al. Elucidating the murine brain transcriptional network in a segregating mouse population to identify core functional modules for obesity and diabetes. J. Neurochem. 97 (suppl. 1), 50–62 (2006).

  30. 30.

    et al. Nearest neighbor networks: clustering expression data based on gene neighborhoods. BMC Bioinformatics 8, 250 (2007).

  31. 31.

    et al. Variations in DNA elucidate molecular networks that cause disease. Nature 452, 429–435 (2008).

  32. 32.

    et al. Genetics of gene expression and its effect on disease. Nature 452, 423–428 (2008). References 31 and 32 integrated association mapping in human populations and linkage mapping in mice to identify suites of functionally related genes that are causally implicated in disease.

  33. 33.

    et al. Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks. Nature Genet. 40, 854–861 (2008).

  34. 34.

    et al. An integrative genomics approach to infer causal associations between gene expression and disease. Nature Genet. 37, 710–717 (2005).

  35. 35.

    , , & Discovery of meaningful associations in genomic data using partial correlation coefficients. Bioinformatics 20, 3565–3574 (2004).

  36. 36.

    & Estimating genomic coexpression networks using first-order conditional independence. Genome Biol. 5, R100 (2004).

  37. 37.

    et al. An integrative genomics approach to the reconstruction of gene networks in segregating populations. Cytogenet. Genome Res. 105, 363–374 (2004). This paper was the first to integrate expression QTL data and phenotypic correlation data into causal modelling, as well as to describe the crucial role of genetic perturbations in anchoring causal links in the Bayesian network context.

  38. 38.

    et al. Structural model analysis of multiple quantitative traits. PLoS Genet. 2, e114 (2006).

  39. 39.

    , & Harnessing naturally randomized transcription to infer regulatory relationships among genes. Genome Biol. 8, R219 (2007). This paper details a conservative analysis pipeline for uncovering high-confidence causal links with a well-defined false-discovery rate.

  40. 40.

    et al. Genetic networks of liver metabolism revealed by integration of metabolic and transcriptional profiling. PLoS Genet. 4, e1000034 (2008).

  41. 41.

    & Genetical genomics analysis of a yeast segregant population for transcription network inference. Genetics 170, 533–542 (2005).

  42. 42.

    et al. Inferring gene transcriptional modulatory relations: a genetical genomics approach. Hum. Mol. Genet. 14, 1119–1125 (2005).

  43. 43.

    et al. An integrative approach for causal gene identification and gene regulatory pathway inference. Bioinformatics 22, e489–e496 (2006).

  44. 44.

    et al. eQED: an efficient method for interpreting eQTL associations using protein networks. Mol. Syst. Biol. 4, 162 (2008).

  45. 45.

    et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nature Rev. Genet. 9, 356–369 (2008).

  46. 46.

    et al. A genome-wide association study of global gene expression. Nature Genet. 39, 1202–1207 (2007).

  47. 47.

    et al. Discovery of expression QTLs using large-scale transcriptional profiling in human lymphocytes. Nature Genet. 39, 1208–1216 (2007).

  48. 48.

    et al. Population genomics of human gene expression. Nature Genet. 39, 1217–1224 (2007).

  49. 49.

    et al. Mapping the genetic architecture of gene expression in human liver. PLoS Biol. 6, e107 (2008).

  50. 50.

    et al. Increasing the power to detect causal associations by combining genotypic and expression data in segregating populations. PLoS Comput. Biol. 3, e69 (2007).

  51. 51.

    , & Gene network inference via structural equation modeling in genetical genomics experiments. Genetics 178, 1763–1776 (2008).

  52. 52.

    et al. Integrating genetic and network analysis to characterize genes related to mouse weight. PLoS Genet. 2, e130 (2006).

  53. 53.

    et al. Identifying regulatory mechanisms using individual variation reveals key role for chromatin modification. Proc. Natl Acad. Sci. USA 103, 14062–14067 (2006).

  54. 54.

    The Genetical Theory of Natural Selection (Oxford Univ. Press, 1930).

  55. 55.

    & The evolution of gene expression QTL in Saccharomyces cerevisiae. PLoS ONE 2, e678 (2007). This paper is a founding contribution to the field of functional population genomics; it addresses the genomic basis of phenotypic evolution from the perspective of the functional alleles segregating in populations.

  56. 56.

    et al. Trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors. Nature Genet. 35, 57–64 (2003).

  57. 57.

    & Understanding quantitative genetic variation. Nature Rev. Genet. 3, 11–21 (2002).

  58. 58.

    Origin of the neutral and nearly neutral theories of evolution. J. Biosci. 28, 371–377 (2003).

  59. 59.

    , & Regulatory changes underlying expression differences within and between Drosophila species. Nature Genet. 40, 346–350 (2008).

  60. 60.

    Statistical power of expression quantitative trait loci for mapping of complex trait loci in natural populations. Genetics 178, 2201–2216 (2008).

  61. 61.

    et al. Evidence of a large-scale functional organization of mammalian chromosomes. PLoS Genet. 1, e33 (2005).

  62. 62.

    & Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster. Nature 356, 519–520 (1992).

  63. 63.

    , & The effect of deleterious mutations on neutral molecular variation. Genetics 134, 1289–1303 (1993).

  64. 64.

    , , & Fine-scale mapping of recombination rate in Drosophila refines its correlation to diversity and divergence. Proc. Natl Acad. Sci. USA 105, 10051–10056 (2008).

  65. 65.

    et al. Uncovering regulatory pathways that affect hematopoietic stem cell function using 'genetical genomics'. Nature Genet. 37, 225–232 (2005).

  66. 66.

    et al. Linking metabolic QTLs with network and cis-eQTLs controlling biosynthetic pathways. PLoS Genet. 3, 1687–1701 (2007).

  67. 67.

    et al. Genetic basis of proteome variation in yeast. Nature Genet. 39, 1369–1375 (2007).

  68. 68.

    et al. Applying gene expression, proteomics and single-nucleotide polymorphism analysis for complex trait gene identification. Genetics 178, 1795–1805 (2008).

  69. 69.

    et al. The Collaborative Cross, a community resource for the genetic analysis of complex traits. Nature Genet. 36, 1133–1137 (2004).

  70. 70.

    et al. Mapping determinants of gene expression plasticity by genetical genomics in C. elegans. PLoS Genet. 2, e222 (2006).

  71. 71.

    & Gene–environment interaction in yeast gene expression. PLoS Biol. 6, e83 (2008).

  72. 72.

    et al. Integrated transcriptional profiling and linkage analysis for identification of genes underlying disease. Nature Genet. 37, 243–253 (2005).

  73. 73.

    et al. Genetic dissection of gene regulation in multiple mouse tissues. Mamm. Genome 17, 490–495 (2006).

  74. 74.

    et al. Genetic and genomic analysis of a fat mass trait with complex inheritance reveals marked sex specificity. PLoS Genet. 2, e15 (2006).

Download references

Acknowledgements

I thank the Jane Coffin Childs Memorial Fund for Medical Research and New York University for support, and L. Chen for discussion.

Author information

Affiliations

  1. Center for Genomics and Systems Biology, Department of Biology, New York University, 100 Washington Square East, New York, New York 10003, USA.

    • Matthew V. Rockman

Authors

  1. Search for Matthew V. Rockman in:

Competing interests

The author declares no competing financial interests.

Reprints and permissions information is available at http://www.nature.com/reprints.

Correspondence should be addressed to the author (mrockman@nyu.edu).

About this article

Publication history

Published

DOI

https://doi.org/10.1038/nature07633

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Newsletter Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing