Subjects

Abstract

Genetic variation influences gene expression, and this variation in gene expression can be efficiently mapped to specific genomic regions and variants. Here we have used gene expression profiling of Epstein-Barr virus–transformed lymphoblastoid cell lines of all 270 individuals genotyped in the HapMap Consortium to elucidate the detailed features of genetic variation underlying gene expression variation. We find that gene expression is heritable and that differentiation between populations is in agreement with earlier small-scale studies. A detailed association analysis of over 2.2 million common SNPs per population (5% frequency in HapMap) with gene expression identified at least 1,348 genes with association signals in cis and at least 180 in trans. Replication in at least one independent population was achieved for 37% of cis signals and 15% of trans signals, respectively. Our results strongly support an abundance of cis-regulatory variation in the human genome. Detection of trans effects is limited but suggests that regulatory variation may be the key primary effect contributing to phenotypic variation in humans. We also explore several methodologies that improve the current state of analysis of gene expression variation.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Accessions

Gene Expression Omnibus

References

  1. 1.

    et al. Polymorphisms in the low-density lipoprotein receptor-related protein 5 (LRP5) gene are associated with variation in vertebral bone mass, vertebral bone size, and stature in whites. Am. J. Hum. Genet. 74, 866–875 (2004).

  2. 2.

    et al. Association testing by DNA pooling: an effective initial screen. Proc. Natl. Acad. Sci. USA 99, 16871–16874 (2002).

  3. 3.

    & Apolipoprotein E: from atherosclerosis to Alzheimer's disease and beyond. Curr. Opin. Lipidol. 10, 207–217 (1999).

  4. 4.

    & Long-range control of gene expression: emerging mechanisms and disruption in disease. Am. J. Hum. Genet. 76, 8–32 (2005).

  5. 5.

    et al. Sarcoidosis is associated with a truncating splice site mutation in BTNL2. Nat. Genet. 37, 357–364 (2005).

  6. 6.

    et al. Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels. Science 316, 1331–1336 (2007).

  7. 7.

    et al. Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes. Science 316, 1336–1341 (2007).

  8. 8.

    et al. FCGR3B copy number variation is associated with susceptibility to systemic, but not organ-specific, autoimmunity. Nat. Genet. 39, 721–723 (2007).

  9. 9.

    & Genome architecture, rearrangements and genomic disorders. Trends Genet. 18, 74–82 (2002).

  10. 10.

    et al. Submicroscopic deletion in patients with Williams-Beuren syndrome influences expression levels of the nonhemizygous flanking genes. Am. J. Hum. Genet. 79, 332–341 (2006).

  11. 11.

    et al. CFH haplotypes without the Y402H coding variant show strong association with susceptibility to age-related macular degeneration. Nat. Genet. 38, 1049–1054 (2006).

  12. 12.

    & Genome-wide association studies for common diseases and complex traits. Nat. Rev. Genet. 6, 95–108 (2005).

  13. 13.

    & The genetics of regulatory variation in the human genome. Hum. Genomics 2, 126–131 (2005).

  14. 14.

    & From DNA to RNA to disease and back: the 'central dogma' of regulatory disease variation. Hum. Genomics 2, 383–390 (2006).

  15. 15.

    , , & Cis-acting expression quantitative trait loci in mice. Genome Res. 15, 681–691 (2005).

  16. 16.

    et al. Mapping determinants of human gene expression by regional and genome-wide association. Nature 437, 1365–1369 (2005).

  17. 17.

    et al. Genetics of gene expression surveyed in maize, mouse and man. Nature 422, 297–302 (2003).

  18. 18.

    et al. Genome-wide associations of gene expression variation in humans. PLoS Genet. 1, e78 (2005).

  19. 19.

    et al. Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science 315, 848–853 (2007).

  20. 20.

    Regulatory polymorphisms underlying complex disease traits. J. Mol. Med. 83, 97–109 (2005).

  21. 21.

    et al. Genetic inheritance of gene expression in human cell lines. Am. J. Hum. Genet. 75, 1094–1105 (2004).

  22. 22.

    The International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005).

  23. 23.

    The International HapMap Consortium. The phase II haplotype map of the human genome. Nature (in the press).

  24. 24.

    , , & beadarray: an R Package to analyse Illumina BeadArrays. R News 6, 17 (2006).

  25. 25.

    , , , & Quality control and low-level statistical analysis of Illumina BeadArrays. Rev. Stat. 4, 1–30 (2006).

  26. 26.

    et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000).

  27. 27.

    , , , & The Gene Ontology Annotation (GOA) Database—an integrated resource of GO annotations to the UniProt Knowledgebase. In Silico Biol. 4, 5–6 (2004).

  28. 28.

    , & Multiple locus linkage analysis of genomewide expression in yeast. PLoS Biol. 3, e267 (2005).

  29. 29.

    , , , & Identifying regulatory mechanisms using individual variation reveals key role for chromatin modification. Proc. Natl. Acad. Sci. USA 103, 14062–14067 (2006).

  30. 30.

    et al. ATM haplotypes and breast cancer risk in Jewish high-risk women. Br. J. Cancer 94, 1537–1543 (2006).

  31. 31.

    et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816 (2007).

  32. 32.

    et al. Functional analysis of polymorphisms in the promoter regions of genes on 22q11. Hum. Mutat. 24, 35–42 (2004).

  33. 33.

    et al. Evolutionary discrimination of mammalian conserved non-genic sequences (CNGs). Science 302, 1033–1035 (2003).

  34. 34.

    et al. Conserved noncoding sequences are selectively constrained and not mutation cold spots. Nat. Genet. 38, 223–227 (2006).

  35. 35.

    et al. Ultraconserved elements in the human genome. Science 304, 1321–1325 (2004).

  36. 36.

    et al. Human GLI3 intragenic conserved non-coding sequences are tissue-specific enhancers. PLoS ONE 2, e366 (2007).

  37. 37.

    et al. Highly conserved non-coding sequences are associated with vertebrate development. PLoS Biol. 3, e7 (2005).

  38. 38.

    & The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proc. Natl. Acad. Sci. USA 102, 1572–1577 (2005).

  39. 39.

    et al. Trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors. Nat. Genet. 35, 57–64 (2003).

  40. 40.

    et al. A novel, high-performance random array platform for quantitative gene expression profiling. Genome Res. 14, 2347–2356 (2004).

  41. 41.

    , , & A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19, 185–193 (2003).

  42. 42.

    & Empirical threshold values for quantitative trait mapping. Genetics 138, 963–971 (1994).

  43. 43.

    & Permutation tests for multiple loci affecting a quantitative character. Genetics 142, 285–294 (1996).

Download references

Acknowledgements

We thank the HapMap Consortium for data availability; M. Smith for assistance with software development; and M. Gibbs, J. Orwick and C. Gerringer for technical support. Funding was provided by the Wellcome Trust (to E.T.D. and P.D.), the US National Institutes of Health ENDGAME (to E.T.D. and S.T.), Cancer Research UK (to S.T.), and the Medical Research Council (to M.D.). S.T. is a Royal Society Wolfson Research Merit Award holder.

Author information

Affiliations

  1. The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    • Barbara E Stranger
    • , Alexandra C Nica
    • , Matthew S Forrest
    • , Antigone Dimas
    • , Christine P Bird
    • , Claude Beazley
    • , Catherine E Ingle
    • , Stephen Montgomery
    • , Panos Deloukas
    •  & Emmanouil T Dermitzakis
  2. Department of Oncology, University of Cambridge, Cancer Research UK Cambridge Research Institute, Li Ka Shing Centre, Robinson Way, Cambridge CB2 0RE, UK.

    • Mark Dunning
    •  & Simon Tavaré
  3. European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.

    • Paul Flicek
  4. Computer Science Department, Gates Building 1A, Stanford University, Stanford, California 94305-9010, USA.

    • Daphne Koller

Authors

  1. Search for Barbara E Stranger in:

  2. Search for Alexandra C Nica in:

  3. Search for Matthew S Forrest in:

  4. Search for Antigone Dimas in:

  5. Search for Christine P Bird in:

  6. Search for Claude Beazley in:

  7. Search for Catherine E Ingle in:

  8. Search for Mark Dunning in:

  9. Search for Paul Flicek in:

  10. Search for Daphne Koller in:

  11. Search for Stephen Montgomery in:

  12. Search for Simon Tavaré in:

  13. Search for Panos Deloukas in:

  14. Search for Emmanouil T Dermitzakis in:

Contributions

B.E.S. performed the majority of the analysis, coordinated the efforts on the project, performed part of the experimental work, and wrote part of the manuscript. E.T.D. and P.D. helped with the analysis, wrote part of the manuscript, and led the project. S.T. and M.D. performed the normalization and helped with statistical analysis. A.C.N., A.D., C.P.B., P.F. and S.M. performed specific parts of the analysis. M.S.F. helped with the analysis and performed part of the experimental work. C.E.I. performed most of the experimental work. C.B. wrote some of the scripts and performed part of the analysis. D.K. provided advice on the permutation analysis.

Corresponding authors

Correspondence to Panos Deloukas or Emmanouil T Dermitzakis.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figs. 1–6, Supplementary Table 1, and Supplementary Methods

  2. 2.

    Supplementary Table 2

    Number and source category of SNPs used in trans analysis.

  3. 3.

    Supplementary Table 3

    Significant cis- 1Mb associations, linear regression, individual population analysis, 0.001 permutation threshold.

  4. 4.

    Supplementary Table 4

    Significant cis- 1 Mb associations, linear regression, multiple population analysis, 0.001 permutation threshold.

  5. 5.

    Supplementary Table 5

    Significant cis- 1Mb associations, Spearman rank correlation, individual population analysis, 0.001 permutation threshold.

  6. 6.

    Supplementary Table 6

    Significant trans associations, linear regression, individual population analysis, 0.001 permutation threshold.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/ng2142

Further reading