Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Gene–environment interactions in human diseases

Key Points

  • Complex diseases are thought to involve the interaction between environmental and lifestyle factors, and inherited susceptibility.

  • The increasing number of disease-associated alleles of both high and low penetrance that have been described allows us to assess whether allele penetrance is modified by environmental factors.

  • There are many models that describe the precise nature of the risks associated with combinations of genetic and environmental risk factors. This introduces an additional element of multiple comparisons into the already large matrix of potential genetic and environmental risk factors.

  • All the main epidemiological study designs can be used to detect gene–environment interactions. The optimal study design depends on the interaction that is being examined.

  • The sample sizes that are required to detect gene–environment or gene–gene interactions are much larger than those necessary to detect genetic or environmental factors in isolation.

  • Many studies that have been carried out do not have adequate sample sizes to address gene–environment interactions.

  • Creating common databases of results, and pooling results across consortia could mitigate the problem of sample size. Pre-planned pooling of results will be more efficient than post-hoc pooling, as the increasing use of haplotype-tagging SNPs might mean that different research groups choose different gene variants.

  • Finding the common variants associated with risk of common diseases will be just the beginning of applying knowledge of gene variation to human disease. Dissecting the interaction of genes with environment will be necessary to assess their public-health and clinical relevance, and will present many challenges.


Studies of gene–environment interactions aim to describe how genetic and environmental factors jointly influence the risk of developing a human disease. Gene–environment interactions can be described by using several models, which take into account the various ways in which genetic effects can be modified by environmental exposures, the number of levels of these exposures and the model on which the genetic effects are based. Choice of study design, sample size and genotyping technology influence the analysis and interpretation of observed gene–environment interactions. Current systems for reporting epidemiological studies make it difficult to assess whether the observed interactions are reproducible, so suggestions are made for improvements in this area.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type



Prices may be subject to local taxes which are calculated during checkout

Figure 1: Models of gene–environment interactions.
Figure 2: Number of cases needed to detect a range of multiplicative interactions, according to allele prevalence.


  1. Garrod, A. The incidence of alkaptonuria: a study in chemical individuality. Lancet 2, 1616–1620 (1902). A classical work of intuition, in which Garrod inferred the existence of inherited biochemical characteristics from family histories of alkaptonuria at the turn of the last century.

    Article  CAS  Google Scholar 

  2. Green, A. & Trichopoulos, D. in Textbook of Cancer Epidemiology (eds Adami, H., Hunter, D. & Trichopoulos, D.) 281–300 (Oxford Univ. Press, Oxford, 2002).

    Google Scholar 

  3. Takeshita, T., Mao, X. Q. & Morimoto, K. The contribution of polymorphism in the alcohol dehydrogenase-β subunit to alcohol sensitivity in a Japanese population. Hum. Genet. 97, 409–413 (1996).

    Article  CAS  PubMed  Google Scholar 

  4. Thomas, D. C. Genetic epidemiology with a capital 'E'. Genet. Epidemiol. 19, 289–300 (2000).

    Article  CAS  PubMed  Google Scholar 

  5. Botto, L. D. & Khoury, M. J. Commentary. Facing the challenge of gene–environment interaction: the two-by-four table and beyond. Am. J. Epidemiol. 153, 1016–1020 (2001).

    Article  CAS  PubMed  Google Scholar 

  6. Khoury, M. J., Adams, M. J. & Flanders, W. D. An epidemiologic approach to ecogenetics. Am. J. Hum. Genet. 42, 89–95 (1988). The patterns by which genes and environment jointly determine risk were laid out using commonly accepted clinical models such as xeroderma pigmentosum and phenylketonuria.

    CAS  PubMed  PubMed Central  Google Scholar 

  7. Ottman, R. An epidemiologic approach to gene–environment interaction. Genet. Epidemiol. 7, 177–185 (1990).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Begg, C. B. & Berlin, J. A. Publication bias and dissemination of clinical research. J. Natl Cancer Inst. 81, 107–115 (1989).

    Article  CAS  PubMed  Google Scholar 

  9. Rothman, K. J. & Greenland, S. Modern Epidemiology 2nd edn (Lippincott–Raven, Philadelphia, 1998).

    Google Scholar 

  10. Rebbeck, T. R., Spitz, M. & Wu, X. Assessing the function of genetic variants in candidate gene association studies. Nature Rev. Genet. 5, 589–597 (2004).

    Article  CAS  PubMed  Google Scholar 

  11. Armitage, P. & Colton, T. (eds) Biostatistical Genetics and Genetic Epidemiology (John Wiley & Sons, West Sussex, 2002).

    Google Scholar 

  12. Narod, S. A. et al. Risk modifiers in carriers of BRCA1 mutations. Int. J. Cancer 64, 394–398 (1995).

    Article  CAS  PubMed  Google Scholar 

  13. Gauderman, W. J. Sample size requirements for matched case–control studies of gene–environment interaction. Stat. Med. 21, 35–50 (2002).

    Article  PubMed  Google Scholar 

  14. Khoury, M. J. & Flanders, W. D. Nontraditional epidemiologic approaches in the analysis of gene–environment interaction: case–control studies with no controls! Am. J. Epidemiol. 144, 207–213 (1996).

    Article  CAS  PubMed  Google Scholar 

  15. Adami, H. & Trichopoulos, D. in Textbook of Cancer Epidemiology (eds Adami, H., Hunter, D. & Trichopoulos, D.) 87–109 (Oxford Univ. Press, New York, 2002).

    Google Scholar 

  16. Wacholder, S., Rothman, N. & Caporaso, N. Population stratification in epidemiologic studies of common genetic variants and cancer: quantification of bias. J. Natl Cancer Inst. 92, 1151–1158 (2000). Using simulation studies, Wacholder and colleagues showed that the fear of population stratification in well-designed associated studies was exaggerated.

    Article  CAS  PubMed  Google Scholar 

  17. Freedman, M. L. et al. Assessing the impact of population stratification on genetic association studies. Nature Genet. 36, 388–393 (2004).

    Article  CAS  PubMed  Google Scholar 

  18. Cardon, L. R. & Palmer, L. J. Population stratification and spurious allelic association. Lancet 361, 598–604 (2003).

    Article  PubMed  Google Scholar 

  19. Devlin, B. & Roeder, K. Genomic control for association studies. Biometrics 55, 997–1004 (1999).

    Article  CAS  PubMed  Google Scholar 

  20. Pritchard, J. K. & Rosenberg, N. A. Use of unlinked genetic markers to detect population stratification in association studies. Am. J. Hum. Genet. 65, 220–228 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Morimoto, L. M., White, E. & Newcomb, P. A. Selection bias in the assessment of gene–environment interaction in case–control studies. Am. J. Epidemiol. 158, 259–263 (2003).

    Article  PubMed  Google Scholar 

  22. Garcia-Closas, M., Thompson, W. D. & Robins, J. M. Differential misclassification and the assessment of gene–environment interactions in case–control studies. Am. J. Epidemiol. 147, 426–433 (1998).

    Article  CAS  PubMed  Google Scholar 

  23. Liotta, L. & Petricoin, E. Molecular profiling of human cancer. Nature Rev. Genet. 1, 48–56 (2000).

    Article  CAS  PubMed  Google Scholar 

  24. Umbach, D. M. & Weinberg, C. R. Designing and analysing case–control studies to exploit independence of genotype and exposure. Stat. Med. 16, 1731–1743 (1997).

    Article  CAS  PubMed  Google Scholar 

  25. Schmidt, S. & Schaid, D. J. Potential misinterpretation of the case-only study to assess gene–environment interaction. Am. J. Epidemiol. 150, 878–885 (1999).

    Article  CAS  PubMed  Google Scholar 

  26. Albert, P. S., Ratnasinghe, D., Tangrea, J. & Wacholder, S. Limitations of the case-only design for identifying gene–environment interactions. Am. J. Epidemiol. 154, 687–693 (2001).

    Article  CAS  PubMed  Google Scholar 

  27. Smith, P. G. & Day, N. E. The design of case–control studies: the influence of confounding and interaction effects. Int. J. Epidemiol. 13, 356–365 (1984).

    Article  CAS  PubMed  Google Scholar 

  28. Garcia-Closas, M., Rothman, N. & Lubin, J. Misclassification in case–control studies of gene–environment interactions: assessment of bias and sample size. Cancer Epidemiol. Biomarkers Prev. 8, 1043–1050 (1999). Using standard techniques, these authors pointed out the inadequate size of contemporary studies of gene–environment interactions, particularly once measurement error of the environmental variables was considered.

    CAS  PubMed  Google Scholar 

  29. Wong, M. Y., Day, N. E., Luan, J. A. & Wareham, N. J. Estimation of magnitude in gene–environment interactions in the presence of measurement error. Stat. Med. 23, 987–998 (2004).

    Article  CAS  PubMed  Google Scholar 

  30. Clayton, D. & McKeigue, P. M. Epidemiological methods for studying genes and environmental factors in complex diseases. Lancet 358, 1356–1360 (2001). A careful reconsideration of the advantages of the retrospective case–control design compared with prospective studies.

    Article  CAS  PubMed  Google Scholar 

  31. Sinha, R. et al. Heterocyclic amine content in beef cooked by different methods to varying degrees of doneness and gravy made from meat drippings. Food Chem. Toxicol. 36, 279–287 (1998).

    Article  CAS  PubMed  Google Scholar 

  32. Roberts-Thomson, I. C. et al. Diet, acetylator phenotype, and risk of colorectal neoplasia. Lancet 347, 1372–1374 (1996).

    Article  CAS  PubMed  Google Scholar 

  33. Chen, J. et al. A prospective study of N-acetyltransferase genotype, red meat intake, and risk of colorectal cancer. Cancer Res. 58, 3307–3311 (1998).

    CAS  PubMed  Google Scholar 

  34. Kampman, E. et al. Meat consumption, genetic susceptibility, and colon cancer risk: a United States multicenter case–control study. Cancer Epidemiol. Biomarkers Prev. 8, 15–24 (1999).

    CAS  PubMed  Google Scholar 

  35. Le Marchand, L. et al. Well-done red meat, metabolic phenotypes and colorectal cancer in Hawaii. Mutat. Res. 506–507, 205–214 (2002).

    Article  PubMed  Google Scholar 

  36. Barrett, J. H. et al. Investigation of interaction between N-acetyltransferase 2 and heterocyclic amines as potential risk factors for colorectal cancer. Carcinogenesis 24, 275–282 (2003).

    Article  CAS  PubMed  Google Scholar 

  37. Rothman, N. et al. The use of common genetic polymorphisms to enhance the epidemiologic study of environmental carcinogens. Biochim. Biophys. Acta 1471, C1–C10 (2001).

    CAS  PubMed  Google Scholar 

  38. Ames, B. N. Cancer prevention and diet: help from single nucleotide polymorphisms. Proc. Natl Acad. Sci. USA 96, 12216–12218 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Davey Smith, G. & Ebrahim, S. 'Mendelian randomization': can genetic epidemiology contribute to understanding environmental determinants of disease? Int. J. Epidemiol. 32, 1–22 (2003).

    Article  Google Scholar 

  40. Goldstein, D. B., Tate, S. K. & Sisodiya, S. M. Pharmacogenetics goes genomic. Nature Rev. Genet. 4, 937–947 (2003).

    Article  CAS  PubMed  Google Scholar 

  41. Smith, M. W. et al. Contrasting genetic influence of CCR2 and CCR5 variants on HIV-1 infection and disease progression. Hemophilia Growth and Development Study (HGDS), Multicenter AIDS Cohort Study (MACS), Multicenter Hemophilia Cohort Study (MHCS), San Francisco City Cohort (SFCC), ALIVE Study. Science 277, 959–965 (1997).

    Article  CAS  PubMed  Google Scholar 

  42. Shaheen, F. & Collman, R. G. Co-receptor antagonists as HIV-1 entry inhibitors. Curr. Opin. Infect. Dis. 17, 7–16 (2004).

    Article  CAS  PubMed  Google Scholar 

  43. Colhoun, H. M., McKeigue, P. M. & Davey Smith G. Problems of reporting genetic associations with complex outcomes. Lancet 361, 865–872 (2003).

    Article  PubMed  Google Scholar 

  44. Ioannidis, J. P., Rosenberg, P. S., Goedert, J. J. & O'Brien, T. R. Commentary: meta-analysis of individual participants' data in genetic epidemiology. Am. J. Epidemiol. 156, 204–210 (2002).

    Article  PubMed  Google Scholar 

  45. Deitz, A. C. et al. Impact of misclassification in genotype-exposure interaction studies: example of N-acetyltransferase 2 (NAT2), smoking, and bladder cancer. Cancer Epidemiol. Biomarkers Prev. 13, 1543–1546 (2004).

    CAS  PubMed  Google Scholar 

  46. Le Marchand, L. et al. Feasibility of collecting buccal cell DNA by mail in a cohort study. Cancer Epidemiol. Biomarkers Prev. 10, 701–703 (2001).

    CAS  PubMed  Google Scholar 

  47. Garcia-Closas, M. et al. Collection of genomic DNA from adults in epidemiological studies by buccal cytobrush and mouthwash. Cancer Epidemiol. Biomarkers Prev. 10, 687–696 (2001).

    CAS  PubMed  Google Scholar 

  48. Collins, F. S. The case for a US prospective cohort study of genes and environment. Nature 429, 475–477 (2004).

    Article  CAS  PubMed  Google Scholar 

  49. Tranah, G. J., Lescault, P. J., Hunter, D. J. & De Vivo, I. Multiple displacement amplification prior to single nucleotide polymorphism genotyping in epidemiologic studies. Biotechnol. Lett. 25, 1031–1036 (2003).

    Article  CAS  PubMed  Google Scholar 

  50. Paez, J. G. et al. Genome coverage and sequence fidelity of φ29 polymerase–based multiple strand displacement whole genome amplification. Nucleic Acids Res. 32, e71 (2004).

    Article  PubMed  PubMed Central  Google Scholar 

  51. Barker, D. L. et al. Two methods of whole-genome amplification enable accurate genotyping across a 2320-SNP linkage panel. Genome Res. 14, 901–907 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Ge, H., Walhout, A. J. & Vidal, M. Integrating 'omic' information: a bridge between genomics and systems biology. Trends Genet. 19, 551–560 (2003).

    Article  CAS  PubMed  Google Scholar 

  53. Conti, D. V., Cortessis, V., Molitor, J. & Thomas, D. C. Bayesian modeling of complex metabolic pathways. Hum. Hered. 56, 83–93 (2003). An example of an emerging interest in new statistical techniques for addressing perturbations in complex biological pathways.

    Article  PubMed  Google Scholar 

  54. Cai, Z. et al. Bayesian approach to discovering pathogenic SNPs in conserved protein domains. Hum. Mutat. 24, 178–184 (2004).

    Article  CAS  PubMed  Google Scholar 

  55. Wacholder, S., Chanock, S., Garcia-Closas, M., El Ghormli, L. & Rothman, N. Assessing the probability that a positive report is false: an approach for molecular epidemiology studies. J. Natl Cancer Inst. 96, 434–442 (2004).

    Article  PubMed  PubMed Central  Google Scholar 

  56. Rose, G. Sick individuals and sick populations. Int. J. Epidemiol. 14, 32–38 (1985). A leader in epidemiology contended that public health could be bettered by addressing health problems at the population, rather than the individual, level.

    Article  CAS  PubMed  Google Scholar 

  57. Bigler, J. et al. CYP2C9 and UGT1A6 genotypes modulate the protective effect of aspirin on colon adenoma risk. Cancer Res. 61, 3566–3569 (2001).

    CAS  PubMed  Google Scholar 

  58. Chan, A. T. et al. Genetic variants in the UGT1A6 enzyme, aspirin use, and the risk of colorectal adenoma. J. Natl Cancer Inst. (in the press).

  59. National Institute on Aging/Alzheimer's Association Working Group. Apolipoprotein E genotyping in Alzheimer's disease. Lancet 347, 1091–1095 (1996).

  60. Peila, R. et al. Joint effect of the APOE gene and midlife systolic blood pressure on late-life cognitive impairment: the Honolulu-Asia aging study. Stroke 32, 2882–2889 (2001).

    Article  CAS  PubMed  Google Scholar 

  61. Kang, J. H., Logroscino, G., De Vivo, I., Hunter, D. & Grodstein, F. Apolipoprotein E, cardiovascular disease and cognitive function in aging women. Neurobiol. Aging 26, 475–484 (2005).

    Article  CAS  PubMed  Google Scholar 

  62. Rees, J. L. The genetics of sun sensitivity in humans. Am. J. Hum. Genet. 75, 739–751 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Chen, J., Giovannucci, E. & Hunter, D. J. MTHFR polymorphism, methyl-replete diets and the risk of colorectal carcinoma and adenoma among US men and women: an example of gene–environment interactions in colorectal tumorigenesis. J. Nutr. 129, S560–S564 (1999).

    Article  Google Scholar 

  64. Bloemenkamp, K. W., Rosendaal, F. R., Helmerhorst, F. M., Buller, H. R. & Vandenbroucke, J. P. Enhancement by factor V Leiden mutation of risk of deep-vein thrombosis associated with oral contraceptives containing a third-generation progestagen. Lancet 346, 1593–1596 (1995).

    Article  CAS  PubMed  Google Scholar 

  65. Lehtimaki, T. et al. Association between serum lipids and apolipoprotein E phenotype is influenced by diet in a population-based sample of free-living children and young adults: the Cardiovascular Risk in Young Finns Study. J. Lipid Res. 36, 653–661 (1995).

    CAS  PubMed  Google Scholar 

  66. Hines, L. M. et al. Genetic variation in alcohol dehydrogenase and the beneficial effect of moderate alcohol consumption on myocardial infarction. N. Engl. J. Med. 344, 549–555 (2001).

    Article  CAS  PubMed  Google Scholar 

  67. Memisoglu, A. et al. Interaction between a peroxisome proliferator-activated receptor-γ gene polymorphism and dietary fat intake in relation to body mass. Hum. Mol. Genet. 12, 2923–2929 (2003).

    Article  CAS  PubMed  Google Scholar 

  68. Maier, L. A. Genetic and exposure risks for chronic beryllium disease. Clin. Chest Med. 23, 827–839 (2002).

    Article  PubMed  Google Scholar 

  69. Weinshilboum, R. Thiopurine pharmacogenetics: clinical and molecular studies of thiopurine methyltransferase. Drug Metab. Dispos. 29, 601–605 (2001).

    CAS  PubMed  Google Scholar 

  70. Israel, E. et al. The effect of polymorphisms of the β2-adrenergic receptor on the response to regular use of albuterol in asthma. Am. J. Respir. Crit. Care Med. 162, 75–80 (2000).

    Article  CAS  PubMed  Google Scholar 

Download references


I thank my colleagues at the Harvard School of Public Health and the Channing Laboratory for helpful discussions, and P. Kraft for reviewing the manuscript.

Author information

Authors and Affiliations


Ethics declarations

Competing interests

The author declares no competing financial interests.

Related links

Related links








Alzheimer disease

multiple sclerosis

Parkinson disease


sickle-cell anaemia

xeroderma pigmentosum



Human Genome Project

National Cancer Institute Breast and Prostate Cancer and Hormone-Related Gene Variants Cohort Consortium

National Cancer Institute Cancer Genome Anatomy Project SNP500Cancer Database

UK Biobank

US National Institute of Environmental Health Sciences Environmental Genome Project



The study of drug responses that are related to inherited genetic differences.


A discipline that seeks to explain the extent to which factors that people are exposed to (environmental or genetic) influence their risk of disease, by means of population-based investigations.


Studies of specific genes in which variation might influence the risk of a specific disease, usually because the gene is part of a biological pathway that is plausibly related to the disease.


A molecular marker of a biological function or external exposure.


An approach to gene mapping that looks for associations between a particular phenotype and allelic variation in a population.


An attempt to distinguish between more likely and less likely interactions on the basis of knowledge of biological mechanisms, before an interaction is observed.


A statistic that quantifies the dispersion of data about the mean. In quantitative genetics, the phenotypic variance (Vp) is the observed variation of a trait in a population. Vp can be partitioned into components, owing to genetic variance (Vg), environmental variance (Ve) and gene–environment correlations and interactions.


The frequency with which individuals that carry a given gene variant will show the manifestations associated with that variant. If penetrance of a disease allele is 100% then all individuals carrying that allele will express the associated disorder.


The presence of multiple subgroups with different allele frequencies within a population. The different underlying allele frequencies in sampled subgroups might be independent of the disease within each group, and they can lead to erroneous conclusions of linkage disequilibrium or disease relevance.


One of a small subset of SNPs that is needed to uniquely identify a complete haplotype.


(LD). A measure of whether alleles at two loci co-exist in a population in a non-random fashion. Alleles that are in LD are found together on the same haplotype more often than would be expected by chance.


The study of the complex interactions that occur at all levels of biological information — from whole-genome sequence interactions to developmental and biochemical networks — and their functional relationship to the phenotypes of organisms.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Hunter, D. Gene–environment interactions in human diseases. Nat Rev Genet 6, 287–298 (2005).

Download citation

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing