Abstract
Case–control studies have many advantages for identifying disease-related genes, but are limited in their ability to detect gene–environment interactions. The prospective cohort design provides a valuable complement to case–control studies. Although it has disadvantages in duration and cost, it has important strengths in characterizing exposures and risk factors before disease onset, which reduces important biases that are common in case–control studies. This and other strengths of prospective cohort studies make them invaluable for understanding gene–environment interactions in complex human disease.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
dbTMM: an integrated database of large-scale cohort, genome and clinical data for the Tohoku Medical Megabank Project
Human Genome Variation Open Access 10 December 2021
-
Genome-wide analysis of 53,400 people with irritable bowel syndrome highlights shared genetic pathways with mood and anxiety disorders
Nature Genetics Open Access 05 November 2021
-
Urologische Forschung in Deutschland
Der Urologe Open Access 28 April 2020
Access options
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout



References
International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).
Venter, J. C. et al. The sequence of the human genome. Science 291, 1304–1351 (2001).
International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005).
Chakravarti, A. & Little, P. Nature, nurture, and human disease. Nature 421, 412–414 (2003).
Collins, F. S. The case for a US prospective cohort study of genes and environment. Nature 429, 475–477 (2004).
Hunter, D. J. Gene–environment interactions in human diseases. Nature Rev. Genet. 6, 287–298 (2005).
Ordovas, J. M. et al. Dietary fat intake determines the effect of a common polymorphism in the hepatic lipase gene promoter on high-density lipoprotein metabolism: evidence of a strong dose effect in this gene–nutrient interaction in the Framingham Study. Circulation 106, 2315–2321 (2002).
Tai, E. S. et al. Singapore National Health Survey. Dietary fat interacts with the -514C>T polymorphism in the hepatic lipase gene promoter on plasma lipid profiles in a multiethnic Asian population: the 1998 Singapore National Health Survey. J. Nutr. 133, 3399–3408 (2003).
Bos, G. et al. Interactions of dietary fat intake and the hepatic lipase −480C>T polymorphism in determining hepatic lipase activity: the Hoorn Study. Am. J. Clin. Nutr. 81, 911–915 (2005).
Ko, Y. L., Hsu, L. A., Hsu, K. H., Ko, Y. H. & Lee, Y. S. The interactive effects of hepatic lipase gene promoter polymorphisms with sex and obesity on high-density-lipoprotein cholesterol levels in Taiwanese-Chinese. Atherosclerosis 172, 135–142 (2004).
St-Pierre, J. et al. Visceral obesity attenuates the effect of the hepatic lipase −514C>T polymorphism on plasma HDL-cholesterol levels in French-Canadian men. Mol. Genet. Metab. 78, 31–36 (2003).
Manolio, T. Novel risk markers and clinical practice. N. Engl. J. Med. 349, 1587–1589 (2003).
Langholz, B., Rothman, N., Wacholder, S. & Thomas, D. C. Cohort studies for characterizing measured genes. J. Natl Cancer Inst. Monogr. 26, 39–42 (1999).
Gordis, L. Epidemiology 2nd edn (W. B. Saunders, Philadelphia, 2000).
Foster, M. W. & Sharp, R. R. Will investments in large-scale prospective cohorts and biobanks limit our ability to discover weaker, less common genetic and environmental contributors to complex diseases? Environ. Health Perspect. 113, 119–122 (2005).
Barbour, V. UK Biobank: a project in search of a protocol? Lancet 361, 1734–1738 (2003).
Khoury, M. J. The case for a global human genome epidemiology initiative. Nature Genet. 36, 1027–1028 (2004).
Clayton, D. & McKeigue, P. M. Epidemiological methods for studying genes and environmental factors in complex diseases. Lancet 358, 1356–1360 (2001).
Sackett, D. L. Bias in analytic research. J. Chron. Dis. 32, 51–63 (1979).
Schlesselman, J. J. Case–Control Studies: Design, Conduct, and Analysis (Oxford Univ. Press, New York, 1982).
Neyman, J. Statistics: servant of all sciences. Science 122, 401–406 (1955).
Taube, A. Matching in retrospective studies, sampling via the dependent variable. Acta Soc. Med. Ups. 73, 187–196 (1968).
Wang, S. S., Fridinger, F., Sheedy, K. M. & Khoury, M. J. Public attitudes regarding the donation and storage of blood specimens for genetic research. Community Genet. 4, 18–26 (2001).
Bhatti, P. et al. Genetic variation and willingness to participate in epidemiologic research: data from three studies. Cancer Epidemiol. Biomarkers Prev. 14, 2449–2453 (2005).
Austin, H., Hill, H. A., Flanders, W. D. & Greenberg, R. S. Limitations in the application of case–control methodology. Epidemiol. Rev. 16, 65–76 (1994).
Miettinen, O. S. The “case–control” study: valid selection of subjects. J. Chronic Dis. 38, 543–548 (1985).
Wacholder, S., Silverman, D. T., McLaughlin, J. K. & Mandel, J. S. Selection of controls in case–control studies. III. Design options. Am. J. Epidemiol. 135, 1042–1050 (1992).
Doll, R. Proof of causality. Persp. Biol. Med. 45, 499–515 (2002).
Rosenberg, N. A., Li, L. M., Ward, R. & Pritchard, J. K. Informativeness of genetic markers for inference of ancestry. Am. J. Hum. Genet. 73, 1402–1422 (2003).
Helgason, A., Yngvadottir, B., Hrafnkelsson, B., Gulcher, J. & Stefansson, K. An Icelandic example of the impact of population structure on association studies. Nature Genet. 37, 90–95 (2005).
Ben-Shlomo, Y., Smith, G. D., Shipley, M. & Marmot, M. G. Magnitude and causes of mortality differences between married and unmarried men. J. Epidemiol. Community Health 47, 200–205 (1993).
Zeger, S. L., Liang, K. Y. & Albert, P. S. Models for longitudinal data: a generalized estimating equation approach. Biometrics 44, 1049–1060 (1998).
Kolonel, L. N. et al. A multiethnic cohort in Hawaii and Los Angeles: baseline characteristics. Am. J. Epidemiol. 151, 346–357 (2000).
The ARIC investigators. The Atherosclerosis Risk in Communities (ARIC) Study: design and objectives. Am. J. Epidemiol. 129, 687–702 (1989).
The Women's Health Initiative Study Group. Design of the Women's Health Initiative clinical trial and observational study. Control. Clin. Trials 19, 61–109 (1998).
Colditz, G. A., Manson, J. E. & Hankinson S. E. The Nurses' Health Study: 20-year contribution to the understanding of health among women. J. Womens Health 6, 49–62 (1997).
Newman, A. B. et al. Association of long-distance corridor walk performance with mortality, cardiovascular disease, mobility limitation, and disability. JAMA 295, 2018–2026 (2006).
Lloyd-Jones, D. M., Larson, M. G., Beiser, A. & Levy, D. Lifetime risk of developing coronary heart disease. Lancet 353, 89–92 (1999).
Troyer, D. A., Mubiru, J., Leach, R. J. & Naylor, S. L. Promise and challenge: markers of prostate cancer detection, diagnosis and prognosis. Dis. Markers 20, 117–128 (2004).
Tsai, A. W. et al. Coagulation factors, inflammation markers, and venous thromboembolism: the longitudinal investigation of thromboembolism etiology (LITE). Am. J. Med. 113, 636–642 (2002).
Leibowitz, H. M. et al. The Framingham Eye Study monograph: an ophthalmological and epidemiological study of cataract, glaucoma, diabetic retinopathy, macular degeneration, and visual acuity in a general population of 2631 adults, 1973–1975. Surv. Ophthalmol. 24 S335–S610 (1980).
Ellenberg, J. H. & Nelson, K. B. Sample selection and the natural history of disease. Studies of febrile seizures. JAMA 243, 1337–1340 (1980).
Kannel, W. B. Clinical misconceptions dispelled by epidemiological research. Circulation 92, 3350–3360 (1995).
Aleksic, N. et al. Factor XIIIA Val34Leu polymorphism does not predict risk of coronary heart disease: the Atherosclerosis Risk in Communities (ARIC) Study. Arterioscler. Thromb. Vasc. Biol. 22, 348–352 (2002).
Taubes, G. Epidemiology faces its limits. Science 269, 164–169 (1995).
Jamrozik, K., Weller, D. P. & Heller, R. F. Biobank: who'd bank on it? Med. J. Aust. 182, 56–57 (2005).
Kannel, W. B. The Framingham Study: its 50-year legacy and future promise. J. Atheroscler. Thromb. 6, 60–66 (2000).
Stamler, J. Blood pressure and high blood pressure. Aspects of risk. Hypertension 18, I95–107 (1991).
Riboli, E. & Kaaks, R. The EPIC Project: rationale and study design. European Prospective Investigation into Cancer and Nutrition. Int. J. Epidemiol. 26, S6–S14 (1997).
Weis, B. K. et al. Personalized exposure assessment: promising approaches for human environmental health research. Environ. Health Perspect. 113, 840–848 (2005).
Gauderman, W. J. Sample size requirements for matched case–control studies of gene–environment interaction. Stat. Med. 21, 35–50 (2002).
Altshuler, D. et al. The common PPARγ Pro12Ala polymorphism is associated with decreased risk of type 2 diabetes. Nature Genet. 26, 76–80 (2000).
Grant, S. F. et al. Variant of transcription factor 7-like 2 (TCF7L2) gene confers risk of type 2 diabetes. Nature Genet. 38, 320–323 (2006).
Meslin, E. M., Thomson, E. J. & Boyer, J. T. The Ethical, Legal, and Social Implications Research Program at the National Human Genome Research Institute. Kennedy Inst. Ethics J. 7, 291–298 (1997).
Prentice, R. L. On the design of synthetic case–control studies. Biometrics 42, 301–310 (1986).
Mantel, N. Synthetic retrospective studies and related topics. Biometrics 29, 479–486 (1973).
Marshall, E. Whose DNA is it, anyway? Science 278, 564–567 (1997).
Triendl, R. Japan launches controversial Biobank project. Nature Med. 9, 982 (2003).
Acknowledgements
The authors express appreciation to M. Boehnke, E. Boerwinkle, B. Foxman, M. Khoury, L. Kuller, J. Ordovas and B. Psaty for their critical review and comments on this manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Related links
Related links
DATABASES
OMIM
FURTHER INFORMATION
Ethics and Governance Framework of the UK Biobank
Incidence and Prevalence Database
National Health Infrastructure Initiative
National Institute of Environmental Health Sciences
NHGRI Ethical, Legal and Social Issues
NHGRI Expert Panel Recommendations for a populationbased cohort
NIH Genes and Environment Initiative
Responses to NHGRI Request for Information
SEER Cancer Statistics Review, 1975-2002
Glossary
- Exposure
-
A putative cause or characteristic determinant of a health outcome of interest.
- Risk factor
-
An attribute or exposure that increases the probability of disease or other outcome; used by some to mean causal factor or 'determinant' and by others to mean 'risk marker'.
- Cohort
-
Originally defined as a group of people born during a particular period (a 'birth cohort'); now broadened to include any designated group of people who are followed or traced over time.
- Risk marker
-
An attribute or exposure that is associated with an increase in the probability of a specified outcome, but is not necessarily a causal factor.
- Population stratification
-
The presence of different allele frequencies in cases and controls that is attributable to diversity in the background population and is unrelated to outcome status.
- Ancestry informative (ancestral) marker
-
A locus with several polymorphisms that exhibit substantially different frequencies between ancestral populations. For example, the Duffy null allele has a frequency of almost 100% of sub-Saharan Africans, but occurs infrequently in other populations.
- Incidence
-
The number of new cases of disease that develop during a period of time.
- Odds ratio (or relative odds)
-
The odds of disease in the individuals exposed to an environmental factor or genetic variant divided by the odds in unexposed individuals; or the odds of exposure in the cases divided by the odds in the controls (they are algebraically equivalent). If the odds ratio is significantly greater than one, then the environmental factor or genetic variant is associated with the disease.
- Study power
-
The probability of rejecting the null hypothesis of no association in a study if it is in fact false, or of detecting a difference between two groups if it does in fact exists.
- Type I error rate
-
The probability of rejecting the null hypothesis of no association in a study if it is in fact true, or of detecting a difference between two groups when no difference exists.
Rights and permissions
About this article
Cite this article
Manolio, T., Bailey-Wilson, J. & Collins, F. Genes, environment and the value of prospective cohort studies. Nat Rev Genet 7, 812–820 (2006). https://doi.org/10.1038/nrg1919
Issue Date:
DOI: https://doi.org/10.1038/nrg1919
This article is cited by
-
Predictive genomic tools in disease stratification and targeted prevention: a recent update in personalized therapy advancements
EPMA Journal (2022)
-
Artificial intelligence powered statistical genetics in biobanks
Journal of Human Genetics (2021)
-
Using Genetic Marginal Effects to Study Gene-Environment Interactions with GWAS Data
Behavior Genetics (2021)
-
Admixture/fine-mapping in Brazilians reveals a West African associated potential regulatory variant (rs114066381) with a strong female-specific effect on body mass and fat mass indexes
International Journal of Obesity (2021)
-
Genome-wide analysis of 53,400 people with irritable bowel syndrome highlights shared genetic pathways with mood and anxiety disorders
Nature Genetics (2021)