Family studies have demonstrated genetic influences on environmental exposure: the phenomenon of gene–environment correlation (rGE). A few molecular genetic studies have confirmed the results, but the identification of rGE in studies that measure genes and environments faces several challenges. Using examples from studies in psychology and psychiatry, we integrate the behavioral and molecular genetic literatures on rGE, describe challenges in identifying rGE and discuss the implications of molecular genetic findings of rGE for future research on gene–environment interplay and for attempts to prevent disease by reducing environmental risk exposure. Genes affect environments indirectly, via behavior and personality characteristics. Associations between individual genetic variants and behaviors are typically small in magnitude, and downstream effects on environmental risk are further attenuated by behavioral mediation. Genotype–environment associations are most likely to be detected when the environment is behaviorally modifiable and highly specified and a plausible mechanism links gene and behavior. rGEs play an important causal role in psychiatric illness. Although research efforts should concentrate on elucidating the genetic underpinnings of behavior rather than the environment itself, the identification of rGE may suggest targets for environmental intervention even in highly heritable disease. Prevention efforts must address the possibility of confounding between rGE and gene–environment interaction (G × E).
Since the 1960s, personality psychologists have emphasized the role of the person in producing his/her environment.1, 2, 3 Rather than viewing the person as someone whose behavior was shaped solely by situational contingencies, these researchers demonstrated how people's personalities and behaviors influenced the way others responded to them and influenced the choices people made about how, where and with whom they spent their time.
By the late 1970s, behavioral geneticists had amassed a large body of research on twins and adoptees that attested to the importance of genetic influences on individual differences in personality, cognitive abilities and liability to disease.4 Behavioral geneticists realized that genetic factors influencing an individual's exposure to particular environments could make those environments themselves heritable. This phenomenon is referred to as gene–environment correlation, or rGE.5, 6 By the mid-1990s, however, this literature had descended into misunderstanding and polemics. Social scientists on one side of the debate argued that behavioral geneticists were on an absurd quest (with potentially dangerous social and political implications) to identify genes for divorce, poverty, political affiliation and religious observance.7 Scientists on the other side of the debate continued to insist that classic sociological methods of enquiry (in which putative measures of the environment were treated as causal) may be confounded by heritable behaviors influencing both exposure to the environment and outcome.8
Although behavioral geneticists never intended to instigate a search for ‘divorce genes,’ technological advances following the mapping of the human genome, including ever denser maps of polymorphic DNA markers, falling genotyping cost and growing statistical sophistication, have made it cheaper and quicker than ever before to identify associations between specific genetic variants and measures of the environment. In large part, studies of gene–environment interplay have been motivated by the search for gene–environment interactions, following recent demonstrations of genetic sensitivity to environmental effects on human phenotypes.9 In general, however, these studies have not identified rGEs, although some have. Do the results of these molecular genetic studies confirm or contradict the evidence for rGE from twin and adoption studies? And to what extent should social scientists, geneticists and clinicians still be concerned about the possibility of rGE in their investigations?
In this paper, we review mechanisms of rGE and the evidence for rGEs from the psychology and psychiatry literatures, discuss challenges in identifying rGEs in studies of psychiatric disorder, and discuss the implications of rGEs for understanding genetic and environmental risk processes in psychopathology. This paper goes beyond existing reviews of the literature on gene–environment interplay5, 9, 10, 11, 12 by integrating recent findings from molecular genetics research into an existing conceptual framework established by behavioral genetic studies. Our point is not simply that ignoring rGEs results in misleading conclusions about how experiences shape behavior. Others have made this point adequately. Rather, we argue that the molecular genetic work, though still sparse, underscores the importance of identifying behaviors and personality characteristics that bring about particular environmental experiences, and that this has implications for the design of studies that measure both genes and environments.
rGE and G × E: definitions
rGEs reflect genetic differences in exposure to particular environments. Gene × environment interactions (G × E) refer to genetic differences in susceptibility to particular environments.5 We illustrate this distinction with a hypothetical example. The personality trait of neuroticism is characterized by the tendency to experience negative emotions like anger and anxiety. As a result, individuals who score high on neuroticism tend to experience conflict in their relationships with romantic partners, friends and colleagues. Imagine that molecular geneticists have identified two variants of a gene and this gene is associated with neuroticism: people who have variant A of the gene tend to score high on measures of neuroticism and people who have variant B of the gene tend to score low on measures of neuroticism. Because variant A is more common among individuals who are highly neurotic and because highly neurotic people tend to have conflictual interpersonal relationships, the A variant is likely to be correlated with the experience of interpersonal life stressors (i.e., a rGE). Now imagine that our gene is unassociated with neuroticism, and that stressful life events occur at equally high rates among individuals who carry the A and B variants. Nevertheless, stressful life events have a more adverse effect on carriers of the A variant than on carriers of the B variant, increasing risk for depression more steeply among the former than the latter. This state of affairs would reflect a G × E, wherein individuals who carried the A variant would be especially susceptible to the experience of stressful life events.
rGEs can arise by both causal and non-causal mechanisms. Of principal interest are the causal mechanisms, which indicate genetic control over environmental exposure. Genetic variants influence environmental exposure indirectly via behavior. Three causal mechanisms giving rise to rGEs have been described.6, 13 Passive rGE refers to the association between the genotype a child inherits from her parents and the environment in which the child is raised. For example, because parents who have histories of antisocial behavior (which is moderately heritable)14 are at elevated risk of abusing their children, maltreatment may be a marker for genetic risk that parents transmit to children rather than a causal risk factor for children's conduct problems.15 Evocative (or reactive) rGE refers to the association between an individual's genetically influenced behavior and the reaction of those in the individual's environment to that behavior. For example, the association between marital conflict and depression may reflect the tensions that arise when engaging with a depressed spouse rather than a causal effect of marital conflict on risk for depression. Finally, selective (or active) rGE refers to the association between an individual's (genetically influenced) traits or behaviors and the environmental niches selected by the individual. For example, individuals who are characteristically extroverted may seek out very different social environments than those who are shy and withdrawn.
Non-causal mechanisms include evolutionary processes and behavioral ‘contamination’ of the environmental measure. Evolutionary processes, such as genetic drift and selection, can cause allele frequencies to differ between populations. For example, exposure to malaria-bearing mosquitoes over many generations may have caused the higher allele frequency among certain ethnic groups for the sickle hemoglobin (HbS) allele, a recessive mutation that causes sickle-cell anemia, but confers resistance against malaria.16 In this way, HbS genotype has become associated with the malarial environment.
Behavioral contamination can give rise to rGEs when person-specific factors influence perceptions and, hence, reports of the environment. Studies of psychiatric phenotypes may be especially prone to such biases, particularly case–control studies that rely on retrospective reports of the environment. Behavioral contamination can also produce rGE by biasing the sample selection. An example is provided by a study of late-onset Alzheimer's disease. The sample comprised older adults (60 years and older) who reported retrospectively on the fat content of their diet at three points in their adult lives. The researchers detected a significant rGE among the healthy controls: those who possessed the apolipoprotein E (APOE) ɛ4 genotype (the genotype associated with higher disease susceptibility) were more likely than those without the APOE ɛ4 genotype to eat a diet low in fat.17, 18 However, given at least some other studies showing that a high-fat diet increases risk of Alzheimer's Disease among those with the APOE ɛ4 genotype,19 this rGE may have arisen because individuals with the APOE ɛ4 genotype who ate a low-fat diet preferentially survived to age 60.
rGE: evidence from the behavioral genetic literature
Twin and adoption studies have provided much of the evidence for rGEs by demonstrating that putative environmental measures are heritable.9, 20, 21 For example, studies of adult twins have shown that desirable and undesirable life events are moderately heritable22, 23 as are specific life events and life circumstances, including divorce,24, 25 the propensity to marry,26 marital quality27 and social support.28, 29, 30 Studies in which researchers have measured child-specific aspects of the environment have also shown that putative environmental factors, such as parental discipline or warmth, are moderately heritable12 (for reviews, see Plomin and Bergeman20). Television viewing, peer group orientations and social attitudes have all been shown to be moderately heritable.20, 31, 32 There is also a growing literature on the genetic factors influencing behaviors that constitute a risk to health, such as the consumption of alcohol, tobacco and illegal drugs, and risk-taking behaviors.33, 34, 35, 36 Like parental discipline, these health-related behaviors are genetically influenced, but are thought to have environmentally mediated effects on disease.
To the extent that researchers have attempted to determine why genes and environments are correlated, most evidence has pointed to the intervening effects of personality and behavioral characteristics. For example, parental negativity and harsh physical discipline are moderately heritable and studies of twins37 and adoptees38, 39 have shown that much of this heritability reflects genetic influences on variation in children's aggressive, disruptive behaviors that elicit negative responses from adults. Saudino and Plomin40 showed that virtually all of the heritability in Home Observation for Measurement of the Environment (HOME) scores at 24 months of age was accounted for by toddler cognitive abilities and temperament.
Similarly, much of the heritability of marital status and quality can be accounted for by genetic factors that influence individual differences in personality. Jockin and McGue24 reported that 30–42% of divorce heritability could be attributed to the genetic factors affecting individual differences in personality in one spouse. Johnson et al.26 reported that approximately 80% of the phenotypic correlation between personality and the propensity to marry was accounted for by common genetic factors. As a final example, Spotts et al.41 reported that about half of the heritability of wives' marital satisfaction could be attributed to genetic factors influencing individual differences in wives' personalities, particularly aggressiveness and optimism.
Reporting on life events more broadly, Saudino et al.23 demonstrated that all the genetic influence on controllable desirable and undesirable life events could be explained by genetic factors influencing individual differences in personality factors, such as neuroticism, extraversion and openness to experience, although this finding was specific to women. Similarly, in a study of twin children, Thapar et al42 reported strong genetic correlations between measures of total life events (but not uncontrollable life events) and depressive symptomatology. Overall, the evidence from twin and adoption studies suggests that rGEs are mediated by heritable personality and behavioral characteristics.
Because genetic influences on behavior mediate the heritability of environmental exposure, environments less amenable to behavioral modification tend to be less heritable. For example, negative life events that are beyond the control of the individual (e.g., the death of a loved one, losing one's home in a natural disaster) are not heritable, whereas negative life events that may be dependent on an individual's behavior (e.g., getting a divorce, getting fired from a job) are heritable.43, 44 Similarly, personal life events (i.e., events that occur directly to an individual) are more highly heritable than network life events (i.e., events that occur to someone within an individual's social network, thus affecting the individual indirectly).22
rGE: evidence from the molecular genetic literature
Excepting genetic associations with substance use (which psychologists and psychiatrists tend to think of as an outcome rather than a predictor of disease), the first report of a measured rGE was published only very recently, and came from the Collaborative Study on the Genetics of Alcoholism (COGA). This group reported that a single-nucleotide polymorphism in intron 7 of the γ-aminobutyric acid A α2 receptor (rs279871; GABRA2) was associated with alcohol dependence45 and marital status.46 Individuals who had the high-risk GABRA2 variant (i.e., the variant associated with alcohol dependence) were less likely to be married, in part because they were at higher risk for antisocial personality disorder and were less likely to be motivated by a desire to please others.46 Thus, these results are consistent with the findings from twin and adoption studies in showing that the influence of genes on environments is behaviorally mediated.
Further evidence of rGE from molecular genetic research comes from a study of 207 adults who reported retrospectively on the parenting they experienced in their families of origin.47 Individuals who were homozygous for the A allele in exon 8 (E8) of the dopamine D2 receptor (DRD2) gene reported significantly more paternal rejection, parental overprotection and paternal overprotection compared with individuals who were heterozygous or homozygous for the G allele. Individuals who were homozygous for the A allele in DRD2 (E8) and who were Ser385-positive for the Pro385Ser variant of the GABAA α6 receptor (GABRA6) gene reported the highest levels of paternal rejection. These individuals were also more persistent as measured by the Temperament and Character Inventory (TCI).48 Although persistence was associated with parental rejection, controlling for temperamental characteristics did not alter the significance of the association between DRD2 (E8) (or the interaction between DRD2 (E8) and GABRA6) and the parenting subscales. However, the authors did not report how much the association between the gene variants and the parental rejection measures was reduced when they controlled for temperamental characteristics. In any case, the findings suggest that the genotype–environment association was, at best, only partially mediated by temperament, as measured in adulthood by the TCI, and that other behavioral measures might mediate the association between genotype and environment more fully. Because the environmental variables were measures of the perceived environment, the relevant behavioral mediators might involve the individual's perception or recollection of the parenting they received and/or behaviors that evoked or elicited particular parenting practices. Indeed, in a study of twins, Krueger et al49 showed that virtually all of the heritable variation in retrospective reports of family cohesion and family socioeconomic status was shared by personality variables (positive emotionality, negative emotionality and constraint).
Finally, Burt50 conducted a study 132 college men who participated in several group activities in the lab and were later asked to judge how much they liked the other group members. Men who were heterozygous or homozygous for the G allele of the G1438A polymorphism of the serotonin transporter receptor 2A (5HT2A) gene engaged in more rule-breaking behaviors and were better-liked by their peers than men who were homozygous for the A allele. The association between the G1438A polymorphism and peer relationships was mediated in part by its effects on men's rule-breaking behavior.
Although the research base is still small, the findings from these molecular genetic studies are consistent with at least two aspects of the behavioral genetic literature. First, these studies confirm the existence of rGEs in research that actually measures genetic variants as well as environments. Second, they provide preliminary support for the finding that correlations between genes and environments are mediated by behavioral and personality characteristics, although only the Burt50 study produced strong evidence of mediation. Below, we elaborate on the challenges that molecular geneticists face as they seek to identify and interpret rGEs, and we suggest directions for research in this area.
Genotype–environment associations: challenges in identifying them
Studies that combine measured genes and measured environments are relatively new to psychology and psychiatry and the two published accounts of significant rGEs46, 47 as well as the unpublished data by Burt50 are promising. In contrast, other studies have measured both genes and environments as predictors of psychopathology, but have not detected statistically significant rGEs. These include investigations of (a) the serotonin transporter gene-linked polymorphic region (5HTTLPR) or brain-derived neurotrophic factor (BDNF) variants and stressful life events as risk factors for depression51, 52, 53, 54, 55, 56, 57 and alcohol consumption,58 (b) monoamine oxidase A (MAOA) and adverse life events as risk factors for antisocial behavior59, 60, 61, 62, 63, 64, 65 and (c) catechol-O-methyltransferase (COMT) and cannabis use as risk factors for schizophrenia66 or COMT and low birth weight as risk factors for antisocial behavior.67 Despite these null findings, it is possible that more rGEs will be detected as growing numbers of psychologists and psychiatrists integrate molecular genetic techniques with careful measurement of the environment. This research effort faces several challenges, however, that must be addressed if researchers hope to successfully identify and interpret correlations between genetic variants and measures of the environment.
First, sample sizes must be large enough to detect genetic effects of a realistic size. Because rGEs must be behaviorally mediated, they are most likely to be detected when associations between genetic variants and behavior are well replicated and effect sizes are large. However, as reviewed by Kendler,68 associations between genes and complex behaviors are typically small in magnitude. On average, genetic variants increase the odds of a given disorder by approximately 1.30.68 Thus, although there may be sizable correlations betweens behaviors and environments (e.g., a parent's antisocial behavior and their abuse of a child), correlations between genetic variants and environments are likely to be smaller because genetic variants are only weakly predictive of the behaviors that are thought to mediate the gene–environment association. For example, the MAOA genotype is only weakly and inconsistently predictive of antisocial behavior,59, 69, 70 although antisocial behavior is a relatively strong predictor of family violence. Indeed, rGEs may be small because the association between the genetic variant and the behavior that mediates the rGE is itself moderated, either behaviorally, or by other genes through the phenomenon of epistasis. Despite the likelihood of small effect sizes, studies that have tested for rGE and G × E have typically employed samples ranging in size from around 100 participants to many thousand participants, with most including between 200 and 800 participants. Consequently, many studies are underpowered. One of the most notable features of meta-analyses of the behavioral genetic literature is that the effect size of the initial report is rarely replicated, although in many cases meta-analyses indicate that an association of smaller magnitude is reliably present. Larger sample sizes may be required to detect these associations and, consequently, gene–environment associations.
Second, the environment must be highly specified and well measured. It is possible that the extensive literature on personality can inform researchers' choices about how to measure the environment. Meta-analyses are beginning to identify small but consistent associations between specific genetic variants and personality characteristics.71 For example, a number of meta-analyses have identified an association between 5-HTTLPR and the personality trait of neuroticism.72, 73, 74, 75 Not only is neuroticism a strong correlate of depression (for a review, see Klein et al.76), but individuals who score high on neuroticism are prone to experience stressful life events, particularly those of an interpersonal nature (e.g., Headey and Wearing,77 Kendler et al.,78 Van Os and Jones79). Thus, we might expect 5-HTTLPR to be associated with interpersonal stressful life events, but not uncontrollable or network stressful life events. Although one study that genotyped the serotonin promoter variant also distinguished between personal and network stressful life events, this group did not report whether 5-HTTLPR was correlated with either type of stressful life event.80 Other studies that have measured stressful life events and 5-HTTLPR have not distinguished between personal and network (or controllable and uncontrollable) stressful life events.
In addition, confidence in findings of rGE will be increased if researchers are able to demonstrate convergent validity. For example, if the association between GABRA2 and marital status is mediated by antisocial personality symptoms, then GABRA2 should also predict marital conflict. If rule-breaking behaviors mediate the association between 5HT2A variants and peer relations, then 5HT2A variants should also be associated with suspensions from school or contact with police.
Third, researchers must balance the risk of false-positive results against the risk of false-negatives. Until recently, researchers have largely relied on candidate gene methods to test for gene-behavior and gene–environment associations. These studies have a high risk of false-negative results as they measure only a tiny proportion of the variation in the genome. Genome-wide association studies that measure essentially all the common genetic variation are now feasible, although they require very large sample sizes to overcome the problems associated with testing many thousands of genetic variants.81
In spite of these challenges, the prospects for successfully finding gene–environment associations have never been brighter. The scale of genetic studies has increased enormously owing to exponentially falling genotyping costs. Methodological developments have been fast-paced, driven by the greatly increased availability of resequencing data from projects such as the HapMap (http://www.hapmap.org/) and ever-cheaper and more powerful computational resources. Moreover, bioinformatic analyses and high-throughput functional studies are starting to fill in the many blank spaces in our knowledge of gene function, allowing better identification of candidates with potential influence on behavior and a better understanding of the biological pathways from genes to behaviors.
The clearest example of how understanding the biological pathway from gene to behavior helps to identify rGEs relates to a functional polymorphism in the mitochondrial gene for aldehyde dehydrogenase (ALDH2) that metabolizes an ethanol byproduct, acetaldehyde, into acetate. Homozygotes for the mutant ALDH2*2 allele have negligible ALDH2 activity, and experience an unpleasant flushing reaction after alcohol intake as a result of acetaldehyde accumulation. Heterozygotes have reduced ALDH2 activity and experience less severe flushing. The ALDH2*2 allele is common in East Asian populations, in whom it has a well-established protective effect associated with about a 10-fold reduction in risk of alcoholism.82 The protective effect is thought to be a direct consequence of the flushing reaction and associated nausea, drowsiness and headache that discourages drinking. In a recent meta-analysis,83 researchers found clear evidence of a correlation between ALDH2 genotype and alcohol exposure (alcohol exposure being considered an environmental risk factor for cancer in much the same way that some psychiatrists view cannabis use as an environmental risk factor for schizophrenia). Alcohol intake increased as a function of the number of ALDH2*1 alleles: ALDH2*1*1 homozygotes were more likely than ALDH2*1*2 heterozygotes to be heavy drinkers and none of the ALDH2*2*2 homozygotes were heavy drinkers. In summary, the ALDH2 polymorphism as well as other genetic variants (e.g., alcohol dehydrogenase; ADH1B) are associated with alcohol consumption and other alcohol-related phenotypes (e.g., fetal alcohol syndrome (FAS))84 because they have functional influences on ethanol metabolism, the intermediate products of which are potentially toxic.
Knowing the function of a gene and its effects on downstream biological function increases the likelihood that researchers will correctly identify which behaviors (and, consequently, which environments) will be associated with the gene. The research conducted by the COGA group on GABRA2 illustrates this concept. Edenberg et al.45 found that GABRA2 was not only strongly linked to alcohol dependence, but also to brain oscillations in the beta frequency range (13–28 Hz). Beta rhythms reflect a balance between excitatory and inhibitory networks of nerve cells and this balance is thought to be regulated by the GABAA receptor.85 Alcohol-dependent groups have been shown to have increased power in the beta frequency, particularly in the parietal and frontal regions of the brain.86 Taken together, these findings suggest that specific GABAA receptor gene variants (e.g., GABRA2) are associated with central nervous system (CNS) hyper-excitability, as reflected in brain oscillation patterns and that these patterns of CNS activity may underlie the behavioral disinhibition and impulsivity that characterizes alcohol dependence and other externalizing spectrum disorders. Understanding the role of GABAA in neuronal activity provided a framework for COGA researchers to effectively hypothesize that GABRA2 would be correlated with personality characteristics related to impulse control (e.g., antisocial personality characteristics), which in turn would be correlated with a measure of interpersonal functioning as reflected in marital status.
A developmental viewpoint may help us identify plausible rGEs, because rGEs are likely to arise when the environment and a heritable behavior have transactional or reciprocal mutual influences over time. As an example, young adolescents who have psychotic experiences are at elevated risk of using cannabis later in adolescence, which in turn prefigures schizophreniform illness.87 Therefore, it is likely that heritable influences on cannabis use correlate with genetic risk factors for schizophreniform illness, even though the only published study that bears on this question did not find such an association.66
A developmental viewpoint may also inform our expectations about when we might detect rGEs and G × E. That is, genes and environments can have time-dependent influences on the course of psychopathology. For example, although a number of researchers have reported that 5-HTTLPR genotype moderates the influence of stressful life events on risk for depression in adolescents and young adults,51, 52, 55 this finding has not been replicated in two samples of older adults, possibly because psychosocial stressors are more weakly associated with repeat-episode depression.56, 80 As another example, twin studies find that the heritability of phenotypes like intelligence increases as people age.88 This observation is often interpreted to mean that active and evocative rGEs play a greater and greater role in accounting for phenotypic variation as people age. The type of rGE may also change developmentally: it has been hypothesized that a transition from passive to evocative and active forms of rGE occurs between infancy and adolescence, as children take on a more active role in constructing their environments.13 In molecular genetic terms, genetic associations with early cognitive abilities may be mediated by parental influences over children's early experiences; these give way to associations mediated by children's behavior in creating their own learning environment. Overall, these observations suggest that rGEs and G × E (and particular types of rGEs) may be more evident at some points in the life course than others.
In conclusion, efforts to identify rGEs face a number of challenges. First, rGEs are likely to be behaviorally mediated, but associations between genes and behaviors are typically small in magnitude. This implies that (a) sample sizes will need to be large in order for researchers to detect rGE of small effect and (b) researchers must use reliable and valid environmental and behavioral measures in order to optimize the chances of detecting rGE and understanding how rGE is mediated by personality and behavioral characteristics. Second, researchers must think hard about which measures of the environment associate most specifically with the relevant heritable behaviors and personality variables. The personality literature may provide a guide for helping researchers identify relevant environmental experiences. Third, understanding the biological pathways from gene variants to behaviors may help researchers identify and interpret rGE. Genetic variants influencing behavior must be understood in relation to the neural systems they perturb.89 The growing use of endophenotypes in molecular genetic studies of psychiatric disorder90 may help by bridging the gap between neural systems and behavior, not least by potentially facilitating the development of useful animal models. This in turn increases the likelihood that researchers will identify significant associations between genes and the behaviors that bring about environmental experiences.
rGE: implications for G × E studies
The relatively weak associations between gene variants and behaviors that modify the environment68 suggest that many researchers who are interested in gene × environment interactions will have low power to detect associations between measured genes and measured environments, even when they truly exist. This is important for the design and analysis of G × E studies that rely on a strong assumption of independence between genetic and environmental factors, as rGE does not have to reach statistical significance to profoundly affect the interpretation of G × E estimates.91, 92 For example, ‘case-only’ designs (i.e., those that estimate genetic and environmental risks for disease among individuals who are already affected by the disease) efficiently estimate gene × environment interactions in disease risk, but are sensitive to the existence of even small rGEs, resulting in inflated type I error rates.91, 93 That said, with a single exception,94 researchers interested in genetic and environmental influences on psychiatric phenotypes have not to our knowledge utilized the case only design to estimate gene × environment interactions.
More usual in psychological and psychiatric research is the case–control design, which is more robust to rGE, but only when stratified analyses are conducted (in which the odds of disease are calculated separately for each genetic subgroup as a function of environmental exposure) rather than a log-linear analysis, in which the gene × environment interaction is tested under the assumption that the genetic and environmental risks are statistically independent.95 Typically, however, researchers utilizing the case–control design have used log-linear analyses to test for G × E effects on, for example, Alzheimer's Disease (e.g., Jarvik et al.,96 Mayeaux et al.97), suggesting that the results may be biased by any non-independence of genetic and environmental risk factors.
Moreover, case–control and other designs that rely on retrospective recall of the environment are likely to give rise to artefactual gene–environment associations arising from behavioral ‘contamination’ of the reported environment. Retrospective recall of past events may be influenced by individual differences in current mood, personality, or mental health,98 although such effects are not found consistently.99 As discussed above, retrospective recall of past events may also reflect the degree to which past environments were elicited by an individual's behavior.100
Many studies of gene × environment effects on psychiatric outcomes have utilized cohort (longitudinal) designs instead of the case–control design. Cohort studies offer certain methodological advantages, including the possibility of measuring environments prospectively, thus avoiding problems with retrospective recall of the environment. Cohort studies are also less prone than some case–control studies to non-causal rGE arising from certain types of selection bias. That is, unlike subjects in some case–control studies, affected and unaffected cohort study subjects are not matched from the study's inception on environmental measures that must be recalled retrospectively (e.g., drinking or smoking during pregnancy). The main obstacle for cohort studies is that finding and replicating gene–environment interactions necessitates large sample sizes. Cohort studies of psychiatric illnesses with low base rates (e.g., schizophrenia, autism) are likely to have difficulties in identifying sufficient numbers of affected subjects.101 A second problem is that subjects may have to be followed up for an extremely long time to measure environmental risk factors over the duration of their action.
Finally, it bears noting that depending on the methods used to estimate heritability, correlations between genetic factors and shared or non-shared environments may or may not be included in the total heritability estimate.102, 103 This has implications for molecular genetics because endophenotypes or phenotypic end points may be selected for molecular genetic studies based on their heritability, which may reflect the presence of rGE rather than direct effects of genotype.
In conclusion, even small rGEs may result in inflated type I error rates when researchers use case-only designs and when case–control designs test for G × E using log-linear models. Long-term cohort studies are probably least likely to be confounded by rGE. Case–control studies should conduct stratified analyses and, wherever possible, not rely on retrospective recall of environments.
rGE: implications for disease prevention
The interplay between genetic and environmental influences on disease means that even highly heritable diseases are preventable by environmental interventions. For example, phenylketonuria (PKU) is an inherited disorder characterized by an inability to metabolize the amino acid phenylalanine. The accumulation of phenylalanine in the brain and body tissues causes mental retardation, but can be minimized by a diet low in proteins containing phenylalanine.
The target population of a prevention or intervention effort may depend on whether disease risk is influenced by gene × environment interactions – as is the case for PKU – or by rGEs, or by both mechanisms. The existence of G × E suggests that a genetic subpopulation is at elevated environmental risk, and that interventions are likely to be most effective in that subpopulation. It has been argued that there may be significant public health benefits in using genetic information to stratify the allocation of environmental interventions that prevent disease,104 although this viewpoint is far from universally held (e.g., Willet105).
In contrast, rGE does not imply that particular interventions will be differentially effective across subpopulations and suggests two alternative possibilities. First, rGE may reflect the pleiotropic effects of genotype on the disease process and the environment, meaning that environmental interventions would be ineffective in disease prevention. For example, it is a plausible hypothesis that a functional polymorphism in COMT directly influences both cannabis use and risk for psychosis.106 If cannabis use and psychosis were only correlated because both were directly influenced by COMT function, then it is evident that prevention of cannabis use would have no effect on rates of psychosis.
Second, it is possible that disease risk is determined by the environmental exposure, and environmental exposure, in turn, mediates the association between the genotype and the disease. The latter possibility suggests that preventative efforts can focus on reducing environmental risk exposure without regard to genotype. If, for example, the association between COMT and risk for psychosis were mediated entirely by cannabis use, the appropriate intervention would be to reduce cannabis use in the entire population rather than target prevention efforts at individuals with the COMT genotype.
An example where the distinction between rGE and G × E may be important is drinking alcohol during pregnancy, which is a risk factor for FAS and related disorders of lesser severity.107 Risk for FAS is associated with the ADH1B genotype (also called the ADH2 genotype), but the causal process is unclear because there are plausible G × E and rGE mechanisms, and results have been inconsistent across studies.84 For example, a fetus that inherits the genetic variant that metabolizes ethanol less efficiently (ADH1B*1) has been shown to be more susceptible to the effects of alcohol exposure in the intrauterine environment than a fetus with the fast-metabolizing ADH1B*3 genotype.108, 109, 110 If this is so, then a gene × environment interaction increases risk for FAS and children whose mothers possess the slow-metabolizing ADH1B*1 allele might benefit disproportionately from interventions aimed at reducing maternal alcohol consumption.
Alternatively, the association between alcohol exposure in utero and fetal alcohol spectrum disorders may reflect a passive rGE. That is, it has also been shown that the ADH1B*3 allele in infants and mothers is associated with fetal alcohol spectrum problems in infants.111 The same study showed that mothers who carried the ADH1B*3 allele drank more heavily than mothers who carried the ADH1B*1 allele, although the association failed to reach statistical significance, possibly because of small cell sizes in the high alcohol exposure group. If ADH1B*3 is associated with increased alcohol consumption which leads, in turn, to fetal alcohol spectrum problems, then interventions would be appropriately aimed at all pregnant women who drink heavily, regardless of genotype, because the key mechanism concerns alcohol consumption and not genetic effects on neonatal outcomes.
These findings suggest that researchers who measure genes and environments should test for both rGE and gene × environment interaction. Understanding which form of gene–environment interplay results in a particular behavior or disorder is informative from a prevention and intervention standpoint.
Findings from molecular genetic studies are consistent with findings from the behavioral genetic literature in suggesting that rGEs are mediated by personality and behavioral characteristics. This has several implications for researchers who are interested in identifying rGEs for the purposes of better understanding health and behavioral development. First, studies that measure genes and environments will probably need to be large in order for power to be sufficiently high to detect rGE. Second, environments may need to be better specified and better measured than they are currently. Third, the use of endophenotypes may provide researchers with a bridge between genes and behaviors that potentially modify the environment, thus helping narrow hypotheses about which genes and which environments should be correlated. Finally, there are at least two ways in which advancing our understanding of rGEs may lead to improvements in public mental health. First, the mediation of genetic risk by exposure to environmental risk is likely to play an important role in causing psychiatric illness. rGEs may suggest targets of environmental intervention for heritable disease. Second, depending on the design, the existence of even small rGEs may inflate type I error rates in studies of gene × environment interaction.91, 95 Efforts to target interventions at specific genetic subgroups will be ineffective if they are based on spurious reports of G × E. By using highly reliable and valid measures of behavior and environment, by developing more specific measures of the environment, and by developing a better understanding of the biological pathways from genes to behavior, researchers may be more successful in identifying and interpreting rGEs.
We thank Michael Rutter for his helpful comments on an earlier draft of this paper. This work was support by Grant R01 HD050691 from NICHD to Sara Jaffee and by Grant P50 HL81012 to Tom Price. Neither author has any financial interests related to the material reported in the paper to declare.