This page has been archived and is no longer updated
Epistasis: too often neglected in complex trait studies?
Author: �?rjan Carlborg
Keywords
Keywords for this Article
Add keywords to your Content
Save
|
Cancel
Share
|
Cancel
Revoke
|
Cancel
Rate & Certify
Rate Me...
Rate Me
!
Comment
Save
|
Cancel
Flag Inappropriate
The Content is
Objectionable
Explicit
Offensive
Inaccurate
Comment
Flag Content
|
Cancel
Delete Content
Reason
Delete
|
Cancel
Close
Full Screen
"618 | AUGUST 2004 | VOLUME 5 www.nature.com/reviews/genetics PERSPECTIVES been characterized (see BOX 1a). Modifier genes in the human and mouse have provided further evidence of the importance of epista- sis: the genetic background often influences the phenotypes of the susceptible genotype of MAJOR GENES, for example, by affecting the PENETRANCE of the gene 2 .Complex traits are also regulated by epistasis, as shown by the CANDIDATE GENE studies in which interactions be- tween individual candidate genes are evaluated. One such example is the interaction between the D-allele of the angiotensin I converting enzyme (ACE) gene and the C-allele of the angiotensin II type 1 receptor (AGTR1) gene 3 . The risk of myocardial infarction is signifi- cantly increased by the ACE D-allele in patients who carry that particular AGTR1 allele. In the case of quantitative genetic variation, several or many genes of largely unknown function combine with environmental influ- ences to control trait variation. This is the case for many complex traits that are of medical rel- evance in humans or of economic importance in plants and livestock. By combining quanti- tative genetic theory and molecular informa- tion on genetic marker maps, we can identify the individual genomic loci with the largest effects on quantitative traits (also known as quantitative trait loci or QTLs) and start to examine the genetic control of these traits. We therefore have the means to address the next important challenge in quantitative genetics ? defining the interactions that occur among the genes that underlie these traits. The best source of information on the importance of epistasis in the regulation of complex traits comes from studies on model and other experimentally amenable organisms. Even so, most studies of model organisms have ignored epistasis; indeed, a recent review points out that epistasis is a hidden complexity in the reg- ulation of complex traits that in general is not unravelled in QTL-mapping studies 4 .Similar opinions are expressed in other recent reviews 5?8 , leading to the speculation that epis- tasis could be a factor that contributes to the failure to replicate the results of many human ASSOCIATION STUDIES 9 , and could be one cause of QTL effects that diminish or disappear if they are isolated on fixed genetic backgrounds in experimental organisms. However, recent developments in QTL-mapping methodolo- gies have allowed us to detect not only epistasis between QTLs with individual effects, but also novel epistatic QTLs that primarily mediate their effects on the traits through interactions with other genes (BOX 1). The extent to which epistasis is involved in regulating complex traits is not known, and so we cannot assume that epistasis will be found for every trait in every population. However, we argue that epistasis has been overlooked for too long and that it now needs to be routinely explored in complex trait studies. Here, we use examples from the literature to show that much can be gained by considering epistasis in QTL-mapping studies. We explain how infor- mation about gene interactions will aid our understanding of complex traits, and provide an overview of the results obtained in several successful studies in model organisms. We dis- cuss how the principles and challenges gleaned from these studies could be adopted for carry- ing out similar research in model species and in natural, including human, populations. Overview of QTL mapping Methods for detecting, or mapping, QTLs have been developed for a wide range of populations. This section, together with FIG. 1 and BOX 2,briefly addresses the principles of individual and epistatic QTL mapping as well as the challenges that they pose. For a more thorough review of QTL-mapping methodologies, we refer readers to REF. 6. Individual QTLs. For illustration, we will con- sider a simple but widely used study design in which an F 2 population is derived from a cross between two different inbred lines (FIG. 1a).In their simplest form, QTL-mapping app- roaches work by contrasting the mean effects on the phenotype of alternative F 2 genotypes (for example, QQ versus Qq versus qq,where Q and q are alternative marker alleles derived from lines 1 and 2, respectively). However complicated the statistical analysis, most Interactions among loci or between genes and environmental factors make a substantial contribution to variation in complex traits such as disease susceptibility. Nonetheless, many studies that attempt to identify the genetic basis of complex traits ignore the possibility that loci interact. We argue that epistasis should be accounted for in complex trait studies; we critically assess current study designs for detecting epistasis and discuss how these might be adapted for use in additional populations, including humans. In its broadest sense, epistasis implies that the effect of a particular genotype on the pheno- type depends on the genetic background. In its simplest form, this refers to an interaction between a pair of loci, in which the phenotypic effect of one locus depends on the genotype at the second locus (BOX 1a).More generally, the effect of one locus might depend on the geno- type at several or many loci. In the case of QUANTITATIVE TRAITS,epistasis describes the gen- eral situation in which the phenotype of a given genotype cannot be predicted by the sum of its component single-locus effects 1 (BOX 1b?e). Extensive work on the control of qualitative genetic variation has highlighted the biological importance of epistasis at a ?locus-by-locus? level. On the basis of this work, several classic genotype?phenotype patterns that are caused by epistasis ? such as comb type in chickens, coat colour in various animals, the BOMBAY PHENOTYPE in the ABO blood-group system in humans and kernel colour in wheat ? have Epistasis: too often neglected in complex trait studies? �rjan Carlborg and Chris S. Haley OPINION PERSPECTIVES NATURE REVIEWS | GENETICS VOLUME 5 | AUGUST 2004 | 619 experiments that aim to partition the genetic variation in a population have focused on detecting the genetic (that is, additive and dominance) effects of individual QTLs, irre- spective of interactions 10,11 (FIG. 1b).This strat- egy has been successful for detecting QTLs with large effects on the quantitative traits 12,13 , and, in several instances, causal mutations for QTLs have been identified in the coding 14 as well as the regulatory 15 regions of genes. These studies have therefore focused on the average genetic effect of the genotypes of a QTL, ignor- ing the possibility that these effects might be influenced by genetic background, either by other individual loci or by all other loci. Epistatic QTLs. Epistatic QTL-mapping meth- ods are more flexible than those for individual QTLs as they simultaneously consider the mean effects of multi-locus genotypes on the phenotype (FIG. 1).The use of the method- ology poses more technical challenges and demands more from the data than individual QTL mapping (BOX 2).For these reasons, epistatic QTL mapping is not yet a standard tool in complex trait studies. Epistasis be- tween pairs of QTLs in which both or one QTL have detectable individual effects has been reported 16?20 ,but the extent to which epistasis controls variation in quantitative traits has been poorly explored. There are sev- eral methods for mapping epistatic QTLs in human 21 and experimental populations 22?25 ; some of the most recent methods are based on simultaneous scans and randomization tests that detect QTLs that do not have individual effects 23,24 .Such approaches have led to the identification of many, statistically reliable, novel epistatic QTLs. Insights from model organisms Epistatic QTL-mapping studies in model organisms have detected many new interac- tions and have therefore concluded that epis- tasis makes a large contribution to the genetic regulation of complex traits. Epistatic QTLs without individual effects have been found in various organisms, such as birds 26,27 ,mam- mals 28?32 , Drosophila melanogaster 33 and plants 18,34 .However, other similar studies have reported only low levels of epistasis or no epistasis at all, despite being thorough and involving large sample sizes 35?37 .This clearly indicates the complexity with which multifac- torial traits are regulated; no single mode of inheritance can be expected to be the rule in all populations and traits. Thorough genetic studies of complex traits therefore need to be flexible and need to accommodate various modes of inheritance, as we cannot currently define a specific mode of inheritance before Box 1 | Defining epistasis Mendelian traits The term ?epistasis? was initially used in the context of Mendelian inheritance; environmental effects are relatively unimportant for Mendelian traits, so individuals can be clearly assigned to one of a limited number of classes according to their phenotype. Here, epistasis was used to describe the situation in which the actions of one locus mask the allelic effects of another locus, in the same way that completely dominant alleles mask the effects of the recessive allele at the same locus. A clear example of this can be seen in a,in which the dominant allele (I) at the KIT locus, which confers white-coat colour in the pig, is dominant over all alleles at the MC1R locus (E), which confer a darker coat colour. The effects of the various alleles at the E locus can only be determined in individuals with the recessive genotype ii at the I locus. This example was classically termed ?dominant epistasis?, which gives a segregation ratio of 12:3:1 for white:black:brown, respectively. Complex traits For complex traits, epistasis describes any interaction between two or more loci, such that the phenotype of any genotype cannot be predicted simply by summing the effects of individual loci. A fictive example with two loci with no epistasis for a complex trait is shown in b.Here, the 3 lines for the effects of 3 genotypes at locus 1 run in parallel, indicating that the phenotypic effect is not influenced by the genotype at locus 2. Examples of epistasis for complex traits are shown in c?e. The first common pattern (c) is similar to Mendelian dominant epistasis shown in a,in which one locus in a dominant way suppresses the allelic effects of a second locus. In this example of growth in chickens, among-genotype variation for locus 2 is only expressed in the presence of the homozygous LL genotype at locus 1 (REF. 26). Such epistasis often leads to individual QUANTITATIVE TRAIT LOCI (QTLs) having small average differences among genotypes and therefore not being detected unless epistasis is incorporated into the analysis. The second epistatic pair (d) is an example of co-adaptive epistasis, in which genotypes that are homozygous for alleles of the two loci that originate from the same line (that is, JJ with JJ,or LL with LL) show enhanced performance. This type of gene interaction is particularly interesting as the loci have no significant individual effect (for example, the average effect of JJ, JL and LL do not differ) and it therefore cannot be detected without a SIMULTANEOUS SCAN for multiple QTLs 26 . The third epistatic example (e) shows dominance-by-dominance epistasis, in which the double heterozygote (LS, LS) deviates from the phenotype that is expected from the phenotypes of the other heterozygotes (?, LS or SL,?). The figure shows an example of a negative dominance- by-dominance interaction, which causes the double heterozygote to have a lower phenotype than expected 31 .Images in panel a are reproduced with permission from REF. 56 � (2001) Macmillan Magazines Ltd. and REF. 57 � (1998) Genetics Society of America. a Dominant epistasis (Mendelian) Extension genotype (MC1R) Dominant white genotype ( KIT ) II EE Ee ee Ii ii b Two additive (non-epistatic) loci AA AB BB Phenotype Genotype locus 1 0 0.5 1.0 1.5 2.0 2.5 Locus 2 ? AA Locus 2 ? AB Locus 2 ? BB c Dominant epistasis (complex) 225 245 265 285 305 325 345 JJ JL LL Gr owth in chicken 8?46 days of age (grams) Genotype locus 1 Locus 2 ? JJ Locus 2 ? JL Locus 2 ? LL d Co-adaptive epistasis 35 36 37 38 39 40 41 42 43 JJ JL LL Hatch-weight in chicken (grams) Genotype locus 1 Locus 2 ? JJ Locus 2 ? JL Locus 2 ? LL e Dominance-by-dominance epistasis 0 0.5 1 1.5 2 SS SL LLMater nal performance for of fspring survival in mice Genotype locus 1 Locus 2 ? SS Locus 2 ? SL Locus 2 ? LL 620 | AUGUST 2004 | VOLUME 5 www.nature.com/reviews/genetics PERSPECTIVES results of mapping single QTLs and epistatic QTL pairs that affect growth differences between Junglefowl and White Leghorn chick- ens 26 .Four QTLs were detected by their indi- vidual effects, and four additional QTLs were found as part of an epistatic QTL pair. Two of the QTLs that were significant only as part of an epistatic QTL pair had near-significant indi- vidual effects, whereas the other two were novel epistatic loci that showed only minor individual effects. These last two loci could have been detected only through the simulta- neous search approach. Similar results have been found in other studies: for example, in their analysis of functionally related physiolog- ical traits in 95 D. melanogaster RECOMBINANT INBRED LINES,Montooth et al. 33 identified both epistatic QTL pairs without individual effects and QTL pairs with near-significant individual effects. the analysis is done. The routine use of epista- tic QTL-mapping methodologies will help to explore whether there are particular traits, population types or species for which epista- tic regulation is more important than for oth- ers. Here, we focus on the results of the first genome-wide scans for interacting QTLs without individual effects 26?33 .These studies indicate that the more widespread use of QTL-mapping methods that are based on simultaneous scans for the joint effect of pairs of epistatic QTLs can give further insight into the genetics that underlie complex traits. Population size and statistical power. As expected, the power to detect epistasis varies with the size of the population and the preci- sion with which the analysed phenotypes are measured. So, results that are obtained from smaller populations in which single measurements are taken from each individ- ual 29,30 detect less epistasis than studies of large populations 26 or populations for which several measures per genotype are considered 33 .In summary, although the results indicate that epistatic effects are often large enough to merit a full genome scan for epistasis regardless of the population size 26?30,33 , the epistatic studies (or meta-analyses) of populations are at their most powerful if they use good quality data from 500 or more F 2 individuals. Potential to detect novel epistatic QTLs. Simultaneous mapping of QTLs using an epistatic model can detect loci that mainly affect the quantitative trait through epistatic interactions with another locus, in addition to those QTLs that are detected through their average individual effects 26?30 .An example of this is shown in FIG. 2,which summarizes the F 0 F 1 F 2 QQEE qq qqE- qqbb,Q-E- Q-ee Qq QQ QqEeQqEe qqee Individual QTL mapping Epistatic QTL mapping 123 1.5 1 0.5 0 456789 1 0 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Individual QTL QTL pairs Test statistic between 2 and 3 Test statistic between 1 and 2 Test statistic between 0 and 1 Genomic location (M) Test statistic 123 1 1.5 0.5 0 456789101112 13 14 15 16 17 18 19 20 21 22 23 24 25 Individual QTL T est statistic Genomic location (M) Epistatic pair effect Marginal effects ca b 45 n12 3 Figure 1 | Principles of individual quantitative trait loci and epistatic quantitative trait locus mapping. Quantitative trait locus (QTL) mapping aims to partition the total genetic variation for a trait into the effects of individual loci. a | The F 2 design for QTL mapping uses a three-generation pedigree, and requires two strains that differ in the trait of interest and a linkage map of polymorphic molecular markers. The hybrid F 1 generation derived from the parental lines is intercrossed to produce F 2 individuals; here, one full sibling family of n individuals that differ in the proportion of markers that they have inherited from the parental lines is shown. F 2 individuals are analysed statistically to determine whether there is a difference in phenotype between marker genotype classes. If there is, then the QTL alleles (here, Q,q and E,e) are linked to the marker. b | In principle, individual QTL mapping searches for the partitioning of the F 2 individuals according to the QTL genotype with the smallest phenotypic variation within the genotype classes and the largest differences among the genotype classes. Epistatic QTL mapping partitions the individuals according to the genotypes at multiple loci (for example, Q and E), which ensures a better fit if the phenotype depends on the genotypes of both loci. c | An individual QTL scan results in a statistical support curve for the individual effects of a QTL at each tested location in the genome (two green boxes). A simultaneous scan for QTL pairs results in a statistical support surface for all potential combinations of QTL locations in the genome (centre panel). QTLs with large individual effects will be seen as high peaks in the individual QTL support curve and as ridges in the QTL-pair support surface. A stepwise approach for detecting epistatic QTLs (for example, on the basis of FORWARD SELECTION) will only detect epistasis along these ridges. QTLs with mainly epistatic effects will usually appear only as small peaks in the individual QTL support curve, but as large peaks in the support surface for QTL pairs. Images of pigs in part b are reproduced with permission from REF. 57 � (1998) Genetics Society of America. PERSPECTIVES NATURE REVIEWS | GENETICS VOLUME 5 | AUGUST 2004 | 621 complex traits. We also highlight the problems in interpreting and applying QTL results that could arise by ignoring epistasis. Most strategies for epistatic QTL mapping use quantitative genetic models, with the detected epistasis being an indication of a genotype?phenotype dependency between loci that cannot be explained by individual QTL effects. However, epistasis that is detected in this way is not always biologically relevant. For example, some types of statistical epistasis result from the scale that is used to model QTLs 39 rather than from biological gene inter- actions 40 .It is therefore necessary to evaluate all combinations of loci using other post- mapping analyses to further explore whether the results can be explained by biological gene interactions. One way to link statistical esti- mates of epistasis to their biological meaning is to plot two-locus genotype?phenotype pat- terns for epistatic pairs and to connect these to experimentally determined gene-interaction patterns 26,27,31,41 .Several frequently occurring epistatic patterns have been identified by this approach (BOX 1), and, in some cases, these resemble Mendelian epistatic relationships (?dominant epistasis? in BOX 1a,c); such patterns can then be used to identify the underlying molecular mechanisms. Another means of exploring the functional relationships that underlie epistasis is to use gene regulatory net- works 42 to infer the genetic regulatory struc- tures, such as positive- or negative-feedback loops, that can generate the detected epistatic patterns. Functional relationships among loci have also been used to reconstruct the genetic pathways that are involved in regulating the complex trait. By creating networks in which QTLs are nodes and connections are epistatic predicting phenotypic variation from simulta- neously considering multiple-locus genotypes, relative to predicting it from the sum of single- locus genotypes. Despite the success of some epistatic QTL studies, the estimates of epistasis do not, in general, have a direct relationship to biological mechanisms of gene interactions 1,5 . In this section, we illustrate how knowledge about statistical epistasis can be used to infer the biological mechanisms that underlie Potential to attribute a larger proportion of the genetic variance to QTLs. By estimating the consequences of both significant individ- ual and interaction effects, it has been possible to better explain the total phenotypic varia- tion in terms of individual loci and combina- tions of loci. The proportion of the total genetic variance in F 2 or recombinant back- cross populations that results from epistasis was estimated in 5 available comparable stud- ies: the variance ranged from 0 to 81% with a mean of 38% for the 18 traits studied. The results are summarized in TABLE 1 (REFS 26?30). Given that the average phenotypic variances associated with correctly identified individual QTLs can be overestimated if smaller num- bers of progeny are analysed 38 , there is a risk that the variance that results from epistasis for the smaller populations shown in TABLE 1 (REFS 28?30) has been overestimated; by con- trast, the estimates from the larger populations are expected to be more reliable 26 . Epistatic versus biological interactions The term epistasis has two related, but distinct, meanings in genetics 1 .It was originally coined to describe the action by which one Mendelian locus alters the allelic effects at another locus, similar to the way dominance alters the allelic effects within a locus. In quantitative genet- ics, epistasis relates to the improvement in Box 2 | The challenges of epistatic analyses Methods for detecting epistasis can easily be derived from methods that have been developed for detecting individual QTL effects in experimental 58?60 and human populations 61?67 . The main obstacle to the more widespread use of these methods is therefore not the theoretical adaptation of QTL-mapping methods to accommodate epistasis, but, rather, the practical limitations of performing the analyses on experimental data. The principal limitations are outlined below. The computational demand of thorough epistatic QTL analyses is generally high. The number of potential QTL combinations in a multiple-QTL model increases rapidly with the number of QTLs that are considered simultaneously. So, it quickly becomes computationally intractable to evaluate all of these potential combinations of QTLs. By using parallel computing and efficient algorithms for epistatic QTL mapping 68?70 , the computational demands of the epistatic QTL analyses can be significantly reduced. The standard procedure during significance testing in QTL mapping is to use stringent significance thresholds, which are derived, for example, through RANDOMIZATION TESTS,to correct for the multiple testing that is performed during genome scanning 23,24,71 .Owing to a markedly higher number of tests that are carried out in multi-dimensional scans for epistatic QTLs, the use of multiple-testing corrected thresholds will cause only large epistatic effects to be detectable. To detect more subtle epistasis, alternative testing approaches are needed. Many data sets are not suitable for evaluations of epistasis. When searching for epistatic QTLs, the epistatic effects in the genetic models are derived from the genotype means of multi-locus genotypes instead of single-locus genotypes, as in individual QTL analyses. So, the power to detect QTLs depends on the number of individuals in the genotype classes on which the analysis is based. This means that considerably larger population sizes are needed to obtain the same power to detect an epistatic effect as an individual effect in the analyses. Computationally improved QTL-mapping algorithms will become a powerful tool for detecting epistatic QTLs in experimental populations, which can be designed for epistatic QTL mapping. The obstacles to obtaining high power in epistatic QTL mapping are, however, high in human populations, and will hamper the use of traditional approaches to detect epistasis unless alternative routes to identifying and validating epistatic QTLs are found. * * * * *** * 1 106 211 316 421 526 631 736 841 946 1051 1156 1261 1366 1471 1576 1681 1786 1891 1996 2101 2206 2311 2416 2521 0 5 10 15 20 25 30 35 40 45 50 Genomic location (cM) Chromosome number F-value 0 2 4 6 8 10 12 14 16 18 20 22 Genome-wide F-profile 5% genome-wide marginal significance Chromosome number 5% genome-wide interacting QTL Figure 2 | Comparing the results of searches for epistatic and individual quantitative trait loci. Genome-wide F-PROFILE support and 5% genome-wide significance level from the mapping of quantitative trait loci (QTLs) by their individual (additive and dominance) effects on growth from 8 to 46 days of age in an intercross between Junglefowl and White leghorn F 2 chickens 26 . The best estimated locations for the epistatic QTLs that were significant at a 5% genome-wide level in a scan for interacting QTL pairs are indicated by asterisks. cM, centiMorgan. 622 | AUGUST 2004 | VOLUME 5 www.nature.com/reviews/genetics PERSPECTIVES improved bioinformatic tools, will help to address some of these technical problems. In this section, we consider some options for incorporating these new technologies in a joint framework for quantitative genetic analysis. Model organisms. Epistatic QTLs with large enough effects can be detected even in small experiments 29,30 , so more knowledge about epistasis could be obtained from many pre- viously collected experimental data sets by re-analysing them with an epistatic QTL- mapping method. An even more rewarding strategy ? one that has been used successfully for individual QTL effects ? would be the joint analysis of data from similar or identical crosses carried out in different laboratories or at different times 43 .New analyses of already- collected data would be a cost-efficient way to obtain more experimental evidence for the interactions, large networks that contain many QTLs have been created for several traits, including adiposity (build up of fat) and tail length in mice 41 and growth in chick- ens 27 ,whereas several smaller networks have been reported for maternal performance for offspring survival 31 . Interactions among loci result in the genetic effects of alleles at one locus differing in magnitude or direction depending on the genotype at another locus. There is a risk that individual loci will remain undetected in cases in which epistasis is ignored, and that the esti- mated effects of detected QTLs could be severely biased. Overestimation of individual QTL effects leads to erroneous interpretations of the relative importance of detected QTLs, but also to problems with confirming QTL effects in further crosses and to lower eco- nomic gain if attempts are made to use the QTLs (for example, in MARKER-ASSISTED SELECTION; MAS). Epistasis therefore needs to be consid- ered if choosing the validation strategy for detected QTLs, as the nature of the interaction can guide the researcher to the appropriate genetic background to obtain maximal power for replication. For example, improving the resolution of mapping a QTL through recom- binant-progeny testing is only possible if the chosen parents have a genetic background that allows expression of the phenotype in the progeny. The same applies to MAS, in which the economic gain of introducing QTL alle- les to new lines can be improved by using knowledge about important interactive effects that are mediated by specific alleles at other genomic loci. Guidelines for effective analysis As discussed in the previous section, large and powerful studies are needed to thoroughly examine the genetic basis of complex traits, as smaller studies only allow detection of larger epistatic QTLs. How can this knowledge be applied to improve understanding of epista- sis in model organisms and how should epis- tasis be approached in inherently less powerful designs such as in human or other natural populations? In theory, the analytical princi- ples described above should apply equally well to all populations; however, in practical terms, its application is hindered by technical limitations and the nature of the experimen- tal data (see BOX 2). New technologies, such as high-throughput, high-density genotyping, more affordable expression analysis and Table 1 | Epistatic genetic variation explained by quantitative trait loci* Species Genotyped Number Proportion of variance Reference population of traits that results from epistasis Mean Maximum Minimum Chicken F 2 5 46% 79% 0% 26 Chicken F 2 3 26% 31% 19% 27 Mouse BC1 4 27% 33% 19% 28 Mouse F 2 5 49% 81% 16% 29 Mouse RBC 1 16% 16% 16% 30 *Estimates are based on the 5 available comparable studies, which cover 18 traits 26?30 . BC1, first-generation backcross; RBC, reciprocal backcross. Glossary ASSOCIATION STUDIES A set of methods that is used to identify correlations between genetic polymorphisms and expression of phenotypes, such as diseases, in populations. BOMBAY PHENOTYPE A rare ABO blood group (O h ) in which the genotype at a locus other than the ABO gene locus makes the individuals seem to have blood type ?O? even if the ?A? or ?B? enzymes are present. CANDIDATE GENES Genes in which functional variation is thought to affect the trait under consideration, often on the basis of their physiological role or their effects in other species. F-PROFILE A plot of the statistical support (measured by an F-test) for quantitative trait loci at regular intervals throughout the genome. FALSE DISCOVERY RATE (FDR). The proportion of false-positive test results out of all positive (significant) tests (note that the FDR is conceptually different to the significance level). FIRST-ORDER GENETIC INTERACTIONS Interactions between pairs of genes or quantitative trait loci. FORWARD SELECTION A statistical procedure in which a multi-dimensional genome scan is reduced to a series of sequential one-dimensional genome scans. HAPLOTYPE The allelic configuration of multiple genetic markers that are present on a single chromosome of a given individual. MAJOR GENE A gene that is part of a polygenic or oligogenic system but for which alternative alleles have a large influence on the phenotype. MARKER-ASSISTED SELECTION (MAS). Genetic markers are used to indirectly select for specific alleles at closely linked trait loci by directly selecting for the marker. PENETRANCE The proportion of individuals with a specific genotype who manifest the genotype at the phenotypic level. For example, if all individuals with a specific disease genotype show the disease phenotype, then the disease is said to be ?completely penetrant?. QUANTITATIVE TRAIT A continuously distributed measurable trait for which variation depends on a single gene or on the cumulative action of many genes and the environment. Common examples include height, weight and blood pressure. QUANTITATIVE TRAIT LOCUS (QTL). Genetic loci or chromosomal regions that contribute to variability in complex quantitative traits, as identified by statistical analysis. Quantitative traits are typically affected by several genes and by the environment. RANDOMIZATION TEST A statistical test in which statistical significance is judged by comparison to a distribution that is generated by repeated random permutations of the actual data. RECOMBINANT INBRED LINE A population of fully homozygous individuals that is obtained by repeated selfing from an F 1 hybrid, and that comprises 50% of each parental genome in different combinations. SIMULTANEOUS SCAN A multi-dimensional genome scan in which several gene locations are selected simultaneously. VARIANCE-COMPONENT APPROACH Quantitative trait locus (QTL) analysis method, suited to complex family structures, in which variance that is attributable to a QTL is estimated rather than the mean effects of alternative genotypes. PERSPECTIVES NATURE REVIEWS | GENETICS VOLUME 5 | AUGUST 2004 | 623 not a ready-made solution, which means that the full implementation of the strategy requires further research ? for example, into how to use information from the differ- ent sources to avoid biasing the results towards previously well-studied systems and away from potential new findings. The rapid development of high-through- put techniques provides many potential sources of external information for the test- ing procedure. Genome-sequencing projects provide information on most genes that are located in the regions of QTLs. As few regions in the genome are expected to lack potential candidate genes, functional infor- mation about dependencies between the relationships 47 .The derivation of haplotypes across the genome opens new opportunities for mapping genes that underlie complex traits in natural populations 48 .It should be possible to perform genome scans for direct associations between haplotypes at one loca- tion or combinations of haplotypes at two locations with trait variation. However, testing for interactions between multiple haplotypes in two locations is probably intractable. As an alternative, the haplotypes could be used to reconstruct the genetic relationship between individuals, both at individual loci and for combinations of loci. QTL effects can then be predicted by using the VARIANCE-COMPONENT APPROACH to estimate the proportion of the genetic variance that results from the effects of individual loci and from the interactions between them 49 .The sample sizes of these populations will probably need to be consider- ably larger than those of the experimental populations to achieve similar power, owing to a lower signal-to-noise ratio in the phenotypic measurements 50,51 .The designs might therefore not be cost-effective for detecting novel epista- tic patterns until the large-scale collection of haplotype data becomes feasible. Integrated framework for detecting epistatic QTLs. Most QTL-mapping studies are based on stringent significance thresholds that are derived to control the rate of false positives in the study. It has been argued 52 that we ought to focus on optimizing our procedures for elimi- nating and controlling false positives instead of imposing these stringent criteria. A step- wise approach in which a FALSE DISCOVERY RATE (FDR) 53 calculation is used to control the rate of false positives in each step is efficient at removing false positives 52 .Furthermore, the combination of information from many data sources has improved the range and quality of conclusions that can be drawn, for example, in studies that are based on gene- expression analysis 54 .Here, we propose a stepwise approach to build confidence in epistatic QTLs using many independent data sources. A multi-dimensional genome scan is used to identify a set of potentially interacting QTLs, and, in subsequent steps, external information is used to increase or decrease confidence in each of the QTLs in the initial set (FIG. 3).This strategy will allow the detec- tion of interacting QTLs of smaller effects and strong external support as well as novel QTLs with larger effects but less external support. What we present here is the outline of a strategy that, if fully implemented, exploits the ability of a multi-dimensional scan to detect novel epistatic QTLs without demanding large populations or effects. It is role of epistasis in the regulation of complex traits. Future study designs should, however, consider exploring epistasis more thoroughly. Collecting information on epistatic patterns and networks from many organisms and on a wide range of traits could be a valuable approach to understand more about what to look for in natural populations and potentially also to find new ways to model gene interac- tions. A thorough exploration of FIRST-ORDER GENETIC INTERACTIONS needs to be based on rea- sonably large populations, and new studies should aim for at least replicating the popula- tion sizes of the largest successful studies (of 850 F 2 individuals 26 or 100 unique genotypes with multiple phenotypic measurements 33 ). It will also be necessary to further explore the contribution of higher-order epistasis, for which even larger populations are needed. For example, 4,000 F 2 individuals are needed to study three-way interactions in an F 2 popula- tion, based on having the same number of individuals in all three-locus genotype classes as in a study of two-way interactions using 1,000 F 2 individuals. However, before initiat- ing such studies, it would be useful to explore large individual studies that have already been carried out or meta- or joint analyses of compatible data sets. Natural populations. There is reason to assume that the importance and abundance of gene interactions in natural populations are of the same magnitude as those found in model organisms. Several studies of epistasis among major genes or candidate genes have found epistasis in the expression of complex traits of medical importance in humans, including type I diabetes 44 , type II diabetes 45 and inflam- matory bowel disease 46 .However,owing to lack of power, evaluations of epistasis are not included as a standard procedure in the genetic analysis of complex traits in natural populations. Could the algorithms that have successfully been used in experimental popu- lations be adopted for analyses of more com- plex populations as well? In theory, the answer is yes, but in practice, it is difficult to collect data sets of sufficient size to obtain the full benefits of this methodology. However, cur- rent efforts in humans and other species to generate dense genetic maps using many polymorphic markers, such as SNPs, is encouraging; the aim of such projects is to use the maps to reconstruct common genome- wide HAPLOTYPES.In future, it should then, at least theoretically, be possible to obtain suffi- cient population sizes by sampling individuals from the general population and using high- density SNP maps to reconstruct haplotypic High FDR Intermediate FDR All QTLs fr om unconditional sear ch sorted by their p -values Testing strategy combining external and QTL information Low confidence QTLs High confidence QTLs External information on QTL regions Set of QTLs sorted according to joint confidence Selected set of QTLs from unconditional search expected to contain a reasonable portion of QTLs Figure 3 | Proposed framework for acquiring confidence in quantitative trait loci. The strategy involves integrating quantitative trait locus (QTL) mapping and external information. First, a (multi-dimensional) genome scan is performed to estimate the contribution of all possible combinations of genomic regions to the expression of the studied trait(s). From the complete results, a set of potentially biologically interesting QTLs is selected on the basis of a false discovery rate calculation (FDR). The FDR that is used does not need to be set at a particularly high level to identify potentially interesting regions 52 . The second step of the procedure aims to separate the set of potential QTLs into high- and low-confidence QTLs using external information. The classification of a QTL as a high-confidence QTL could be on the basis of very high significance in the QTL-mapping experiment, strong external evidence for an interaction or a combination of the two. 624 | AUGUST 2004 | VOLUME 5 www.nature.com/reviews/genetics PERSPECTIVES 26. Carlborg, �. et al. A global search reveals epistatic interaction between QTLs for early growth in the chicken. Genome Res. 13, 413?421 (2003). 27. Carlborg, �., Burt, D., Hocking, P. & Haley, C. S. Simultaneous mapping of epistatic QTL in chickens reveals clusters of QTL pairs with similar genetic effects on growth. Genet. Res. (in the press). 28. Kim, J. H. et al. Genetic analysis of a new mouse model for non-insulin-dependent diabetes. Genomics 74, 273?286 (2001). 29. Shimomura, K. et al. Genome-wide epistatic interaction analysis reveals complex genetic determinants of circadian behavior in mice. Genome Res. 11, 959?980 (2001). 30. Sugiyama, F. et al. Concordance of murine quantitative trait loci for salt-induced hypertension with rat and human loci. Genomics 71, 70?77 (2001). 31. Peripato, A. C. et al. Quantitative trait loci for maternal performance for offspring survival in mice. Genetics 162, 1341?1353 (2002). 32. Ways, J. A., Cicila, G. T., Garrett, M. R. & Koch, L. G. A genome scan for loci associated with aerobic running capacity in rats. Genomics 80, 13?20 (2002). 33. Montooth, K. L., Marden, J. H. & Clark, A. G. Mapping determinants of variation in energy metabolism, respiration and flight in Drosophila. Genetics 165, 623?635 (2003). 34. Eshed, Y. & Zamir, D. Less-than-additive epistatic interactions of quantitative trait loci in tomato. Genetics 143, 1807?1817 (1996). 35. Flint, J., DeFries, J. C. & Henderson, N. D. Little epistasis for anxiety-related measures in the DeFries strains of laboratory mice. Mamm. Genome 15, 77?82 (2004). 36. Smith Richards, B. K. et al. QTL analysis of self-selected macronutrient diet intake: fat, carbohydrate, and total kilocalories. Physiol. Genomics 11, 205?217. 37. Zeng, Z-.B. et al. Genetic architecture of a morphological shape difference between two Drosophila species. Genetics 154, 299?310 (2000). 38. Beavis, W. D. in Molecular Dissection of Complex Traits (ed.Paterson, A. H.) 145?162 (CRC, New York, 1998). 39. Cordell, H. J. et al. Statistical modeling of interlocus interactions in a complex disease: rejection of the multiplicative model of epistasis in type 1 diabetes. Genetics 158, 357?367 (2001). 40. Risch, N., Ghosh, S. & Todd, J. A. Statistical evaluation of multiple-locus linkage data in experimental species and its relevance to human studies: application to nonobese diabetic (NOD) mouse and human insulin-dependent diabetes mellitus (IDDM). Am. J. Hum. Genet. 53, 702?714 (1993). 41. Cheverud, J. M. et al. Genetic architecture of adiposity in the cross of LG/J and SM/J inbred mice. Mamm. Genome 12, 3?12 (2001). 42. Omholt, S. W., Plahte, E., Oyehaug, L. & Xiang, K. Gene regulatory networks generating the phenomena of additivity, dominance and epistasis. Genetics 155, 969?980 (2000). 43. Walling, G. A. et al. Combined analyses of data from QTL mapping studies: chromosome 4 effects on porcine growth and fatness. Genetics 155, 1369?1378 (2000). 44. Cordell, H. J., Todd, J. A., Bennett, S. T., Kawaguchi, Y. & Farrall, M. Two-locus maximum lod score analysis of a multifactorial trait: joint consideration of IDDM2 and IDDM4 with IDDM1 in type 1 diabetes. Am. J. Hum. Genet. 57, 920?934 (1995). 45. Cox, N. J. et al. Loci on chromosomes 2 (NIDDM1) and 15 interact to increase susceptibility to diabetes in Mexican Americans. Nature Genet. 21, 213?215 (1999). 46. Cho, J. H. et al. Identification of novel susceptibility loci for inflammatory bowel disease on chromosomes 1p, 3q, and 4q: evidence for epistasis between 1p and IBD1. Proc. Natl Acad. Sci. USA 95, 7502?7507 (1998). 47. The International HapMap Consortium. The International HapMap Project. Nature 426, 789?796 (2003). 48. Van Den Oord, E. J. & Neale, B. M. Will haplotype maps be useful for finding genes? Mol. Psychiatry 9, 227?236 (2004). 49. Blangero, J. & Almasy, L. Multipoint oligogenic linkage analysis of quantitative traits. Genet. Epidemiol. 14, 959?964 (1997). 50. Eaves, L. J. Effect of genetic architecture on the power of human linkage studies to resolve the contribution of quantitative trait loci. Heredity 72, 175?192 (1994). 51. Mitchell, B. D., Ghosh, S., Schneider, J. L., Birznieks, G. & Blangero, J. Power of variance component linkage analysis to detect epistasis. Genet. Epidemiol. 14, 1017?1022 (1997). 52. van den Oord, E. J. & Sullivan, P. F. False discoveries and models for gene discovery. Trends Genet. 19, 537?542 (2003). 53. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. 57, 289?300 (1995). 54. Drawid, A. & Gerstein M. A Bayesian system integrating expression data with sequence patterns for localizing proteins: comprehensive application to the yeast genome. J. Mol. Biol. 301, 1059?1075 (2000). genes needs to be added before the informa- tion can be used. Experimental evidence on gene dependencies can be extracted from the literature or databases on reported epistasis between QTLs or candidate genes, from knowledge of biochemical pathways and from studies on protein?protein interactions. Further experimental information can be obtained by studying gene expression, which provides evidence of regulatory relationships between groups of genes in the specific data set. Other bioinformatic evidence could come from, for example, mining literature databases for co-occurrences of gene names in pub- lished articles 55 .To further develop and evalu- ate this approach, we need to implement it for model organisms in which we can carry out large-scale studies and identify convincing evi- dence of epistasis with limited external evi- dence. These will then provide a means of learning what information is most valuable and how it can be applied most effectively in studies of natural populations such as those of our own species. Ironically, the extra restric- tions that this procedure imposes on the type of loci involved could, in some circumstances, make it easier to identify candidate loci that underlie pairs of interacting QTLs than it is to identify a candidate gene that underlies a QTL with no interactions. Future prospects New technologies are continually evolving to give us more information about isolated com- ponents of biological systems. One of the most challenging tasks will be to integrate the information in a biological model that can predict the function of the entire biological system. Genetics is central in biological mod- elling, as it provides the framework in which all other components act. It is possible to identify the influence of individual genetic components on variation in a system, and we are now starting to supplement that knowl- edge with information about the interplay between genes and how they jointly affect the system. The identification of epistatic interac- tions between genes and/or QTLs is a valuable starting point for a more thorough under- standing of these genetic networks. Our aim is to develop analytical frameworks that inte- grate information from many sources. It is with this in mind that we have proposed a general strategy for the detection of (epistatic) QTLs, in which information from many dis- ciplines is integrated in one framework. It is to be hoped that this will provide more infor- mation on the nature of gene interactions, because unless we consider epistasis, we will not be able to fully understand the control of complex traits. �rjan Carlborg is at the Linnaeus Centre for Bioinformatics, Uppsala University, BMC, Box 598, SE-751 24 Uppsala, Sweden. Chris S. Haley is at the Department of Genetics and Biometry, Roslin Institute, Roslin, Midlothian EH25 9PS, UK Correspondence to �.C. e-mail: orjan.carlborg@lcb.uu.se doi:10.1038/nrg1407 1. Phillips P. C. The language of gene interaction. Genetics 149, 1167?1171 (1998). 2. Hummel, K. P. The inheritance and expression of Disorganization, an unusual mutation in the mouse. J. Exp. Zool. 137, 389?423 (1958). 3. Tiret, L. et al. Synergistic effects of angiotensin-converting enzyme and angiotensin-II type 1 receptor gene polymorphisms on risk of myocardial infarction. Lancet 344, 910?913 (1994). 4. Flint, J. & Mott, R. Finding the molecular basis of quantitative traits: successes and pitfalls. Nature Rev. Genet. 2, 437?445 (2001). 5. Barton, N. H. & Keightley, P. D. Understanding quantitative genetic variation. Nature Rev. Genet. 3, 11?21 (2002). 6. Doerge, R. W. Mapping and analysis of quantitative trait loci in experimental populations. Nature Rev. Genet. 3, 43?52 (2002). 7. Jansen, R. C. Studying complex biological systems using multifactorial perturbation. Nature Rev. Genet. 4, 145?151 (2003). 8. Hoh, J. & Ott, J. Mathematical multi-locus approaches to localizing complex human trait genes. Nature Rev. Genet. 4, 701?709 (2003). 9. Hirschhorn, J. N., Lohmueller, K., Byrne, E. & Hirschhorn, K. A comprehensive review of genetic association studies. Genet. Med. 4, 45?61 (2002). 10. Lander, E. S. & Botstein, D. Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121, 185?199 (1989). 11. Haseman, J. K. & Elston, R. C. The investigation of linkage between a quantitative trait and a marker locus. Behav. Genet. 2, 3?19 (1972). 12. Andersson, L. & Georges, M. Domestic-animal genomics: deciphering the genetics of complex traits. Nature Rev. Genet. 5, 202?212 (2004). 13. Mackay, T. F. Quantitative trait loci in Drosophila. Nature Rev. Genet. 1, 11?20 (2001). 14. Grobet, L. et al. A deletion in the bovine myostatin gene causes the double-muscled phenotype in cattle. Nature Genet. 17, 71?74 (1997). 15. van Laere, A. S. et al. A regulatory mutation in IGF2 causes a major QTL effect on muscle growth in the pig. Nature 425, 832?836 (2003). 16. Fijneman, R. J., de Vries, S. S., Jansen, R. C. & Demant, P. Complex interactions of new quantitative trait loci, Sluc1, Sluc2, Sluc3, and Sluc4, that influence the susceptibility to lung cancer in the mouse. Nature Genet. 14, 465?467 (1996). 17. Long, A. D., Mullaney, S. L., Mackay, T. F. & Langley, C. H. Genetic interactions between naturally occurring alleles at quantitative trait loci and mutant alleles at candidate loci affecting bristle number in Drosophila melanogaster. Genetics 144, 1497?1510 (1996). 18. Li, Z., Pinson, S. R., Park, W. D., Paterson, A. H. & Stansel, J. W. Epistasis for three grain yield components in rice (Oryza sativa L.). Genetics 145, 453?465 (1997). 19. Shook, D. R. & Johnson, T. E. Quantitative trait loci affecting survival and fertility-related traits in Caenorhabditis elegans show genotype-environment interactions, pleiotropy and epistasis. Genetics 153, 1233?1243 (1999). 20. Morel, L. et al. Genetic reconstitution of systemic lupus erythematosus immunopathology with polycongenic murine strains. Proc. Natl Acad. Sci.USA 97, 6670?6675 (2000). 21. Stern M. P. et al. Evidence for linkage of regions on chromosomes 6 and 11 to plasma glucose concentrations in Mexican Americans. Genome Res. 6, 724?734 (1996). 22. Kao, C.-H., Zeng, Z.-B. & Teasdale, R. Multiple interval mapping for quantitative trait loci. Genetics 152, 1203?1216 (1999). 23. Sen, S. & Churchill, G. A. A statistical framework for quantitative trait mapping. Genetics 159, 371?387 (2001). 24. Carlborg, �. & Andersson, L. The use of randomisation testing for detection of multiple epistatic QTL. Genet. Res. 79, 175?184 (2002). 25. Yi, N., Xu, S. & Allison, D. B. Bayesian model choice and search strategies for mapping interacting quantitative trait loci. Genetics 165, 867?883 (2003). PERSPECTIVES NATURE REVIEWS | GENETICS VOLUME 5 | AUGUST 2004 | 625 66. Carlborg, �., Andersson, L. & Kinghorn, B. The use of a genetic algorithm for simultaneous mapping of multiple interacting quantitative trait loci. Genetics 155, 2003?2010 (2000). 67. Ljungberg, K., Holmgren, S. & Carlborg, �. Efficient algorithms for quantitative trait loci mapping problems. J. Comput. Biol. 9, 793?804 (2002). 68. Ljungberg, K., Holmgren, S. & Carlborg �. Simultaneous search for multiple QTL using the global optimization algorithm DIRECT. Bioinformatics 25 Mar 2004 (doi:10.1093/bioinformatics/bth175). 69. Churchill, G. A. & Doerge, R. W. Empirical threshold values for quantitative trait mapping. Genetics 138, 963?971 (1994). Acknowledgements We are grateful to the Biotechnology and Biological Sciences Research Council and to the Knut and Alice Wallenberg Foundation for their support. We thank B. Hill, G. Plastow, J. Cheverud and two anonymous referees for some valuable comments on the manu- script and A. Peripato for helpful assistance with providing data. Competing interests statement The authors declare that they have no competing financial interests. Online links DATABASES The following terms in this article are linked online to: Entrez: http://www.ncbi.nih.gov/Entrez ACE | AGTR1 | KIT | MC1R OMIM: http://www.ncbi.nlm.nih.gov/Omim type I diabetes | type II diabetes FURTHER INFORMATION �rjan Carlborg?s web page: http://www.orjancarlborg.com The Roslin Institute: http://www.ri.bbsrc.ac.uk Access to this links box is available online. 55. Jenssen, T. K., Laegreid, A., Komorowski, J. & Hovig, E. A literature network of human genes for high-throughput analysis of gene expression. Nature Genet. 28, 21?28 (2001). 56. Andersson, L. Genetic dissection of phenotypic diversity in farm animals. Nature Rev. Genet. 2,130?138 (2001). 57. Kijas, J. M. et al. Melanocortin receptor 1 (MC1R) mutations and coat color in pigs. Genetics 150, 1177?1185 (1998). 56. Haley, C. S. & Knott, S. A. A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69, 315?324 (1992). 57. Cheverud, J. M. & Routman, E. J. Epistasis and its contribution to genetic variance components. Genetics 139, 1455?1461 (1995). 58. Chase, K., Adler, F. R. & Lark, K. G. Epistat: a computer program for identifying and testing interactions between pairs of quantitative trait loci. Theor. Appl. Genet. 94, 724?730 (1997). 59. Schaid, D. J. General score tests for associations of genetic markers with disease using cases and their parents. Genet. Epidemiol. 13, 423?449 (1996). 60. Umbach, D. M. & Weinberg, C. R. Designing and analysing case?control studies to exploit independence of genotype and exposure. Stat. Med. 16, 1731?1743 (1997). 61. Rabinowitz, D. A transmission disequilibrium test for quantitative trait loci. Hum. Hered. 47, 342?350 (1997). 62. Almasy, L. & Blangero, J. Multipoint quantitative-trait linkage analysis in general pedigrees. Am. J. Hum. Genet. 62, 1198?1211 (1998). 63. George, V., Tiwari, H. K., Shu, Y., Zhu, X. & Elston, R. C. Linkage and association analyses of alcoholism using a regression-based transmission/disequilibrium test. Genet. Epidemiol. 17 (Suppl. 1), S157?S161 (1999). 64. Lunetta, K. L., Faraone, S. V., Biederman, J. & Laird, N. M. Family-based tests of association and linkage that use unaffected sibs, covariates, and interactions. Am. J. Hum. Genet. 66, 605?614 (2000). 65. Liu, Y., Tritchler, D. & Bull, S. B. A unified framework for transmission-disequilibrium test analysis of discrete and continuous traits. Genet. Epidemiol. 22, 26?40 (2002). "
Add Content to Group
|
Bookmark
|
Keywords
|
Flag Inappropriate
share
Close
Digg
Facebook
MySpace
Google+
Comments
Close
Please Post Your Comment
*
The Comment you have entered exceeds the maximum length.
Submit
|
Cancel
*
Required
Comments
Please Post Your Comment
No comments yet.
Save Note
Note
View
Public
Private
Friends & Groups
Friends
Groups
Save
|
Cancel
|
Delete
Please provide your notes.
Next
|
Prev
|
Close
|
Edit
|
Delete
Genetics
Gene Inheritance and Transmission
Gene Expression and Regulation
Nucleic Acid Structure and Function
Chromosomes and Cytogenetics
Evolutionary Genetics
Population and Quantitative Genetics
Genomics
Genes and Disease
Genetics and Society
Cell Biology
Cell Origins and Metabolism
Proteins and Gene Expression
Subcellular Compartments
Cell Communication
Cell Cycle and Cell Division
Scientific Communication
Career Planning
Loading ...
Scitable Chat
Register
|
Sign In
Visual Browse
Close
Comments
CloseComments
Please Post Your Comment