The role of genetic interactions in yeast quantitative traits

Joshua Bloom, Iulia Kotenko, Meru Sadhu, Sebastian Treusch, Frank W. Albert, Leonid Kruglyak Department of Human Genetics, University of California, Los Angeles, Los Angeles, California, USA Howard Hughes Medical Institute, University of California, Los Angeles, Los Angeles, California, USA Lewis Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544 Department of Biological Chemistry, University of California, Los Angeles, Los Angeles, California, USA

any trait studied here.The remaining strain repeatability variance ranged from 0.05% to 21.4%, with a median of 8.8% (Fig. 1).Three-way interactions may account for some of the remaining effect of strain, but are unlikely to explain most of this remaining variance for most traits (Supplementary Table 2).This leaves higher-order interactions, other effects of strain, or experimental effects confounded with strain as the potential sources of the remaining strain repeatability variance.
Next, we sought to identify the individual genomic regions underlying these genome-wide estimates.We used a forward-search QTL mapping approach that controls for other QTL 9 (Methods) to detect 797 genome-wide-significant additive QTL, with a median of 42.5 per trait (range 17-56).We calculated the variance captured by these detected QTL with a random effect model that uses a genetic relationship matrix (GRM) constructed only from genotypes at the peak markers for each significant additive QTL.These loci captured a median of 92% of the additive genetic variance (Fig. 2a).The number of detected QTL per trait increased approximately four-fold relative to that in our previous study of a subset of 1008 segregants from this panel 6 , but the variance captured by significant QTL only increased by 5%, because most detected loci generally have very small effect sizes (Fig. 4).These observations suggest that many additional undetected loci for these traits likely exist in this cross, but that their individual and collective effects are very small.The increased panel size also increases mapping resolution.The 180 loci that explain 1% or more of phenotypic variance have a median 95% confidence interval of 10.3kb, compared to 31.2kb with 1,008 segregants; these confidence intervals span approximately 5 genes in the yeast genome.
Detection of additive QTL that account for nearly all of the additive genetic variance allowed us to further partition the variance contributed by QTL-QTL interactions (Methods).Briefly, we compared estimates of interaction variance captured by pairs of markers selected by three different criteria: all pairs of markers across the genome, the subset of pairs in which one marker is the peak of an additive QTL, and the subset where both markers are additive QTL peaks.As noted above, across the traits examined, the amount of phenotypic variance captured by interactions between all marker pairs had a median of 9.2%.The amount of phenotypic variance captured by interactions between significant additive QTL and the rest of the genome had approximately the same median (9.4%), whereas it dropped to 4.5% for interactions only between significant additive QTL (Fig. 3 and Supplementary Fig. 3).These results suggest that in most pairwise interactions, at least one of the loci has a significant additive effect, as can be confirmed by directly mapping QTL-QTL interactions (see below).
We detected specific genome-wide significant QTL-QTL interactions for each trait using a statistically powerful approach that takes into account all the additive genetic variance (Methods).One can test for interactions either between all pairs of markers (full scan), or only between pairs where one marker corresponds to a significant additive QTL (marginal scan).In principle, the former can detect a wider range of interactions, but the latter can have higher power due to a reduced search space.Here, the two approaches yielded similar results, detecting 205 and 266 QTL-QTL interactions, respectively, at an FDR of 10%, with 172 interactions detected by both approaches.In the full scan, 153 of the QTL-QTL interactions correspond to cases where both interacting loci are also significant additive QTL, 36 correspond to cases where one of the loci is a significant additive QTL, and only 16 correspond to cases where neither locus is a significant additive QTL (Supplementary Fig. 4).The interactions detected in the full and marginal scans captured a median of 3.2% and 3.4% of phenotypic variance, respectively (Fig. 3).These numbers correspond to about 40% of the total pairwise interaction variance estimates (Fig. 2b), and greatly exceed expectations from background linkage effects 10 (Supplementary Fig. 5).
The detected QTL-QTL interactions generally have very small effect sizes, with a median variance explained of 0.31% (max 3.3%), compared to 0.38% (max 26%) for additive QTL (Fig. 4).Further, for a given effect size, the power to detect an additive QTL is higher than the power to detect a QTL-QTL interaction.The combination of these two factors leads to a higher detection rate for additive QTL in our study, and consequently a greater fraction of additive variance explained by detected QTL.The remainder of the interaction variance is likely due to many more pairs with even smaller effect sizes.
We have used a very large yeast cross with 4,390 segregants to study quantitative trait variation in greater detail.Across 20 traits, we find that additive genetic effects and pairwise genetic interactions explain 43.7% and 9.2% of phenotypic variance, respectively, in agreement with previous estimates based on a smaller dataset 11 .We detected a median of 42.5 significant additive QTL per trait.On average, these QTL captured 92% of the estimated additive heritability.Loci that explain at least 1% of phenotypic variance of loci typically spanned no more than 10 kb.We further estimate that roughly half of the pairwise interaction variance is contributed by interactions among significant additive QTL, and that nearly all of the interaction variance is contributed by interactions between significant additive QTL and the genome.Twolocus interactions in which neither locus has an additive effect are rare and do not contribute much to phenotypic variance.We detected about 13 QTL-QTL interactions per trait; these capture 3.2% of phenotypic variance or 40% of total pairwise interaction variance.We previously discussed the factors that may lead to greater "missing" 12 heritability in human genome-wide association studies (GWAS) than in a yeast cross 6 .Here we have focused on better delineating the contributions of pairwise interactions to phenotypic variance.The larger cross enabled us to obtain an accurate genome-wide estimate of these contributions, and revealed that they are substantially smaller than those of additive effects for every trait examined.Further, few interactions arise from locus pairs without detectable additive effects.
Although accurate estimates of the contributions of higher-order interactions require even larger sample sizes, the preliminary estimates obtained here (Supplementary Table 2) suggest that such interactions contribute even less than pairwise ones.As previously noted, the contributions of interactions to phenotypic variance in outbred populations are expected to be smaller than in a cross 1,2 .These results suggest that genetic interactions are unlikely to be a major contributor to heritability of most quantitative traits, and that they are most effectively detected by starting with the set of loci with additive effects.Combined with the recent observation of a small contribution of dominance to human trait variation 13 , this suggests that heritability not captured by genome-wide additive models arises primarily from additive effects of variants untagged by current genotyping technologies.

Construction of segregant panel and sequencing libraries
The BYxRM segregants were constructed as described previously 6 .Before, we chose one segregant from a panel of dissected tetrads.Here we added segregants from this panel of tetrads that were not previously genotyped to assemble a new panel of 4390 segregants.A Biomek FX liquid handling robot (Beckman Coulter) was used to re-array segregants that had not been previously genotyped to 1 ml of yeast peptone dextrose (YPD) in 2-ml deep-well 96well plates (Thermo Scientific).Plates were sealed with Breathe-Easy gas-permeable membranes (Sigma-Aldrich), and yeast were grown 2 days at 30°C without shaking.DNA was extracted using 96-well DNeasy Blood & Tissue kits (Qiagen).DNA concentrations were determined using the Quant-iT dsDNA High-Sensitivity DNA quantification kit (Invitrogen) and the Bio-Tek Synergy 2 plate reader.DNA was diluted to 0.2 ng per microliter.Per sample, 5 µl of 0.2 ng per µl DNA was added to 4 µl of 5X Nextera HWM buffer (Illumina), 6 µl of water and 5 µl of 1/35 diluted Nextera enzyme.The transposition reaction was performed for 5 minutes at 55°C.Illumina sequencing adapters and custom indices were added by PCR directly after the tagmentation reaction without additional sample purification.10 µl of fragmented DNA was combined with 0.5 µl each of 10 µM index primers (one of N701-N712 plus one of 96 custom indices), 5 µl of 10X Ex Taq buffer, 0.375 µl Ex Taq polymerase (Takara), 4 µl of 2.5 mM dNTPs, and 29.625 µl of water, and amplified with 20 cycles of PCR.1152-plex libraries were run on two single end lanes of a rapid-run flow cell of a HiSeq 2500 (Illumina).

Power calculations
We calculated statistical power (1-β) for sample sizes of 100, 1000, and 4000 segregants in R using the 'power.t.test' function 14 .Power was calculated over a range of effect sizes, where effect size was calculated as the percent phenotypic variance explained by a single QTL or QTL-QTL interaction.To correct for multiple testing genome-wide significance thresholds (α) of p<6.9x10 -4 and p<2.5x10 -5 were used for additive and interacting QTL, respectively.These thresholds were chosen based on a familywise error rate (FWER) < 5% for the additive scan and false discovery rate (FDR)<10% for the interaction scan.

Determining segregant genotypes
Fastq files for the 3,552 previously unsequenced segregants now sequenced here were demultiplexed using fastq-multx 15 and aligned to the SacCer3 version of the reference genome using bwa 16 .The 3,552 new segregants were sequenced with an average coverage of approximately 2X.The 1,056 previously sequenced segregants were realigned to SacCer3.BAM 17 files for all 4,608 segregants were merged into one BAM file and variants were called as described previously.An additional filter was used to remove regions with strong mapping bias towards the reference genome 18 .Of 39,741 high confidence SNPs at which BY and RM differ, 28,220 unique SNPs were retained for downstream analysis.As described previously, a hidden Markov model was used to infer the segregant genotypes 6 .Segregants were removed if they had fewer than 25 or greater than 105 recombination breakpoints, fewer than 35,000 markers with genotype calls, or if the segregant genotype was correlated with another segregant with a Pearson correlation greater than 0.9.4,390 segregants passed these filters and were used for mapping.

Segregant phenotyping
All 4,390 segregants were phenotyped together, including the 1,056 previously characterized and sequenced segregants.Phenotyping was performed as described previously 6 .Briefly, segregants were pinned to agar plates from liquid stocks and then imaged for end-point growth at 48 hours.Colony radii were calculated using functions in the EBImage R package 19 .Endpoint growth measurements were filtered and normalized as previously described.
Note: Segregant genotypes and phenotype files are too large for submission here, but are available upon request by reviewers and will be made available through a supplementary website at publication.

Calculating variance components
Custom R code was used to estimate variance components and map additive QTL as well as QTL-QTL interactions.A repeated measures mixed model 7 was used to estimate variance components.The model be written as: where y is a vector of length m that contains phenotypes for n segregants including replicate measurements such that m = n * [number of replicates]. is a vector of estimated fixed effect coefficients.X is a matrix of fixed effects (here  is the overall mean, and X is a 1  vector of ones unless otherwise specified).Z is an m x n incidence matrix that maps m total measures to n total segregants.a are the additive genetic effects, i are the pairwise genetic interaction effects and p are effects due to residual strain repeatability.The residual error is denoted by e.The distributions of these effects are assumed to be normal with mean zero and variancecovariance as follows: The variance structure of the phenotypes is Here, A is the additive relatedness matrix, the fraction of genome shared between pairs of segregants.A was calculated using the 'A.mat' function in the rrBLUP R package 20 .  2 is the additive genetic variance captured by markers. ∘  is the Hadamard (entrywise) product of A, which can be interpreted as the fraction of pairs of markers shared between pairs of segregants.  2 is the interaction genetic variance captured by all pairwise combinations of markers.  and   are nxn and mxm identity matrices,   2 is the residual effect of strain not captured by the additive and interaction genetic variance terms, and   2 is the error variance.Variance components were estimated using AI-REML 21 and custom R code.Standard errors of variance component estimates were calculated as the square root of the diagonal of the Fisher information matrix from the iteration at convergence of the AI-REML algorithm.
An additional term for three-way interactions, using the Hadamard cube of A, is included in a model in Supplemetary Table 2.

Mapping additive QTL
Additive QTL were mapped using a forward stepwise procedure.For each chromosome and trait the above model was fit, replacing the term for polygenic additive effects with   where   ~(0,  _ 2   ). _ 2 is the additive genetic variance from all chromosomes excluding the chromosome of interest, and   is calculated as above, excluding markers from the target chromosome.The segregant best linear unbiased predictor (BLUP) residuals (  =  −   ) for each chromosome were calculated by subtracting the BLUPs for the effects of the rest of the genome and pairwise interactions from the phenotypes, where   = ( _ 2   +   2  ∘ ) ′  −1 ( − ) and  =   2    ′ +   2  ∘  ′ +   2    ′ +   2   .Replicate values per strain were averaged.These averaged BLUP residuals for each chromosome were then used as the starting point for scans for additive QTL on the chromosome of interest.Using BLUP residuals increases power to detect QTL by controlling for genetic contributions from the remainder of the genome 22 .We tested for linkage at each marker on the given chromosome by calculating (-n(ln(1-r 2 )/2ln(10))), where r is the Pearson correlation coefficient between the segregant genotypes at the marker and segregant BLUP residuals for n segregants.FWER thresholds were determined from empirical null distributions determined by recomputing the linkage statistic chromosome wide from 1,000 permutations of BLUP residual phenotypes to strain assignments and recording the maximum value 23 .The most significant marker was extracted from each QTL significant at a 5% FWER threshold.These peak markers were added to the model as fixed effects and residuals were recomputed.Additional linkage scans were performed on these residuals (using 5% FWER thresholds that were recomputed after each round of QTL addition) until no additional significant QTL were detected on that chromosome.Confidence intervals were calculated as 1.5 LOD drop using the lodint function in R/QTL 24 .

Mapping QTL-QTL interactions
We increased power and computational efficiency by searching for interactions using the segregant BLUP residuals from the additive polygenic model as phenotypes.Specifically, we calculated   for each trait as   =  −   where   = (  2 ) ′  −1 ( − ) and  =   2  ′ +   2    ′ +   2   .Replicate values per strain were averaged.For the full two-dimensional scan, LOD scores for interactions were computed for all pairs of makers as (-n(ln(1-r 2 )/2ln(10))), where n is the number of segregants with phenotypes, r is the Pearson correlation coefficient between the product of segregant genotypes at pairs of markers separated by at least 50 markers and the BLUP residuals.FDR at different LOD thresholds was calculated by dividing (the average number of peaks obtained from 5 permutations of the data that scramble segregant identities) by (the number of peaks observed in the real data).We also tested for interactions between each locus with significant additive effects (identified as described in the preceding section) and the rest of the genome in the same manner as for the full twodimensional scan.We refer to this as the marginal scan.FDR was calculated as above.
Results from the BLUP residual approach were compared to a simpler two locus interaction model from 'scantwo' in R/QTL 24 that compares the likelihood ratio of a model that includes an interaction term to a model without this term.From the BLUP residual approach we detected 205 QTL-QTL in the full scan and 266 in the marginal scan.Using the same FDR procedure, 73 QTL-QTL were detected using R/QTL with the full two-dimensional scan and 112 were detected in the marginal scan.All of the R/QTL QTL-QTL interactions were also detected as statistically significant in our BLUP residual models.

Fraction of variance captured by marker subsets
To estimate the fraction of additive variance captured by significant additive QTL, we fit the model  =  +  +  + , where a was calculated from the relatedness of segregants only at the genome-wide significant QTL peak markers for the given trait (A QTL ) such that ~(0,  _ 2   ) , ~(0,   2   )  ~(0,   2   ), and compared it to the same model but with a calculated using the relatedness at all markers in the genome (A) as described above, such that ~(0,   2 ).
We partitioned the interaction variance in a similar manner.Starting with  =  +  +  +  + , where ~(0,   2 ) and ~(0,   2  ∘ ), we replaced  ∘  with various subsets of marker combinations.We fit a model with ~ (0,    *   2   ∘   ) to capture the fraction of variance due to all pairwise interactions between significant additive QTL.We fit a model with ~ (0,    *  2   ∘ ) to capture the fraction of variance due to all pairwise interactions between significant additive QTL and the genome.We fit models with ~(0,   2 ), where   2 is the fraction of phenotypic variance captured by significant QTL-QTL interactions and QQ is the relatedness matrix calculated from an nxq matrix where n is the number of segregants and each column corresponds to the product of the genotypes at the peak markers for genome wide significant interacting QTL-QTL.The median fraction of interaction variance explained by significant QTL-QTL interactions was calculated as the median of   2 /  2 for the given trait.
To estimate the fraction of variance explained by non-specific background linkage effects, N markers or pairs of markers were chosen per trait, where N was the observed number of QTL or QTL-QTL for that trait.GRMs were calculated as above, but for the random marker subsets instead of QTL peak markers.Replicates were averaged for each strain and the repeatability term was excluded from the model to make this analysis more tractable.Variance components were estimated for each of the models listed above for 50 random draws of N markers for each trait.The median fraction of variance explained from these simulations is plotted in Supplementary Fig. 5.
The individual QTL and QTL-QTL interaction effect sizes shown in Fig. 4 were computed using the anova function in R with a trait specific multiple regression linear model with all the trait specific significant QTL peak markers and the product of QTL-QTL pair peak markers as fixed effects

of additive and pairwise interaction genetic architectures
We simulated phenotypes from a range of genetic architectures to test whether the mixed model will appropriately partition variance into additive and interaction components given our experimental design and our observed genotype data.Specifically, we simulated all combinations of either 0, 1, 5, 10, 50, or 500 QTL and/or QTL-QTL interactions.We set the broad-sense heritability (defined for these simulations as additive plus pairwise interaction variance) to 0.75 and varied the additive heritability to range from 0 to 0.75 in increments of 0.15 for all unique combinations of QTL and QTL-QTL interactions.QTL were given equal effects, but the sign of their effect was chosen at random.The positions of additive QTL were chosen randomly for each simulation.The positions of QTL-QTL interactions were chosen from the set of all combinations of additive QTL, but if the target number of QTL-QTL interactions was greater than the set of all combinations of additive QTL, then additional QTL-QTL interaction positions were chosen where neither position had a marginal additive effect.The summed effects of the additive loci were scaled to have the target additive variance and the summed effects of the interacting loci were scaled to have the target interaction variance and these were added to create vector g.Error variance was added from a normal distribution with mean 0 and SD=(1-H2)/H2*var(g)) .Additive and interacting variance components were estimated with GRMs constructed from all the markers, as described above (Supplementary Table 1).

Figure 2 .
Figure 2. Additive and interaction variance captured by detected loci a) Total variance captured by detected QTL for each trait is plotted against the whole-genome estimate of additive genetic variance.Error bars show +/-s.e.The diagonal line represents (variance captured by detected QTL = additive genetic variance) and is shown as a visual guide.b) Total variance captured by detected QTL-QTL interactions from the marginal scan for each trait is plotted against the whole-genome estimate of interaction variance.Error bars show +/-s.e.The diagonal line represents (variance captured by detected QTL-QTL interactions = interaction genetic variance) and is shown as a visual guide.

Figure 3 .
Figure 3. Phenotype variance captured by different variance component models of twoway interactionsThe average fraction of phenotypic variance captured by different variance component models of two-way interactions across traits.The bar heights represent variance estimated with all markers (genome x genome) (grey), significant additive QTL by all markers (QTL x genome) (blue), additive QTL by additive QTL (QTL x QTL) (green), significant QTL-QTL detected from the marginal scan (orange), and significant QTL-QTL from the exhaustive two-dimensional scan purple).Error bars show +/-s.e.

Figure 4 .
Figure 4. Distribution of genetic effects and power to detect them A density plot of the fraction of phenotypic variance (X-axis) explained by individual significant QTL (blue area) and QTL-QTL interactions (red area) across all traits.The curves correspond to the statistical power at a genome-significance threshold (right Y-axis) for QTL (blue) and QTL-QTL interactions (red).

Figure S3 .
Figure S3.For each trait, juxtaposed barplots of the two-way interaction variance captured with all markers (genome x genome) (grey), significant additive QTL by all markers (QTL x genome) (blue), additive QTL by additive QTL (QTL x QTL) (green), significant QTL-QTL interactions detected from the marginal scan (orange), and significant QTL-QTL interactions from the exhaustive twodimensional scan (purple) are shown.Error bars show +/-s.e.

Figure S4 .
Figure S4.The fraction of phenotypic variance explained by individual significant QTL-QTL interactions from the exhaustive two-dimensional scan aggregated across all traits and grouped by whether 0, 1, or 2 of the interacting partners of the QTL-QTL interaction also have significant additive effects.