Phenotypic covariance across the entire spectrum of relatedness for 86 billion pairs of individuals

Attributing the similarity between individuals to genetic and non-genetic factors is central to genetic analyses. In this paper we use the genomic relationship (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}π) among 417,060 individuals to investigate the phenotypic covariance between pairs of individuals for 32 traits across the spectrum of relatedness, from unrelated pairs through to identical twins. We find linear relationships between phenotypic covariance and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}π that agree with the SNP-based heritability (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat h_{SNP}^2$$\end{document}h^SNP2) in unrelated pairs (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi \, < \, 0.02$$\end{document}π<0.02), and with pedigree-estimated heritability in close relatives (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi \, > \, 0.05$$\end{document}π>0.05). The covariance increases faster than \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi \hat h_{SNP}^2$$\end{document}πh^SNP2 in distant relatives (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0.02 \, > \, \pi \, > \, 0.05$$\end{document}0.02>π>0.05), and we attribute this to imperfect linkage disequilibrium between causal variants and the common variants used to construct \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}π. We also examine the effect of assortative mating on heritability estimates from different experimental designs. We find that full-sib identity-by-descent regression estimates for height (0.66 s.e. 0.07) are consistent with estimates from close relatives (0.82 s.e. 0.04) after accounting for the effect of assortative mating.


Supplementary Tables
Supplementary Table 1. Trait name (and abbreviation), number of records analysed, UK Biobank field identifiers (f.id) and analysis particulars for 32 quantitative and ordered categorical traits.

SUPPLEMENTARY NOTE 1: Effect of incomplete linkage disequilibrium on phenotypic covariance
Simulation was used to investigate the effect of incomplete linkage disequilibrium between causal variants and common variants included in the genomic relationship matrix (GRM) on the phenotypic covariance across the relatedness spectrum. Specifically, we wanted to investigate if the observed increased phenotypic covariance in distant relative pairs (0.02 < π < 0.05) could be recapitulated through incomplete linkage disequilibrium by selecting causal variants with lower minor allele frequency than common (HapMap3) variants used to construct the GRM.

Method overview
Phenotypes were simulated under a simple A + E model (i.e. phenotype = additive genetic effect + environmental deviation) for the 417,060 individuals from the UK Biobank included in our study. For each replicate, we calculated the average covariance between pairs of individuals in each of the 54 relationship bins from the GRM. Simulations used the GRM generated and described in the main text. We conducted 100 replicate simulations of four scenarios; where scenarios differed in the causal variant distribution for allele frequency and imputation accuracy. The final scenario was when causal variants were sampled from genotyped SNPs. The simulated heritability of the trait was 0.8.

Selection of causal variants
For each replicate, we selected 10,000 causal variants from a pool of variants defined by four scenarios. Scenarios 1, 3 and 4 used imputed data. Briefly, all UK Biobank genotypes were cleaned and imputed to the Haplotype Reference Consortium (HRC) 6 and UK10K 7 reference panels by Bycroft et al. 3 . Genotype probabilities were converted to hard-call genotypes using PLINK2 8 (--hard-call 0.1) for bi-allelic variants with info score > 0.3. Basic QC applied to imputed variants including a Hardy-Weinberg equilibrium test p-value > 1x10 -5 and minor allele count > 100 (as assessed in the unrelated European sample). The first scenario selected causal variants from the ~1.1M HapMap3 variants used to create the GRM. As described in the main text, all these variants had minor allele frequency (MAF) > 0.01. The second scenario selected causal variants from the 558,730 autosomal genotyped variants supplied by the UK Biobank 3 that passed QC filters with MAF > 0.01. QC filters applied to the genotyped variants included a Hardy-Weinberg equilibrium test p-value > 1x10 -5 , missing genotype rate < 0.05 and minor allele count > 100 (as assessed in the unrelated European sample). The third scenario selected causal variance from imputed variants passing basic QC filters, while the final scenario applied an additional QC filter to imputed variants passing basic QC to include only those variants with high imputation accuracy (info score > 0.95). A summary of the scenarios used to select causal variants is provide in Supplementary Table 6.

Simulation of phenotypes
Phenotypes (P) were simulated using GCTA (v1.9) 9 as the sum of genetic effects (A) and environmental deviations (E) for each individual. For each simulation, allele substitution effects were generated for the 10,000 causal variants from N(0,σ 2 where M is the number of causal loci (M = 10,000), pj is the minor allele frequency of the j-th causal variant and σ 2 a is the additive genetic variance (σ 2 a = 0.8). Additive genetic effects were then the sum of simulated effect for each locus, dependent on the genotype of the individual. Environmental deviations were generated from N(0,σ 2 e), where σ 2 e is the environmental variance (σ 2 e = 0.2). Thus, the expectation for the heritability (h 2 ) of the simulated trait is h 2 = σ 2 a/(σ 2 a+σ 2 e) = 0.8.

Parameter estimates
Heritability was estimated using a weighted linear regression of the average phenotypic covariance on the mid-point of the genomic relationship bins in either unrelated (π < 0.02) or relative (π > 0.05) pairs.

Results -distribution of MAF and imputation quality
The minor allele frequency (MAF) and imputation quality distribution for causal variants differed markedly across the scenarios. All HapMap3 variants used to construct the GRM (scenario 1) were common (MAF > 0.01) and almost all variants had high imputation quality (info score > 0.95, Supplementary Figure 9). Genotyped variants (scenario 2) were also all common (MAF > 0.01) but had more than twice the number of low frequency variants (0.01 < MAF < 0.1) compared to the HM3 variants. In contrast, less than half the imputed variants passing basic QC filters (scenario 3) were common with imputation quality generally being low-moderate for most of these variants. The final scenario (scenario 4), where imputed variants were selected to have high imputation accuracy, resulted in about 15% of variants being rare and is somewhat intermediate between scenarios 1 and 3 in terms of MAF frequency distribution and imputation accuracy. Note that although genotyped SNP in scenario 2 were not imputed, up to 5% missing values (per variant) were tolerated in this scenario.
Supplementary Figure 9. The minor allele frequency distribution (left) and imputation accuracy (right) for 10,000 randomly selected causal variants in four scenarios: (i) imputed HapMap3 variants, (ii) common genotyped SNP, (iii) all imputed QC'd variants and (iv) imputed QC'd variants with high imputation accuracy (info score > 0.95).

Results -parameter estimates
The heritability estimates were not significantly different from the simulated value in unrelated (π < 0.02) and relative (π > 0.05) pairs when causal variants were selected from the HapMap3 variants used to construct the GRM (scenario 1, 1 $ = 3.8, p = 0.05; Supplementary Table S7). This implies little loss in information between causal variants and the relationships estimated in the GRM. Thus, the phenotypic covariance is directly proportion to ℎ 0 !"# $ for any pair of individuals under this scenario.
In contrast, the phenotypic covariance showed two distinct linear sections when the causal variants were not included in the GRM (scenarios 2-4). There was a smooth transition between the two linear sections in distantly related individuals (0.02 < π < 0.05, Supplementary Figure 10). Although the slope in relatives (π > 0.05) and intercepts were in accordance with the expected values, the slope in unrelated individuals was significantly lower than the expected value of 0.8 in all cases (scenario 2: slope = 0.442 s.e. 0.005; scenario 3: slope = 0.259 s.e. 0.001; scenario 4: slope = 0.610 s.e. 0.002). A lower slope indicates that the covariance between unrelated pairs is lower than that expected from their estimated genomic relationship. This could be an artefact of technical errors (i.e. poor imputation quality or missing genotypes) but the lower than expected covariance in unrelated individuals is consistently observed across all our simulations of high and low imputation accuracy, and genotype variants (scenarios 2-4).

Supplementary
$ is a measure of LD and the maximum squared correlation between two loci, 6 and 7 are the allele frequencies at loci A and B respectively, and we assign alleles such that 6 < 7 . This implies, on average, causal variants would expect a loss of about half the simulated heritability in unrelated individuals due to incomplete LD (i.e. and ℎ $ is the simulated trait heritability. We observed a slope of 0.442 (s.e. 0.005), which is of a similar magnitude to our crude approximation, and suggest that incomplete LD is a key factor determining the two distinct linear sections of phenotypic covariance in relative (π > 0.05) and unrelated (π < 0.02) pairs. Figure 10. Phenotypic covariance for 100 replicate simulations, where the simulated heritability is 0.8 and four different scenarios were used to select causal variants. Shown in red and blue are the regression lines through unrelated (π < 0.02) or relative (π > 0.05) pairs. Boxes represent the median and interquartile range (IQR = Q3 -Q1), and whiskers are Q3±1.5*IQR from 100 replicate simulations. Points indicate outlier replicates.

SUPPLEMENTARY NOTE 2: Phenotypic covariance due to epistatic interactions
The presence of epistasis (i.e. non-additive interaction of alleles at different loci) could increase the phenotypic covariance between pairs, particularly in close relatives. This note (i) uses simulation to describe the effect of epistatic variance on the phenotypic covariance in relatives, (ii) tests 14 phenotypes in the UK Biobank for epistatic variance, and (iii) determines the power of our test to detect epistasis. The expectation, following 12 , is that the phenotypic covariance will increase proportional to square of the genomic relationship ( $ ).

Simulation Overview
We simulated phenotypes as the sum of genetic (G) and environmental (E) effects using real genotypes from our sample of 417,060 individuals from the UK Biobank. Simulated genetic effects were solely due to epistatic (or interacting) loci. There were 100 replicate simulations where each replicate selected a different set of 5,000 pairs of unlinked loci (i.e. 10,000 causal variants in total) from the 1,123,347 HapMap3 variants used to create the GRM. We calculated the phenotypic covariance between pairs of individuals in the 54 relationship bins from the GRM described in the main text. The broad-sense heritably (H 2 ) of the simulated trait was 0.8.

Simulated phenotypes
Phenotypes (P) were simulated as the sum of genetic effects (G) and environmental deviations (E) for each individual. Genetic effects are the sum of simulated epistatic effects, dependent on the (observed) genotype for an individual. Epistatic effects are generated from a unit normal distribution and : = ; ∘ @ , where : is the genetic effect of individual j, and are the standardised genotypes for individual j at the first and second locus respectively of an interacting pair, and is the epistatic effect of the corresponding pair of loci. In other words, if we considered two biallelic loci (A/a and B/b), then the genetic effects are: Genetic effects for all individuals (G) were scaled to be N(0,1) and environmental deviations sampled from a normal distribution with mean 0 and variance (1 − $ )/ $ .

Simulation Results
The covariance was approximately zero between unrelated individuals (Supplementary Figure 11, left panel). This is consistent with simulation conditions where epistatic variance is not generated between unlinked loci. For relatives, phenotypic covariance increased in relationship bins in proportion to the expected value of $ $ (Supplementary Figure 11, right panel).

Testing UK Biobank phenotypes for epistasis
We tested the subset of 14 phenotypes in the UK Biobank with moderate-high SNPbased heritability (ℎ 0 !"# $ > 0.1) and > 400 records for additive-by-additive genetic variance. We used a bin-based approach where we fitted a weighted linear regression in R 13 (where weights were equal to Nk pairs per bin) of the phenotypic covariance per bin on the linear ( < ) and quadratic ( < $ ) functions of the average genomic relationship. Coefficients for the quadratic term indicate the additive-by-additive variance for all traits was not significantly different from zero (p > 0.05/14; Supplementary Table 8), though the additive-by-additive term was nominally significant for both height and body fat percentage. Table 8. Additive-by-additive variance estimates for 14 traits using a weighted linear regression in the UK Biobank. Shown is the coefficient, standard error (s.e.) and two-sided p-value for the quadratic ( $ ) term (where t-value ~ tdf=4).  Figure  12. Contrary to expectations under additive-by-additive effects (i.e. Supplementary  Figure 11), a negative coefficient for $ was estimated for height. This seems driven predominately by monozygotic (MZ) twin pairs having lower covariance than expected under a linear model, and is likely to be an artefact of assortative mating. That is, assortative mating inflates the genetic variance in all close relative pairs except MZ twins. Caution is also needed for the interpretation of the (nominally) significant result for body fat percentage. An alternate model, for example, that included common environmental effects for full-sibs ( ~ 0.5) and monozygotic twins ( ~ 1.0) could also feasibly explain the non-linear increased covariance for this trait. Figure 12. Phenotypic covariance in close relatives for height (left) and body fat percentage (right) modelled with a quadratic function for the genomic relationship ( ). Red points show the observed data and 95% confidence intervals from the fitted model are indicated in grey.

Power to detect epistasis
We assess the power of our analysis to detect epistatic variation. We provide an analytical solution and verified these calculations via simulation.

Power to detect epistasis -analytical solution
The model fitted to the data was: = + where is the vector of average covariance per bin, is a the design matrix fitting an overall mean, linear and quadratic terms for the genomic relationship; is the vector of residual errors distributed ( , = $ ), and is diag(1/N) where N is the number of pairs per bin. The sampling variance of given by [ > 5 ] 51 , and the non-centrality parameter for the quadratic term is (b3/[ > 5 ] @,@ 51 ) 2 , with b3 = H 2 . Power was calculated in R using an F-distribution with 1,4 degrees of freedom and a type-1 error rate of 0.05.

Power to detect epistasis -simulation
Our simulations followed the model outlined in sections 2.1.1 and 2.1.2. That is, we simulated phenotypes as the sum of genetic (G) and environmental (E) effects using real genotypes from the UK Biobank, where simulated genetic effects were solely due to 5,000 pairs of unlinked loci with additive-by-additive genetic effects. We calculated the average phenotypic covariance in 7 relationship bins for close relatives (

SUPPLEMENTARY NOTE 3: Expectations from full-sib IBD regression analysis under assortative mating
This section summarises known theoretical results from the literature and derives new theoretical results for full-sib IBD regression, validated by simulation. The main finding is that for full-sib IBD regression, and by implication for its generalisation in the RDR method 14 , under assortative mating the expectation of the estimate of heritability is neither the population value under random mating nor the population (equilibrium) value under assortative mating.

Theory -change in genetic & phenotypic variance due to assortative mating
Assortative mating increases the genetic variance in a population, compared to a random mating or base population, and therefore also the phenotypic variance. After several generations of assortative mating equilibrium conditions are reached where the genetic and phenotypic variance remain constant (but increased compared to a random mating or base population). This section of the supplementary note sets out the expected variances after 1 generation of assortative mating, and at equilibrium conditions; and how to convert from base (random mating) population to equilibrium variances (Supplementary Table 9). Theory is reproduced from Table 10.6 (pg. 176) of Falconer and Mackay 15 and Lynch and Walsh 12 (pg. 153-158). Note that y1 and y2 are the phenotypes of full-sibs 1 and 2 respectively.

Theory -expected values from full-sib IBD regression
The phenotype (y) can be modelled as: where 1 is a vector of 1's, μ is the overall mean, a is vector additive genetic effects distributed N(0,Aσ 2 a), c is the common environment for pairs of individuals from the same family distributed N(0,Eσ 2 c), e is the vector of residuals from the model distributed N(0,Iσ 2 e), A is a genetic relationship matrix with 1's on the diagonal and proportion of alleles identity-by-descent (IBD) on the off-diagonals for full-sibs, E is a family environment matrix with 1's on the diagonal and 1's on the off-diagonal for fullsib pairs, I is an identity matrix, σ 2 a is the additive genetic variance, σ 2 c is the variance attributed to family effects and σ 2 e is the residual variance. Thus var(y) = Aσ 2 a + Eσ 2 c + Iσ 2 e in the general case or, specifically for full-sibs, cov(y1,y2) = σ D $ + 9σ , $ [ii] where cov(y1,y2) is the phenotypic covariance for full-sib pairs and 9 is the mean IBD sharing for full-sibs (i.e. 9 = 0.5). The variance components can be estimated using restricted maximum likelihood (REML), e.g. in GCTA 16 , with the heritability obtained from the full-sib regression (h 2 FS) defined as σ , $ / (σ , $ + σ D $ + σ = $ ). The proportion of phenotypic variance explained by family environment ( Next, consider an equivalent model based on the cross-product of full-sib pair phenotypes and simple linear regression: where zi = y1y2 with y1 and y2 being the phenotypes of sib 1 and 2 from full-sib pair i, and μ * is the regression intercept, b is the regression slope, E is the genetic relationship (proportion of alleles IBD) for full-sib pair i and εi is the model residual. and where σ 2 a(RM) and σ 2 P(RM) are as defined above, or eq. (5) / eq. (7) in aforementioned reference.
We simulate assortative mating below to verify that the expected value of J , $ , the REML estimator of σ , $ , is the base or random mating genetic variance and thus, in the presence of assortative mating, that J D $ is inflated by the increase in genetic variance created by assortment (i.e. the variance created by gametic phase disequilibrium). We note that h 2 FS is then σ 2 a(RM) / (σ 2 a(EQ) + σ 2 c + σ 2 e); where the numerator is the random mating genetic variance and the denominator is the phenotypic variance in the current population. Thus, h 2 FS estimates neither the random mating (h 2 RM) nor equilibrium (h 2 EQ) heritability.

Simulation parameters
A total of 500 replicate simulations were conducted for one generation of either random or assortative mating. Parameters for the simulation are given in Supplementary Table 10

Simulation results -individual model
Supplementary Table 11 shows the mean (and standard error of the mean, s.e.m.) for 500 replicate simulations of the individual model with either random or assortative mating and analysed with REML in GCTA. The expected value for the full-sib covariance from the simulations is not significantly different from the expected value defined in Supplementary Table 9 [cov(y1,y2) = ½σ 2 a(RM)(1 + rhRM 2 ) = 0.350 or 0.497], either under random or assortative mating schemes. Assortative mating increases the phenotypic covariance between full-sib pairs.
The estimated variance attributed to additive genetic effects ( J , $ ) under random or assortative mating was not significantly different from the simulated genetic variance in the base population ( ,(./) $ = 0.7). The variance attributed to family effects ( J D $ ) is ~ 0 under random mating, and approximately equal to the increase in genetic variance due to assortment when r = 0.6 (i.e. D $ = σ 2 a1 -σ 2 a(RM) = 0.85 -0.7 = 0.15). We confirm that ℎ 0 F! $ under this model is the random mating genetic variance scaled by the phenotypic variance in the current population (h 2 FS = σ 2 a(RM) / σ 2 P1 = 0.61).
Supplementary Table 11. Mean (and standard error of the mean, s.e.m.) for variance components from the individual model estimated using REML from 500 replicate simulations. s.e.m.

SUPPLEMENTARY NOTE 4: Expectations of genetic parameters from different experimental designs
This supplementary note describes the expectations of heritability estimates from different experimental designs under assortative mating. We convert each estimate from this study and classic twin estimates from the literature into two comparable parameters, (i) the equilibrium heritability (ℎ *+ $ ), and (ii) the random mating genetic variance divided by the equilibrium phenotypic variance [i.e. ℎ *+ $ (1 − ℎ *+ $ ), under some assumptions and where r is the phenotypic correlation between mates]. The experimental designs examined are a classic twin design, full-sib IBD regression analysis, and estimates from complex pedigrees (or close relatives).

A note on converting between random and assortative mating variance parameters
Under the assumptions of (i) no common environmental variance, (ii) no non-additive genetic variation and (iii) a population in assortative mating equilibrium, we know the relationship between the random mating (RM) and assortative mating (AM) variance parameters. From Supplementary Table 9, This can also be written as equilibrium parameters in terms of RM parameters 19 , Note that when under random mating = 0, = 1 and ℎ *+ $ = ℎ ./ $ .
We can now write the expected value of the estimates from twin studies, from full-sib IBD regression and from complex pedigree analysis in terms of either RM or AM parameters.

Assumed value for r, the correlation between mates
In subsequent sections of this supplementary note, we adjust estimates from twin studies and full-sib IBD regression for assortative mating. For adjustments, we need to assume a value for r, the correlation between mates. We chose to use estimates independent of our estimate of ̂ from close relatives, and therefore obtained phenotypic correlations of spousal pairs from the literature. The correlation between mates for height was assumed to be 0.

Expectations from twin studies
Let r(MZ) and r(DZ) be the monozygotic and dizygotic twin correlations, σ 2 a(EQ) the additive genetic variance at equilibrium, σ 2 c the common environmental variance, σ 2 d the non-additive genetic variance, σ 2 P(EQ) the phenotypic variance at equilibrium, and ℎ *+ $ , *+ $ and *+ $ be the proportion of equilibrium phenotypic variance explained by additive genetic, non-additive genetic and common environmental variance. Then following Falconer and Mackay 15 (Table 10.5, p. 172) and Lynch and Walsh 12 (Table  7 Supplementary  Table 13 we apply these corrections to recent meta-analysis of twin correlations for height 24 and EA 25 , assuming the correlations between mates ( ) as described in section 4.2.

Expectations from complex pedigrees
In the main text, the inflation factors for correlation between relatives with a relationship of = 1/( + 1) is approximately (1 + ℎ *+ $ ) L , where is the number of meiosis apart ( = 0 for MZ twins). We model the effects of AM explicitly in the main paper and estimate ℎ *+ $ for traits with evidence of AM (i.e. height and EA). In the general case there is no simple solution as the expectation is averaged over different relative pairs and so depends on the specific experimental design. However, if we assume that the population consists of relative pairs with =

SUPPLEMENTARY NOTE 5: Model parameterisation with indirect genetic effects
The purpose of this section is to show how highly parameterised models can become that attempt to draw inference on direct and indirect genetic effects on complex traits and, therefore, that caution needs to be applied when simplified models are used that draw strong conclusions. We begin by noting that there is a strong theoretical body of literature on indirect (associative) effects, as summarised in Lynch and Walsh 12 for maternal effects and in much more detail and with more generality in Walsh and Lynch 26 . The primary literature is from quantitative genetics applied to plant and animal breeding and from evolutionary genetics.
We focus on the simple design of a nuclear family with 2 siblings and assume that we have perfect genetic transmission and phase information. We assume the absence of non-additive genetic variation and non-nuclear (cytoplasmic) inheritance and note that their effects would require additional parameters in the model and would induce confounding with other parameters. We also assume a variance component model, which models composite effects of unobserved or unknown traits in relatives (parents, sib) that influence the phenotype of interest in the focal individual. An alternative approach is a trait-based model, for which the traits in relatives that affect the phenotype of interest in the focal individual need to be known 26 .

Gametic model
Let subscripts T, N, M, P and S denote transmitted, non-transmitted, maternal, paternal and sibling, and x be the (0-1) indicator variable for an allele. The phenotype y of one of the two siblings can be written as a function of transmitted and nontransmitted effects, and their interactions, y = bTMxTM + bTPxTP + bNMxNM + bNPxNP + bSMxSM + bSPxSP + bTMTPxTMxTP + ….. + bSMSPxSMxSP + e There are 6 main allelic effects in this model and 15 pairwise interactions. A number of the interactions have a clear interpretation, for example the effect bTMTP is a parentof-origin effect. The variance of y is partitioned into the contribution of the 6 main effects, their interactions and the contribution a large number of covariance terms. Again, some of these covariance terms have a clear interpretation, for example cov(xTM, xTP), cov(xTM, xNM), cov(xTM, xNP) and cov(xNM, xNP) can all be non-zero when there is gametic phase disequilibrium, e.g. under assortative mating.

Whole genome variance component model
We model the phenotype of an offspring as a function of direct (d) and indirect effects, with the indirect effects partitioned into a Maternal, Paternal and Sibling effect. When considering the phenotype of the offspring, these indirect effects are environmental sources of variation, even though the unknown traits through which the indirect effects operate can have genetic components (A).
Even without interactions, there are 7 variance components (plus a residual) and up to 21 covariances to estimate.
In this model, Apar includes the effect of the environment of the focal (offspring) individual that is correlated with the average genotype of the parents.

Estimation of genetic variance from sibling design ("full-sib IBD regression")
If we have only 2 sibs per family and there is an indirect (social, subscript s) effect of one sibling on the other, then, following the notation and derivations in Walsh and Lynch 26 (chapter 22): y1 = Ad1 + Ed1 + As2 + Es2 and likewise y2 = Ad2 + Ed2 + As1 + Es1 (Bijma et al. 27 define a 'total breeding value' as ATi = Adi + Asi).
Let π be the correlation in additive genetic values between the sibs. This could be the expected value (½) or the realised IBD value. It follows that var(y) = var(Ad) + var(Ed) + var(As) + var(Es) + 2πcov(As, Ad), and cov(y1, y2) = π[var(Ad) + var(As)] + 2cov(As, Ad) Therefore, the expectation of the estimate of additive genetic variance of the sib-IBD regression is [var(Ad) + var(As)] and the intercept is 2cov(As, Ad).
This ignores the contribution of parental indirect effects and the contribution of a common environmental component.