Epistasis between FLG and IL4R Genes on the Risk of Allergic Sensitization: Results from Two Population-Based Birth Cohort Studies

Immune-specific genes as well as genes responsible for the formation and integrity of the epidermal barrier have been implicated in the pathogeneses of allergic sensitization. This study sought to determine whether an epistatic effect (gene-gene interaction) between genetic variants within interleukin 4 receptor (IL4R) and filaggrin (FLG) genes predispose to the development of allergic sensitization. Data from two birth cohort studies were analyzed, namely the Isle of Wight (IOW; n = 1,456) and the Manchester Asthma and Allergy Study (MAAS; n = 1,058). In the IOW study, one interaction term (IL4R rs3024676 × FLG variants) showed statistical significance (interaction term: P = 0.003). To illustrate the observed epistasis, stratified analyses were performed, which showed that FLG variants were associated with allergic sensitization only among IL4R rs3024676 homozygotes (OR, 1.97; 95% CI, 1.27–3.05; P = 0.003). In contrast, FLG variants effect was masked among IL4R rs3024676 heterozygotes (OR, 0.53; 95% CI, 0.22–1.32; P = 0.175). Similar results were demonstrated in the MAAS study. Epistasis between immune (IL4R) and skin (FLG) regulatory genes exist in the pathogenesis of allergic sensitization. Hence, genetic susceptibility towards defective epidermal barrier and deviated immune responses could work together in the development of allergic sensitization.


Genotyping -Isle of Wight birth cohort study
DNA was extracted from blood or saliva samples from cohort subjects (n = 1,211). FLG variants R501X, 2282del4, and S3247X were selected for genotyping. DNA samples were interrogated using GoldenGate Genotyping Assays (Illumina, Inc, SanDiego, CA) on the BeadXpressVeracode platform (Illumina, Inc, SanDiego, CA) per Illumina's protocol. In brief, samples were fragmented and hybridized to the pool of allele-specific primer sets. Following an extension/ligation reaction the samples were then hybridized to the Veracode bead pool and processed on the BeadXpress reader. Data were analyzed using the genotyping module of the GenomeStudio Software package (Illumina, Inc, SanDiego, CA). DNA from each subject plus 37 replicate samples were analyzed for a total of 1,248 samples. The quality threshold for allele determination was set at a GenCall score > 0.25 (scores ≤ 0.25 were "no calls") with n = 1,227 samples (98.3%) retained for further analysis. Analysis of each locus included reclustering of genotyping data using our project data to define genotype cluster positions with additional manual reclustering to maximize both cluster separation and the 50th percentile of the distribution of the GenCall scores across all genotypes (50% GC score). Participants were classified as having FLG loss-of-function defect if they carry the minor allele for at least one of the following FLG null variants: R501X, 2282del, or S3247X.
In regard to IL4R SNPs, an efficient genotype tagging scheme was developed that gave priority to variants that 1) showed strong association with asthma in the Isle of Wight birth cohort, and/or 2) have been reported by others to be associated with asthma/allergy, and/or 3) have functional importance. A literature search for IL4R gene plus asthma and allergy was used to identify associated variants (SNPs, indels). Functional variants included those that were nonsynonymous, located in conserved DNA, and/or present in DNA regions with gene regulatory potential. Tagger implemented in Haploview 3.2 using Caucasian Hapmap data was used to develop a tagging scheme for the IL4R gene region, including 10 kb upstream and downstream of the gene. An r 2 value of 0.85 was the threshold for tagging and one, two, and three SNP marker combination tests were used. The result was an efficient number of genotyped variants (n = 13) that would provide the needed information to statistically support or exclude the gene in its association with asthma outcomes.
Thirteen IL4R SNPs were genotyped (Table S5) DNA from each subject plus 37 replicate samples, genotyped for control purposes, were analyzed for a total of 1,248 samples (trios were not available). The quality threshold for allele determination across samples was set at a GenCall score >0.25 with n=1,227 samples retained for further analysis. Analysis of each locus included reclustering of genotyping data using our project data to define genotype cluster positions with additional manual reclustering to maximize both cluster separation and the 50th percentile of the distribution of the GenCall scores across all genotypes (50% GC score). GenCall score is a quality metric of the Illumina GenomeStudio software that indicates the reliability of the genotypes called on SNP arrays, with scores ranging from 0.0 to 1.0. The proprietary algorithm for converting raw allele intensities into genotypes considers angle of the clusters, dispersion of clusters, overlap between clusters, and intensity.
Genotypes with lower GenCall scores are located furthest from the center of the cluster.

Genetic risk models and interaction analysis
Genetic association studies examine relationships between the presence of a genotype (genetic factors) and a phenotype (traits). A genotype, is therefore the exposure of interest in genetic epidemiology. Since there are three genotypes for each SNP (e.g., AA, AB, and BB), their analysis requires some manipulation. Assuming that one of the alleles (allele B) is associated with increased risk of disease; thus, the other allele (allele A) will be a marker for protection (baseline risk). In practice, statistical analysis of genetic associations is done assuming that the uncommon (minor) allele is the risk marker. For analysis purposes, the three genotypes are coded according to the genetic risk model of interest. Commonly explored genetic risk models are: dominant, recessive, and additive and to a lesser extent some studies consider the 'heterosis' (over-dominant) risk model [1][2][3][4].
The dominant, recessive, and heterosis risk models require the three genotypes to be collapsed into two levels (present/absent; see Figure 1). In order to determine whether the dominant model fits the data, the heterozygote (AB) and variant homozygote (BB) genotypes are collapsed together and assumed to be the risk group and the wild-type genotype (AA) makes the baseline/referent group. In the case of the recessive model, the assumption is that the variant allele is associated with risk when present in two copies. Hence, in analysis for the recessive model, the variant homozygote (BB) genotype is consider the risk group and compared to the referent group (AA and AB genotypes). In the heterosis (heterozygote advantage or over-dominant) model, the interest is in heterozygosity as opposed to the two homozygote genotypes.
The AB genotype make one group and homozygote genotypes (AA and BB) make the comparative category. The heterosis model is rarely explored in the human genetic association literature; however, it provides insights on genomic regions where heterozygote advantage might exists. The only difference between these models is in which genotype or collapsed genotypes they consider as the risk genotype: the dominant model considers AB and BB as risk genotypes; the recessive model considers BB as the risk genotype; and heterosis model considered AB genotype to either have higher or lower risk compared to the AA and BB genotypes.
On the other hand, in some instances, genotypes are analyzed for a gene dosage effect using the 'additive' model ( Figure 1). This model may be likened to the examination of association for different grades of an exposure, such as no smoking, moderate smoking, and heavy smoking. In Given the previous brief background, we explored the dominant, recessive, heterosis, and additive genetic risk models for each of the 12 IL4R SNPs when assessing association with allergic sensitization (the outcome). This step was done to select the best fitting genetic risk model for each of the 12 IL4R SNPs. Hence, for each SNP, four genetic risk models were tests in association with allergic sensitization, and the genetic model with the lowest QIC value was selected as the best fitting model. Information related to this step is provided in Table S2, which shows the QIC values for each SNP under the different genetic risk models.
After finding the best fitting genetic risk model for each IL4R SNP, we tested multiplicative statistical interactions, by including a product term in regression models, between FLG variants (defined using dominant model as present/absent) and IL4R SNPs (using the best fitting model found in Table S2) on the risk of allergic sensitization. Hence, we evaluated 12 models to determine if there is statistical interaction between FLG variants and any of the 12 IL4R SNPs on the risk of allergic sensitization. The presence of multiplicative statistical interaction was assessed using the p-value associated with the interaction term (an FDR-adjusted p-value < 0.05 was considered statistically significant). Table 2 shows p-values of the 12 evaluated interaction terms. Of note, we used the term 'epistasis' to refer to multiplicative statistical interactions (nonadditive effects) between genes at different loci.