Childhood gene-environment interactions and age-dependent effects of genetic variants associated with refractive error and myopia: The CREAM Consortium

Myopia, currently at epidemic levels in East Asia, is a leading cause of untreatable visual impairment. Genome-wide association studies (GWAS) in adults have identified 39 loci associated with refractive error and myopia. Here, the age-of-onset of association between genetic variants at these 39 loci and refractive error was investigated in 5200 children assessed longitudinally across ages 7–15 years, along with gene-environment interactions involving the major environmental risk-factors, nearwork and time outdoors. Specific variants could be categorized as showing evidence of: (a) early-onset effects remaining stable through childhood, (b) early-onset effects that progressed further with increasing age, or (c) onset later in childhood (N = 10, 5 and 11 variants, respectively). A genetic risk score (GRS) for all 39 variants explained 0.6% (P = 6.6E–08) and 2.3% (P = 6.9E–21) of the variance in refractive error at ages 7 and 15, respectively, supporting increased effects from these genetic variants at older ages. Replication in multi-ancestry samples (combined N = 5599) yielded evidence of childhood onset for 6 of 12 variants present in both Asians and Europeans. There was no indication that variant or GRS effects altered depending on time outdoors, however 5 variants showed nominal evidence of interactions with nearwork (top variant, rs7829127 in ZMAT4; P = 6.3E–04).

The refractive errors myopia and hyperopia are common visual disorders that typically require correction with spectacles, contact lenses, or refractive eye surgery. Myopia -particularly with increasing severity -is a leading cause of irreversible visual impairment and blindness due primarily to stretching and thinning of the ocular tissues within the posterior segment of the eye. These changes are associated with an increased risk of retinal detachment, chorioretinal atrophy, choroidal neovascularisation, myopic maculopathy, glaucoma and cataract 1,2 . Myopia is rare in infancy, usually developing during school age or in early adulthood 3 . For current generations of young adults, approximately 30-40% of individuals in Western countries 4,5 and 80% of those in urban areas of East Asia 6,7 have myopia.
Genome-wide association studies (GWAS) in primarily population-based samples [8][9][10][11][12][13][14] and next-generation sequencing (NGS) studies of carefully selected high myopia pedigrees harbouring extremely rare, high penetrance disease-causing mutations [15][16][17][18][19][20] have improved our understanding of the genetics of refractive error and myopia. To date at least 39 distinct loci harbouring common genetic variants showing genome-wide significant association with refractive error have been identified through GWAS. For the genetic variants that contribute most to the burden of myopia in the general population (i.e. the GWAS-identified variants) it is not yet known whether the variants act during very early life, childhood, or in adulthood. This is an important question given that knowledge of the time and mode of action of the causal variants at the associated loci is necessary for detecting children at-risk of myopia (who would benefit most from treatment intervention), and would aid the design of new therapies capable of halting myopia progression.
For environmental risk factors to which most children are exposed, inter-individual differences in genetic susceptibility may account for some of the phenotypic variance 21 . Exposure to nearwork, i.e. reading and other tasks requiring prolonged near vision, has long been proposed as an environmental risk factor for myopia to which children are ubiquitously exposed during their schooling. The total duration of reading, the period of continuous reading, the reading distance between the text and the eyes, and variation in nearwork exposure outside of the school day have each been shown to be associated with refractive error or myopia progression 22,23 . The other most strongly implicated environmental risk factor for myopia is insufficient time spent outdoors [24][25][26] , and it has been suggested that time spent outdoors and time spent performing nearwork activities together underlie the robust association between myopia and educational achievement 2,27 . Gene-environment (GxE) interactions -which in this project we define as marker-phenotype associations whose effects differ statistically depending on whether individuals have been exposed to a high vs. low level of an environmental risk factor -may contribute extensively to variation in disease susceptibility 28 . Given the recent identification of gene-environment interactions involving nearwork or level of education, a key question in myopia research currently is whether GxE interactions contribute to the rising prevalence of myopia and to the higher incidence rate observed in young Asian populations as compared to their European counterparts.
We carried out analyses of pediatric/adolescent cohorts collaborating in the Consortium for Refractive Error And Myopia (CREAM) to investigate whether the top index variants at the 39 loci previously identified in GWAS meta-analyses of adults have early-onset effects manifest during childhood. We also tested for evidence of GxE interactions involving either nearwork or time spent outdoors. A single large cohort with longitudinal measurements of refractive error over much of childhood was used for the primary analyses. Meta-analyses of cross-sectional samples were then used to test for replication.
Royal Victorian Eye and Ear Hospital, the University of Tasmania, and the Australian Twin Registry; WESDR, the Health Sciences Institution Review Board of the University of Wisconsin, Madison.
Participants underwent cycloplegic autorefraction (RAINE, TEST, BATS, GZT, SCORM, STARS) or non-cycloplegic autorefraction (ALSPAC) or subjective refraction (TEDS, WESDR) and the spherical equivalent refractive error averaged between the two eyes was calculated. Parental questionnaires that included items on time spent engaged in nearwork outside of school, and time spent in outdoor activities were used to classify children as spending a high or low amount of time performing nearwork (Table S6) or outdoors (Table S7) each day. Classification was done within each cohort separately, using a median split ("low" group, exposure below median level; "high" group, exposure above median level).
Genetic analysis. DNA samples obtained from blood or saliva were genotyped using either an Illumina or Affymetrix high-density single nucleotide polymorphism (SNP) array, and genotypes at untyped markers were imputed using the 1000-Genomes Project reference panel (see Table S5 for details). Stringent quality control procedures (e.g. imputation quality r 2 or info score > 0.5) were applied to each cohort separately (Supplementary Information). 39 SNPs that showed genome-wide significant association with refractive error in the general adult population in two previous GWAS analyses 8,9 were selected for evaluation (Table S1).

Cross-sectional models and meta-analyses.
For each of the 8 cross-sectional cohorts separately, single SNP tests of association with refractive error were conducted using the following linear regression model: where y i is the spherical equivalent refractive error of the i th participant, of age a i and sex s i and with g i their risk allele dosage on the scale 0-2 for the test SNP, and ε i the residual. Regression coefficients are indicated as β Age , β Sex , and β SNP for the model parameters age, sex and SNP genotype, respectively. Additional G x E interaction models were tested for samples with information available on environmental exposures, nearwork or time outdoors (both exposures coded: 0 = low, 1 = high). For the i th participant, using n i to denote nearwork and t i for time outdoors: Results from the individual cohorts were meta-analyzed in 5599 individuals comprising 5 cohorts of European ancestry (BATS, RAINE, TEDS, TEST, WESDR; N = 3,143; Table 1) and 3 cohorts of Asian ancestry (GZT, SCORM, STARS; N = 2,456; Table 1) using a weighted inverse-variance, fixed effects model 29 . A random effects model was used if Cochran's Q-test for heterogeneity yielded a P-value below 0.05.

Longitudinal study (ALSPAC).
Refractive error was included in the clinical assessments for ages 7, 10, 11, 12 and 15 years in ALSPAC children 30 . Linear mixed models for refractive trajectory were fit as described 30 using the nlme package in R 31 for individuals (N = 5,200; Table 1) who underwent at least 3 refractive assessments and whose genotype data passed quality control filters (as described in the Supplementary Information). Briefly, SNP dosage, age and higher-order age terms (age 2 and age 3 ) were modelled as fixed effects while for each child, the difference from the average refractive error at baseline and the linear rate of change in refractive error were modelled as individual-level random effects, using an autoregressive correlation structure. To examine GxE interactions, initially, 3-way interaction models were tested that included the interaction between SNP, change-from-baseline in age, and environmental exposures (nearwork or time outdoors). If the P-value for the 3-way interaction was >0.05 then models including only 2-way interactions were tested. Quanto 32 was used to gauge the power to detect main and interaction effects in the ALSPAC cohort. These calculations assumed a minor allele frequency (MAF) of 0. 25 739 participants with missing information about time spent performing nearwork), a binary exposure affecting 39% of the cohort (equivalent to that for high vs. low nearwork exposure in ALSPAC) and a refractive error distribution with a mean of zero and a standard deviation of 1.50 D. The estimated power would be conservative given that a linear mixed model analysis will have greater power than a linear model analysis.
Genetic risk score for all 39 SNPs. A genetic risk score was computed by summing the dosage of risk alleles for all 39 SNPs. In individuals of Asian ancestry only 31 of the 39 SNPs were polymorphic (MAF > 0.05) and therefore contributed to the genetic risk score calculation. The frequency distribution of genetic risk score in each sample was normally distributed with a mean of 36 (95% C.I. 29 to 42) alleles in Europeans and 40 (95% C.I. 37 to 42) alleles in Asians. To calculate the variance in refractive error explained by the genetic risk score at a specific age for participants in the ALSPAC cohort, refractive error at age 7.5 years (or at age 15 years) was regressed on genetic risk score using a linear model. The covariates age and sex were not associated with refractive error when included in the age 7.5 or the age 15 year model, and their inclusion did not improve the fit of either model (note that being a birth cohort, the age range was narrow). Hence these covariates were omitted. The variance explained by the genetic risk score was therefore taken as the R 2 value for a model that included the genetic risk score as the only predictor variable.
Pathway analysis. The genes ( Table 2) implicated in having early-onset effects (N = 10 genes) or later-onset effects (N = 11 genes) in the ALSPAC discovery sample were evaluated using PANTHER Version 10.0 (release date May 15, 2015) 33 and DAVID Version 6.7 (release date 27 Jan, 2010) 34 to identify potential functional pathways.

Early-onset and later-onset effects in childhood.
Nine cohorts of children/adolescents were studied ( Table 1). The largest of these, ALSPAC (N = 5,200), which had longitudinal data for refractive error, was used for discovery analyses, and 8 cross-sectional cohorts were used for validation. The discovery cohort had ~80% power to detect an association for a SNP with an effect size of 0.1 D and MAF of 0.25. Of the 39 SNPs examined, 16 showed evidence of onset in childhood ( Table 2 and Table S2). Early-onset associations already manifest at 7.5 years of age were present for 10 SNPs (P = 4.8E-02 to P = 5.3E-03). Later-onset associations that emerged between the ages of 7.5 and 15 were noted for 11 SNPs (P = 4.9E-02 to 8.8E-04 for SNP x Age interaction). Five SNPs showed a main effect at baseline as well as later progressive effects. Examples of SNPs showing evidence of early-onset and later-onset effects are presented in Fig. 1 for early-onset CHRNG SNP rs1881492, later-onset A2BP1 (also known as RBFOX1) rs17648524, and PRSS56 rs1656404 with both effects. For all associated SNPs the "direction of effect" was the same as in the original GWAS 8,9 .
The genetic risk score was very strongly associated with refractive error both at 7.5 years of age (β = − 0.018 D, 95% CI − 0.012 to − 0.024, P = 2.2E-9) and with increasing age (β = − 0.003 D/yr, 95% CI − 0.002 to − 0.004, P = 5.8E-14). By the age of 15 years, the model suggested that the 39 SNPs together would produce a more than 1.0 D difference in refractive error between participants carrying the lowest and highest number of risk  alleles observed (Fig. 2). At age 7.5 years the genetic risk score explained 0.6% of the variation in refractive error (N = 4,566; P = 6.6E-08); at age 15 years the corresponding figure was 2.3% (N = 3,666; P = 6.9E-21). For validation we tested the genetic risk score and 12 of the 16 above SNPs (4 were nearly monomorphic in Asians) in the 8 multi-ethnic cross-sectional study cohorts (combined N = 5,599; Table 1). The average age of the participants varied from 6.6 years-old in the STARS cohort to 20.0 years-old in RAINE. The genetic risk score and 4 SNPs -rs7744813 (KCNQ5), rs7837791 (TOX), rs8000973 (ZIC2) and rs17648524 (A2BP1) -were associated with refractive error (P < 0.05; Table 3). All 4 SNPs had the expected direction of effect and none exhibited evidence of between-cohort heterogeneity. Interestingly, 3 of the 4 SNPs had evidence of both early-onset and later progressive effects in the discovery cohort. Meta-analysis summary plots for the genetic risk score and the individual SNPs tested for replication are presented in Figure S1. There was suggestive evidence that SNPs had larger effect sizes in Asian than in European ancestry participants ( Figure S2). Interactions with time engaged in nearwork. Two types of interactions between SNP genotype and nearwork exposure were evaluated in the ALSPAC discovery cohort: An interaction already present at the baseline age of 7.5 years-old (a 2-way SNP x nearwork interaction) and an interaction that developed progressively during later childhood (a 3-way, SNP x nearwork x age-from-baseline interaction). For a SNP with a risk allele frequency of 0.25, and ignoring the repeated measures nature of the data, the analysis of ALSPAC participants had > 90% power to detect an interaction effect of 0.25 D at α = 0.05 (and > 50% power at α = 1.28E-3, corresponding to a Bonferroni correction for testing all 39 SNPs).
Four of the cross-sectional study cohorts, 1 of European ancestry (TEDS) and 3 of Asian ancestry (GZT, SCORM and STARS), had information available regarding the time participants spent engaged in nearwork (Table S6), allowing tests for replication. In the meta-analysis of all 4 replication studies (Table S3) none of the SNPs that showed nominal evidence of an interaction with nearwork in the ALSPAC discovery cohort showed evidence of replication (all P > 0.16). Likewise, the genetic risk score did not show evidence of an interaction with nearwork in the cross-sectional cohorts (P = 0.49).

Interactions with time spent outdoors. In the discovery cohort, only rs13091182 within ZBTB38
showed nominal evidence of a 3-way interaction involving time outdoors (uncorrected P = 0.028; corrected P > 0.05; Fig. 3f). Surprisingly, the risk allele of rs13091182 was associated with slower progression towards myopia (or less hyperopia) in general and with faster progression towards myopia in children who spent more time outdoors, suggesting a potentially false-positive result. There was no evidence for 2-way SNP x time outdoors interactions (uncorrected P > 0. 20 Table S4). Similarly, the genetic risk score did not show evidence of an interaction with time spent outdoors in the replication cohorts.
Pathway analysis. Pathway analysis identified a single functional pathway for the set of 10 genes (Table 2) implicated in having early-onset effects, namely "hedgehog signalling" (Panther P = 0.043; key genes ZIC2 and BMP4). The set of 11 genes implicated in having later-onset effects did not show enrichment for specific pathways.

Discussion
Early-onset and later-onset SNP effects. Sixteen SNPs showed evidence of effects in childhood in ALSPAC participants (Table 2); 10 SNPs had early-onset effects manifest by age 7.5 years, 11 SNPs had later-onset effects, and 5 SNP had early-onset effects that progressed further during later childhood. For the 12 of these 16 SNPs available in the cross-sectional cohorts, 4 showed evidence of replication (Table 3). There was suggestive evidence that SNP effect sizes were approximately 2 times larger in Asian as compared to European ancestry children/adolescents ( Figure S2). A genetic risk score that captured the effects of all 39 GWAS-identified variants confirmed the involvement of genetic influences acting at an early age (7.5 years) and then increasing further in magnitude across later childhood.
We sought to discover whether the early-onset and later-onset variants clustered according to functional pathway (for example, if GWAS SNPs A and B are causal variants that affect the expression levels of genes X and Y, respectively, and X acts downstream of Y to regulate refractive development, then one might expect the onset age for SNPs A and B to coincide). However, as summarised in Table 4, SNPs associated with early-onset or later-onset effects did not clearly cluster according to the known function(s) of the genes implicated in mediating the SNPs' effects. Pathway analysis confirmed this impression, with only a single functional pathway being identified. Potential reasons for this lack of functional clustering are, first, that many genes in the genome have diverse functions, which are sometimes poorly understood. For instance, during development of the human visual system, an ion channel may play a vital role during early embryonic development of the retina, be a necessary component of the visual cycle, and yet also contribute to neuronal plasticity. Second, precisely which gene or genes mediate the effect of a specific GWAS-identified SNP is not known with certainty for any of the refractive error GWAS SNPs identified to date: While the nearest gene to a GWAS SNP is usually considered the most likely to be involved, this does not always hold true 35 .
The 39 SNPs examined were identified in adult GWAS meta-analyses with sample sizes of approximately 45,000 individuals, and all had small effects (typically 0.1 D per copy of the risk allele). The ALSPAC longitudinal cohort (N = 5,200) had ~80% power to detect an association for a SNP with an effect size of 0.1 D and MAF of 0.25 (but note that the true power would likely have been lower because: refractive development would not be complete by 15 years of age, our models tested primarily for yearly effects rather than cumulative effects, and the "winner's curse" phenomenon 36 , i.e. the over-estimation of effect sizes in the original GWAS investigations). Therefore, a likely reason why some of the 39 SNPs we studied failed to show childhood-onset associations in the longitudinal cohort is limited statistical power. Thus, we cannot conclude that the SNPs that did not show observable childhood-onset associations have an age-of-onset beyond 15 years-old even though they might well do: much larger studies will be required to definitively address this issue. Similarly, the limited concordance between the longitudinal and cross-sectional studies was also likely due to limited statistical power, although 8 of the 12 SNPs tested for replication showed the expected direction of effect (Table 3).   Table 3. Replication meta-analysis results for SNP main effects. SNPs associated with refractive error in the ALSPAC age-of-onset analyses were tested for association with refractive error in 8 independent cohorts of children (5 European ancestry, 3 Asian ancestry). Abbreviations: Chr = Chromosome. GR Score = Genetic risk score. RA = Risk allele. RAF = Risk allele frequency. *SNPs with minor allele frequencies < 0.05 were not examined due to low statistical power. Interactions with environmental exposures. In general there was scant evidence for GxE interactions, especially for SNP x time spent outdoors effects. Given the expected power of > 90% to detect interaction effects with a magnitude 0.25 D or more, this argues against SNP x nearwork or SNP x time outdoors interactions of this size being present for the majority of variants studied, rather than lack of statistical power precluding their discovery.
In the ALSPAC longitudinal analysis the gene-environment interaction between ZMAT4 SNP rs7829127 genotype and nearwork survived correction for multiple-testing (P corr = 0.025). Although this interaction was not replicated in the cross-sectional meta-analyses, variants at this locus have previously been reported to show an interaction with the duration of education in a meta-analysis of 5 studies from Singapore (SNP x education interaction = − 0.42 D, 95% C.I. − 0.15 to − 0.69, P = 0.002) 37 . We did not explore interactions between SNPs and years of education, since in several cohorts the participants were still students. The functional role of ZMAT4 is not known.
Why might GxE interactions involving these 39 SNPs be so scarce? First, differences in environmental risk exposures were not considered in the original GWAS investigations carried out by CREAM 9 and 23andMe 8 . Thus, SNPs with strong interaction effects but no main effects may not have been detected using those GWAS designs. Second, the age range and ethnic diversity of the original GWAS discovery samples were highly varied. Given the substantial increase in the prevalence of myopia in the past few decades, which strongly implicates a major role for environmental risk factors, it seems almost certain that the individuals studied in the CREAM and 23andMe GWAS meta-analyses would have grown up in environments with a wide range of risk exposure profiles depending on the participants' years of birth: young (recently born) individuals would have been exposed to a much more myopiagenic environment than older (more distantly born) adults. Therefore, a variant that increases the risk of myopia only in children who perform excessive nearwork may have shown an (apparent) main effect association with refractive error in a GWAS carried out in a young adult cohort, in which participants were ubiquitously exposed to high nearwork during childhood. However, this same variant may not have shown an association with refractive error in a GWAS on an older cohort, due to the lower nearwork exposure during childhood of the older individuals. Thus, support for the association of such a variant in the CREAM and 23andMe GWAS samples may have been diluted rather than strengthened during the meta-analysis of younger and older cohorts.
Separate from tests for gene-environment interactions, time spent outdoors itself was not associated with myopia in 3 of the 5 cross-sectional studies (GZT, STARS, and TEDS) and the association was of borderline significance in another (TEDS). This lack of an association with time outdoors implies that detecting a SNP x time outdoors interaction would also have been challenging, even after meta-analysis of data from all 5 cohorts.
Interestingly, a large-effect GxE interaction predisoposing children to myopia was identified recently, involving a rare variant at the APLP2 gene locus and time spent reading 38 . APLP2 was implicated in myopia   development through studies in an animal model 39 , which -given the statistical challenge of identifying GxE interaction effects in human populations -suggests that combining findings from animal models and human studies could be a fruitful future approach. We reasoned that correction for multiple testing was not appropriate when examining the age-of-onset of the 39 SNPs investigated, because of compelling existing evidence that by adulthood these SNPs truly are associated with refractive error. That is, our analyses sought to discover whether or not each SNP had an effect during childhood, not whether a group of candidate SNPs were associated with refractive error per se. By contrast, in view of very limited evidence for interactions with environmental exposures for most of the SNPs examined, correction for multiple testing was considered appropriate when evaluating SNP x nearwork and SNP x time outdoors interactions: In these analyses, a large number of independent hypothesis tests were carried out, with little or no prior knowledge that an interaction must be present at some age.
Limitations of the present work. The present work had a number of other limitations. The cross-sectional samples were not matched for age, which prevented us from testing for "early" and "later" onset effects in the replication stage. The level of exposure to nearwork and time outdoors also varied across samples, which meant that imprecisely-matched interaction effects were meta-analysed, potentially reducing statistical power. We chose to categorise time spent performing nearwork and time spent outdoors relative to the median activity level in each study sample because the measurement scales used in the various studies were not standardised (precluding the use of an absolute measure). If in reality these environmental risk factors exert their influence non-linearly -for instance if spending more than a certain threshold number of hours per day outdoors is needed to protect against myopia development -then our approach may have poorly captured the effects of the environmental exposures. For the combined meta-analysis of European and Asian cross-sectional studies, we assumed that each lead SNP tagged the underlying causal variant(s) equally well in European and Asian ancestry individuals, which is an oversimplification. Finally, we chose to examine only a simple, binary GxE model, whereas more complex scenarios may exist [40][41][42] .

Conclusions
Specific myopia-predisposing SNPs were found to differ in the age at which they had their effects, and whether or not these effects got progressively stronger during later childhood. Thus, SNPs implicating the genes CHRNG, CACNA1D, LAMA2, CYP26A1 and BMP4 were associated with early onset changes in refractive error that did not progress further, while SNPs close to PRSS56, KCNQ5, TOX, ZIC2 and SHISA6 showed early-onset effects that became greater still at older ages. Effects that only appeared in later childhood -after the age of 7.5 yearsimplicated the genes CD55, CHD7, RORB, KCNMA1, A2BP1 and GJD2. Gene-environment interactions involving nearwork or time outdoors were rare or absent for the vast majority of the GWAS-identified SNPs, and indeed a genetic risk score that demonstrated very convincing association with early-onset (P = 2.2E-9) and later progressive (P = 5.8E-14) changes in refractive error appeared to act independently of the time children spent in these activities. However, one robust interaction between rs7829127 in ZMAT4 and time spent performing nearwork (nominal P = 6.3E-04, corrected P = 0.025) was observed, replicating a previously-identified interaction involving rs7829127 and years of education 37,43,44 .