Genome-wide association meta-analysis of corneal curvature identifies novel loci and shared genetic influences across axial length and refractive error

Corneal curvature, a highly heritable trait, is a key clinical endophenotype for myopia - a major cause of visual impairment and blindness in the world. Here we present a trans-ethnic meta-analysis of corneal curvature GWAS in 44,042 individuals of Caucasian and Asian with replication in 88,218 UK Biobank data. We identified 47 loci (of which 26 are novel), with population-specific signals as well as shared signals across ethnicities. Some identified variants showed precise scaling in corneal curvature and eye elongation (i.e. axial length) to maintain eyes in emmetropia (i.e. HDAC11/FBLN2 rs2630445, RBP3 rs11204213); others exhibited association with myopia with little pleiotropic effects on eye elongation. Implicated genes are involved in extracellular matrix organization, developmental process for body and eye, connective tissue cartilage and glycosylation protein activities. Our study provides insights into population-specific novel genes for corneal curvature, and their pleiotropic effect in regulating eye size or conferring susceptibility to myopia.

R efractive error is common worldwide and particularly so in Asia, where uncorrected refractive error is one of the major causes of visual impairment and blindness 1,2 . In 2015, uncorrected refractive error caused moderate or severe visual impairment in 116 million people, and blindness in 7.4 million people-these figures are expected to rise to 128 million and 8.0 million, respectively, by 2020 3 . Thus there is a critical need to better understand the genetic basis of how different optical components may contribute to ammetropia, for which corneal curvature represents a main endophenotype.
Corneal curvature (CC) is a key clinical endophenotype for the refractive status of the eye. The corneal air-tissue interface provides approximately two-thirds of the eye's optical power 4 . Thus, changes in the CC significantly affect refractive error, such as myopia. A steeper CC was associated with a more negative/ myopic refractive error. In the emmetropic eye, the refractive power of the eye's optical components (such as CC) must be appropriate to its axial length (AL). If the changes in CC, AL, or other ocular components such as lens thickness or anterior chamber depth are not aligned, refractive errors (myopia or hyperopia) are likely to occur.
Clinically, CC associates with ethnicity 5,6 , age 7 , and anthropometric features (height and weight) 7 . Based on family and twin studies, CC is highly heritable, with 35-95% of inter-individual CC variation attributed to genetic factors [8][9][10][11][12] . Previous genomewide association studies (GWASs) have been successful in identifying more than 160 loci associated with refractive error [13][14][15] , and nine loci associated with AL 16 . Only four loci associated with CC have been previously reported from GWAS analyses: MTOR 17 , CMPK1 18 , and RBP3 18 identified in Asians, and PDGFRA in both Asians 17 and Europeans 11,19 . A further 31 loci in European emmetropes, accounting for an additional 2.3% of variance in CC, have also been provisionally reported 20 . Associated variants identified to date cannot fully explain the additive genetic variance of CC, and hence other genetic variants are likely to contribute to this endophenotype. The difference in the prevalence of myopia in various ethnic groups, particularly in Asia, also suggests that certain CC-associated alleles may be population-specific 11 . Furthermore, the extent to which shared or distinct genetic loci contribute to variation in CC, AL, and spherical equivalent is uncertain.
Thus, we conducted the largest GWAS meta-analysis of CC to date, incorporating both European and Asian cohorts in a single analysis from the Consortium for Refractive Error and Myopia (CREAM) and validated our findings in the United Kingdom (UK) Biobank.

Results
Primary GWAS of corneal curvature. The CREAM discovery cohorts included 29,580 individuals with European ancestry from 18 studies, and 14,462 individuals with Asian ancestry from 10 studies. The demographics of these 44,042 participants are shown in Supplementary Table 1. GWAS analyses for CC were performed at the cohort level for all variants genotyped or imputed using the 1000 Genomes Project data as reference panels 7 Table 2). The genomic control inflation factor (λ GC : 0.872-1.085) showed little evidence of inflation in test statistics at the study level.

(Supplementary
We applied a uniform set of quality-control procedures to all cohort-level GWAS results in CREAM and meta-analysed up to 8.94 million variants (see Methods). For the discovery phase, we performed an inverse-variance-weighted meta-analysis on the European and Asian populations. The quantile-quantile plot for the trans-ethnic meta-analysis (λ GC : 1.119; Supplementary Fig. 1) indicated moderate inflation, and genomic control-adjusted test statistics were generated. The inflation is partially due to polygenicity as the linkage disequilibrium (LD)-score regression intercepts were close to one 21 (LD-score regression intercept of 1.045 in Europeans and 1.013 in Asians). Figure 1 shows the Manhattan plot for the trans-ethnic metaanalysis for CC. We identified 41 loci at genome-wide significance (P < 5.0 × 10 −8 ; Table 1; Supplementary Figs. [2][3][4]. We performed replication analyses of these loci for CC in 88,218 participants with European ancestry from the UK Biobank. Thirty-seven (90.2%) of the 41 lead variants passed genome-wide significance in the UK Biobank data ( Table 1). The signals of the two loci (CMPK1/STIL, RBP3) not reaching genome-wide significance in the replication phase were mainly driven by the CREAM Asian populations. In the combined CREAM and UK Biobank meta-analysis, the five most strongly associated loci were RSPO1 (rs4074961; P = 6.51 × 10 −100 ), HUS1 (rs12702376; P = 2.77 × 10 −86 ), FGF9 (rs9506725; P = 1.21 × 10 −78 ), PDGFRA Both directly genotyped and imputed variants were meta-analysed for corneal curvature in 44,042 CREAM participants. The y-axis represents −log 10 p values for association with corneal curvature, and the x-axis represents genomic position based on human genome build 37, highlighting newly identified loci (arrows in blue; Table 1), loci associated with axial length (labelled with nearest gene names), and loci associated with spherical equivalent (cross in green). The horizontal red line indicates the genome-wide significance level of P < 5.0 × 10 −8 . The horizontal blue line indicates the suggestive significance level of P < 1.0 × 10 −5 .  16 . Here it stands out as the most significant locus for CC in our large meta-analysis. We confirmed associations with CC in the four previously identified loci in our samples: PDGFRA, MTOR, RBP3, and CMPK1 11,[17][18][19] , and in an additional 17 loci provisionally identified in the European emmetropes from the UK Biobank 20 . In total, we identified 20 novel CC loci through single-variant analysis.
Following the age classification scheme adopted by the CREAM consortium, we stratified the CREAM samples into younger (age < 25 years) and older (age ≥ 25 years) groups. In the older group (n = 35,442), the top five genome-wide significant loci were RSPO1, MTOR, PDGFRA1, HUS1, and FGF9 (Supplementary Data 1). There were no novel genome-wide significant hits for CC in both groups. No variants reached genome-wide significance in the younger group (n = 8620), likely due to the insufficient power in contrast to the older group. The effect sizes and directions of effect for the top variants were consistent between the younger and older groups.
Trans-ethnic comparison of genotypic effects in Europeans versus Asians. Among the 41 lead variants identified in the full discovery samples, the effect size and direction of effect were largely consistent across Europeans and Asians (Table 1). To evaluate whether genetic effect sizes were consistent in Europeans versus Asians in the CREAM samples, we compared additive effect sizes (beta coefficient in millimetre per allele) of variants with p-value < 0.01 in both populations. We grouped these variants by inter-population allele frequency discrepancy between the two ancestry groups (<0.1, 0.1-0.3, and >0.3). Overall, the variants were concordant in direction of effects and effect sizes ( Fig. 2a-c). The effect sizes appeared most consistent in variants with little discrepancy in allele frequency (2a; allele frequency difference < 0.1), and less consistent in variants of larger allele frequency discrepancy (2c; allele frequency difference > 0.3).
The high concordance of genetic effects between Europeans and Asians, however, could not rule out the possibility of discrepancy at some loci with large inter-population differences in the allele frequency and LD structure. Three loci in our all-ancestry analyses were driven by European populations (HDAC11/FBLN2 rs2630445, ADAMTS19/CASY3 rs7708378, and FGF9 rs9506725), and two loci were driven by Asian populations (RBP3 rs11204213 and CMPK1/STIL rs60078183). In the ethnic group (either the Asian or European populations) where these variants were not associated, they typically presented as monomorphic in the other ethnic group, as well as those variants in LD with the lead variant (r 2 ≥ 0.8; Supplementary Data 2). Regional plots at HDAC11/FBLN2 rs2630445, CMPK1/STIL rs60078183, and FGF9 rs9506725 were presented in Europeans  Table 3). For GWAS analyses performed separately in Europeans and Asians, there were two Asianspecific loci (EMX2/EMX2OS rs2240776, P = 2.45 × 10 −8 ; NCAPG rs7672919, P = 3.90 × 10 −8 ; Supplementary Table 4); both did not reach genome-wide significance in the combined analysis. In addition, one locus showing suggestive significance in CREAM Europeans (PIEZO2 rs2101976; P = 7.32 × 10 −8 ) has been replicated in the UK Biobank data (P = 8.90 × 10 −14 ).
We calculated SNP-heritability (SNP-h 2 ) using GWAS summary statistics 22 . The SNP-h 2 estimate for CC in Asians (0.196, s.e. = 0.036) was numerically lower than in Europeans (0.267, s.e. = 0.024), but there was no statistical evidence for a meaningful difference. A similar pattern was noted in a previous study 12 .
Association of corneal curvature loci with spherical equivalent and axial length. In further analyses, we assessed the associations of the 41 CC lead variants with spherical equivalent 14,15 in 95,505 participants from the UK Biobank, as well as in a subset of CREAM participants with AL measurement (N = 10,851; Supplementary Table 5). The lead variants were categorized into the following three groups using false discovery rates (FDR) set at a threshold of 1% from the Benjamini-Hochberg procedure 23 ( Fig. 3

and Supplementary Data 3).
Eight CC variants were associated with AL, but not spherical equivalent (Group A, Fig. 3). That is, the effect allele of the variant was associated with eye size, e.g. a larger eye with both a flatter CC and longer AL (positive genetic effects on both CC and AL; bar in red, Fig. 3) or a smaller eye with both a steeper CC and shorter AL (negative genetic effects; bar in blue), but was not associated with spherical equivalent. For instance, HMGA2 rs7959830 T allele was associated with a steeper CC (β = −0.018, s.e. = 0.002, P = 3.02 × 10 −16 ) and shorter AL (β = −0.047, s.e. = 0.014, P = 6.54 × 10 −4 ), but not spherical equivalent (β = 0.005, s.e. = 0.013, P = 0.670). A lack of association with refractive error at these variants might be explained on the basis of the compensatory effects resulting in, a steeper CC (that tends to make the eye more myopic) and shorter AL (that tends to make the eye more hyperopic/less myopic), or a flatter CC and longer AL The compensatory genetic effects for CC and AL relevant to the effects on myopia or hyperopia is illustrated in Fig. 4. Among these variants, the pleiotropic association on AL of HDAC11/FBLN2 rs2630445, ADAMTS19/CHSY rs7708378 were mainly driven from European populations, and RBP3 rs11204213 from Asian populations.
Fifteen CC variants were associated with spherical equivalent (Group B, Fig. 3). Eleven of these variants were not associated with AL. The strongest signals associated with spherical equivalent (P < 1 × 10 −7 ) were CASC15 rs9366426, CHRND/ PRSS56 rs2245601, KAZALD1 rs807037, FBN1 rs9806595, and  RNLS rs166976. Among these 11 variants, the allele associated with a steeper CC was associated with a more negative/myopic refractive error (or a flatter CC and more positive/hyperopic refractive error). For instance, FBN1 rs9806595 T allele was associated with a steeper CC (β = −0.012, s.e. = 0.002, P = 2.23 × 10 −8 ) and more myopic refractive error (β = −0.084, s.e. = 0.014, P = 1.40 × 10 −8 ), but not AL (β = −0.015, s.e = 0.013, P = 0.231). For the remaining CC variants at four loci that were associated spherical equivalent as well as AL (HUS1, RP11-91P17.1, FGF9, and IGFBP5/TNP1), the direction of genetic effect on spherical equivalent may depend on the relative magnitude of effect on CC versus AL. For instance, although both HUS1 rs12702376 C allele and RP11-91P17.1 rs7004112 T allele were associated with steeper CC (that tends to make the eye more myopic) and shorter AL (that tends to make the eye less myopic), HUS1 was associated with a negative/myopic refractive error, while RP11-91P17.1 was associated with a positive/hyperopic refractive error. Among these variants, pleiotropic effects of FGF9 rs9506725 were mainly driven from European populations. The remaining 18 CC variants (Group C, Fig. 3) were not associated with spherical equivalent or AL at FDR > 1%. The association on CC and spherical equivalent at the majority loci was, although not significant, in an expected direction, e.g. with a steeper CC and a more negative/myopic refractive error (or vice versa). Among these variants, CMPK1/STIL rs60078183 was mainly driven by Asian populations. The pleiotropic genetic effects on spherical equivalent of CC-associated variants, together with previously implicated AL variants 16,24 , are summarized in Supplementary Fig. 5.
To visualize functional enrichment of identified gene-sets, we mapped these sets graphically into an enrichment network 27 in Cytoscape 28 . Similarity coefficients greater than 0.375 were used to place these sets together with the interconnectivity drawn by a line. Using comprehensive collections of gene-sets related to CC, we identified enrichments in gene-sets involved in organism development and growth, with components for eye development and connective tissue cartilage, explicitly suggesting a strong genetic link between the size of the eye and organism/body (Fig. 5). We also observed that gene-sets were involved in ECM and glycosylation protein activity, which have previously been suggested in pathways related to central corneal thickness 29 .
To assess any specific roles of subset of CC genes with pleiotropic effects on myopia, we performed gene-set clustering. and effects toward emmetropic and myopic states. The figure illustrates genetic effects of AL (β AL ) might or might not compensate genetic effects of corneal curvature (β CC ) toward myopia or hyperopia. Longer CC (shown by positive β CC ; arrow upward) tends to make the eye hyperopic (dashed line in blue) and longer AL (positive β CC ; arrow downward) tends to make the eye more myopic (dashed line in red). Similarly, steeper CC (negative β CC , arrow downward) tends to make the eye more myopic and shorter AL (negative β CC ; arrow downward) tends to make the eye less myopic. The compensatory pleiotropic effects β AL could offset β CC on myopia or hyperopia at the pleiotropic ratio , as shown in group A. The compensatory pleiotropic effects β AL , however, cannot offset β CC on myopia or hyperopia at smaller pleiotropic ratio β AL β CC , as shown in group B. There might be other pleiotropic effect in Group C, besides AL, to compensate genetic effect of CC on myopia. Het-I 2 , for heterogeneous effects between the variants. All P-value for heterogeneity was >0.05. Pleiotropic effect ratio was calculated at each variant and combined to estimate β AL β CC and heterogeneity using the meta-analysis approach (see Methods). Grouping of A, B, and C was the same as in Fig. 3.
Enriched pathways identified included basement membrane, endoplasmic reticulum lumen, collagen-containing extracellular matrix, ossification, and osteoblast differentiation ( Fig. 5; Diamond node; Supplementary Data 5). The pathways were closely connected to ECM or developmental process, thus underlying a functional heterogeneity in genes exhibiting pleiotropic effects on both CC and refractive error.
Additional pathways identified from the whole-genome data using VEGAS, such as elastic fibre formation, camera-type eye development and O-linked Glycosylation, also displayed connectivity to the ECM, development processes and glycosylation protein activity ( Fig. 5; nodes in green). A few pathways (regulation of stem cell differentiation, system development, etc.) failed to link to any existing gene-sets were identified; this is likely attributable to marginal and uncharacterized CC-genes.
Biological function of the CC-associated loci. We examined gene expression in 20 normal human donor eyes from the ocular tissue database (OTDB; https://genome.uiowa.edu/otdb/). The majority of the genes identified at the 41 loci were expressed in human ocular tissues including cornea, sclera, ciliary body, or lens etc. (Supplementary Data 6). THBS4 had the highest expression in the cornea and sclera, IGFBP5 in the ciliary body and sclera, and PDGFRA in the lens.
We also queried expression quantitative trait loci (e-QTL) database to assess the association between the gene expression and the top CC-variants in different human tissues (see UTLs). Twenty-one index variants were eQTLs for the expression of genes, which resided in, or were adjacent to, the variant (Supplementary Data 7). Among them, the majority of variants were eQTL's for the expression of the nearest gene. There are some exceptions. For example, SNP rs807037 is a missense variant within the KAZALD1 gene and is also an eQTL for the nearby gene SFXN3 in artery, brain, adipose subcutaneous and nerve tissues (P < 2.20 × 10 −5 ). Intronic MTOR rs3737611 is an eQTL for EXOSC10 in thyroid, nerve and artery tissues (P < 5.70 × 10 −5 ). Intronic RSPO1 rs4074961 is an eQTL for itself and nearby genes GNL2, DNALI1 and MEAF6 in various tissues.
The newly identified genes are largely involved in ocular growth/development. LMX1B (encoding a LIM homeodomain class transcription factor) regulates anterior segment morphogenesis and patterning 30 , and is associated with Nail-Patella syndrome 31 . Recently, LMX1B has been reported to be associated with primary open-angle glaucoma, accompanied by development defects of the ocular anterior segments including cornea [32][33][34] . SOX2 (encoding a member of the SRY-related HMG-box family of transcription factors) links to both anophthalmia and microphthalmia 35 . Other CC genes that are also associated with eye or overall morphology include GHSR (associated with craniofacial development 36 ), HMGA2 (linked to body height), and LCORL (linked to skeletal trunk height 37 ).
Five of the implicated genes (ADAMTS3, ADAMTS6, ADAMTS7, ADAMTS19, and ADAMTS20) belong to the ADAMTS protein family, which is closely involved in regulating the organization and function of ECM 38 . For instance, ADAMTS6 has a major role in focal adhesion and tight junction formation, and can alter the deposition of fibrillin microfibrils in epithelial cells 39 . Interestingly, ADAMTS family members also associate with height variation (ADAM28, ADAMTS19, ADAMTS2, ADAMTS3, ADAMTS6, ADAMTSL1, ADAMTSL3), suggesting their pleiotropic roles in body growth 40   Gene-set enrichment analysis for corneal curvature in CREAM data. Enrichment results were mapped as a network of gene-sets (nodes) related by mutual overlap (edges). Node size is proportional to the total number of genes in each set, colour gradient represents the enrichment significance and edge thickness represents the number of overlapping genes between sets. Nodes in red represent gene-sets identified from the g:Profiler enrichment analysis, and in green represent additional gene-sets identified from the VEGAS-pathway analysis. Nodes of diamond show the pathways for the implicated genes associated with both CC and spherical equivalent (Group 2 in Fig. 3). Groups of functionally related gene-sets are circled and labelled (dashed line).
ECM protein fibulin-1, which modulates corneal cell migration by interactions with other ECM components, such as fibronectin 41 . Weill-Marchesani syndrome (that may be associated with thicker and steeper corneas) may also result from dominant mutations in FBN1. Fibronectin and IGFBP5 also bind to each other. This binding regulates the ligand-dependent action of IGFBP5 on insulin-like growth factors, and this has effects on cell proliferation, differentiation, survival, and motility. IGFBP5 shows expression in the human cornea and was down regulated in eyes with keratoconus 42,43 . BMP7 encodes a member of the transforming growth factor -β superfamily that is involved in numerous cellular functions including development, morphogenesis, cell proliferation, apoptosis, and ECM synthesis 44 . BMP7 is key in eye development during embryogenesis, and BMP7knockout mice have been shown to develop anophthalmia 45,46 . OFCC1 (encoding a reticular cytoplasmic protein expressed during embryonic development) in the Medaka fish is associated with the ojoplano ('flat eye') phenotype due to defective eye cup morphogenesis 47 . COL5A1 and COL6A1 encode components of type V and VI fibrillar collagens that are present in the human cornea 48 . COL6A1, together with ADAMTS20, have been reported to be associated with intraocular pressure 49 . COL5A1 mutations are found in classical Ehlers-Danlos Syndrome 50 , which is associated with thinner and steeper corneas 51 . COL5A1 is also a susceptibility locus for central corneal thickness 52,53 . THBS4 encodes an extracellular calcium binding protein that is involved in cell proliferation, adhesion, and migration 54 . FGF9 is involved in the neural patterning of the optic neuroepithelium 55 .
Genes identified as being involved in the Wnt signaling pathway were also implicated in our analysis (SOX2, FRAT1, FRAT2, RSPO1, FGF9; Supplementary Data 5 "canonical Wnt signaling pathway"). Two additional Wnt signalling related genes (ZNRF3 16 and WNT7B 24 ), though not the top genes in this study, were also identified as being associated with CC or axial length. The Wnt signalling pathway has prominent effects on multiple developmental events during embryogenesis 56 , including that of differentiation of the anterior segment of the eye 57,58 , and retinal development 59,60 .

Discussion
In the largest CREAM trans-ethnic GWAS meta-analysis of CC to date (44,042 individuals with replication in 88,218 participants from UK Biobank), we identified novel loci through single-variant analysis, and gene-based tests. SNP-heritability was estimated at 0.267 and 0.196 in Europeans and Asians, respectively. We discovered population-specific loci that existed in both European and Asian ethnic groups, as well as the presence of a high concordance of inter-population genetic effects overall. Variants were involved in coordinating eye size (by affecting CC and AL concurrently) whilst maintaining emmetropia (Group A). Meanwhile, other genetic variants were associated with refractive error (Group B); the genetic effect for AL could not compensate the effect for CC, as showing that the pleiotropic effect ratio β AL β CC was significant smaller than that of "eye size" variants. A third group of variants was also observed that appeared independent in terms of pleiotropic effects. Besides pathways related to the ECM, the implicated genes were significantly enriched in pathways involved in organism development and growth, eye development, connective tissue cartilage and glycosylation protein activity. Implicated genes with pleiotropic effects on refractive error were involved in diverse pathways related to ECM and organism development and growth.
Our data provide insights into novel genes that regulate CC across European and Asian populations. We found trans-ethnic replication of significant loci, and a high concordance of genetic effects in variants with little discrepancy in allele frequency between the two ancestry groups. Our results are robust as 90.2% of CC-associated loci were replicated at a genome-wide significance throughout the UK Biobank. We also confirm the association of the MTOR loci 17 with CC in Europeans, in contrast to the lack of replication in previous Europeans studies with much smaller sample size 11,19 . Although the underlying genetic effects were largely shared between the two ancestry groups, at the same time, population-specific loci were also observed: HDAC11/ FBLN2, ADAMTS19/CHSY3, and FGF9 in Europeans, and RBP3 and CMPK1/STIL previously reported in Asians 18 . In these cases, the lead variants were monomorphic in the other non-significant Asian or European populations, respectively, and any signals barely seen from the flanking variants at these loci. Our data show that the trans-ethnic meta-analysis approach yields shared and unique variants for CC in Europeans and Asians.
We used bioinformatics tools to demonstrate the functional connectivity between the associated genes. The newly identified loci such as LMX1B, SOX2, NHSL1, GHSR, HMGA2, IGFBP5, FRAT1, FRAT2, STIL, USP1, HUS1, STON2, and IGF2 are mainly involved in organism growth and eye development. Additional notable CC candidate genes belong to the ADAMTS family, including ADAMTS7, ADAMTS19, and ADAMTS20 involved in organization and function of ECM. Novel genes, such as FBN1, BMP7, COL6A1, THBS4, FBLN2, and KAZALD1, are involved in both ECM formation and organism development. In addition, CCgenes associated with refractive error were involved in basement membrane, endoplasmic reticulum lumen, collagen-containing extracellular matrix, ossification, and osteoblast differentiation, underlying a functional heterogeneity in genes exhibiting pleiotropic effects on both CC and refractive error.
Our study is the most comprehensive study on pleiotropic effects of CC-associated genes on eye size and refractive error in humans. Several small-size studies also have reported the effects of 'eye-size' genes, such as PDGFRA 11,17,19 and RBP3 18 . Our study confirmed previous findings. Studies in mice and chicken also support the existence of distinctive genetic effects to determine eye sizes, for instance, (1) effects that purely govern eye size, (2) effects restricted to specific ocular dimension (i.e. CC or AL separately), or (3) effects that scale the size of the eye and body simultaneously 61,62 . Clearly, CC-genes are involved in heterogeneous genetic function.
In human emmetropic eyes, CC is highly correlated with AL, and the two are carefully scaled relative to each other 63 . Thus, genetic pathways may exist to simultaneously influence AL and CC while maintaining the emmetropic status 12,61,64 . In our analyses, we have therefore compared our set of CC loci with their respective associations with AL and refractive error. We identified 'eye-size' genes (HMGA2, RSPO1, HDAC11/FBLN2, RBP3, PDGFRA, NHSL1 and ADAMTS19/CHSY3, and INTS6) that were associated with eye size (e.g. a larger eye with both a flatter CC and longer AL, or vice versa), but not refractive error. The compensatory pleiotropic effect for AL could offset CC's effect toward myopia or hyperopia; namely, a genetic determined 1 mm increase (or decrease) for CC accompanying a 2.92 mm increase (or decrease) on average for AL might cancel out their opposite effects on refractive error. This may represent a carefully coordinated scaling of optical components to maintain the eye in an emmetropic state as it grows. These genetic variants may therefore control variation in eye size independent of refractive error. Among these 'eye-size' genes, HMGA2 and HDAC11/ FBLN2 are likely to have pleiotropic effects on both the coordinated scaling of the eye, as well as height 65,66 . Similarly, the ADAMTS19 gene encodes metalloproteinases that belong to the ADAMTS family with the members as human growth genes 40 . This is consistent with the findings that height (body size) and eye size are genetically coordinated 12,62 . Among the other 'eye-size' genes identified, some had roles in Wnt signalling (RSPO1 and HDAC11), platelet-derived growth factor signalling (PDGFRA), and extracellular ligands and calcium binding (FBLN2).
In contrast to the group of 'eye-size' genes that do not affect spherical equivalent, there is another group of CC-implicated variants associated with refractive error, with little or no pleiotropic effect on AL (Group B). The pleiotropic ratio β AL β CC was significantly smaller than that in 'eye-size' variants, therefore, without adequate compensatory effects on AL, these variants may influence the refractive error status of the eye primarily through CC. There is one exception at loci RP11-91P17.1. Variant rs7000412 T allele was associated with steeper CC (that tends to make the eye more myopic) and shorter AL (less myopic) with overall effects towards to a hyperopic refractive error; thus this variant influenced refractive error likely through AL. Five top loci replicated in UK Biobank for refractive error (P < 1 × 10 −7 ), including CHRND/RPSS56, FBN1, CASC15, RNLS, and KAZALD1, have also been reported in previous CREAM GWAS 13 . Ten loci showed significance in replication after accounting for multiple testing (FDR < 0.01). These genes are actively involved in pathways of basement membrane, endoplasmic reticulum lumen and collagen-containing extracellular matrix, linking to ECM and organism development, growth. Plotnikov et al. recently also proposed a genetic link between CC and refractive error in Europeans 67 . Using CC-associated SNPs in emmetropes as instrument variables, they estimated the causal effect of CC on refractive error to be +1.41 D (95% CI, 0.65-2.16) less myopic refractive error per mm flatter cornea. A significant group of CC-genes identified in our study showing association with spherical equivalent corroborates the finding of an association between CC and refractive error.
The remaining CC loci (Group C) were not significantly associated with refractive error or AL. However, in a majority of these variants, the associations with spherical equivalent, although not statistically significant, were in the expected direction-for instance, a flatter cornea and a more hyperopic spherical equivalent, or vice versa. It is unknown whether these loci may also have modulatory effects on other refractive components of the eye (e.g. lens thickness or anterior chamber depth) that may have attenuated its effect on refractive error. In addition, some of these genes in Group 3 (FNDC3B, COL5A1, COL6A1), together with genes in Group 2 (IGFBP5, FGF9, and CWC27/ ADAMT6) was linked to connective tissue disorder (Supplementary Fig. 6) and has been associated with keratoconus 29,68-71 , a disorder of corneal thinning and steepening, implying a possible effect of CC genes regulating refractive error and keratoconus.
In summary, we have identified 47 genome-wide significant loci for CC (of which 26 are new), through a large-scale tansethnic GWAS meta-analysis. The importance of undertaking this study in individuals of different ethnicities cannot be understated as we identified both population-specific loci in Europeans as well as Asians as well as loci that were common between both ethnicities. These findings provide insights into the underlying genetic aetiology of eye growth and may provide pointers for us to explore why myopia is more prevalent in Asians than Europeans. These CC loci can coordinate AL and eye-size to keep human eyes emmetropic, and some play a role in the development of refractive errors primarily through variations in CC. Implicated genes were significantly enriched in a network linking extracellular matrix organization, developmental process for body and eye and glycosylation protein activities. Elucidating and characterising the heterogeneity of such genes that regulate the optical component dimensions of the eye may enable a better understanding of the biology of both emmetropia and ametropia in humans.

Methods
Study populations. The discovery cohorts included 29,580 individuals with European ancestry from 18 studies, and 14,464 with Asian ancestry from 10 studies. General methods, demographics, and phenotyping of the study cohorts have previously been extensively described, and are provided in brief in Supplementary  Table 1 and Supplementary Notes. In the replication phase, 88,218 participants of European ancestry from the UK Biobank who had measurements for CC were included in the replication stage, as well as 95,505 participants of European ancestry (from the UK Biobank) with phenotype information for refractive error 3 . Written informed consent was obtained from all participants in accordance with the Declaration of Helsinki. All studies were performed with the approval of their local Human Research and Ethics Committee.
Phenotype measurements. All participating CREAM cohorts used similar protocols for the collection of keratometry and other ocular biometric measurements. The protocols have been described in detail elsewhere 14,16,59 . In brief, CC radii in the horizontal and vertical meridians were measured using an autokeratometer. The means (in millimetre) of CC from the individuals' two eyes were used for analysis, while the means of the readings from one eye were used when the readings from the other eye were unavailable. Participants were excluded if they had corneal scars, keratoconus, prior refractive or cataract surgery, or other intraocular procedures that could alter CC. For the UK Biobank, participants were excluded from the analyses if they had an eye disorder that may have altered their refractive error or CC (see Supplementary Notes).
Genotyping and imputation. The CREAM study samples were genotyped on either Illumina or Affymetrix platforms. Genotypes were imputed using the 1000 G Genomes Project reference panel (Phase I version 3, March 2012 release). SNPs with low imputation quality were filtered using metrics specific to the imputation method and thresholds used in previous GWAS analyses. The Markov Chain Haplotyping software, IMPUTE 72,73 , or MACH 74 were adopted for imputation. A detailed description regarding genotyping platforms and imputation procedures have been outlined (Supplementary Table 2). Stringent quality control of genotype data was applied in each cohort from CREAM. Samples with low call rates (<95%) or with gender discrepancies were excluded. Cryptically related samples and outliers in population structure from principal component analyses were also excluded. SNPs flagged with missingness >5%, gross departure from Hardy-Weinberg equilibrium (P < 10 −6 ), and minor allele frequency (MAF) < 1% were removed from further analyses. Poorly imputed markers (IMPUTE info < 0.5 or minimac Rsq < 0.5) were excluded. UK Biobank genotyping arrays were imputed to the HRC reference panel and a combined 1000 Genomes Project -UK10K reference panel using IMPUTE4 75 . Data quality control (QC) was described in the Supplementary Notes.
Statistical analyses and meta-analyses. We assumed an additive genetic model where the dosage of each SNP was a continuous variable ranging from 0 to 2 for the effect allele. For each study, an additive allele-dosage regression model, adjusted for both age and sex at each genotyped or imputed SNP, was conducted to determine its association with CC represented as a quantitative trait. An additional adjustment for up to the first five principal components was carried out according to the population substructure in each individual study. For studies that included children and adolescent participants, GWAS analyses were conducted separately by age groups (age ≥ 25 vs. age < 25), as for previous GWAS analyses for corneal astigmatism 76 . Sample outliers with CC values exceeding six standard deviations from the mean were excluded at the study level. The per-SNP meta-analyses were performed in METAL software (https://genome.sph.umich.edu/wiki/METAL) with a weighted inverse-variance approach 77 . A Cochran's Q test was used to assess heterogeneity across studies 78 .
Locus identification and genetic variants annotation. The independent signal from the meta-analysis was determined using LD-clumping procedure in PLINK (https://www.cog-genomics.org/plink2). The index variant was identified (P < 5 × 10 −8 ) in each clump, which was formed for variants with P < 1 × 10 −5 that were in LD (r 2 > 0.1) and within 500 kb of the index variant. The same variant were assigned to no more than one clump. The LD structure was estimated from the European panel in the 1000 Genome Project as the reference population, or Asian panel for meta-analysis summary statistics in Asians. A locus was identified by an index variant with the regions flanking 250 kb on both sides. For those with multiple signals in one locus (500 kb region) or an overlapping of multiple loci identified from the PLINK clumping procedure, conditional analysis was further performed to confirm the independent signals using GCTA-COJO 79 . The LD structure was estimated in the same manner as for LD-clumping procedure in PLINK. The regional plot was drawn for each identified locus from the combined meta-analysis ( Supplementary Fig. 4) using LocusZoom (http://locuszoom.org/). The coordinates and variant identifiers are reported on the NCBI B37 (hg19) genome build, and annotated using UCSC Genome Browser 80 . We identified variants within each of the LD blocks (r 2 ≥ 0.6) in European and Asian populations of the 1000 Genomes Project (100 Kb flanking the top SNP at each locus) to apply functional annotations of transcription regulation using HaploReg 81 (https://pubs. broadinstitute.org/mammals/haploreg/haploreg_v3.php) and Encyclopedia of DNA Elements (ENCODE) 82 data.
Replication in UK Biobank participants. The UK Biobank reported the maximum and minimum corneal power in each eye. After taking the mean of replicate readings, the corneal power in each eye was calculated as the mean of the maximum and minimum values. Corneal power was converted to corneal radius of curvature using the equation CC = (337.5/corneal power). For the genetic analysis of CC in all available participants, we took the average CC of the two eyes as the phenotype. A total of 88,218 individuals were included in the analysis for CC in all available participants. Analyses were performed using BOLT v2.3 22 . Variant genotype, age, sex, genotyping array (coded as 0 or 1 for the UK BiLEVE or UK Biobank Axiom, respectively) and the first 10 PCs were included as covariates.
BOLT uses a mixed model to account for relatedness (kinship) between individuals.
Gene-based tests and pathway analyses. Gene-based testing was conducted using the VEGAS software 25,26 (https://vegas2.qimrberghofer.edu.au/) on the results of separate meta-analyses of GWAS in European and Asian ancestries. Gene-based p-values from different populations were combined by Fisher's method. For samples of European descent, we used the European panel in the 1000 Genome Project as the reference population to estimate patterns of LD. For the Asian ancestry groups, we used the combined 1000 Genome Project Asian samples as the reference population to approximate LD patterns. To include gene regulatory regions, SNPs were included if they fell within 50 kb of the transcription start site of genes.
VEGAS-Pathway analysis 25,26,83 was carried out with pre-specified pathways from Gene Ontology 84 , MSigDB 85 (containing canonical pathways and gene-sets from BIOCARTA, REACTOME, KEGG databases), PANTHER 86 , and pathway commons databases 87 . We filtered these gene-sets to include only pathways with 10-1000 genes, yielding 9734 pathways. Empirical VEGAS-Pathway p values for each pathway were computed by comparing the summed χ 2 test statistics from real data with those generated in 500,000 simulations where the relevant number (according to the size of the pathway) of randomly drawn χ 2 test statistics was summed. To ensure that clusters of genes did not adversely affect results within each pathway, gene-sets were pruned such that each gene was >500 kb away from all other genes in the same pathway. We performed meta-analysis on the two sets of pathway p-values from Asian and European samples by Fisher's method.
We investigated functional annotation of the top identified genes using g: Profiler (https://biit.cs.ut.ee/gprofiler/gost). For g:Profiler, we used "g:GOST" function to perform pathway analysis on identified CC-associated genes. Prespecified pathways include Gene Ontology, pathways from KEGG, Reactome, WikiPathways and protein complexes from CORUM 88 . The significant pathway was claimed at the adjusted p-value < 0.05 after correction for multiple testing.
To investigate the connection between the enriched gene-sets, we mapped these gene-sets into network functional enrichment map analysis 27 . We visualize the network enrichment in Cytoscape software v3.7.1 (https://cytoscape.org/) 28 . Highly similar gene-sets were placed close together with the interconnectivity among genesets drawn by line (edge; similarity coefficient > 0.375).
Association of corneal curvature loci with axial length and spherical equivalent. For all identified CC-associated variants, we assessed their association with refractive error in European participants of UK Biobank (N = 95,505). We further assessed the association of CC-associated variants with AL using a subset of CREAM cohorts (N = 10,851; Supplementary Table 5). False discovery rates (FDR) from the Benjamini-Hochberg procedure were set at 1% as a threshold of statistical significance 23 . We categorized CC variants into three groups: (A) variants were associated with AL, but not spherical equivalent; (B) variants were associated with spherical equivalent; and (C) variants were not associated with spherical equivalent or AL.
Pleiotropic effect ratio β AL β CC at each variant was calculated to quantify relevant genetic effects on AL versus effects on CC and the variance was calculated using Delta method. To estimate the pleiotropic effect ratio for variants in each group, we performed meta-analyses in METAL software (https://genome.sph.umich.edu/ wiki/METAL) with a weighted inverse-variance approach 77 . A Cochran's Q test was used to assess heterogeneity across variants 78 . Z-statistics were used to test the significant difference of the pleiotropic ratio β AL β CC . SNP-heritability estimation. We applied the LD score method 22 (https://github. com/bulik/ldsc) using GWAS summary statistics to estimate SNP-h 2 . After merging SNPs with the HapMap3 Asian samples, we had a total of 1,174,487 and 1,085,659 SNPs for the LD score regression analyses for the European and Asian populations, respectively. The LD score matrix was estimated from the 1000 Genomes Project Asian reference panel, or European reference panel separately, with a 1 cM sliding window. We calculated the heritability using the software Idsc v1.0.0. The resulting regression slope was multiplied by the number of effective SNPs in the reference panel from the 1000 Genomes Project data 22 .
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
The summary statistics of the meta-analysis combining studies in CREAM are included in Supplementary Data 8. To protect the privacy of the participants in our cohorts, the datasets generated during and/or analysed during the current study are available from the corresponding authors on reasonable request.