Introduction

Uveitis is an intraocular inflammatory disease which can result in severe visual loss1,2,3 and can be categorized by etiology (infectious or non-infectious) and by affected ocular region (anterior, intermediate, posterior, or panuveitis). Non-infectious uveitis represents the majority of cases in the developed world, with a prevalence of 121 per 100,000 in the US3,4. Anterior Uveitis (AU), characterized by inflammation of the iris and/or ciliary body, is the most common type of non-infectious uveitis, with a prevalence of 98/100,000 adults in the US, accounting for ~80% of all non-infectious uveitis cases3. AU predominately affects younger individuals, with a mean age of onset less than 40 years of age5,6.

AU is frequently observed as a complication of spondyloarthropathies (SpAs), such as ankylosing spondylitis (AS), psoriatic arthritis, and inflammatory bowel disease (IBD)7. Hence, most studies so far have focused on AU in the context of a spondyloarthropathy (mainly AS). Notably, all these inflammatory diseases are strongly associated with HLA haplotypes. Birdshot Chorioretinopathy has the strongest known association to human leukocyte antigen (HLA)-A*29, followed by AS and HLA-B*278. Such an association was first described for AS and AU 50 years ago9,10, and since confirmed in several studies11,12,13,14. It has previously been estimated that approximately 50% of all patients with AU are HLA-B*27 positive15, which increases to above 80% among AS patients with AU11. In addition to HLA-B*27, associations of smaller effect size were described with other HLA alleles including HLA-A*02:01, HLA-B*08, HLA-DRB1*15 and HLA-DPB1*0314, and with other non-HLA common loci, including ERAP1, IL23R, and the 2p15 locus16. To date, most genetic association studies of AU were mostly in the setting of AS, with the drawback of intertwining the genetic signals of both diseases.

Here, we evaluate several large AU cohorts consisting of 3850 cases in total and match them with over 900,000 controls in the largest AU meta-analysis to date. We describe the underlying genetics of B*27-positive (B*27-pos) and B*27-negative (B*27-neg) AU in the various ancestries to identify strong signals in both sub-types. A B*27-pos analysis identifies a significant, HLA-B*27-dependent protective signal in ERAP1, suggesting an altered immunogenic peptidome as a pathogenetic factor. A complementary analysis of 2984 B*27-neg AU cases identifies both common and rare signals for B*27-neg AU: a genome-wide significant common signal near the HLA Class-II HLA-DPB1 gene, and several genome-wide significant genes that increase the risk for AU identified through gene-burden analyses of rare damaging coding variants. These results shed light on the genetics of AU and stress the importance of whole-exome sequencing in the efforts to decipher the disease’s underlying genetic risks.

Results

Common genetic signals contributing to AU risk

We sequenced eight large EHR based populations, including 3850 AU cases and 916,549 controls (Table 1). Testing the association of common variants, we discovered two genome wide significant signals for AU: a risk signal at the HLA-B locus (rs543685299, OR [95% CI] = 3.37 [3.11–3.65], p = 1.03E–196) and a protective signal for rs3198304 at the ERAP1 locus (OR [95% CI] = 0.86 [0.82–0.91], p = 1.1e–8) (Fig. 1). The top ERAP1 SNP showed a consistent direction of effect in 6/8 cohorts (Fig. 2).

Table 1 Overview of eight cohorts included in the meta-analysis
Fig. 1: Common HLA-B risk and ERAP1 protection with 3850 AU cases and 916,549 controls.
figure 1

A Manhattan plot depicting the -log10(P value) for all common variants (y-axis) across all chromosomes (x-axis). HLA-B top risk signal is shown by an upward red triangle on chromosome six, while ERAP1 protection is shown by the downward red triangle on chromosome five. Association models were run with age, age2, sex and age × sex, and 10 ancestry-informative principal components as covariates. P values are uncorrected and are from two-sided tests performed using approximate Firth logistic regression.

Fig. 2: Top SNPs at the HLA-B and ERAP1 loci across eight cohorts.
figure 2

A A forest plot depicting the association details for HLA-B top risk variant rs543685299 in each of the eight cohorts tested and including all ancestries. B A forest plot depicting the association details for the top ERAP1 protective intronic variant rs3198304 in the eight cohorts tested and including all ancestries. A meta-analysis result combining all cohorts is the lowest row (bold), meta-analysis OR is presented by a red diamond. Center points represent odds ratios as estimated by approximate Firth logistic regression, with errors bars representing 95% confidence intervals. P values are uncorrected and reflect two-sided tests. Numbers below the cases and controls columns represent counts of individuals with homozygote reference, heterozygote and homozygous alternative genotypes, respectively.

We repeated the analysis while restricting to individuals of European descent (3,180 cases and 826,685 controls) and observed similar results for both HLA-B (OR = 3.4, p = 1.1e–185) and ERAP1 (OR = 0.85, p = 1.1e–08, Fig. S1S2). ERAP1 is an ER-aminopeptidase that trims peptides to be loaded and presented by MHC class-I proteins, and alterations in ERAP1 change the peptidome available to HLA Class I alleles17.

Rare variant analyses identify risk genes contributing to AU risk

We next tested several gene-burden models that incorporated various AF filter thresholds as well as variant deleteriousness scores (see Methods).

The gene burden analyses combining all cohorts exhibited a controlled low inflation of ʎ=0.94 (Fig. S3), suggesting that the analysis did not deviate from the expected p value distribution and was well-adjusted for population stratification. This allowed us to confidently identify genes that pass a strict study-wide and genome-wide significance threshold. We used a study-wide significance threshold of p = 2.86e–07 calculated by using the approach from Li & Ji 2005 for multiple testing correction18, and utilized this for the remainder of results discussed here (see Methods for details).

Five genes reached the study-wide significance threshold (p = 2.86e–07, Methods). The first, IPMK, was significant when considering a model that includes pLoF and missense variants that are strongly deleterious (predicted by 5/5 prediction models), with AF < 0.1%, reaching a high OR [95% CI] = 9.42 [4.44–19.89] with p = 4.4e–09 (Table 2, Supplementary Data 12, Supplementary Information).

Table 2 Top gene burden results for AU

The second genome-wide and study-wide significant gene, IDO2, showed a strong risk signal with OR [95% CI] = 3.61 [2.23-5.7], p = 6.16e–08 for rare (AF < 0.1%) pLoF variants. IDO2 is a LoF-tolerant gene that exhibits a pLI score of 0 and O/E = 0.81 (0.54–1.25)19. The top association burden included seven distinct pLoF variants (Supplementary Data 23, Supplementary Information).

Three additional genes exhibited borderline significant p values and represented results from extremely rare gene-burden masks that consider only variants appearing once in each cohort (singletons). The first gene, ACHE, had six cases, each carrying an extremely rare and distinct damaging missense variant, with OR [95% CI] = 15.29 [5.57-42.0], p = 1.22e–07. Since no pLoF variants are included in this model, this might suggest a gain of toxic function to this gene that might affect AU risk. ACHE codes for acetylcholinesterase, a well-known enzyme that breaks down acetylcholine (Ach). STXBP2 (Syntaxin binding protein 2) was also significant when considering an extremely rare, missense-only gene burden mask, with nine cases carrying distinct variants (OR [95% CI] = 11.66 [4.63-29.39], p = 1.92e–07). Missense and PLoF mutations in STXBP2 are associated with autosomal-recessive Familial Hemophagocytic Lymphohistiocytosis (FHL), a hyperinflammatory syndrome caused by uncontrolled overactivation of the immune response. Uveitis has been reported as a manifestation of hemophagocytic lymphohistiocytosis20,21. Lastly, five extremely rare pLoF variants in ADGRF5, the adhesion G-protein coupled receptor 5 (also called GPR116), aggregate to increase risk for AU (OR [95% CI] = 27.04 [7.73-94.54], p = 2.44e–07). The low number of case carriers observed for ACHE, STXBP2, and ADGRF5 (<10), suggests that further support is required to nominate them as risk genes for AU (Supplementary Information).

ERAP1 signal is strengthened in a B*27-stratified analysis

To better understand the genetic signals underlying HLA-B*27 in the AU cohort, we next controlled for HLA-B*27 in our analyses using the HLA-B*27-tagging SNP rs4349859 as a covariate. When controlling for this SNP, the signal at the HLA locus was diminished, leaving a borderline signal near HLA-DPB1 (OR = 1.15, p = 6e–08, Fig. S4). However, the protective signal on ERAP1 remained genome-wide significant (OR = 0.84, p = 1.7e–8). Thus, after conditioning for the HLA-B*27 signal, we still observed associations at both the ERAP1 and HLA loci. We therefore designed stratified analyses by which we divided the cohorts by the carrier status of HLA-B*27 using the HLA-B*27 tag SNP rs4349859. The HLA-B*27 stratification resulted in two cohorts: (1) a B*27-pos cohort with samples carrying either one or two copies of the tag SNP, and (2) a B*27-neg cohort with samples carrying zero copies of the tag SNP. The B*27-pos cohort consisted of 856 AU cases and 70,109 controls, suggesting that 22.2% of the analyzed AU cases carry the B*27 allele. This is a significant enrichment compared to the 7.7% B*27 carriers in the controls, similar to the 6%-8% expected HLA-B*27 frequency in general population in the US22.

The final B*27-stratified analysis greatly weakened the HLA-B signal (rs543685299: OR = 1.49; p = 5.6e–3), while the ERAP1 signal remained the only genome-wide significant locus (Fig. 3A, B). Moreover, the protective effect of the ERAP1 variant was stronger when examining the smaller B*27-stratified cohort (rs27710, OR = 0.74, p = 1.3e–9), even though the B*27-pos cohort included only 22.2% of cases and 7.7% of controls from the larger cohort.

Fig. 3: A B*27-pos analysis exhibiting ERAP1 as the only genome-wide significant risk for B*27-AU.
figure 3

A A Manhattan plot depicting the -log10(P value) for all common variants (y-axis) across all chromosomes (x-axis). ERAP1 top protective signal is shown by a downward red triangle on chromosome five. B A locus zoom plot showing ERAP1. Genome-wide significant threshold of 5e–08 is represented by a dashed gray line. Coding variants are highlighted in black, including labeled rs30187 (K528R). C A forest plot depicting the association details for ERAP1 top risk variant (rs30187) in all cohorts tested. A meta-analysis result combining all cohorts is the lowest row (bold), meta-analysis OR is represented by a red diamond. Center points represent odds ratios as estimated by approximate Firth logistic regression, with errors bars representing 95% confidence intervals. P values are uncorrected and reflect two-sided tests. Numbers below the cases and controls columns represent counts of individuals with homozygote reference, heterozygote and homozygous alternative genotypes, respectively.

The top ERAP1 variant is in perfect LD with the ERAP1 missense variant K528R-rs30187 (Fig. 3B, R2 = 1, D’ = 1). This haplotype has been shown to be an eQTL that significantly decreases ERAP1 expression and is associated with other HLA class-I related disorders such as AS and Spondyloarthritis23. The effect of the ERAP1 signal across the B*27-pos cohorts was consistently protective (Fig. 3C). We repeated the analysis while restricting to individuals of European descent consisting of 808 AU cases and 67,761 controls. The EUR-only, B*27-pos analysis confirmed the significant results for the protective ERAP1 locus with OR [95% CI] = 0.73 [0.66–0.81], and a p = 4.1e–10 for the top SNP rs30187 (Fig. S5).

Since AU is commonly observed in other class-I-opathies such as Psoriatic Arthritis and AS, we designed a strict analysis removing all samples diagnosed with either AS (ICD10-M45) or psoriasis (ICD10-L40) from the smaller B*27-pos cohort. When considering only B*27 carriers that were not diagnosed with either AS or psoriasis, we identified 618 AU cases and 67,256 controls in all eight cohorts including all ancestries. This sets the proportion of AU cases that are diagnosed also with AS or Ps at 28%. Within the general B*27 controls, we found 4% to have AS or Ps diagnosis. In this analysis, ERAP1 locus presented a similar protection of OR = 0.74, with p = 3e–6 owing to the decreased power of this analysis, but supporting the protective direction of the full analysis (Fg. S6).

Phasing of the ERAP1 locus identifies the risk and protection ERAP1 haplotypes

ERAP1 haplotypes were previously studied in the context of several HLA class I-associated autoimmune diseases including Birdshot Chorioretinopathy (BSCR) and AS24,25,26. The common haplotypes are reported to affect ERAP1 expression levels and enzymatic activity. Haplotypes Hap2 and Hap3 associate with increased expression and enzymatic activity, while Hap10 corresponds to a decrease in both expression and activity17,25. The main SNPs that distinguish between these sets of haplotypes, K528R (rs30187) and D575N (rs10050860) show a distinct eQTL effect on ERAP1 expression as observed in the GTEx data for many tissues, the strongest including whole blood (p = 4.4e–78), skeletal muscle (p = 4.6e–49) and lung (p = 6.1e–46) for rs30187, and skeletal muscle (p = 5.0e–43), whole blood (p = 9.8e–31) and esophagus (p = 1.0e–14) for rs1005086027. We therefore set out to examine all possible ERAP1 haplotypes and their effect on AU risk.

The phasing of ERAP1 common SNPs that construct the ERAP1 haplotypes included: (a) the extraction of the distinctive imputed ERAP1 SNPs, (b) phasing the dosage data, and (c) classifying individual SNPs in each sample into one of the 10 defined haplotypes (described in the Methods section). We then modeled the association of each haplotype with case-control status, including the covariates of age and sex. The results pointed to Hap2 as the top risk for AU with OR = 1.2 and p = 2.1e–09 (Table 3). Interestingly, Hap1, which differs from Hap2 only by variants I12 (rs72773968) and 127 P (rs26653) and occurs at a similar frequency, was not significant, suggesting that residues 12 and 127 contribute to the Hap2 risk. When tested individually, 12 T has OR = 1.13 (p = 6.8e–04) and R127 has OR = 1.08 (p = 2.7e–03) in the most-powered variant level association including the full cohort.

Table 3 ERAP1 haplotype associations with AU

The results were most pronounced for Hap10, which presented a strong protective signal (OR = 0.83, p = 2.8e–10) driving most of the ERAP1 protective signal observed in the previous analyses. Hap10 represents the strongest common eQTLs that decrease the expression of ERAP1 including 528 R and 575 N. We also identified a protective effect for Hap6 (OR = 0.85; p = 2e–04) that shares most SNPs with Hap10 including 528 R and 730E, with the two SNPs having strong effects on decreased ERAP1 expression and activity27,28.

We applied the same approach but to the smaller B*27-pos cohort. We hypothesized that since ERAP1 signal is specific to this HLA allele background, the haplotype effects will become more prominent. The analyses confirmed this hypothesis, presenting the same risk direction for Hap2 (OR = 1.38, p = 5.38e–07) and protection for Hap10 (OR = 0.71, p = 4.4e–07), exhibiting stronger effects and weaker p values due to the significant loss of power (Supplementary Data 4). The protective effect of Hap6 is also more prominent in the B*27-pos cohort, with a strong protective OR = 0.7 and similar p = 4e–04, surprisingly maintaining the same signal with the much smaller cohort, due to the stronger depletion in cases.

An additive effect for B*27-AU risk with the combined effect of having two copies of HLA risk alleles and the ERAP1 risk haplotypes

The effects of the ERAP1 risk-Hap2 and protective-Hap10 were assessed in the above analysis in all subjects or subjects carrying at least one copy of the HLA-B*27 allele. We next constructed a model to test the effect of homozygous/heterozygous ERAP1 haplotypes on different HLA-B*27 backgrounds, using samples that carry no B*27 risk alleles, one, or two copies of the B*27 risk-allele. We defined zero HLA-B and ERAP1 protection haplotypes (two copies Hap10 and no copies of Hap2) as the reference risk genotype (i.e. OR = 1), and assessed the risk of the ERAP1 Hap10 and Hap2 on a B*27 negative background (Fig. 4, left panel and supplementary Data 5), compared to having one (Fig. 4, middle panel) and two copies of HLA-B*27 allele (Fig. 4, right panel). We found a moderate risk increase of OR = 1.4 for individuals carrying two protective ERAP1-Hap10 copies with one copy of HLA-B*27 allele, which increased by more than four times (OR = 6.3) when replacing two copies of ERAP1-Hap10 with two copies of ERAP1-Hap2. The maximum risk combination (two ERAP1-Hap2 and two HLA-B*27 alleles), reached a large OR = 36.9. We found that even with two copies HLA-B*27, having two copies of ERAP1-Hap10 reduces the AU risk back to a model-estimated OR = 2.1. However, we did not observe cases carrying two copies of Hap10 and two copies of HLA-B*27, suggesting that the risk for AU when having two copies of ERAP1 Hap10 is even lower. This result supports the hypothesis that the ERAP1-Hap2 (increased activity and expression of ERAP1) play a role in the processing of the antigenic peptide(s) that is presented by HLA-B*27 in AU. The Hap10 haplotype, that is associated with decreased activity and expression, might process a peptidome that lacks the antigenic peptide(s).

Fig. 4: The combined risk for AU with HLA-B*27 and ERAP1-haplotyes.
figure 4

The effect of homozygous and heterozygous ERAP1 haplotypes Hap2 and Hap10 on different HLA-B*27 backgrounds. Zero HLA-B and ERAP1 protection haplotypes combination (two copies of Hap10 and no copies of Hap2) was defined as the reference risk genotype (i.e. OR = 1, first column on left panel). The assessed risk of the ERAP1 Hap10 and Hap2 combinations on a B*27 negative background is shown (left panel and supplementary Data 5). Middle panel is the same as left panel, but for one copy of HLA-B*27. Right panel is the same as left panel, but for two copies of HLA-B*27.

HLA-DPB1 is a significant risk for B*27-neg AU

We next included only cases and controls not carrying the B*27-tagging SNP (B*27-neg AU). This cohort consisted of 2984 B*27-neg AU cases and 844,709 B*27-neg controls. This analysis revealed a genome-wide significant signal at rs6914651, an HLA-Class-II gene region near HLA-DPB1, which had not previously been associated with AU [OR = 1.18 (1.11–1.25), p = 1.6e–08] (Fig. 5). With an allele frequency of 0.277, rs6914651 tags a common signal that might reflect a coding variant within HLA-DPB1 that is associated with AU risk, which in turn might point to a specific HLA-DPB1 allele that increases risk of HLA-B*27 negative AU. To answer this question, we followed with two additional analyses: (1) imputing the HLA-DPB1 alleles and testing for association of each allele with case-control status, and (2) fine-mapping of the region near HLA-DPB1 to uncover the genetic signals that underlie this significant association. The results of testing the associations of class-II HLA alleles have shown HLA-DPB1*04:01 as a protective allele (p = 7.2e–06, OR = 0.89) and HLA-DPB1*03:01 as risk (p = 2.4e–04, OR = 1.2, supplementary Data 6). However, when adjusting for the top SNP (rs6914651) in the regression model, neither HLA-DPB1*03:01 or HLA-DPB1*04:01 were nominally significant, suggesting that it might not be a specific HLA-DPB1 allele that affects AU risk (supplementary Data 7). However, rs6914651 acts as an eQTL for HLA-DPB1 and significantly decreases its expression, supporting an effect on AU risk by decreasing HLA-DPB1 expression27. The results of fine-mapping the DPB1 region also suggested that the signal originates not from HLA-DPB1 itself, but from the region downstream to HLA-DPB1, where a long stretch of non-coding variants share similar posterior inclusion probabilities (Fig. S7).

Fig. 5: HLA-DPB1 is a significant risk for B*27-neg AU.
figure 5

A Manhattan plot depicting the -log10(P value) for all common variants (y-axis) across all chromosomes (x-axis). HLA-DPB1 top risk signal is shown by an upward red triangle on chromosome five. B A locus zoom plot showing all common and rare signals on HLA-DPB1. Genome-wide significant threshold of 5e–08 is represented by a dashed gray line, above which there is a stretch of high LD variants downstream to HLA-DPB1. Association models were run with age, age2, sex and age × sex, and 10 ancestry-informative principal components as covariates. P values are uncorrected and are from two-sided tests performed using approximate Firth logistic regression.

Gene burden analyses of B*27-neg AU

We next asked whether the gene burden analyses using the B*27-neg AU cohort replicate previous results with the full cohort. This question is highly relevant to deciphering the mechanism underlying both sub-types of AU. We found that both IPMK and IDO2 replicated a similar direction of risk in the B*27-neg cohort (Table 4, Supplementary Information). Both of those genes also did not show significant associations in the B*27-pos cohort, suggesting these mechanisms of risk pertain to B*27-neg AU.

Table 4 IPMK and IDO2 genes are associated with B*27-neg AU

Aside from IPMK and IDO2, we found support for several additional genes when examining only the B*27-neg cohort (Supplementary Data 8). First, the signal for ADGRF5 has an OR of [95% CI] = 27.6 [10.1–75.5] and p = 1e–10, due to the addition of two B*27-neg cases carrying singleton pLoF variants. While the number of cases is still low (<10), this gives additional support for ADGRF5 to be involved in risk of AU. In addition, the same STXBP2 model including singleton damaging missense variants was also strengthened to OR[95% CI] = 14.8 [6.1–35.8] and p = 2.3e–09 with the addition of one case. As ADGRF5, this analysis also provided additional support for STXBP2 to be involved in the risk for AU. We further identified PMP22 as a borderline gene with to OR[95% CI] = 4.88 [2.7–8.9] and p = 2.3e–07, with a mask that includes rare missense variants and AF < 0.01%. Last, two additional genes received a borderline p value below threshold with six carriers each, for rare pLoF and missense singleton masks, respectively: LDHA (OR[95% CI] = 16.6 [11.6–218.5], p = 4.7e–08) and DPH6 (OR[95% CI] = 16.3 [5.6–47.1], p = 2.8e–07). However, lacking additional support, these candidate genes will require further evidence to be considered as AU risk.

Discussion

Anterior uveitis (AU) is often studied as a manifestation of systemic autoimmune diseases, with high prevalence in seronegative spondyloarthropathies including ankylosing spondylitis, psoriatic arthritis, arthritis associated with IBD and reactive arthritis29,30,31. Until now, it has been difficult to disentangle AU analysis from the other diseases that are well recorded and that allow much larger studies. With EHR data for almost one million samples, we were able to study the genetics of AU specifically by focusing on individual ICD diagnosis codes and removing the most common co-morbidities from the cohort for better elucidation of the genetic signals. We include in our analyses whole-exome sequence data for the full set of samples, allowing discovery of genes in which rare coding changes impact AU risk. We incorporate all ancestries into comprehensive analyses that dissect the contributions from different ancestral population. Consequently, we explore the underlying genetics of HLA-B*27 uveitis and distinguish HLA-B*27 positive and HLA-B*27 negative uveitis as two genetically distinct diseases. Although clinical manifestations overlap considerably, B*27-pos AU is typically characterized by more robust inflammation and is more likely to recur than B*27-neg AU32,33.

When stratifying the cohorts by HLA-B*27, we observe a limited range of B*27 carriers around 25%, much lower than the 50% previously reported9. Previous reports suggested around 50% B*27-carriers but were based on smaller studies34. Also, the AFR population in our datasets could also have reduced the proportion of HLA-B*27 carriers having much smaller occurrence of B*27 in the AFR ancestry (~1.7% of AFR controls). The large cohorts at hand allowed us to still study this much smaller cohort of B*27-pos and elucidate the clear effect of ERAP1 as the strongest factor affecting AU risk and protection in HLA-B*27 AU. ERAP1 has been previously reported as nominally conferring risk for AU in AS cases12,14. However, previous studies were underpowered to detect a significant the AU-ERAP1 signal, with cases having a major AS diagnosis, which made it hard to disentangle AS diagnosis from AU analysis.

Different combinations of non-synonymous SNPs give rise to the ten main ERAP1 haplotypes with differences in enzymatic activity and/or expression levels17. By utilizing the full size of the cohort and dissecting the samples into the ten common ERAP1 haplotypes, we were able to see the strong protection of two copies of Hap10, that may offer disease prevention even for individuals carrying two copies of the strongest B*27 AU risk alleles.

The ERAP1 low-activity low-expression Hap10 haplotype was previously shown to protect against AS, and to express a different B*27 peptidome than the risk Hap2 by providing reduced trimming of peptides17,35. In this context, the strong Hap10 homozygous protection that we observe for AU suggests that ERAP1 expression and enzymatic activity are lowest in these carriers. This also suggests that the peptidome shaped by Hap10 could be deficient in the antigenic peptide(s) that active the immune response in AU cases.

The stratification by HLA-B*27 also enabled us to observe a clear common genetic risk for B*27-neg uveitis in the form of class-II HLA-DPB1. This signal was distinct from B*27-pos AU and points to a distinct mechanism for the two diseases. While B*27-pos AU is driven by antigenic-peptide(s), where we hypothesize the mechanism of ERAP1 variants is affecting the peptides available for presentation in HLA-B*27, the mechanism of disease in B*27-neg AU may differ. The participation of a class-II gene, either directly by a specific allele, or indirectly through a change in expression, suggests a mechanism similar to celiac disease, where an exogenous immunogenic factor initiates the cascade that leads to pathogenicity36. In the case of AU, this might be something like a cataract surgery that exposes immune cells to tissues that are normally sequestered (the crystalline lens), however, further investigations are required to confirm such a hypothesis.

The availability of rare variants from exome sequencing, in addition to genotyping and imputation, allowed us to identify two genes where rare genetic variants affect AU risk. In the case of IPMK (Inositol Polyphosphate Multikinase), we find that either missense or loss-of-function variants combine together to show increased risk of disease. IPMK’s catalytic activities yields water-soluble inositol polyphosphates and is considered a signaling hub in mammalian cells that coordinates the activity of various signaling networks including regulating the TLR-induced innate immunity37. IPMK promotes Toll-like receptor–induced inflammation by stabilizing TRAF6, the Tumor Necrosis Factor Receptor–Associated Factor 6, that is a critical mediator of TLR signaling38. While this might be a valid mechanism affecting AU pathology, it remains to be seen exactly how IPMK affects AU risk.

For IDO2 (Indoleamine 2,3-dioxygenase 2) we observe significant risk through a clear loss-of-function mechanism that is shared between the studied cohorts. IDO2, like IDO1, was reported necessary for the differentiation of regulatory T cells in vitro and has been shown to play a pro-inflammatory role in the development of B cell-mediated autoimmune arthritis39,40. It is then likely that the loss of IDO2 might disrupt T-cell regulation and affect the T-Cell mediated response in the anterior chamber, contributing to the patho-mechanism of AU. While a precise role for IPMK and IDO2 in regulating immune tolerance in the anterior segment remains opaque, it is relevant to note that both proteins are expressed locally. Single-nucleus RNA sequencing data made available through The Broad Institute demonstrate IPMK and IDO2 expression in both the iris (irido-) and ciliary body (-cyclitis) (figures S8-S11, https://singlecell.broadinstitute.org/single_cell/study/SCP1841/).

Taken together, these results highlight the underlying and distinct genetics of B*27-pos and B*27-neg AU, presenting them as two genetically distinct diseases. We further identify the protection of ERAP1-Hap10, which raises the enticing prospect of ERAP1’s therapeutic potential in the management of AU. This is particularly relevant in B*27-pos AU where recurrent episodes of inflammation, difficult to control with topical steroids, put patients at increased risk of vision-threatening complications. Last, we uncover several risk genes for B*27-neg AU: including a common locus that affects AU risk in HLA-DPB1, as well as two risk genes and several candidate genes affecting disease risk through rare variation causing loss-of-function (as in IDO2) and/or changes to the protein sequence (as in IPMK), thus further elucidating the genetic risks for AU.

Methods

Study populations

Genome-wide association analyses were performed in eight cohorts including the U.K. Biobank cohort41 and the Geisinger Health System MyCode cohort42. Others datasets include: 29,237 from the Malmö Diet and Cancer Study43,

41,537 participants from the University of Pennsylvania Penn Medicine BioBank44,

29,845 participants from the Mount Sinai BioMe BioBank45,

49,071 from the Colorado Center for Personalized Medicine Biobank46, and 40,217 from the UCLA ATLAS Community Health Initiative47,48. We also included 115,418 participants from the MAYO-RGC Project Generation, which brings together the Mayo Clinic Biobank (N = 53,227)49 as well as 30 Mayo-based disease registries/studies who were successfully sequenced. This study was reviewed and approved by the Mayo Clinic IRB (#09-007763).

We included 829,865 participants of European ancestry, 42,790 of African ancestry, 13,870 of South Asian ancestry, 9305 of East Asian ancestry, 18,868 with ancestry from the Americas, and 5653 of other ancestries, for whom genotyping, exome-sequencing data and phenotype data were available (full breakdown of cohorts in supplementary Data 9). Cases were selected based on the “ICD10: H20 Iridocyclitis” diagnosis code, controls were defined as individuals without the ICD10: H20 code.

Ethical compliance

Ethical approval for the UK Biobank was previously obtained from the North West Center for Research Ethics Committee (11/ NW/0382). The work described herein was approved by UK Biobank under application number 26041. Approval for Geisinger Health System MyCode analyses was provided by the Geisinger Health System Institutional Review Board under project number 2006-0258. Informed consent was obtained for all study participants. Appropriate consent for the University of Pennsylvania Penn Medicine BioBank was obtained from each participant regarding storage of biological specimens, genetic sequencing and genotyping, and access to all available EHR data. This study was approved by the Institutional Review Board of the University of Pennsylvania and complied with the principles set out in the Declaration of Helsinki. All subjects participating in the MAYO-RGC Project Generation provided informed consent for use of specimens and data in genetic and health research and ethical approval for Project Generation was provided by the Mayo Clinic IRB (#09-007763). Ethical approval and consent for the Colorado Center for Personalized Medicine Biobank was reviewed and approved by the Colorado Multiple Institutional Review Board (#15-0461). All research performed in the UCLA ATLAS Community Health Initiative study conformed with the principles of the Helsinki Declaration. All individuals provided written informed consent to the original recruitment of the UCLA ATLAS Community Health Initiative. Patient Recruitment, Sample Collection for Precision Health Activities at UCLA is an approved study by the UCLA Institutional Review Board (UCLA IRB). IRB#17-001013. All research performed in this study uses de-identified data (without any Protected Health Information data) with no possibility of re-identifying any of the participants. The Mount Sinai BioMe BioBank study protocols were approved by the institutional review board of the Icahn School of Medicine at Mount Sinai. Written informed consent was obtained for all study participants. All participants in the Malmö Diet and Cancer Study were provided written informed consent and the study was approved by the Lund University Ethics Committee (MDC LU 51-90) and for the cadmium sub-study (2009/633).

Exome Sequencing and whole-genome genotyping

For analyses of common variants, we used array genotyping data and imputation performed with the use of the TOPMed reference panel50,51. Exome sequencing was performed at the Regeneron Genetics Center using a custom automated sample preparation approach. Samples were captured with IDT xGen v1 or Twist Comprehensive Exome probes and sequenced using Illumina HiSeq 2500-v4 or Illumina NovaSeq instruments, with 75-bp paired-end reads and two index reads. The GRCh38 human genome reference sequence and Ensembl, version 85, gene definitions were used for variant identification and annotation. For the COLORADO, MAYO-CLINIC and UCLA cohorts sequenced with Twist, probes also included the Twist Diversity SNP panel, for which multi-point refinement was conducted using GLIMPSE prior to further genotype QC and imputation52. For exome coding variants, we classified variants from most to least deleterious in the following order: frameshift, stop–gain, stop–loss, splice acceptor, splice donor, in-frame insertion or deletion (indel), missense, and other annotations. Frameshift, stop–gain, stop–loss, splice-acceptor, and splice-donor alleles were categorized as predicted loss-of-function variants. We classified missense variants using computer modeling to predict functional effects with five algorithms: SIFT53, Polyphen-2 HDIV54, Polyphen-2 HVAR54, LRT55 and MutationTaster56. To account for the fact that different genes have different types and frequencies of potentially causative variants, we used the functional annotation of the variants in each gene to generate seven pseudo-genotypes based on the combined variant burden: predicted loss-of-function variants; predicted loss-of-function variants plus missense variants that were predicted to be deleterious by five of five algorithms; predicted loss-of-function variants plus missense variants that were predicted to be deleterious by at least one of five algorithms; predicted loss-of-function variants plus any missense variants; missense variants that were predicted to be deleterious by five of five algorithms; missense variants that were predicted to be deleterious by at least one of five algorithms; and finally, any missense variants at all (these categories are similar to those used previously)57. We used the alternative allele frequency and functional annotation of each variant to generate seven genotypes based on the combined variant burden: pLoF variants with an alternative-allele frequency thresholds of 1%, 0.1%, 0.01% and singletons, pLoF variants plus missense variants that were predicted to be deleterious and had an alternative-allele frequency thresholds of 1%, 0.1%, 0.01% and singletons.

Statistical analysis

We estimated associations between genotypes and phenotypes by fitting linear regression models (for quantitative traits) or Firth bias-corrected logistic regression models (for binary traits) using the REGENIE software, version 2 + 58. Analyses were stratified according to cohort and ancestry and were adjusted for age, age squared, sex, age-by-sex, and age squared–by–sex interaction terms; experimental batch-related covariates; the first 10 common variant–derived genetic principal components; the first 20 rare variant–derived principal components; and a polygenic score generated by REGENIE, which robustly adjusts for relatedness and population structure58. We performed a meta-analysis of association results across cohorts and ancestries with a fixed-effect inverse-variance–weighted approach. We report results for TOPMED imputed data for common variants defined by minor allele frequency greater than 0.5%, and we report results for exome sequenced rare coding variants that had a minor allele count greater than five in both cases and controls. For gene burden analyses, we tested each of the variant-burden categories mentioned above at four thresholds of alternate-allele frequencies: alternative-allele frequencies of less than 1%; alternative-allele frequencies of less than 0.5%; alternative-allele frequencies of less than 0.1%; and alternative-allele frequencies of less than 0.01%. These seven categories and four thresholds produce 28 pseudo-genotypes for each gene, but they are not fully independent of one another, given the overlapping annotations and frequency thresholds. Thus, we calculated an appropriate adjusted Bonferroni significance level for these variant-burden tests, using a method recommended by a review of multiple-testing correction methods in non-independent genetic tests18,59. Calculating the effective number of independent tests based on the correlation matrix of these variant-burden tests in our meta-analysis resulted in a value of 9.002158 tests per gene, which, when multiplied by the number of genes tested (19,446) and used as a correction factor for an alpha level of 0.05, resulted in an exome-wide level of significance at a P value of 2.86e–07.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.