Introduction

Psoriasis (OMIM no. 177900) is a common immune-mediated disease characterized by painful, red scaly patches of skin. Psoriasis affects 0.2–2% of individuals and disease etiology likely involves multiple genetic and environmental factors. Psoriatic patients can develop an inflammatory arthritis and are at an elevated risk of comorbidities such as heart disease, metabolic syndrome and cardiovascular disease;1 they also exhibit increased risk for other immune-mediated disorders such as Crohn’s disease, celiac disease, ankylosing spondylitis, leprosy, multiple sclerosis, systemic sclerosis and rheumatoid arthritis.2 Genome-wide association studies (GWAS)3, 4, 5, 6, 7 have identified more than 41 independent signals in 36 disease-susceptibility regions3, 7, 8 for individuals of European ancestry, although much of the genetic contribution to psoriasis remains unexplained.

The strongest psoriasis susceptibility locus lies within the major histocompatibility complex (MHC) and is termed PSORS1 (psoriasis susceptibility locus 1). Other important contributors include IL12B, IL23R, IL23A, TNFAIP3, IL13, RNF114 and TNIP1.4, 5, 6, 9 Although GWAS provides an efficient strategy for identifying disease susceptibility loci, detailed assessments of the contribution of these loci to disease require systematic investigation of common and rare variants in each locus. Previous fine-mapping studies in psoriasis have focused on the MHC, helping localize the primary risk locus10, 11 and identify multiple psoriasis susceptibility alleles.8, 12

Using a combination of genotyping and imputation,13 we examined in detail the genetic variation in each of the eight psoriasis susceptibility regions (MHC, IL12B, IL23A, IL23R, TNIP1, TNFAIP3, IL13 and RNF114) in a large sample of subjects of European ancestry, over 97% of whom were not previously subjected to fine-mapping. We sought to identify the variant with strongest association in each locus as well as to find out whether multiple variants might be independently associated in the same locus.

Materials and methods

Samples and markers

We collected 5067 samples of European ancestry from five collaborating centers (Table 1). Boundaries for five of the fine-mapping regions were defined to include all genes of interest in their entirety and all single-nucleotide polymorphisms (SNPs) with association P-values in the CASP-GWAS4 passing a predefined threshold (see Supplementary Table 2 for details). For TNFAIP3, we considered a broader region (476 kb on chr6.hg19.g.138001712–138478136) that covers the candidate disease risk regions for several autoimmune diseases, including psoriasis, psoriatic arthritis (PsA), rheumatoid arthritis and systemic lupus erythematosus. In total, 408 SNPs were picked from this region. The MHC region was defined to include four possible MHC disease loci – a 224-kb candidate interval for PSORS1,10 a secondary locus not in linkage disequilibrium (LD) with PSORS1, which was tagged by rs2022544 chr6.hg19.g.32321004G>A near HLA-DRA,4 and two possible PsA loci tagged by rs12175489 chr6.hg19.g.31377587G>A in MICA and rs3130180 chr6.hg19.g.33020286G>C in HLA-DPA1/HLA-DPB1 (unpublished data). As LD is so extensive in the MHC region and candidate intervals have not been determined for the secondary loci, our fine-mapping region was designed to include all but 250 kb of the classic MHC. We designed an Illumina (San Diego, CA, USA) iSelect custom genotyping array with 2269 tagging SNPs in the eight regions and 5463 SNPs outside the eight regions. All SNPs from two MHC panels designed for the Illumina Golden Gate platform (mapping panel GS0006599-OPA and exon-centric panel GS0006598-OPA) falling within the 3.37 Mb candidate region (chr6.hg19.g.29 882 021–33 252 022) were included. SNPs were removed if they had Illumina Infinium design scores <0.5 or if they had an Infinium I design (ie, A/T or C/G SNPs requiring two bead assays for one genotyping). In addition, 31 SNPs in this region were included to tag HLA-Cw6-bearing risk haplotypes.10 For the other seven regions, all SNPs from the August 2009 release of the 1000 Genomes project with MAF≥0.01 in CEU samples were considered as candidates. After excluding SNPs with Infinium design scores <0.6 or with Infinium I designs, tagging SNPs14 were selected using an LD value of r2≥0.8 as a threshold. A backup SNP was added for every SNP tagging more than five other variants to guard against genotyping failure, and all SNPs in moderate to high LD (r2≥0.5) with the most strongly associated SNP in each region were unconditionally included. Eighty-six additional SNPs for these seven regions were also genotyped, including 63 high-priority variants (5′-UTR, 3′-UTR, essential splice site, nonsynonymous and synonymous coding, gained stop codon) from the 1000 Genomes call set and 23 SNPs within the TNFAIP3 region that are associated with other autoimmune disorders.15, 16

Table 1 Summary of samples based on cohort

Samples with <95% genotyping rate or with extreme heterozygosity were excluded from analysis. Sixty-three samples that were duplicates or close relatives of other samples in the study (identified with the program relativeFinder17) and 185 samples with evidence of non-European ancestry (identified with principal component analysis; Supplementary Figure 1) were also excluded from further analysis. A total of 4806 samples (2699 cases and 2107 controls) were analyzed. Genotypes and phenotypes for the 4806 samples and 7732 markers analyzed in this study have been submitted to the NCBI dbGaP database at http://www.ncbi.nlm.nih.gov/gap (dbGaP accession number: phs000019.v1.p1).

Genotype imputation

We used information on patterns of haplotype variation available in European ancestry (EUR) samples in the 1000 Genomes Project Phase I release version 313 to increase the density of the variants in the eight regions. We first used MaCH18 to perform the phasing of the genotype data and then used Minimac19 for imputation. We restricted analysis of genetic variants in the eight known regions to those that were genotyped on the array or could be imputed with acceptable quality (estimated r2 between the imputed dosage and true genotypes ≥0.3).

We used SNP2HLA20 and a data set collected by the Type 1 Diabetes Genetics Consortium (T1DGC) as a reference panel to impute 8961 variants in the MHC region (126 classical HLA 2-digit alleles, 298 HLA 4-digit alleles, 1276 polymorphic amino-acid residues of these classical alleles and 7261 SNPs). Alleles and amino acids were imputed for eight HLA genes that were genotyped in the T1DGC reference panel (HLA-A, -B, -C, -DPA1, -DPB1, -DQA1, -DQB1 and -DRB1). These variants were merged with 40 074 SNPs and indel variants in the MHC region that were imputed using the 1000 Genomes data set as a reference panel, but were not part of the T1DGC reference set, giving a combined data set of 49 035 MHC variants for analysis.

Association analysis

We used a variance component mixed-model association test implemented in the EMMAX software package to identify disease-associated variants.21 Variance component models correct for a wide range of potential problems, including population structure, by modeling pairwise genotype similarity between individuals (4265 independent markers outside the 41 known psoriasis susceptibility loci22 were used to estimate kinship coefficients). We began with an unconditional analysis of all genotyped and well-imputed variants in the eight regions, selecting the most statistically significant signal in each region that survived Bonferroni correction for the total number of variants tested. We then used a stepwise forward selection method to select iteratively additional signals, conditional on signals from the previous iterations.

Genomic control inflation factors (λGC) were calculated for each round of conditional analysis based on P-values at a set of 4265 independent markers outside the 41 known psoriasis susceptibility loci.22 We estimated odds ratios (ORs) for the associated markers using a logistic regression model that included, as ancestry covariates, coordinates obtained from PLINK23 for the top 10 dimensions of a multidimensional scaling analysis of an identity-by-state distance matrix based on variants outside the eight regions. We estimated the variance in liability (locus-specific heritability)24 that can be explained by the significant SNPs as determined under a liability threshold model25 (assuming that the multiple loci have an additive effect on the risk of psoriasis). We used the prevalence of the disease as 0.02 and calculated risk ratios from our estimated ORs using an iterative approach.26 We used ANNOVAR27 for functional annotation of our significant signals and their proxies (r2≥0.9 in 1000 Genomes Project EUR samples).

For association analysis of the MHC variant data set that includes HLA alleles and amino acids, we implemented a forward-selection regression method using a logistic model, controlling for the ancestry covariates. Significance was defined by Bonferroni correction for 49 035 variants tested. We also compared the log odds of the HLA-C*06-B*57 haplotype to other HLA-C*06-bearing haplotypes in a logistic model and used bootstrap methods to determine significance of the difference in log ORs.

Results

Using 2269 genotyped SNPs among the eight regions, there were 49 239 SNPs and insertions and deletions (indels) with estimated imputation quality r2≥0.3 (Supplementary Table 1). Around 70% of the well-imputed 1000 Genomes variants (estimated imputation quality r2≥0.3) were common variants (minor allele frequency ≥0.05) (Supplementary Figure 2). Variants in the MHC region account for >80% of the imputed variants. For association analyses, we used a Bonferroni significance threshold of 1.01 × 10−6 (=0.05/49 239) to correct for multiple testing.

We identified five significant signals from an initial unconditional analysis of the eight loci and identified four additional signals after conditional analyses (Table 2). Taken together, they explain 12.09% of the variance in liability for psoriasis or 18.3% of its estimated heritability. Among the nine independent signals, three were from the MHC, two were from IL12B and one each was from IL13, IL23A, TNFAIP3 and TNIP1. The λGC values for the unconditional round and two conditional rounds were 1.03, 1.02 and 1.02, respectively. The Q–Q plots are shown in Supplementary Figure 3.

Table 2 Summary of results from association analyses

The most strongly associated variant (rs114255771 chr6.hg19.g.31316699G>A) was situated between HLA-C and HLA-B (P-value=2.94 × 10−74; OR=4.04; variance explained (Ve)=5.23%; risk allele frequency in case/control (fratio)=0.19/0.08). The two additional independent signals in the MHC region were rs6924962 chr6.hg19.g.31358302C>T (P-value=3.21 × 10−19; OR=1.82; Ve=1.64%; fratio=0.23/0.16) near the MICA gene, and rs892666 chr6.hg19.g.29924728C>T (P-value=1.11 × 10−10; OR=1.48; Ve=1.10%; fratio=0.70/0.66) near HLA-A (Figure 1). The three MHC signals together accounted for 7.97% of the total variance in disease liability.

Figure 1
figure 1

Regional association plot for MHC. Association plots for MHC region in the unconditional round, the first conditional round and the second conditional round are shown in (a), (b) and (c), respectively. The horizontal axis represents the base position on chromosome 6 (from 27 to 34 Mb on build 37). The left-vertical axis represents the negative log (base 10) of the P-value in the corresponding round (denoted by the circles) and the right-vertical axis represents the recombination rate (in cM/Mb) (denoted by the blue lines). The color coding for the circles is used to represent the linkage disequilibrium with the most significant SNP in the region in each round.

IL12B had two independent signals – rs62377586 chr5.hg19.g.158766022G>A (P-value=7.42 × 10−16; OR=1.43; Ve=0.9%; fratio=0.74/0.67) and rs918518 chr5.hg19.g.158826493C>T (P-value=3.22 × 10−11; OR=1.44; Ve=0.74%; fratio=0.28/0.21) – accounting for 1.64% of variance in liability. Figure 2 shows the regional association plots for the IL12B region for the unconditional and conditional analyses.

Figure 2
figure 2

Regional association plot for IL12B. Association plots for the IL12B region in the unconditional round and the first conditional round are shown in (a) and (b), respectively. The horizontal axis represents the base position on chromosome 5 (from 158.7 to 159 Mb on build 37). The left-vertical axis represents the negative log (base 10) of the P-value in the corresponding round (denoted by the circles) and the right-vertical axis represents the recombination rate (in cM/Mb) (denoted by the blue lines). The color coding for the circles is used to represent the linkage disequilibrium with the most significant SNP in the region in each round.

Signals were also detected in the IL13, TNIP1, TNFAIP3 and IL23A loci, respectively, at rs1295685 chr5.hg19.g.131996445C>T (P-value=1.65 × 10−7; OR=1.31; Ve=0.39%; fratio=0.83/0.79), rs17728338 chr5.hg19.g.150478318G>A (P-value=4.15 × 10−13; OR=1.81; Ve=0.59%; fratio=0.09/0.05), rs642627chr6.hg19.g.138206783G>A (P-value=5.90 × 10−7; OR=1.26; Ve=0.34%; fratio=0.31/0.26) and rs61937678 chr12.hg19.g.56653695T>C (P-value=1.82 × 10−7; OR=1.58; Ve=1.16%; fratio=0.80/0.77). Taken together, these signals explain 2.48% of the variance in liability (Figure 3). The IL23A region did not have a significant hit in the unconditional round (minimum P-value=1.53 × 10−6 at rs61937678 with Bonferroni-adjusted P-value cutoff=1.01 × 10−6). However, the same SNP reached significance in an analysis adjusting for rs114255771 (MHC), rs62377586 (IL12B), rs1295685 (IL13), rs17728338 (TNIP1) and rs642627 (TNFAIP3).

Figure 3
figure 3

Regional association plot for IL13, TNIP1, TNFAIP3 and IL23A. Association plots for the IL13, TNIP1, TNFAIP3 and IL23A regions are shown in (a), (b), (c) and (d), respectively. The horizontal axis represents the base position of the corresponding region. The left-vertical axis represents the negative log (base 10) of the P-value in the corresponding round (denoted by the circles) and the right-vertical axes represents the recombination rate (in cM/Mb) (denoted by the blue lines). The color coding for the circles is used to represent the linkage disequilibrium with the most significant SNP in the region in each round.

We compared our results with those from a recent meta-analysis of GWAS and Immunochip data for psoriasis7 (Table 3). As assessed using phase I 1000 genomes data set, our primary signal in the MHC (rs114255771) exhibited complete LD (r2=1) with rs4406273 chr6.hg19.g.31266090G>A. This SNP was the primary hit in that meta-analysis and also happens to be an excellent surrogate for HLA-Cw*0602, the leading candidate for the PSORS1 risk allele10, 11 (unpublished data). Our secondary MHC signal (rs6924962) is in close proximity (8.9 kb) to and exhibits moderately strong LD (r2=0.56) with the secondary MHC signal from the GWAS–Immunochip meta-analysis (rs13437088 chr6.hg19.g.3135511T>C).

Table 3 Comparison of results from this study with Tsoi et al7

We also compared our results for the MHC region with other fine-mapping studies of the same region8, 12 (Figure 4). Although the primary signals agree with each other, the conditional signals of similar rank are not in high LD with each other. Our secondary signal (rs6924962) exhibits moderate LD with the fifth signal rs66609536 chr6.hg19.g.31362120G>A (r2=0.56) in Knight et al8 and the third signal rs13437088 chr6.hg19.g.31355119T>C (r2=0.56) in Feng et al.12

Figure 4
figure 4

Comparison of results from this study with results from Knight et al8 and Feng et al.12 Comparison of the results from this study with the results from Knight et al8 and Feng et al12 is shown in (a) and (b), respectively. The SNPs on the vertical axes represent the significant SNPs from this study and the SNPs on the horizontal axes represent the significant SNPs from Knight et al8 and Feng et al12 in (a) and (b), respectively. The blocks (corresponding to each SNP on the horizontal axes versus each SNP on the vertical axes) with the color coding represents the strength of linkage disequilibrium between the corresponding SNPs, which was calculated as a measure to show corroboration between the studies at hand.

We identified the same variant near IL13 (rs1295685) that was identified in the GWAS–Immunochip meta-analysis, and our primary signals in TNIP1 (rs17728338), TNFAIP3 (rs642627) and IL12B (rs62377586) are in strong LD with the top markers for each locus in that analysis (rs2233278 chr5.hg19.g.150467189C>G (r2=0.96), rs582757 chr6.hg19.g.138197824A>G (r2=0.97) and rs4379175 chr5.hg19.g.158804928G>T (r2=0.84), respectively). The signal at rs61937678 in the IL23A region is in physical proximity (96.51 kb) of the signal detected at rs2066819 chr12.hg19.g.56750204G>A in the meta-analysis, although the LD between these two SNPs is nominal (r2=0.2).

We mined the imputed data set for potential causal alleles by examining genome-wide significant variants and their near-perfect proxies (r2>0.9 in 1000 Genomes Project EUR samples); the results are summarized in Supplementary Table 3. Using ANNOVAR,27 we found that, of the 39 examined variants, only one (rs20541 chr5.hg19.g.13199596T>A) is exonic. SNP rs20541 is a nonsynonymous SNP in IL13 that is in strong LD (r2=0.97) with our best IL13 association signal (SNP rs1295685), located in the 3′-UTR of IL13. Nearly all of the remaining 38 variants are either intergenic or intronic based on known genes in the UCSC28 and RefSeq29 databases. To assess the possible functional roles of non-protein-coding variants in our data, we used ENCODE30 RNA-seq data to assess transcribed regions as well as ENCODE30 data and chromHMM31 predictions to categorize the noncoding variants into enhancers, repressors, promoters, insulators, and so on in normal human epidermal keratinocytes (NHEKs) (Supplementary Table 3). While a few of the putative intergenic variants lie within transcribed regions identified in the GM12878 cell line, none of them lie within conserved genomic regions (using phastCons 44-way alignments). Some variants are within enhancer regions based on H3K4Me1, H3K4Me3 and H3K27Ac chromatin marks in the GM12878 cell line, and a total of eight variants in TNIP1, IL12B, MHC and TNFAIP3 are located in strong enhancers in NHEKs. Variants in TNIP1, IL13, IL12B, TNFAIP3 and the primary MHC signal lie within DNase I-hypersensitive sites, but no variants lie within CTCF (zinc-finger protein) binding sites. Moreover, eight variants in IL13, TNIP1 and TNFAIP3 lie within transcription factor binding sites delineated by ChIP-Seq data from the ENCODE project.30

To assess potential causal variants in the MHC region, we investigated the squared Pearson’s correlations between imputed dosages of the three MHC SNP signals from this study and dosages of 2-digit resolution alleles of six MHC genes (HLA-A, -B, -C, -DRB1, -DQB1, MICA) that were genotyped on a subset of 1429 Canadian individuals from our sample.32 The risk allele of each of these signals is tagged best by an allele of a nearby gene (Table 2), with r2 between rs114255771 and HLA-C*06=0.90, between rs6924962 and MICA*002=0.34 and between rs892666 and HLA-A*03=0.38. HLA-A*03 is positively correlated with the protective T allele of rs892666; the strongest squared correlation of the C risk allele of this SNP among all 2-digit HLA alleles is with HLA-A*02 (r2=0.16).

We also imputed 126 classical 2-digit alleles, 298 4-digit alleles and 1276 polymorphic amino-acid residues of eight HLA genes (HLA-A, -B, -C, -DPA1, -DPB1, -DQA1, -DQB1 and -DRB1), and combined these with more than 40 000 imputed SNP and indel variants in the MHC region. The combined variant set was then subjected to stepwise forward logistic regression to identify independent association signals and to determine the relationship of these potentially causative HLA gene variants with the best SNP and indel signals in the MHC region. As shown in Supplementary Table 4, the two best independent signals from this analysis with significant P-values after Bonferroni correction (rs114255771, rs6924962) are the same two variants identified in the previous analysis of SNPs and indels only (Table 2); the third best signal (rs1059514 chr6.hg19.g.29911190G>A) is in moderate LD (r2=0.41) with the third best signal (rs892666) in the previous analysis.

We identified HLA-C*06:02 as the most strongly allied protein-coding variant for the first SNP signal (rs114255771), which is consistent with evidence presented just above (Supplementary Table 4). However, rs114255771 shows association of much greater strength and significance (OR=4.04; P-value=2.6 × 10−66) than either HLA-C*06:02 (OR=2.84; P-value=1.9 × 10−55) or rs4406273 (OR=2.85; P-value=2.0 × 10−55), a near-perfect surrogate for HLA-C*06:02. This is an unexpected result, because SNPs rs114255771 and rs4406273 score as in complete LD (r2=1.000) in the 379 individuals of European descent in the phase I 1000 genomes data set. We hypothesized that this discrepancy may be caused by suboptimal imputation quality of rs114255771 in our analysis sample (predicted r2 of imputed and actual genotyped dosages=0.752), whereas rs4406273 is genotyped for our sample, leading to an observed r2 between rs114255771 and rs4406273 of 0.899 rather than 1.000. We tested this hypothesis using a test sample of 5393 cases and controls with empirically determined genotypes for rs114255771, rs4406273 and HLA-C*06:02; 2761 individuals in the test sample overlap the sample analyzed by this study. LD among these three variants in the test sample was very high (r2=0.980–0.989), and their associations with psoriasis were very similar (OR=3.19–3.24; P-value=3.8 × 10−77–1.8 × 10−75). From these results, it is clear that if rs114255771 had been genotyped in the analysis sample its association with psoriasis would have been nearly identical to that seen for HLA-C*06:02 or rs4406273.

Although the superior association of rs114255771 in the stepwise analysis can be accounted for as an artifact of its suboptimal imputation, it appears to be driven by a genuine genetic effect. In our sample, the imputed risk allele of rs114255771 is more strongly correlated with genotyped alleles of the extended ancestral haplotype HLA-A*02:01-C*06:02-B*57:01-MICA*017-DRB1*07:01-DQB1*03:03 than are the genotyped risk allele of rs4406273 or HLA-C*06:02 itself (data not shown). Thus, the stronger tagging by rs114255771 of the HLA-C*06:02-B*57:01 haplotype appears to be because of it being more accurately imputed for this haplotype than for other HLA-C*06:02-bearing haplotypes. Whatever the reason, the better tagging of HLA-C*06-B*57 results in a stronger disease association for imputed rs114255771 than for HLA-C*06, because, as shown in Table 4, the HLA-C*06-B*57 haplotype is indeed more strongly associated with psoriasis than are other HLA-C*06-bearing haplotypes (OR=3.19 vs 2.07; P-value for difference in log ORs=0.002). The estimated ORs of the various haplotype groups in Table 4 suggest a multiplicative model where the HLA-C*06-B*57 haplotype is hypothesized to increase risk for psoriasis by the product of the ORs of the individual HLA-C*06 and HLA-B*57 alleles (joint OR of 3.19≈OR of 2.07 for HLA-C*06 without HLA-B*57 × OR of 1.61 for HLA-B*57 without HLA-C*06). However, given the low frequency of HLA-B*57 haplotypes lacking HLA-C*06 (0.005 cases/0.002 controls), a larger sample would be needed to test this hypothesis.

Table 4 Summary of association for different MHC haplotype groups

Discussion

Using a custom-designed SNP array, we conducted a fine-mapping study of eight loci known to be associated with psoriasis, confirming associations and revealing multiple independent association signals in the MHC and IL12B regions. Because this experiment did not encompass all subsequently described data sets, its power to detect conditional signals was limited. Nevertheless, this study serves as a largely independent corroboration of the findings of a recent large meta-analysis of psoriasis,7 as only 145 samples are shared between the two studies (representing 2.9% of the samples in this study and 0.4% of those in the meta-analysis). Five of the association signals identified by the two studies (IL13, TNIP1, TNFAIP3, primary MHC and the primary IL12B signal in this study and secondary IL12B signal in the GWAS–Immunochip meta-analysis7) are in strong LD between studies (r2=0.84–1.00) and explain a very similar amount of the variance in disease liability (7.45% in this study vs 7.78% in the meta-analysis) (Table 3). Hence, these signals likely tag the same underlying causative variants. However, three other pairs of association signals (the unconditional IL23A signals, the first conditional signals in the MHC, and our conditional IL12B signal vs the primary IL12B in the recent meta meta-analysis7) are in weaker LD between studies (r2=0.20–0.56), showing very different amounts of the variance in liability (3.54% in this study vs 1.31% in the meta-analysis), and could thus be tagging different disease variants.

Our results for the MHC region provide a completely independent replication of two previous fine-mapping studies of this same region.8, 12 Although the primary signals agree with each other across all three studies, the conditional signals of similar rank are not in high LD with each other. Our third signal from MHC (rs892666) does not exhibit high LD with any previously identified MHC signal, but lies only 7.8 kb from the fourth signal of Knight et al8 study. One explanation for the differences among studies is that we have high power for correctly tagging the very strong HLA-Cw06 signal as the most important risk locus in all the three studies, but we have lesser power for detecting and correctly ordering the more weakly associated secondary signals. Thus, we would expect unanimity in selecting the primary signals but variation in the identity and rank order of other signals until sample sizes become very large.

Adding imputed protein alleles and polymorphic amino-acid residues for eight HLA genes to the SNPs and indels in the association analysis of the MHC region revealed that SNP rs114255771 tags HLA-C*06:02, which has been shown by multiple studies as the leading candidate for the PSORS1 locus in the MHC.10, 11 Although all three SNP signals in the MHC are allied with coding variants of nearby genes, our finding remains that the SNPs are more strongly associated with disease than the putative causative gene variants. For the best SNP signal, this stronger association was demonstrated to be an artifact of reduced imputation quality. The greater strength of association of the other two SNP signals compared with the best HLA variants could also be a result of imputation, especially considering that most of the SNPs were imputed using a different reference panel than was used for imputing the HLA variants. Alternatively, we cannot rule out the possibility that the causative variants underlying the association of these other two SNPs affect MHC genes other than the eight HLA genes included in our analysis, especially given that SNP2HLA does not impute the HLA-like MICA alleles.

Analysis of imputed HLA haplotypes demonstrated that the HLA-C*06-B*57 haplotype is more strongly associated with psoriasis than are other HLA-C*06-bearing haplotypes. This differential association has been suggested previously, albeit in much smaller studies,10, 12, 33 but to our knowledge this is the first time the difference in strength of association has been shown to be statistically significant.

All eight loci were densely genotyped and further enhanced by imputation, enabling us to study rare and common variants including indels that might provide clues about disease mechanisms or increase the heritability explained by these loci.34 Still, none of the best signals identified by the conditional analyses were either indels or rare variants. In total, the nine association signals obtained from this study passing the multiple-testing threshold explain 12.09% of the overall variance in liability for psoriasis or 18.3% of its estimated heritability.

We searched for causal markers in our 39 significant variants and their proxies (Supplementary Table 3) and found only one nonsynonymous SNPS (rs20541), which is a proxy (LD r2=0.97) for our best IL13 association signal (rs1295685). The minor A allele of rs20541 changes amino-acid 130 of IL-13 from arginine (R) to glutamine (Q), and has been associated with increased IgE levels and risk for asthma,35 and reduced risk of helminth infections.36 Notably, this variant has been shown to be more active than the 130R wild-type allele in inducing STAT6 phosphorylation and activation.37, 38 Consistent with the important role of Th17 cells in psoriasis pathogenesis39 and the antagonistic effects of IL13 on IL17 production by human Th17 lymphocytes,40 this allele exerts a protective effect against psoriasis.7 Most of the signals detected in this study might be regulatory as they fall outside the genes and conserved regions, but many fall in transcribed regions, DNase I-hypersensitive sites and transcription factor binding sites. Similar results have recently been reported for rheumatoid arthritis41 and several other complex genetic disorders.42, 43, 44, 45 We used chromHMM predictions to find that some of our variants lie within enhancer regions and are strong enhancers for NHEK cell lines. As the principal cell of the epidermis, NHEK represents, by far, the majority cell type present in the skin and which is greatly expanded in psoriasis lesion, relative to normal skin.7, 46 Moreover, there is evidence from functional assays that the HLA-Cw6 association may relate to altered expression of this gene owing to altered enhancer function, rather than variation in its antigen-presenting properties.47, 48 Additionally, a number of recent papers attest to tissue-specific phenotypes that arise from selective silencing of TNFAIP3 in various mouse cell types.49

Our study was too small for signals near IL23R and RNF114 to reach genome-wide significance. However, we did find nominally significant association for the already established signals in these two regions, rs9988642 chr1.hg19.g.67726104T>C (IL23R) and rs495337 chr20.hg19.g.48522330C>T (RNF114).4, 6, 9 SNP rs495337 was genotyped in our data and had a P-value of 9.20 × 10−5, whereas the SNP rs9988642 was imputed (estimated imputation r2=0.96) and had a P-value of 1.46 × 10−5. Notably, rs9988642 is in strong LD with the rs11209026 chr1.hg19.g.67705958G>A variant,7 which specifies glutamine at residue 381 in the IL-23 receptor, is associated with reduced risk of psoriasis, and abrogates signaling through the receptor.50, 51

In conclusion, we provide evidence for the existence of multiple signals in previously known loci associated with psoriasis, replicating and extending the findings of other fine-mapping studies in a largely independent sample, thereby refining the candidate regions for each association signal. The MHC region explains more variance than all other loci combined, and even the secondary MHC signal explains more variance than any other variant (or locus). Our analysis further indicates that HLA-C, MICA and HLA-A are good candidates for the three MHC signals identified on the basis of SNP associations. Outside the MHC, our studies provide clues to the potential causative mechanisms of the association signals (eg, the two IL12B signals are likely non-exonic and therefore regulatory and not protein-changing, whereas the IL13 signal may exert its effect through a nonsynonymous variation with effects on IL13 protein function), and annotation analyses have identified plausible mechanisms for regulatory mechanisms of noncoding variations. Whatever be the mechanisms involved, the observation that each of the loci achieving significance in this study has a plausible role in immunosurveillance7 strongly supports further functional exploration of psoriasis-associated genetic variation in the context of host defense.