Introduction

Natural Killer (NK) cells are large granular lymphocytes, comprising 5ā€“15% of peripheral blood lymphocytes, and are key innate effector cells1. Functions include cytotoxicity towards virally infected and malignant cells, and production and secretion of cytokines including IFNĪ³, defining NK cells as prototypical Group 1 innate lymphoid cells2. NK cells express germline-encoded receptors, the best-characterised being Killer cell Immunoglobulin-like Receptors (KIRs), which, dependent on associated intracellular signalling domains, can be activating or inhibitory. Viral infection and dysplasia elicit concomitant induction of stress antigens, triggering activating KIRs, and MHC Class I molecule downregulation, removing the ligands for inhibitory KIRs. Immunodeficiencies distinguished by NK cell deficiency3,4,5,6 or dysfunction7 are illustrative of the role of NK cells in human health, and are characterised by susceptibility to viral infection, in particular herpesviruses, and early-onset malignancy.

Genome-wide association studies have provided unparalleled insights into the genetic determinants of human health and disease8. The majority of associated alleles are non-coding and thought to exert phenotype via the regulation of gene expression. Understanding which cell types and conditions such variants demonstrate activity in is vital to determining pathogenic mechanisms. Expression quantitative trait loci (eQTL) analysis of immune cells isolated from blood has demonstrated high degrees of cellular specificity in regulatory variant function. Whilst analyses of NK cell eQTL have been performed, sample sizes have been small and further data are required9, especially for the identification of trans-regulatory variants which have smaller effect sizes. We describe an eQTL study performed across 245 healthy individuals of Northern European ancestry. We demonstrate that NK cells display many hundreds of eQTL that are not observed in other cell types, and find evidence of NK cell-specific master-regulatory regions. These eQTL colocalise with disease-associated loci, providing novel insights into the role of NK cells in human health and disease.

Results

cis-eQTL mapping

NK cells were negatively selected from PBMCs from 245 healthy European adults, and gene expression quantified using Illumina gene expression arrays, giving a readout of 18,078 genes post QC. Whole-genome imputation of genome-wide genotyping data yielded high-quality genotypes at 6,012,996 autosomal loci. We defined cis-acting variants within 1Mb of the TSS, mapping cis eQTL under an additive linear model in QTLtools10. To alleviate the effects of confounding variation we included PCs of expression data as covariates in the model, finding 32 PCs maximised cis eQTL discovery (Supplementary Fig.Ā 1). We identified cis eQTL at 3951 autosomal genes of 17,342 tested (22.7%) at FDRā€‰<ā€‰0.05 (Fig.Ā 1a, Supplementary DataĀ 1). By conditioning on the peak eSNP at each significant cis eQTL we further identified an additional 594 independent cis eQTLs at 528 genes; 528 genes have 2 independent eSNPs, 63 genes have 3, and 3 genes have 4 (Fig.Ā 1a, Supplementary DataĀ 2).

Fig. 1: cis eQTL in NK cells.
figure 1

a Total significant (FDRā€‰<ā€‰0.05) primary and conditional cis eQTL in NK cells from 245 individuals. Numbers of eQTL with evidence of colocalisation with at least one GWAS trait (RTCā€‰>ā€‰0.9) are highlighted (pink). b Frequencies of cis eQTL (FDRā€‰<ā€‰0.05) specific to NK cells or shared with other immune cells; monocytes, neutrophils, CD4+ and CD8+ T cells. c Enrichment of shared causal loci between NK cell eQTL (total, left panel; NK cell-specific, right panel) and GWAS traits (nā€‰=ā€‰100). Traits are compared with the enrichment observed for height as background. Significant traits (FDRā€‰<ā€‰0.05) are highlighted (pink). Point size is proportional to a trait's number of GWAS-significant loci. d Effect of rs1788098 genotype on CD226 expression in NK cells (nā€‰=ā€‰245 individuals). Box and whisker plot; boxes depict the upper and lower quartiles of the data, and whiskers depict the range of the data excluding outliers (outliers are defined as data-pointsā€‰>ā€‰1.5ā€‰Ć—ā€‰the inter-quartile range from the upper or lower quartiles). e The effect of rs1788098 genotype on CD226 expression is specific to NK cells. Significant eQTL effects are highlighted (green). f The CD226 eQTL in NK cells colocalises with a risk locus for type-1 diabetes. SNPs are coloured according to strength of LD (CEU population) to the peak eSNP (rs1788098); brown r2ā€‰>ā€‰0.8, orange 0.5ā€‰<ā€‰r2ā€‰ā‰¤ā€‰0.8, yellow 0.2ā€‰<ā€‰r2ā€‰ā‰¤ā€‰0.5, grey r2ā€‰ā‰¤ā€‰0.2. g Association of rs1788097 (exact proxy for rs1788098 in European populations, r2ā€‰=ā€‰1) with autoimmune diseases and haematological indices. GWAS-significant (pā€‰<ā€‰5ā€‰Ć—ā€‰10āˆ’8) associations are highlighted (pink). h Regional association plot of the primary IRF5 eQTL in NK cells. i The primary IRF5 eQTL does not colocalise with a GWAS risk locus in the IRF5 region for systemic lupus erythematosus. j Conditioning on the peak IRF5 eSNP reveals an independent, secondary eQTL for IRF5 in NK cells. k The secondary IRF5 eQTL colocalises with the IRF5 region for systemic lupus erythematosus. Genotype to phenotype correlations were calculated with linear regression. P values are two-sided.

We used a Bayesian approach, implemented in moloc11, to compare the evidence for shared or independent effects at NK eQTL (this dataset, n=245) with CD4+ T cells12 (nā€‰=ā€‰282), CD8+ T cells12 (nā€‰=ā€‰271), monocytes13 (nā€‰=ā€‰414) and neutrophils14 (nā€‰=ā€‰101). Whilst we find evidence (PPā€‰>ā€‰0.8) supporting sharing of a single regulatory effect at 2161/3951 (54.7%) NK eQTL with at least one other cell type, the data best supports (PPā€‰>ā€‰0.8Ā for a model of association being an eQTL in NK cells alone or independent to eQTLs in other cell types) observed NK eQTL being unique to NK cells at 588/3,951 (14.9%) genes (Fig.Ā 1b, Supplementary DataĀ 3). Amongst these NK-cell-specific eQTL, 270/588 (49.1%) regulate expression of genes for which there are other independent eQTL in at least one other immune cell. Thus, whilst many NK cell-specific eQTL may be reflective of cell-restricted expression, half represent differential regulation of widely expressed genes. In keeping with this observation, there is no significant difference (pā€‰=ā€‰0.152) in median expression of genes with shared or NK cell-specific eQTL (Supplementary Fig.Ā 2).

cis eQTL function

Gene expression is controlled by cis-acting regulatory elements and genetic variation within these can lead to regulatory divergence. Correspondingly, eQTL are significantly enriched within regions of active chromatin, promoters, enhancers and TSS15. To assess the distribution of cis eQTLs in NK cells we plotted densities of observed eQTL around functional chromatin states and transcription factor binding sites, calculating relative enrichment of NK eQTL around each feature (Supplementary Fig.Ā 3, Supplementary DataĀ 4). NK cis eQTLs demonstrate clustering around TSS (pā€‰<ā€‰2ā€‰Ć—ā€‰10āˆ’8), transcribed regions (pā€‰<ā€‰2ā€‰Ć—ā€‰10āˆ’8), enhancers (pā€‰=ā€‰0.0015) and weak enhancers (pā€‰=ā€‰0.0087), and are underrepresented in regions of the genome with repressed activity (pā€‰<ā€‰2ā€‰Ć—ā€‰10āˆ’8). Cell-type specific eQTL and conditional eQTL have been demonstrated to show greater overlap with distal regulatory elements, as compared to primary, shared eQTL16. In keeping with this, conditional eQTL (secondary, tertiary and quaternary eQTL) in NK cells are enriched for enhancers (pā€‰=ā€‰0.0018), but show no enrichment around promoters (Supplementary Fig.Ā 3). Similarly, while NK cell-specific eQTL demonstrate clustering at promoter sites (pā€‰<ā€‰2.2ā€‰Ć—ā€‰10āˆ’7) and no significant clustering around enhancers, they demonstrate enrichment for weak enhancers (pā€‰=ā€‰0.00053), significantly more so than NK eQTL shared across immune cells, (pā€‰=ā€‰0.0019, fold change = 3.93).

We similarly observed significant enrichment of overlap between NK cell eQTL and transcription factor binding sites, demonstrating enrichment at 44/52 (84.6%) of transcription factors tested (Supplementary Fig.Ā 3, Supplementary DataĀ 4). The transcription factor binding site enrichment observed across NK cell eQTL encompasses a broad range of transcription factors, many of which are active in immune cells (e.g. NFKB, IRF7, PAX5). In particular, ETS1 has been previously demonstrated to direct NK cell differentiation17. We observed no significant enrichment between conditional eQTL in NK cells and transcription factor binding sites. Among NK cell-specific eQTL, we identified significant enrichment of overlap at 14/52 (26.9%) of transcription factor binding sites, including EGR1, POU2F2 and ZEB1 which are predicted to act as transcription factor repressors active in determining ILC subset plasticity18.

To better understand the biological pathways subject to regulatory variation specific to NK cells, we performed enrichment analysis of genes with an NK cell-specific eQTL (against a background of all tested genes) in XGR19. In that analysis, NK cell-specific eQTL are enriched for genes within 26 biological pathways annotated by Gene Ontology Biological Processes (GOBP, Supplementary DataĀ 5, Supplementary Fig.Ā 4). Enriched biological processes highlight the established biology of NK cells, including pathways mediating cell death (GO:0006915, apoptotic process; GO:1902042, negative regulation of extrinsic apoptotic signalling pathway via death domain receptors; GO:0097190, apoptotic signalling pathway; GO:0042981, regulation of apoptotic process; GO:0006919, activation of cysteine-type endopeptidase activity involved in apoptotic process), anti-viral host defence (GO:0009615, response to virus; GO:0051607, defense response to virus), and innate immune signalling (GO:0032496, response to lipopolysaccharide; GO:0033209, tumour necrosis factor-mediated signalling pathway).

cis eQTL disease associations

To investigate the disease informativeness of NK cells we adopted two complementary approaches. We firstly used Regulatory Trait Concordance (RTC)20 to determine the proportion of NK cell eQTL predicted to share a causal variant with NHGRI-EBI GWAS Catalog studies (nā€‰=ā€‰5823). Among primary cis NK cell eQTL, 1242/3951 (31.4%) eQTL are predicted to share the same functional variant with at least one GWAS trait (Fig.Ā 1a, Supplementary DataĀ 6). Among eQTL predicted to be unique to NK cells, 161/588 (27.4%) loci are predicted to colocalise with at least one GWAS-significant locus. Conditional eQTL are similarly informative of human disease and GWAS traits, with 174/594 (29.3%) conditional eQTL in NK cells colocalising with GWAS trait-associated loci.

Next, to better understand the impact that regulatory variation modifying gene expression in NK cells has on human health and disease, we used evidence of colocalisation between NK cell eQTL and GWAS traits, derived with coloc21, to estimate the degree of enrichment of NK eQTL colocalisation with GWAS traits as compared to that expected by chance. Height GWAS loci are enriched for regulatory variation operating in connective tissue and mesenchymal stem cells but not immune cells22, and any shared causal variants with NK cell eQTL are likely to be representative of the proportion one would expect by chance (i.e. shared causal loci are likely to represent effects mediated by other cell types)23. We therefore used a well-powered GWAS of height in European populations (UK Biobank, URL: http://www.nealelab.is/uk-biobank/) as our background against which to compare enrichment in other traits. We performed enrichment analysis across 100 traits (Supplementary DataĀ 2); 40 UK Biobank continuous traits (anthropometrics and haematological indices) and 60 NHGRI-EBI GWAS Catalog traits (well-powered, case-control studies in populations of European ancestry). In that analysis, we identify 8 GWAS traits with significant enrichment (FDRā€‰<ā€‰0.05) of colocalisation with NK cell eQTL, and 4 traits with significant enrichment of colocalisation with NK cell-specific eQTL (Fig.Ā 1c, Supplementary DataĀ 7). Among both NK cell-specific and shared eQTL, we observe marked enrichment of shared causal loci with autoimmune diseases (inflammatory bowel disease, multiple sclerosis, type-1 diabetes) and a range of haematological indices including platelet, monocyte and lymphocyte count (Fig.Ā 1c). To investigate whether the choice of height as background for enrichment could bias our results, we recapitulated our analysis using whole-body impedance (UK Biobank phenotype code: 23106_irnt) as an alternative background. In that analysis we find no evidence that choice of background significantly affects our results, with our enrichment estimates derived using height as background being well-correlated (rā€‰=ā€‰0.575, pā€‰=ā€‰5.51ā€‰Ć—ā€‰10āˆ’6) with those using whole-body impedance (Supplementary Fig.Ā 5).

Genes with NK cell-specific eQTL include CD226 (Fig.Ā 1d, e), which encodes CD226 (also known as DNAX accessory molecule-1), a surface-expressed immunoglobulin superfamily glycoprotein, widely expressed on NK cells24. CD226 engages CD155 on antigen-presenting cells, an interaction increasing cytotoxicity and IFNĪ³ production, that is competitively inhibited by NK-expressed TIGIT and CD9625. The NK cell-specific eQTL for CD226 (peak eSNP: rs1788098, pā€‰=ā€‰1.92ā€‰Ć—ā€‰10āˆ’19) colocalises with genetic loci for a range of haematological indices and autoimmune diseases; including IBD and T1DM (Fig.Ā 1f, g). The direction of effect of CD226 expression on autoimmune disease risk is of reduced expression increasing disease risk, suggesting less robust NK cell responses (potentially directed at viral pathogens or autoreactive T cells) increases the risk of autoimmunity. Indeed, NK cells have been shown to restrain the activity of CD4+ T cells in a CD226-dependent manner in the context of multiple sclerosis26. Other NK cell-specific eQTL informative for human disease include; the caspase protease CASP8 (peak eSNP: rs3769821, pā€‰=ā€‰1.24ā€‰Ć—ā€‰10āˆ’8) which colocalises with risk loci for multiple malignancies (breast cancer, non-small cell lung cancer, melanoma, oesophageal squamous cell carcinoma), and the fucosyltransferase FUT11 (peak eSNP: rs11000765, pā€‰=ā€‰1.91ā€‰Ć—ā€‰10āˆ’18) which colocalises with asthma and lung function.

As expected, NK eQTL shared with other immune cells are also highly informative with respect to human health and disease. For instance, an ERAP2 eQTL (peak eSNP: rs1363974, pā€‰=ā€‰1.33ā€‰Ć—ā€‰10āˆ’76), predicted to be shared across all immune cells tested (NK cells, CD4+ and CD8+ T cells, monocytes and neutrophils), is a genetic determinant of neutrophil and lymphocyte percentage (Supplementary Fig.Ā 6). As described above, conditional eQTL are also informative with regard to human disease risk (Supplementary DataĀ 2). A secondary eQTL for IRF5, encoding the transcription factor IRF5, a positive regulator of type-1 interferon production, provides an example of this (Fig.Ā 1hā€“k). Here the primary eQTL at IRF5 (peak eSNP: rs7796963, pā€‰=ā€‰2.37ā€‰Ć—ā€‰10āˆ’65) has no evidence of colocalisation with human disease traits, however the secondary eQTL (peak eSNP: rs17424921, pā€‰=ā€‰5.39ā€‰Ć—ā€‰10āˆ’9) shows strong evidence of a shared causal variant with systemic lupus erythematosus.

trans eQTL

Detecting trans eQTL within purified cell types can elucidate cell-specific regulatory networks13. We defined trans-acting regulatory variation as variantsā€‰>ā€‰5Mb from the TSS, and tested for associations at 18,078 genes under an additive linear model in QTLtools10, alleviating confounding variation through the inclusion of expression data PCs as covariates in the model (12 PCs maximising trans discoveryā€”Supplementary Fig.Ā 1). We identified 84 independent loci (totalling 2,266 SNPs) demonstrating trans effects (FDRā€‰<ā€‰0.05) to 64 genes (Fig.Ā 2a, Supplementary DataĀ 8). Regulatory variation acting in trans is enriched for cis eSNPs27,28, suggesting that a proportion of trans-acting variation operates via cis effects on upstream regulatory genes. Consistent with this, of the 64 genes with trans eQTL, 25 have evidence of cis mediation (colocalisation probability of cis and trans effectā€‰>ā€‰0.8) from 16 genes (Supplementary DataĀ 8).

Fig. 2: trans eQTL in NK cells.
figure 2

a Manhattan plot depicting significant trans eQTL in NK cells. Physical position (x axis) represents location of the target gene. Coloured points represent significant (FDRā€‰<ā€‰0.05) trans eQTL; 2266 trans SNP-gene associations, affecting expression of 64 genes. Regulatory networks are highlighted as follows; GNLY (blue), MC1R (orange), UVSSA (yellow), KIR (pink), all others (green). b Effect of rs1866140 genotype on GNLY expression in NK cells (nā€‰=ā€‰245 individuals). c A trans-regulatory network of 5 genes, mediated in cis by GNLY expression. d Effect of rs1866140 genotype on GNLY regulatory network genes in trans (nā€‰=ā€‰245 individuals). PP4, posterior probability of a shared causal variant with GNLYĀ cis eQTL. e Effect of rs117406136 genotype on MC1R expression in NK cells (nā€‰=ā€‰245 individuals). f A trans-regulatory network of 4 genes, mediated in cis by MC1R expression. g Effect of rs117406136 genotype on MC1R regulatory network genes in trans (nā€‰=ā€‰245 individuals). PP4, posterior probability of a shared causal variant with MC1Rcis eQTL. h Effect of rs111632154 genotype on UVSSA expression in NK cells (nā€‰=ā€‰245 individuals). i A trans-regulatory network of 4 genes, mediated in cis by UVSSA expression. j Effect of rs111632154 genotype on UVSSA regulatory network genes in trans (nā€‰=ā€‰245 individuals). PP4, posterior probability of a shared causal variant with UVSSAĀ cis eQTL. k A trans-regulatory network of 10 genes, mediated mediated by KIR2DS4del. l Effect of KIR2DS4del genotype on KIR regulatory network genes in trans (nā€‰=ā€‰245 individuals). Box and whisker plots; boxes depict the upper and lower quartiles of the data, and whiskers depict the range of the data excluding outliers (outliers are defined as data-pointsā€‰>ā€‰1.5ā€‰Ć—ā€‰the inter-quartile range from the upper or lower quartiles). Genotype to phenotype correlations were calculated with linear regression. P values are two-sided.

Among trans-acting loci, 7 trans eSNPs affect more than one gene distally, and show master-regulatory properties. For instance, a trans-regulatory network mediated by a NK-cell-specific cis eQTL for GNLY (peak eSNP: rs1866140, pā€‰=ā€‰8.86ā€‰Ć—ā€‰10āˆ’23), determines expression of 5 genes in trans; ARHGAP30, EDNRB, KIAA1586, N4BP2, NOP56 (Fig.Ā 2bā€“d). GNLY encodes granulysin, a secreted cytolytic molecule with broad antimicrobial activity and cytotoxicity towards infected and malignant cells29. In addition to cytolytic activity, granulysin also induces immune cell chemotaxis and pro-inflammatory cytokine production in monocytes. The expression of a secreted molecule as a cis mediator of a trans-regulatory network within a cell type is highly analogous to the previously-described trans eQTL network mediated by LYZ, encoding the bactericidal gene lysozyme30,31,32, and supports a model in which the expression of key secreted genes is associated with gene expression networks in trans.

A second example of a trans-regulatory network, affecting DRD3, FAM169B, JADE1, PDHA2 and SNORD85 expression, is mediated in cis by an eSNP affecting MC1R expression (Fig.Ā 2eā€“g, peak eSNP: rs11642267, pā€‰=ā€‰5.80ā€‰Ć—ā€‰10āˆ’35). The MC1RĀ cis eQTL is shared with CD4+ and CD8+ T cells (Supplementary DataĀ 5), and the lead SNP in the non-imputed genotyping was rs2228479, encoding a non-synonymous polymorphism (V92M) within MC1R and in perfect LD (R2ā€‰=ā€‰1) with rs11642267. Given this, we tested for allele-specific expression of this allele across five individuals heterozygous for rs2228479, using the C-BASE assay33,34, observing for all individuals significantly increased expression of the minor allele (pā€‰=ā€‰0.01ā€“pā€‰=ā€‰2.3ā€‰Ć—ā€‰10āˆ’7, combined pā€‰<ā€‰2.2ā€‰Ć—ā€‰10āˆ’16, Ļ‡2 test). Notably no ASE was observed in monocytes at this SNP (Supplementary Fig.Ā 7). MC1R encodes the melanocortin-1 receptor (MC1R), a high-affinity receptor for Ī±-melanocyte-stimulating hormone (Ī±MSH), and most robustly expressed in melanocytes, with associations with tanning35, freckling, hair and eye colour36, melanoma37 and non-melanoma skin cancer38. As well as its effects on pigmentation, MC1R/Ī±MSH signalling has well-established immunomodulatory functions, and MCR1 is surface expressed on B cells, T cells and NK cells39. Whilst the effect of MC1R/Ī±MSH signalling in NK cells is undefined, NK cell dysfunction, as part of broader immunodeficiency, is well-characterised as part of syndromic disorders of hypopigmentation, e.g. Chediak-Higashi, Hermansky-Pudlak and Griscelli syndromes40. All of these syndromes are characterised by impaired secretory lysosome function, impairing melanin secretion from melanocytes as well as cytotoxic granule secretion from NK cells and cytotoxic T cells. Our description here of an NK cell trans-regulatory network mediated by MC1R expression suggests that the NK cell immune defect seen in syndromic disorders of hypopigmentation may not be confined to defective cytotoxic degranulation.

We also observe a trans-regulatory network, affecting AVP, SPTBN4, PINLYP and SIM2 expression, which is mediated in cis by an eSNP affecting UVSSA expression (Fig.Ā 2hā€“j, peak eSNP: rs111632154, pā€‰=ā€‰1.65ā€‰Ć—ā€‰10āˆ’40). UVSSA encodes UV-sensitivity scaffold protein A, a cofactor for RNA polymerase II, which facilitates the release of transcriptional machinery from damaged, transcriptionally active DNA, allowing transcription-coupled nucleotide excision repair41,42,43. Mutations in UVSSA underlie UV-sensitive syndrome, an autosomal recessive disorder characterised by cutaneous photosensitivity without predisposition to malignancy44. The UVSSAĀ cis eQTL has strong evidence of colocalisation with bone mineral density GWAS loci (PP4ā€‰=ā€‰1). In keeping with this, other defects of transcription-coupled nucleotide excision repair, e.g. Cockayne syndrome, are characterised by features of accelerated aging, including osteoporosis45.

We also identify a trans-regulatory network for which the master regulator maps to the leucocyte receptor complex (LRC, 19q13.4), which encodes KIR genes. Expression of 3 genes (FDRā€‰<ā€‰0.05) in trans to the LRC are regulated by a KIR region SNP: rs640345 (Table S8). The LRC is highly polymorphic in terms of KIR gene content and allelic variation. To better define the cis mediator of this regulatory network, we imputed KIR copy number in the study samples using KIR*IMP46 and remapped trans eQTL in NK cells using KIR copy number imputations. KIR gene/pseudogene copy numbers were well-imputed (Table S9), with estimated imputation accuracies ranging from 76.2% (KIR2DS2) to 96.9% (KIR2DS1) and high levels of concordance with gene presence/absence as determined by PCR (80.7ā€“100%). In that analysis, we find the strongest evidence for trans-regulatory activity at KIR2DS4del: a 22 base-pair deletion in KIR2DS4 which introduces a premature stop codon47. KIR haplotypes are broadly grouped into A and B haplotypes, and for A haplotypes KIR2DS4 is the only activating receptor. KIR2DS4 ligands have been challenging to identify but include HLA-C*05:01-bound peptides derived from epitopes common to a number of bacterial pathogens48. KIR2DS4del has significant regulatory effects on a trans network of 10 genes (Supplementary DataĀ 10, Fig.Ā 2k, l). KIR2DS4del is the lead regulatory variant for all 10 genes within the KIR trans-regulatory network (Supplementary DataĀ 10), with no significant independent associations remaining after conditioning on KIR2DS4del copy number (Supplementary DataĀ 10). In addition, KIR imputation revealed a second KIR trans-regulatory network independent of KIR2DS4del (Supplementary DataĀ 10). That network comprises 3 genes (MYRF, EGLN3, SYNGR3) regulated by KIR3DP1 copy number.

We further sought to replicate evidence for our trans associations in NK cells in the large-scale meta-analysis of whole blood eQTL data compiled by the eQTLGEN consortium49. NK cells represent a small proportion of the circulating leucocytes in whole blood. Given that trans eQTL hubs are frequently thought to be cell-type specific49, our expectation was that our power to replicate trans associations in whole blood eQTL would be limited. Despite this, we were able to replicate a small number of NK cell trans associations observed in our study in the eQTLGEN consortium data (Supplementary DataĀ 8). We find evidence of replication (eQTLGEN experiment-wide FDRā€‰<ā€‰0.05) for the effect of rs3811444 on JAM3 expression and for rs10876864 on KCTD11 expression in whole blood (Supplementary DataĀ 8). In both cases the replicated trans association forms part of a larger trans-regulated gene network in whole blood, with rs3811444 regulating the expression of 97 genes in trans and rs10876864 the expression of 47. Moreover, while there is no direct evidence for replication of the trans network mediated by MC1R expression, rs9939914 (a significant trans eSNP for both DRD3 and SNORD85 in our data) affects the expression of a network of four genes in trans in whole blood; GPR25, PHF17, LDHD, BBS10. These observations are in keeping with a model in which, while some trans-regulatory eSNPs are likely to be shared across cell type and context, the gene network which they regulate will vary according to cell type. In keeping with this, in the case of the trans-regulatory hub including rs10876864, for which we see evidence of replication of its effect on KCTD11 expression in whole blood, we have previously reported trans effects at this locus on LAP3P2, IP6K2 and HELZ2 in B cells and LAP3P2 in monocytes31.

TWAS

A striking feature of our eQTL analysis was the degree of enrichment for autoimmune disease that we observed among cis eQTL in NK cells. To further investigate this we utilised NK cell eQTL data and GWAS summary statistics for autoimmune disease to identify novel autoimmune risk genes within five autoimmune TWAS; ulcerative colitis50, rheumatoid arthritis51, systemic lupus erythematosus52, primary biliary cirrhosis53 and Crohnā€™s disease50. We identified 98 TWAS-significant, independent gene-trait associations (FDRā€‰<ā€‰0.05) for autoimmune diseases (Fig.Ā 3, Supplementary DataĀ 11ā€“15); ulcerative colitis 21/2998 genes tested, rheumatoid arthritis 5/3015 genes tested, systemic lupus erythematosus 18/3045 genes tested, primary biliary cirrhosis 18/3035 genes tested, Crohnā€™s disease 31/3077 genes tested. We then sought to define which of these gene-trait pairs are explained by an NK cell-specific cis eQTL, that is instances in which the most likely model includes an eQTL signal unique to NK cells (see ā€œMethodsā€) and the eQTL in NK cells colocalises with the GWAS signal (PPā€‰>ā€‰0.8). In that analysis, 6 gene-traits pairs are driven by gene expression in NK cells specifically; CD226 expression modifies systemic lupus erythematosus and primary biliary cirrhosis risk, TMEM163 expression modifies primary biliary cirrhosis risk, CALHM6 expression modifies ulcerative colitis risk, and both ZMIZ1 and OSM modify Crohnā€™s disease risk. We considered a gene-trait association to be novel if there was no GWAS-significant locus within 1Mb of the gene in the traitā€™s GWAS used as input for the TWAS, and if there was no colocalisation evidence (L2G scoreā€‰ā‰„ā€‰0.3) supporting gene-trait association in Open Targets Genetics54 (URL: https://genetics.opentargets.org, accessed 28/01/2021). Of the 98 TWAS-significant trait-associated genes, 27 define novel gene-trait associations (TableĀ 1).

Fig. 3: NK cell TWAS in autoimmune diseases.
figure 3

Manhattan plot of autoimmune disease TWAS. All significant trait-associated genes (nā€‰=ā€‰98) are labelled. Novel gene-trait associations (no GWAS-significant locus within 1Mb and no evidence supporting gene-trait association in Open Targets Genetics) are highlighted (yellow, nā€‰=ā€‰27). Novel gene-trait associations which colocalise (posterior probability colocalisationā€‰>ā€‰0.8) with an NK cell-specific eQTL are circled (pink, nā€‰=ā€‰3).

Table 1 Novel gene-trait associations identified by TWAS.

We were particularly interested in identifying novel gene-trait associations for which the evidence supports a shared causal variant between the trait GWAS and an eQTL specific to NK cells (defining colocalisation and NK cell specificity as above). We were able to identify three instances of this. Complementary to our identification of NK cell expression CD226 as a risk factor for type-1 diabetes and multiple sclerosis (Fig.Ā 1), we identify NK cell expressionĀ of CD226 as a determinant of both systemic lupus erythematosus and primary biliary cirrhosis. We also identified NK cell expression of TMEM163 as a determinant of primary biliary cirrhosis. TMEM163 encodes a transmembrane zinc transporter55, which is hypothesised to mediate zinc accumulation into lysosomes56. Notably, zinc deficiency has been reported to be associated with reduced NK cell cytolytic activity57. Genetic variation affecting TMEM163 expression has not previously been implicated in the pathogenesis of autoimmune disease. However, in an interesting parallel to our description of a trans-regulatory network in NK cells mediated by MC1R expression, eQTL for TMEM163 in whole blood49 colocalise (PPH4ā€‰=ā€‰0.96, Open Targets Genetics) with genetic determinants of hair colour (UK Biobank, URL: http://www.nealelab.is/uk-biobank/). TMEM163 is not regulated in trans by the MC1R locus in NK cells, however, these data suggest a model in which the shared biological determinants of pigmentation and inflammation operate in NK cells and modify risk of immune-mediated disease. Immunodeficiencies with combined NK cell dysfunction and disorders of pigmentation, including Chediakā€“Higashi, Hermanskyā€“Pudlak and Griscelli syndromes40, are characterised by impairment of secretory lysosome function. Given the described role of TMEM163 in lysosome composition, it is tempting to speculate that TMEM163 may modify both pigmentation and risk of immune-mediated disease through effects on melanin secretion and NK cell cytotoxic degranulation.

Protein phenotypes of NK cell eQTL

Mapping of regulatory genetic variation has been largely undertaken in healthy individuals. Moreover, our eQTL mapping data defines regulatory genetic variation as a determinant of RNA expression, but does not provide direct evidence for genetic control of NK cell phenotypes at the protein level. To better understand the relevance of eQTL mapping data to protein level expression of key NK cells markers, and to demonstrate the relevance of eQTL mapping data to a disease context, we correlated genotype at NK cell eSNPs with expression of key surface markers on NK cells from patients with metastatic melanoma58. Using multiparametric flow cytometry, we sought to confirm the effect of identified NK cell eSNPs on CD226 and CD57 (B3GAT1-encoded) protein expression. We further correlated KIR2D antigen expression with KIR2DS4del copy number, which we have defined as a determinant of a large trans-regulatory network.

T cell and NK cell-expressed CD226 is a determinant of anti-tumour responses59, including in the context of metastatic melanoma60. In a separate cohort of patients with metastatic melanoma, we confirm that that the peak eSNP for CD226 is a determinant of CD226/DNAM1 protein expression on NK cells (pā€‰=ā€‰0.03, Fig.Ā 4). It has previously been shown that the balance between CD226 and TIGIT (a competitive inhibitor of CD226) expression on CD4+ Tregs is a correlate of clinical response to immune checkpoint blockade in metastatic melanoma61. Our identification of an NK cell-specific eQTL, which modulates CD226 protein expression on NK cells, provides the opportunity for future studies to leverage genetic variation to better understand the cellular context of CD226 expression as a determinant of response to immune checkpoint blockade.

Fig. 4: NK cell eQTL determine protein expression on NK cells in patients with metastatic melanoma.
figure 4

Genotype-phenotype correlations are calculated by linear regression. P values are calculated with likelihood ratio tests. Expression of CD57 (orange, nā€‰=ā€‰78) and CD226/DNAM1 (blue, nā€‰=ā€‰24) protein on NK cells is determined by the respective RNA eSNPs in patients with metastatic melanoma. KIR2D antigen expression on NK cells in patients with metastatic melanoma (nā€‰=ā€‰78) is modified by KIR2DS4del copy number (pink). Box and whisker plots; boxes depict the upper and lower quartiles of the data, and whiskers depict the range of the data excluding outliers (outliers are defined as data-pointsā€‰>ā€‰1.5ā€‰Ć—ā€‰the inter-quartile range from the upper or lower quartiles). Genotype to phenotype correlations were calculated with linear regression. P values are two-sided.

CD57 expression defines a subpopulation of mature, cytotoxic, CD56dim NK cells62. Our eQTL mapping analysis identified a cis eQTL for B3GAT1 (which encodes CD57) expression in NK cells (peak eSNP: rs77478906, pā€‰=ā€‰1.90ā€‰Ć—ā€‰10āˆ’11). This eQTL is predicted to be specific to NK cells, and colocalises with a GWAS63 risk locus for prostate cancer (PP4ā€‰=ā€‰0.949), with decreased NK cell B3GAT1 expression being associated with increased prostate cancer risk. We confirm the effect of rs77478906 on CD57 NK cell protein expression in patients with metastatic melanoma (pā€‰=ā€‰3.85ā€‰Ć—ā€‰10āˆ’5, Fig.Ā 4), demonstrating that the prostate cancer risk allele, rs77478906:C, is associated with a decreased proportion of CD57+ NK cells. The proportion of CD57+ NK cells have been observed to increase with age64 and CMV infection65, but it remains unclear how this interacts with malignancy risk or prognosis. Our study demonstrates that regulatory genetic variation is also a key determinant of the CD57+ NK cell proportion, and highlights that studies seeking to understand the relationship between CD57+ NK cells, age, CMV and malignancy should also consider rs77478906 genotype.

Finally, having identified KIR2DS4del copy number as a mediator of a trans-regulatory network in NK cells, we sought to confirm a role for imputed KIR2DS4del copy number on KIR antigen expression in the same patient cohort. In metastatic melanoma patients, KIR2DS4del copy number is associated with the proportion of KIR2D+ NK cells (pā€‰=ā€‰0.046, Fig.Ā 4), with increased copies of the deletion being associated with decreased KIR2D expression. It has previously been demonstrated that a population of CD56dimCD57+CD69+CCR7+KIR+ NK cells are expanded in tumour infiltrated lymph nodes of patients with melanoma, and that these NK cells are key effectors of anti-tumoral cytotoxicity66. Our data suggest that proportions of CD57+ and KIR+ NK cells in the periphery are in part genetically determined, and that in the case of KIR positivity, this is associated with large-scale transcriptional changes in trans. A complete understanding of NK cell phenotypes as a predictor of cancer predisposition or outcome needs to reflect genetically determined inter-individual differences in NK cell phenotype at baseline.

Discussion

Knowledge of the genetic determinants of gene expression in immune cells has proved highly informative in advancing our understanding of immune-mediated disease. Here, we have defined the regulatory landscape of gene expression in primary human NK cells. In doing so, we identified thousands of eQTL operating in NK cells, and demonstrated that a significant proportion of this regulatory variation appears specific to these cells.

Regulatory variation active in NK cells is highly enriched for disease-associated genetic variation. By combining our eQTL mapping with colocalisation analysis, incorporating the results of multiple genetic association studies, we elucidate many examples where regulatory variation specific to NK cells are associated with human disease risk. As such, our data directly implicate gene expression in NK cells in a range of human traits and diseases, most strikingly for autoimmune diseases. We expand on these findings, leveraging our NK eQTL data to perform TWAS in five autoimmune diseases, defining novel roles for NK cell expression of CD226 and TMEM163 in systemic lupus erythematosus and primary biliary cirrhosis. Furthermore, we illustrate the effects of identified NK cell eSNPs on the expression of key NK cell markers at the protein level in a separate cohort of well-characterised patients with metastatic melanoma, replicating these observations at the protein level and illustrating the potential relevance to disease.

Our findings are in keeping with an emerging role for NK cells in autoimmune disease. This role is supported by observational data across a number of autoimmune diseases, identifying decreased NK cell number in the periphery as a marker of disease risk or activity67,68,69, and accumulation of NK cells at sites of autoimmunity70,71. The presence of NK cells at sites of autoimmunity could suggest a model in which NK cells directly contribute to tissue damage through cytolysis and/or cytokine secretion, or alternatively a model in which NK cells have an immunoregulatory role, restraining the activity of autoreactive immune cells. Recent data have highlighted the importance of the latter model, defining immunoregulatory roles for NK cells in controlling the activity of CD4+ T cells26,72.

We identified multiple trans eQTL operating in NK cells, most notably trans-regulatory networks mediated by MC1R, GNLY, UVSSA, and KIR regions. The identification of a large trans-regulatory network mediated by a common deletion in an activating KIR receptor (KIR2DS4del) is particularly intriguing and is analogous to previous descriptions of master-regulatory variation to the MHC31,73. Variation in KIR type has been implicated in the pathogenesis of infectious and autoimmune disease. It is highly likely that the KIR-mediated trans-regulatory network we have identified here will have important roles in human disease risk, and that this role will emerge as the KIR region is better defined, through imputation46 or sequencing74 approaches, in GWAS cohorts.

Our study has some limitations. Most notably, the expression data we present here was generated with microarray technology rather than with RNA sequencing. Large-scale meta-analysis of eQTL mapping in whole blood have demonstrated excellent concordance between studies performed with microarray and RNASeq49. However, within this dataset we are unable to identify genetic mediators of expression in NK cells for transcripts lacking probe coverage, alternative splicing and exon/transcript usage.

In summary, we have described the cis- and trans-regulatory landscapes of gene expression in human NK cells. We have integrated these data with epigenetic and GWAS datasets deriving important insights into NK cell biology and its impact on human health and disease.

Methods

Study samples and power

Healthy individuals of European ancestry (nā€‰=ā€‰245) were recruited following written informed consent (Oxfordshire Research Ethics Committee COREC reference 06/Q1605/55). Participants had a median age of 28 years (range 18ā€“66) and 117 of 245 were male. PBMCs, purified by Ficoll-Pacque density gradient centrifugation, were isolated from whole blood collected into EDTA-containing tubes (Vacutainer system, Becton Dickinson). Cells were washed in Hanksā€™ balanced salt solution (Invitrogen) and enumerated with a haemocytometer.

CD56+CD3āˆ’ NK cells were isolated by magnetic-activated cell sorting negative selection (MACS, cat 130-092-657, Miltentyi Biotec). Cells snap-frozen in RLT reagent (Qiagen) prior to RNA extraction. Total RNA was extracted using RNAeasy Mini kits (Qiagen), quantified using a NanoDrop (ThermoFisher), and BioAnalyzer quantification in a subset.

We estimated study power to detect trait-associated genetic variation in cis and trans across a range of minor allele frequencies and numbers of independent tests (Supplementary Fig.Ā 8). In each case we used powerEQTL75 to calculate power for Ī±ā€‰=ā€‰0.05/number of tests and an effect size and standard deviation of 0.1376. For cis power calculations we assume a variable number of independent tests, defined as the number of independent SNPs in a cis testing window (1ā€“5000) multiplied by the number of genes tested (18,000). For trans power calculations we assume 18ā€‰Ć—ā€‰109 independent tests, assuming 18,000 genes and 1ā€‰Ć—ā€‰106 independent SNPs genome-wide. Assuming an effective sample size of 245, our study has 80% power to detect a cis eQTL with a causal eSNP with minor allele frequencyā€‰>ā€‰0.08, assuming 100 independent tests in the cis testing window. In trans, our study has 80% power to detect a trans eQTL with a causal eSNP with minor allele frequencyā€‰>ā€‰0.12.

Patients with metastatic melanoma (nā€‰=ā€‰78) were recruited as described previously58. Patients provided blood samples to the Oxford Radcliffe Biobank (Oxford Centre for Histopathology Research ethical approval nos. 16/A019 and 18/A064), following written, informed consent. Patients had a median age of 70 years (range 29ā€“95) and 43 of 78 were male.

Gene expression quantification

Total RNA from 245 individuals was quantified using Illumina HumanHT-12 v4 BeadChip gene expression arrays including 47,231 probes. Probe sequences mapping to more than one genomic locus, and probe sequences containing common genomic variation (minor allele frequencyā€‰>ā€‰1%) were excluded from further analysis. Gene expression estimates were normalised (random-spline normalisation) before variance-stabilising transformation using the R package lumi77. Samples were processed in 2 batches, and gene expression estimates corrected using the the R package ComBat78. Following quality control, gene expression quantified at 29,011 probes, mapping to 18,078 unique genes were included in the analysis.

Genotyping and imputation

Genomic DNA was extracted from whole blood using Gentra Puregene Blood Kits (Qiagen) according to manufacturerā€™s instructions, before dsDNA quantification using PicoGreen (Invitrogen). Genome-wide genotypes at 733,202 loci were generated using HumanOmniExpress-12v1.0 BeadChips (Illumina). SNP quality control (QC) filters were applied as follows: minor allele frequency (MAF)ā€‰<ā€‰4%, Hardy-Weinberg equilibrium (HWE) pā€‰<ā€‰1ā€‰Ć—ā€‰10āˆ’6, plate effect pā€‰<ā€‰1ā€‰Ć—ā€‰10āˆ’6 and SNP missingnessā€‰>ā€‰2%. We calculated identity by descent and principal components of LD-pruned genome-wide genotyping data to identify related individuals and sample outliers respectively. Samples with call ratesā€‰<ā€‰98% were excluded from analysis. QC metrics were calculated in PLINK.

Following QC, 609,704 variants were taken forward for genome-wide imputation with pre-phasing using SHAPEIT79 and imputation using IMPUTE280. We used 1000G Phase 1 as a reference panel. Following imputation, SNPs with imputation info scoresā€‰<ā€‰0.9 or HWE pā€‰<ā€‰1ā€‰Ć—ā€‰10āˆ’6 were excluded from further analysis. Genotypes at 6,012,996 loci were taken forward for eQTL mapping. Principal components analysis of genome-wide genotyping data did not identify population outliers. Comparison of these genetic principal components with those of 1000G project reference samples confirmed European ancestry in all study samples (Supplementary Fig.Ā 9). Genotyping QC did not identify related samples. All 245 individuals were included in the analysis.

For patients with metastatic melanoma, following genomic DNA extracted as described above, samples were genotyped using the Infinium Global Screening Array 24v3 (Illumina). Following QC (as above), 297,971 SNPs were taken forward for phasing and genome-wide imputation using Haplotype Reference Consortium v181 as a reference panel. Phasing was performed with Eagle282 and imputation with Minimac483, as implemented in the Michigan Imputation Server v1.2.483. Following imputation, we retained SNPs with MAFā€‰>ā€‰0.04, R2ā€‰>ā€‰0.7 and HWE pā€‰>ā€‰1ā€‰Ć—ā€‰10āˆ’6 for downstream analysis.

We used SHAPEIT phased genotypes at 100 SNPs in the KIR region on chromosome 19 to perform KIR copy number imputation using KIR*IMP v1.2.046. KIR*IMP uses a bespoke reference panel of genotype and KIR copy number from individuals of European ancestry from the UK (698 individuals from 348 families). To assess the performance of KIR copy number imputation, we assayed the presence or absence of 12 functional KIR genes (2DL1-5, 2DS1-5, 3DL1, 3DS1) and two pseudogenes (2DP1 and 3DP1) in a subset of the study samples (nā€‰=ā€‰171). For each gene/pseudogene we used sequence-specific primers to amplify PCR products from genomic DNA. PCR products were separated with agarose gel electrophoresis and stained with ethidium bromide. The presence or absence of each KIR gene/pseudogene was confirmed with two complementary PCR reactions, resulting in two PCR products of different length at each locus.

eQTL analysis

QTLtools10 was used for eQTL mapping under an additive linear model, including PCs of gene expression data to limit the effect of confounding variation. cis eQTL were defined as regulatory variation within 1Mb of the associated probe and trans loci thoseā€‰>ā€‰5Mb from the associated probe. Inclusion of 32 and 12 PCs in the eQTL analysis maximised eQTL discovery in cis and in trans respectively (Supplementary Fig.Ā 1). Prior to eQTL mapping, expression data were corrected for PCs before rank normal transformation. For cis eQTL mapping, QTLtools approximates a permutation test at each phenotype, controlling for multiple-testing burden at the level of each phenotype. We used 10,000 permutations for cis eQTL mapping. To account for cases in which multiple probes mapped to the same gene we retained the probe with most significant, nominal cis association, as implemented with the --grp-best option in QTLtools. A second level of multiple-testing correction across all phenotypes tested was applied in R using qvalue84. Forward-backward stepwise regression implemented in QTLtools10 was used to define multiple independent signals reaching a permutation-based significance threshold, mapping lead eSNPs at each signal to define identify secondary, tertiary and quaternary cis eQTL. For all analyses, we considered an FDRā€‰<ā€‰0.05 to be significant.

We mapped trans eQTL using QTLtools in two stages, firstly fitting additive linear models for each phenotype:genotype pair generating nominal p values for each association, before determining the significance threshold by permuting phenotypes 1000 times. Correlation between phenotypes (but not genotypes) is maintained within each permutation run. By analogy to our approach for cis mapping, for cases in which multiple probes mapped to the same gene we retained the probe with the single most significant trans association for each gene. To alleviate false-positive associations in our trans analysis driven by SNPs in repetitive regions85, we excluded SNPs falling within repeat regions as defined by the UCSC RepeatMasker track86, taking forward 2,853,403 SNPs for analysis. False-positive trans associations can also be secondary to cross-mapping artefacts, whereby an array probe maps to paralogous genes mis-identifying a cis association as one acting in trans for a distal paralogous gene. To address this, following mapping in trans for each gene for which we identified a trans eQTL we used Re-Annotator87 to re-map the array probe to a 10ā€‰Mb region centred on the identified trans eSNP. Allowing up to 6 mismatches, this re-mapping procedure identified 14 potential false-positive trans-eQTLs, which were excluded from further analysis. For significant trans associations, we grouped eSNPs located within a window of 1Mb, affecting the expression of a single gene, as potentially belonging to a single trans signal. We then tested whether each of these trans signals represented a single association, recapitulating mapping in trans for that probe conditioning on the peak eSNP.

Identification of NK cell-specific eQTL

To determine sharing of NK cell cis eQTL with other cells, moloc11 was used in R to compare association at NK cell cis eQTL between NK cells and four other previously published datasets of eQTL in primary immune cell subsets from individuals of European ancestry12,13,14. Moloc adopts a Bayesian framework to compare models of association across multiple traits at a given genetic locus, using summary statistics. We applied moloc at all identified cis eQTL in NK cells, defining the evidence for shared regulatory effects on gene expression across NK cells (this dataset, nā€‰=ā€‰245), CD4+ T cells (nā€‰=ā€‰282), CD8+ T cells (nā€‰=ā€‰271), monocytes (nā€‰=ā€‰414) and neutrophils (nā€‰=ā€‰101). We considered an eQTL to be unique to NK cells where the posterior probability of the eQTL not being shared with another cell typeā€‰>ā€‰0.8, and where a unique eQTL in NK cells was the most likely model overall (either an eQTL in NK cells is the only eQTL at that locus, or where other eQTL at the same locus are distinct from the eQTL in NK cells).

Allele-specific expression

To validate the effect of regulatory variation at MC1R, we used the C-BASE allele-specific expression assay33,34, designing primers to full-length MC1R and cloning PCR product of cDNA and genomic from NK cells from 5 heterozygous individuals heterozygous using TA Cloning kit (Thermofisher). In addition, the same approach was used from monocyte cDNA samples for a separate 5 heterozygote individuals. Cloning product was transformed and colonies isolated with Taqman genotyping being performed on isolated colonies (spotted with pipette tip into PCR solution) using a probe designed to detect both alleles. Ninety-six colonies were tested for each sample, significance of effect for NK cells was performed using a ChiSquared test (1df), comparing ratios of genomic to cDNA. A paired T test was performed for the monocyte samples.

Colocalisation and enrichment

We used ENCODE functional annotations88 to interrogate colocalisation of chromatin states and transcription factor binding sites in lymphoblastoid cell lines (LCLs - GM12878) with eQTL in NK cells. We used ChIP-seq defined transcription factor binding sites for 52 transcription factors, and segmentation of the LCL genome into 7 functional tracks defined by ChromHMM89 and SegWay90; promoter regions, promoter flanking regions, enhancers, weak enhancers or open chromatin cis-regulatory elements, CTCF enriched elements, transcribed regions, and repressed or low activity regions. We calculated the density of functional annotations surrounding each cis eSNP, counting the number of annotations falling into 1kb bins up to 1Mb from each feature. We used a permutation approach implemented in QTLtools to calculate evidence for enrichment of eSNPs with functional annotations, calculating the frequency of observed overlap between a given functional annotation and an eSNP, comparing this to the number of overlaps expected by chance (permuting phenotypes across all probes tested). We tested for enriched biological pathways among cis NK cell-specific eQTL using the GOBP database in XGR19. Genes with cis NK cell-specific eQTL were tested against a background of all genes tested in the eQTL analysis, using a hypergeometric test.

We used two approaches to test for colocalisation of causal variants between cis eQTL and GWAS traits. Firstly we used Regulatory Trait Concordance (RTC)20 implemented in QTLtools. RTC integrates GWAS hits, eQTL data and local LD structure to quantify the decrease in eQTL significance when the phenotype is residualised for the GWAS hit. We used RTC to test for colocalisation at cis eQTL with 87,919 GWAS-significant loci (pā€‰<ā€‰5ā€‰Ć—ā€‰10āˆ’8) from 5,823 GWAS studies/traits downloaded on 28/01/2021 from the NHGRI-EBI GWAS Catalog8 (Supplementary DataĀ 16). An advantage of the RTC method is that it only requires the identity of GWAS-significant peak SNPs, rather than complete GWAS summary statistics. We considered RTC scoresā€‰>ā€‰0.9 to be significant. Secondly, we used the R package coloc v5.1.021 to identify evidence of causal variants shared by NK cell eQTL and GWAS loci. Coloc adopts a Bayesian approach to compare evidence for independent or shared association signals for two traits at a given genetic locus. We tested for colocalisation between NK cell primary cis eQTL and 100 GWAS traits (Supplementary DataĀ 7) with evidence of trait association (pā€‰<ā€‰Ć—ā€‰10āˆ’6) within a 250kb window of the peak eSNP at each significant NK cell primary cis eQTL. We used the coloc.signals function91 to allow for multiple, independent causal signals in both traits. We considered a posterior probabilityā€‰>ā€‰0.8 supporting a shared causal locus to be significant.

To test for enrichment of colocalisation of GWAS traits among NK cell eQTL, we used Fisher exact tests to compare the proportion of NK cell eQTL within 250ā€‰kb of a GWAS locus with evidence of trait association (pā€‰<ā€‰Ć—ā€‰10āˆ’6) for which their is evidence of a shared causal locus, comparing this to the proportion observed for a null GWAS trait (height). We considered traits for which there were at least 5 NK cell eQTL within 250ā€‰kb of a GWAS locus. Throughout we applied FDR correction to account for the number of annotations and traits tested.

TWAS analysis

We performed transcriptome-wide association studies (TWAS) implemented in FUSION92 of five autoimmune diseases, using functional gene expression weights calculated from the NK cell eQTL data described here to impute NK cell gene expression into previously published GWAS of ulcerative colitis50, Crohnā€™s disease50, systemic lupus erythematosus52, primary biliary cirrhosis53 and rheumatoid arthritis51. For each gene, we correct normalised gene expression for PCs of gene expression (nā€‰=ā€‰32 as for eQTL mapping) and genome-wide genotypes (nā€‰=ā€‰10) and estimate cis genetic heritability (SNPs within 1Mb of the TSS), retaining significantly heritable genes (pā€‰<ā€‰0.05). For heritable genes we cross-validated genotype-imputed gene expression estimates, selecting the best-performing model (highest R2) from 5 prediction models; best linear unbiased predictor, Bayesian linear mixed model, elastic-net regression (Ī±ā€‰=ā€‰0.5), LASSO regression and single best eQTL. We then pass functional weights from the best-performing of these models and GWAS summary statistics to FUSION to compute TWAS association statistics for each gene. We applied FDR correction across all genes included in each TWAS, considering FDRā€‰<ā€‰0.05 to be significant. To address whether trait-associated genes were conditionally independent, we performed conditional stepwise regression including all significant TWAS genes on a single chromosome. Finally, to complement the TWAS analysis, at each TWAS-significant gene we used coloc to assess the probability that the GWAS risk locus and NK cell eQTL share a causal variant.

Multiparametric flow cytometry

Whole blood from each volunteer was collected into EDTA-containing collection tubes (Vacutainer System, Becton Dickinson). PBMCs were purified by Ficoll-Pacque density gradient centrifugation, before storage at āˆ’80ā€‰Ā°C. Frozen PBMCs were thawed, counted and plated in 96-well v-bottomed polystyrene plates (Corning, USA) at a concentration of 106 cells per well. After staining with viability dye (LIVE/DEAD FixableNear-IR Dead Cell Stain, cat L10119, Thermo-Fisher Scientific) at a concentration of 1:1000 in PBS, cells were surface immunostained with BUV496-conjugated anti-CD3 (1:150 dilution, clone UCHT1, cat 612940, BD Biosciences), BUV395-conjugated anti-CD56 (1:50 dilution, clone NCAM16.2, cat 563555, BD Biosciences), VioGreen-conjugated PanKIR2D (1:50 dilution, cat 130-128-216, Miltenyi Biotec), BV785-conjugated anti-CD57 (1:50 dilution, clone QA17A04, cat 393329, Biolegend) and PE-conjugated anti-CD226 (1:100 dilution, clone 11A8, cat 338305, Biolegend) in HBSS supplemented with 5% FBS. Immunostaining was performed in the dark, on ice for 30ā€‰min. Flow cytometry was performed using a BD LSR Fortessa X20 flow cytometer, and the data analysed (Supplementary Fig.Ā 10) using FlowJo v10.8 Software (BD Life Sciences). Proportions of CD57+, CD226+ and KIR2D+ NK cells were logit transformed prior to analysis, and genotype-phenotype associations tested with linear regression. P values are calculated with likelihood ratio tests.

Reporting summary

Further information on research design is available in theĀ Nature Research Reporting Summary linked to this article.