Although chronic lymphocytic leukemia (CLL) is characterized by a strong familial risk, the genetic basis of inherited susceptibility to CLL is largely unknown. The increased risk of Hodgkin lymphoma (HL) and non-Hodgkin lymphoma (NHL) in relatives of CLL patients suggests a common etiology to B-cell lymphoproliferative disorders (LPDs) through HLA variation.1 Moreover, as B-cell proliferation is part of an adaptive immune response, which can be initiated by major histocompatibility complex (MHC)-restricted T-cell activation, a possible influence of HLA on CLL pathogenesis is plausible.
It has recently been demonstrated that single nucleotide polymorphism (SNP) variation within the 6p21 region can accurately predict alleles at HLA-A, HLA-B, HLA-C, HLA-DRB1, HLA-DQA1 and HLA-DQB1 loci.2 Furthermore, HLA alleles can be accurately predicted from the SNPs used in a genome-wide association study (GWAS). To investigate the role of genetic variation in the MHC region in the etiology of CLL, we have applied this methodology to SNP data from a GWAS of CLL.
The 517 CLL cases (364 male) analyzed in the GWAS have been previously documented.3 Briefly, they comprised 155 CLL cases with a relative affected with CLL or a related B-cell LPD ascertained through the International CLL linkage consortium (ICLLLC) and 362 cases ascertained through the Leukemia Research CLL4 trial. The genome-wide SNP scan was conducted using Illumina Infinium HD Human370 Duo BeadChips.3 For controls we made use of genotype data generated by the Wellcome Trust Case Control Consortium on 2930 individuals from the British 1958 Birth Cohort (58BC). Collection of samples and information from subjects was undertaken with informed consent and ethical review board approval. After imposing rigorous quality control in terms of excluding samples and SNPs with poor call rates and SNPs showing significant departure from Hardy–Weinberg equilibrium, samples showing evidence of relatedness and ancestral differences,3 SNP genotypes were available on 503 cases and 2698 controls.
For single SNP and haplotype analysis we considered the MHC at 6p21 to be defined by a 5-Mb region bordered by the RFP and MLN genes (rs4324798 at 28 884 096 bp and rs767896 at 33 899 493 bp; Figure 1). We initially considered the 1149 SNPs mapping to this 5-Mb region analyzing the association between SNP and CLL using the Cochran–Armitage trend test. Odds ratios (ORs) and associated 95% confidence intervals (CIs) were calculated by unconditional logistic regression. At the 0.05 threshold, 88 SNPs showed evidence of an association compared with an expected of 57.5 (P<0.001). Associations at the 0.01 threshold are reported in Supplementary Table 1. The strongest single association was shown by rs6904029, which maps at 30 051 046 bp localizing to the genomic sequence for the non-coding sequence for HLA complex group 9 that is intronic to HLA-A (P=1.38 × 10−4, Figure 1 and Supplementary Table 1). To interrogate the relationship between MHC variation and CLL further, we studied haplotypes using HelixTree software v6.4.2 (Golden Helix, Bozeman, MT, USA), using sliding window sizes of 5 and 13 contiguous markers. This analysis provided little evidence for enigmatic disease alleles present on rare haplotypes missed by single SNP analyses (Figure 1).
To validate the rs6904029 association using allele-specific PCR KASPar chemistry (KBiosciences), we genotyped an additional series of 919 unrelated CLL cases (599 male) ascertained through the ICLLLC and 1477 healthy individuals recruited from the National Study of Colorectal Cancer (NSCCG; 1999–2006; n=1269) and the Royal Marsden Hospital Trust/Institute of Cancer Research Family History and DNA Registry (1999–2004; n=208). Successful genotyping was obtained for 897 cases and 1429 controls. This analysis provided further evidence for an association between rs6904029 and CLL risk (P=0.01; Table 1), and in the combined analysis, the OR for CLL associated with rs6904029-A genotype was 1.24 (95% CI: 1.13–1.36; P=1.00 × 10−5). The association remained statistically significant after applying a Bonferroni correction to adjust for multiple testing (Padjusted=0.011).
To survey the relationship between HLA alleles and CLL risk, we predicted class I and II HLA alleles using HLA*IMP software (version 1.3) making use of a reference database from the HapMap Project and 58BC controls.2 To ensure prediction accuracy, we only analyzed alleles that were predicted with a posterior probability >90%, and this criterion was met by >88% of the data. Five alleles showed an association with CLL risk at the 0.05 threshold: HLA-A*0201, HLA-A*3101, HLA-B*1401, HLA-C*0802 and HLA-DRB*1101 (Supplementary Table 2). Only the HLA-A*0201 association (P=3.12 × 10−4) remained statistically significant after adjustment for multiple testing. Carrier status for HLA-A*0201 conferred a 1.32-fold increased risk of CLL (95% CI: 1.13–1.53). rs6904029 is a proxy for HLA-A*0201 (P<10−7; D′=0.96, r2=0.91) but is not significantly correlated with other HLA-A alleles.
A recent GWAS of CLL reported an association with SNPs mapping to HLA-DRB5 and HLA-DQA1 loci.4 In our study, the best evidence for an association at this region was provided by rs660895, which maps to 32 685 358 bp (P=0.09) and HLA-DRB*1101 (P=0.038, Supplementary Table 2).
To examine whether carrier status for HLA-A*0201 was associated with a restricted immunoglobulin gene usage, we made use of previously generated data on CLL4 cases.5 Although immunoglobulin heavy chain variable region (IGHV) usage was non-random, with VH3, VH1 and VH4 families expressed at the highest frequencies (50%, 27% and 15% respectively, Supplementary Table 3), globally there was no difference in usage of specific VH subtypes by HLA-A*0201 genotype in the cases after correction for multiple testing.
Our data provide compelling evidence for a HLA-class I association and specifically that HLA-A*0201 is associated with an increased CLL risk. There is increasing evidence that T-cell dysfunction in CLL may contribute to disease etiology. Specifically, T-cells in CLL may be unable to start, maintain and complete an immune response to the malignant B-cell and other antigens, and are involved directly in sustaining the tumor. In addition, in the context of T cell cross-talk, CD4+ T-cells in CLL have been identified in the pseudofollicle/proliferation centers on the tissues involved, and their physical contact with CLL cells suggests an important role in the activation and survival of CLL cells.6 A role for the HLA-A*02 allele in evoking an effective immune response is supported by the observation that HLA-A*02 is associated with reduced persistence of hepatitis B viral infection7 and the finding of an underrepresentation of HLA-A*02 in patients with tuberculosis.8 The HLA-A*02 allele has also been consistently shown to afford protection against multiple sclerosis.9 HL displays a strong HLA class I association, with underrepresentation of HLA-A*02 associated with Epstein-Barr virus (EBV)-positive disease.10 Intriguingly, in a recent GWAS of HL in the MHC region, rs6904029 provided one of the strongest SNP associations for EBV-positive HL, the A-allele conferring an OR of 0.46.11 Following EBV infection, infected memory B-cells escape immune detection by downregulation of viral antigens. Activation of replicative (that is, lytic) infection and outgrowth of latently infected cells is kept under tight control by HLA and cytotoxic T-lymphocytes (CTLs). HLA-A*02 and HLA-A*0201 in particular are known to present peptides from a wide range of EBV lytic and latent antigens, including the latent membrane proteins LMP2 and LMP1, hence HLA-A*0201-restricted CTL response provides an eminently biological plausible basis for disease etiology in EBV positive-HL. The CLL association with HLA-A*0201 is, however, analogous to that shown in nasopharyngeal carcinoma (NPC), whereby an increased risk of NPC is associated with HLA-A*0201 carrier status.12 The non-random usage of variable domain elements of IGHV provides evidence of selection by chronic antigen stimulation or selection through the B-cell receptor. The absence of a strong association between specific HLA-A*0201 genotype and IGHV subtype in CLL cases, however, argues against a simple environmental basis for disease development.
Although we observed a strong class I association with CLL in this study, it is entirely plausible, given the overrepresentation of associations seen at the 0.05 threshold, that genetic variation in other HLA alleles may impact on disease, albeit less profoundly. The recent GWAS of CLL reported by Slager et al.4 found an association between SNPs mapping to the 6p21.32 region, which encompasses the HLA-DQA1 and HLA-DRB5 genes. It is intriguing that the association was confined to familial CLL cases as familial disease is essentially indistinguishable from CLL.5 The region has been recently identified to harbor variants associated with other B-cell malignancies, including follicular lymphoma and diffuse large B-cell lymphoma.13, 14 As familial aggregation of CLL and NHL is shown, the 6p21.32 association could be reflective of common genetic susceptibility to a range of B-cell LPDs. Despite ∼30% of the CLL cases in our study having a family history of CLL or related B-cell LPD, we did not, however, find statistically significant support for a disease locus for CLL within the HLA-DQA1/HLA-DRB5 region.
In conclusion, our analysis provides support for the MHC variation influencing CLL risk and in particular a role of HLA-A*0201 in disease etiology. In terms of impact, HLA variation has a weaker effect on CLL risk than the recently identified non-HLA loci;3 this is in stark contrast to HL, a disease which is primarily defined by HLA.15 Finally, although speculative, the reciprocal HLA-A*02 associations seen for HL and CLL raise the possibility of differential response to viral infection, such as EBV, also playing a role in the development of CLL.
Leukemia Lymphoma Research provided principal funding for the study. Additional funding was provided by Cancer Research UK and the Arbib fund. We acknowledge National Health Service funding for the Royal Marsden Biomedical Research Center. Finally we are grateful to all individuals for their participation.
Human Chromosome 6 Project Overview: http://www.sanger.ac.uk/HGP/Chr6
British 1958 birth cohort: http://www.b58cgene.sgul.ac.uk
NCBI dbSNP: http://www.ncbi.nlm.nih.gov/SNP/
Online Mendelian Inheritance in Man (OMIM): http://www.ncbi.nlm.nih.gov/Omim/
Golden Helix: http://www.goldenhelix.com/
About this article
Supplementary Information accompanies the paper on the Leukemia website (http://www.nature.com/leu)