In addition to HLA, recent genome-wide association studies (GWASs) of Hodgkin’s lymphoma (HL) have identified susceptibility loci for HL at 2p16.1, 8q24.21 and 10p14. In this study, we perform a GWAS meta-analysis with published GWAS (totalling 1,465 cases and 6,417 controls of European background), and follow-up the most significant association signals in 2,024 cases and 1,853 controls. A combined analysis identifies new HL susceptibility loci mapping to 3p24.1 (rs3806624; P=1.14 × 10−12, odds ratio (OR)=1.26) and 6q23.3 (rs7745098; P=3.42 × 10−9, OR=1.21). rs3806624 localizes 5′ to the EOMES (eomesodermin) gene within a p53 response element affecting p53 binding. rs7745098 maps intergenic to HBS1L and MYB, a region previously associated with haematopoiesis. These findings provide further insight into the genetic and biological basis of inherited susceptibility to HL.


Hodgkin’s lymphoma (HL) is a common lymph node cancer of germinal centre B-cell origin, which is characterized by malignant Hodgkin and Reed–Sternberg (HRS) cells mixed with a dominant background population of reactive lymphocytes and other inflammatory cells1. Although Epstein–Barr virus (EBV) infection may be causally related to a number of cases, there is little evidence to support the involvement of other environmental risk factors2.

Evidence for inherited genetic influence on susceptibility is provided by the increased familial risk and very high concordance between monozygotic twins3. Although the risk of HL is well recognized to be influenced by HLA genotype variation within the major histocompatibility complex, much of the familial risk is thought to be the consequence of non-HLA genotype variation. Support for this hypothesis has come from recent genome-wide association studies (GWASs) of HL that, in addition to demonstrating multiple HLA associations4,5,6,7, have identified non-HLA susceptibility loci at 2p16.1 (REL), 8q24.21 and 10p14 (GATA3)7.

In this study to identify additional susceptibility loci for HL, we conducted an independent primary scan of German patients with classical HL (cHL) and performed a genome-wide meta-analysis with one previously published GWAS of UK cHL patients. The most significant single nucleotide polymorphisms (SNPs) were analysed in two additional series totaling 2,024 cases and 1,853 controls. A combined analysis identified new HL susceptibility loci mapping to 3p24.1 and 6q23.3.


GWAS and meta-analysis

In the primary scan (German-GWAS), 1,001 cHL cases, ascertained by the German Hodgkin Study Group during 1998–2007, were genotyped using the Illumina Human OmniExpress-12 v1.0 arrays. For controls, we used genotype data on 1,226 individuals enrolled into the Heinz Nixdorf Recall study genotyped using Illumina OmniExpress-12 v1.0. A total of 133 case samples were removed during quality control steps for reasons including a failure to genotype, duplicates, closely related individuals or non-CEU ancestry (CEU-Utah residents with Northern and Western European ancestry; Figs 1 and 2).

Figure 1: GWAS data quality control.
Figure 1

Details are provided of the samples and SNPs used in UK-GWAS and quality control of German-GWAS. Quality control of UK-GWAS reported in previously published work42.

Figure 2: Identification of samples of non-European origin for the German GWAS.
Figure 2

The first two principal components of the analysis are plotted in each figure. (a) German controls (black crosses) and cases (grey crosses) are shown together with the HapMap CEU (Utah residents with Northern and Western European ancestry) individuals (red), CHB (Han Chinese in Beijing)+JPT (Japanese in Tokyo) individuals (purple) and YRI (Yoruba in Ibadan) individuals (green). (b) The same plot is shown after the removal of cases and controls of non-European origin. Respective data on UK-GWAS provided in previously published work42.

The UK-GWAS has been previously reported7; briefly, 622 cHL cases were genotyped using Illumina 660w-Quad BeadChips. Genotype frequencies were compared with publicly accessible genotype data generated by the UK Wellcome Trust Case–Control Consortium 2 study of 2,930 individuals from the 1958 British Birth Cohort (58C) and 2,737 individuals from the UK Blood Service collections that had been genotyped using the Illumina Human1.2M-Duo Custom_v1 Array. There was no evidence of systematic bias between these two series, which were combined to provide genotype data for 5,667 controls7. Quality control steps for the UK GWAS have been previously reported7.

After filtering of genotype data from each GWAS on the basis of pre-specified quality control measures, 296,129 autosomal SNPs were common to both case–control series. Quantile–quantile plots of the genome-wide Armitage trend test χ2-values showed that there was minimal inflation of the test statistics rendering substantial cryptic population substructure or differential genotype calling between cases and controls unlikely in either GWAS (genomic control inflation factor8, λgc=1.03 and 1.09 in UK and German-GWAS, respectively; Fig. 3). In the combined analysis under a fixed effects model, we identified 27 SNPs in 21 genomic regions not previously associated with cHL risk, which showed good evidence for a relationship (that is, P<5.0 × 10−5 fixed effects model inverse-variance weighted meta-analysis test; Fig. 4, Supplementary Table S1).

Figure 3: Quantile–quantile (Q–Q) plots of χ2-values for association.
Figure 3

(a) UK cases and UK controls (λ=1.03); (b) German cases and German controls (λ=1.09).

Figure 4: Genome-wide P-values plotted against their respective chromosomal positions.
Figure 4

Shown are the genome-wide P-values (two-sided) obtained using the Cochran–Armitage trend test from 296,129 autosomal SNPs in 1,465 HL cases and 6,417 controls. The red horizontal line represents the genome-wide significance threshold level (P=5.0 × 10−8). The blue horizontal line represents the significance threshold at which SNPs were selected for genotyping in UK-replication 1 (P<5.0 × 10−5). Previously identified risk loci at REL, HLA, PVT1 and GATA3 are labelled.

Replication genotyping

To validate these findings, we genotyped the SNP showing the best evidence of association in each of the 21 regions in the UK-replication 1 series. In the combined analysis, rs3806624 and rs7745098 showed consistent evidence for an association with HL risk applying a fixed effects model to Armitage trend test statistics (combined P-value<5.0 × 10−7; Supplementary Table S2) and these SNPs were genotyped in the UK-replication 2 series. In a combined analysis of all data sets, the rs3806624 and rs7745098 associations were statistically significant on a genome-wide basis (that is, P<5.0 × 10−8 fixed effects model inverse-variance weighted meta-analysis test; Fig. 5; Supplementary Table S3).

Figure 5: Plot of the ORs of HL associated with rs3806624 and rs7745098.
Figure 5

Studies were weighted according to the inverse of the variance of the log of the OR calculated by unconditional logistic regression. Horizontal lines: 95% confidence intervals (95% CI). Box: OR point estimate; its area is proportional to the weight of the study. Diamond (and broken line): overall summary estimate under fixed effects model, with CI given by its width. Unbroken vertical line: null value (OR=1.0).

rs3806624 localizes to 3p24.1 (27,764,623 bp; combined P=1.14 × 10−12 fixed effects model inverse-variance weighted meta-analysis test, odds ratio (OR)=1.26) 5′ to the EOMES (eomesodermin; MIM:604615) gene (Fig. 6) within a 110-kb region of linkage disequilibrium (LD). rs7745098 localizes to 6q23.3 (135,415,004 bp; combined P=3.42 × 10−9 fixed effects model inverse-variance weighted meta-analysis test; OR=1.21) and maps intergenic to HBS1L (HBS1-like protein; MIM 612450) and MYB (V-MYB avian myeloblastosis viral oncogene homologue; MIM:189990) genes. To explore the 3p24.1 and 6q23.3 regions of association further, we imputed unobserved genotypes in GWAS cases and controls using 1,000 genomes data. This analysis did not reveal a significantly stronger association at 3p24.1 and 6q23.3 to that provided by rs3806624 and rs7745098, respectively (Fig. 6).

Figure 6: Regional plots of 3p24.1 and 6q23.3 associations with recombination rates and chromatin state segmentation tracks.
Figure 6

(a) 3p24.1 and (b) 6q23.3. Association results of both genotyped (triangles) and imputed (circles) SNPs in the GWAS samples and recombination rates for rates. −log10 P=values (y axis) of the SNPs are shown according to their chromosomal positions (x axis). The top genotyped SNP in each combined analysis is shown as a large triangle and is labelled by its rsID. Colour intensity of each symbol reflects the extent of LD with the top genotyped SNP; white (r2=0) through to dark red (r2=1.0) Genetic recombination rates, estimated using HapMap Utah residents of Western and Northern European ancestry (CEU) samples, are shown with a light blue line. Physical positions are based on NCBI Build 37 of the human genome. Also shown are the relative positions of genes and transcripts mapping to the region of association. Genes have been redrawn to show the relative positions; therefore, maps are not to physical scale. The lower panel shows the exons and introns of the gene of interest; observed SNP and chromatin state segmentation track (ChromHMM).

Subtype analysis

HL is biologically heterogeneous2, something reflected in the histology of tumour subtypes. In addition, a hallmark of cHL epidemiology is the bimodal age-specific incidence, and it has been argued that the disease in young adults and older adults are aetiologically different; in particular, there is a lower prevalence of EBV in young cHL cases (that is, <40 years), with ~70% of mixed cellularity cHL being EBV-positive compared with only ~20% of nodular sclerosis cHL. We assessed by case-only analysis the relationship between cHL and sex, age, histology and EBV status and rs3806624, rs7745098 genotype (Supplementary Table S4). Using case-only logistic regression, a strong association was seen between rs3806624 and age at diagnosis of cHL with an increased prevalence of the G risk allele in cases diagnosed before age 40 (P=8.35 × 10−4 Wald test on the coefficient). None of the other associations were statistically significant.

Impact of SNPs on heritability of HL

There was no evidence of significant interaction between either rs3806624 or rs7745098 and the previously identified risk loci at 2p16.1 (rs1432295), 6p21.32 (rs6903608), 8q24.21 (rs2608053), 8q24.21 (rs2019960) and 10p14 (rs501764), an observation compatible with each locus having an independent effect on HL risk. To quantify the impact of the known loci on the heritability associated with common variation at the non-HLA-linked loci (3p24.1, 6q23.3, 2p16.1, 8q24.21, 8q24.21 and 10p14), using data from UK-GWAS and German-GWAS we computed the receiver operator characteristic associated with rs3806624, rs7745098, rs1432295, rs2608053, rs2019960 and rs501764 genotypes. The area under the curve corresponding to these variants was 0.63 translating into them collectively accounting for ~7% of the familial risk.

Deciphering association signals

The functional basis of many GWAS signals can be ascribed to sequence changes having an impact on gene expression and sequence conservation in non-coding regions has been shown to be a good predictor of cis-regulatory sequences. Using publicly accessible expression quantitative trait loci (eQTL) data on lymphoblastoid cell lines (LCLs) and T cells, we examined whether either rs3806624 or rs7745098 genotype is associated with differential expression. Although the associations identified did not show consistent statistically significant evidence of cis-acting regulatory effects in publicly accessible eQTL data, this does not preclude the possibility that the causal variants at these disease loci have subtle effects on expression, as the dynamic range of transcripts is small. Further, it is likely that only a cumulative long-term imbalance in expression of target genes will influence cHL development, and expression differences may be relevant only to a specific subpopulation of B cells, which may not be well modelled by EBV-transformed lymphocytes.

To examine whether any directly typed or imputed SNPs lie within or very close to a putative transcription factor-binding/enhancer element, we conducted a bioinformatic search of each region of association. rs3806624 maps within a predicted promoter with histone marks of regulatory elements associated with promoters shown in multiple cell types; rs7745098 is within a highly evolutionary conserved region with predicted weak enhancer activity (Fig. 6; Supplementary Table S5).


We have identified common variants on chromosome 3p24.1 and 6q23.3 that influence the risk of cHL. Some (HLA; 8q24.21 rs2608053; 5q31 rs20541) but not all of the previously identified risk loci for cHL have differential effects according to histology and EBV status5,7. Our study findings suggest that 3p24.1 and 6q23.3 have generic effects on the development of cHL akin to 10p14 variation7.

As rs3806624 localizes 5′ to the EOMES gene and there are no other known genes within this region of LD, there is a high likelihood that the functional basis of the 3p24.1 association is mediated through variation in this gene a priori. Although we cannot exclude the possibility that rs3806624 is simply acting as a marker for the 3p24.1 association, interrogation of the association signal through imputation failed to recover a statistically stronger association. Moreover, as rs3806624 maps within a p53 response element affecting p53 binding, with the risk allele G displaying weaker binding than the A allele9 it is likely that the SNP is causal.

EOMES is a member of the TBR1 subfamily of T-box genes that have a critical role in embryogenesis and pluripotency. Although EOMES seems important in extranodal natural killer/T-cell lymphoma10, evidence for a role of EOMES in cHL aetiology is currently lacking. A high proportion of the reactive infiltrate in cHL tumours is however composed of Thelper2 (Th2)-like cells with a Tregulatory phenotype and crosstalk between these cells and the HRS cells seem essential for tumour growth11. Notably, a key characteristic of HRS cells is the production of cytokines and chemokines driven by GATA3 and T-bet expression and other T-cell transcription factors12. EOMES is necessary for full effector differentiation of CD8+ T cells complementing the actions of T-bet and acting as a key regulatory gene in the development of cell-mediated immunity13; CD8+ T cells deficient in Eomes and T-bet fail to differentiate into functional killers required for viral defence14. Differential expression of Eomes and T-bet seem to facilitate the cooperative maintenance of the pool of antiviral CD8+ T cells during chronic viral infection15, something highly pertinent to cHL.

CD44-positive memory Th2 cells expressing interleukin-5 (Il5) have lower levels of Eomes than Th2 cells lacking Il5 expression16. We have previously shown that variation at GATA3 is a determinant of cHL risk7. Downregulation of Eomes is required for Il5 expression and Eomes suppresses Gata3 transcriptional activity by inhibiting Gata3 binding to the Il5 promoter16. These findings support a role for an extended pathway involving GATA3 and EOMES genes in the aetiology of cHL.

rs170934 that maps 314 kb telomeric to EOMES (28,079,085 bp) has previously been shown to be a risk factor for multiple sclerosis17. Intriguingly, familial clustering of HL and multiple sclerosis is seen suggesting a common aetiological basis to both diseases18. This coupled with the biology of EOMES strengthens the link between autoimmunity and HL and raises the possibility of a model by which T-cell activation has a role in both diseases through aberrant antigenic response.

rs7745098 maps intergenic to HBS1L and MYB, a region of the genome, which has previously been reported to have a substantial role in haematopoiesis with polymorphisms influencing platelet19, white cell count20 and haemoglobin levels21. Although evidence for a role for HBS1L in haematological malignancy is lacking, MYB, through its interaction with p300, controls the proliferation and differentiation of haematopoietic stem and progenitor cells.

Although we do not find evidence of cis-acting regulatory effects in publicly accessible eQTL data from analysis of LCLs or T cells, steady-state levels of RNA in lymphocytes at a single time point and in cycling mature cells may not adequately capture the impact of differential expression in tumorigenesis. In summary, we have identified new risk loci at 3p24.1 and 6p22 for cHL, which provide additional insight into the development of this B-cell malignancy.



Collection of samples and clinicopathological information from subjects was undertaken with informed consent and relevant ethical review board approval in accordance with the tenets of the Declaration of Helsinki (in the United Kingdom, the Royal Marsden Hospital NHS Trust and Multicentre Research Ethics Committee; in Germany, The Ethics committee of the University of Cologne).

Discovery subjects

The German-GWAS comprised 1,001 cHL cases (597 men; mean age at diagnosis=34.8 years, s.d.=12.3) ascertained by the German Hodgkin Study Group during 1998–2007. Cases were genotyped using Illumina Human OmniExpress-12 v1.0 arrays according to the manufacturer’s protocols (Illumina, San Diego, USA). For controls, we used genotype data on 1,226 individuals enrolled into the Heinz Nixdorf Recall study genotyped using Illumina OmniExpress-12 v1.0.

The UK-GWAS has been previously reported7; briefly, 622 cHL cases (63 men, mean age at diagnosis=24.4 years, s.d.=9.6) were genotyped using Illumina 660w-Quad BeadChips. Genotype frequencies were compared with publicly accessible genotype data generated by the UK Wellcome Trust Case–Control Consortium 2 (ref. 22) study of 2,930 individuals from the 1958 British Birth Cohort (58C)23 and 2,737 individuals from the UK Blood Service collections that had been genotyped using the Illumina Human1.2M-Duo Custom_v1 Array. There was no evidence of systematic genotyping bias between these two series, which were combined to provide genotype data for 5,667 controls7.

Quality control of GWAS data sets

DNA samples with GenCall scores <0.25 at any locus were considered ‘no calls’. A SNP was deemed to have failed if <95% of DNA samples generated a genotype at the locus. A sample was deemed to have failed if <95% of SNPs were successfully genotyped. The same quality control metrics on the German-GWAS data were applied as in the UK-GWAS7. We computed identity-by-state (IBS) probabilities for all pairs (cases and controls) to search for duplicates and closely related individuals amongst samples (defined as IBS≥0.80, thereby excluding first-degree relatives). For all identical pairs the sample having the highest call rate was retained. To identify individuals who might have non-Western European ancestry, we merged our case and control data with phase II HapMap samples (60 Western European (CEU), 60 Nigerian (YRI), 90 Japanese (JPT) and 90 Han Chinese (CHB)). For each pair of individuals, we calculated genome-wide IBS distances on markers shared between HapMap and our SNP panel, and used these as dissimilarity measures on which to perform principal component analysis.

We filtered out SNPs having a minor allele frequency <1% and a call rate <95% in cases or controls. We also excluded SNPs showing departure from Hardy–Weinberg equilibrium at P<1 × 10−7 (χ2-test with 1 degree of freedom or Fisher’s exact test if a cell count was <5). After stringent quality control filtering (Fig. 1), we analysed 296,129 autosomal SNPs, common to all cases and controls, in 1,465 cHL cases and 6,417 controls. Cluster plots were manually inspected for all SNPs considered for replication.

Replication series and genotyping

UK-replication 1 comprised 1,071 UK patients diagnosed with cHL of either mixed cellularity HL (269 men; mean age at diagnosis 46.9 years, range 15–86) or nodular sclerosis HL (306 men; mean age at diagnosis 31.4 years, range 15–49). The patients were ascertained through the National Study of Hodgkin’s Lymphoma Genetics (http://www.public.ukcrn.org.uk), an ongoing study of HL established in 2008. Controls were healthy individuals recruited through the National Study of Colorectal Cancer Genetics (n=1,186,367 men; mean age 58.6)24 and the Royal Marsden Hospital Trust/Institute of Cancer Research Family History and DNA Registry (n=102,17 men; mean age 59.3) with no personal history of malignancy. Both cases and controls were UK residents and had self-reported European ancestry.

UK-replication 2 comprised 953 cHL cases (defined according to ICD10 C81.0-3; 290 men, mean age of diagnosis (AOD)=38 years, s.d.=16 years) ascertained from the Scotland and Newcastle Epidemiological Study of Hodgkin Disease (SNEHD), the Young Adult Hodgkin Case–Control Study (YHCCS) and the Epidemiology and Cancer Statistics Group Lymphoma Case–Control Study (ELCCS; http://www.elccs.info). Full details of the SNEHD, YHCCS and ELCCS studies were provided previously. Briefly, SNEHD involved ascertainment of incident cases from Scotland and Northern England during 1993–1997. YHCCS was based on newly diagnosed cases aged 16–24 years from Northern England during 1991–1995. ELCCS comprised cases residing in the north of England aged 16–69 years, with newly diagnosed, non-human immunodeficiency virus-related HL, during 1998–2003. UK population controls were obtained from SNEHD, YHCCS and ELCCS (n=565, 326 men, mean age 41 years, s.d.=17 years). The EBV status of cHL tumours was determined by immunohistochemical staining for EBV latent membrane antigen-1 and/or EBV EBV-encoded RNA in situ hybridization using sections of paraffin-embedded material.

Genotyping was performed using competitive allele-specific PCR KASP chemistry (LGC, Hertfordshire, UK). Primers used are listed in Supplementary Table S6. To ensure quality of genotyping in all assays, at least two negative controls and 6–7% duplicates (showing a concordance >99.9%) were genotyped. Call rates were >95% per 384-well plate for each SNP; cluster plots were visually examined by two researchers.

Statistical and bioinformatic analysis

Main analyses were undertaken using R (v2.10.1; http://www.r-project.org), Stata v.10 (State College, Texas, USA) and PLINK (v1.07)25 software. The association between each SNP and risk was assessed by the Cochran–Armitage trend test. The adequacy of the case–control matching and possibility of differential genotyping of cases and controls were formally evaluated using quantile–quantile plots of test statistics. The inflation factor λ was based on the 90% least significant SNPs8. ORs and associated 95% confidence intervals were calculated by unconditional logistic regression. Meta-analysis was conducted using standard methods under a fixed effects model26,27. Cochran’s Q statistic to test for heterogeneity and the I2 statistic to quantify the proportion of the total variation because of heterogeneity were calculated28. I2-values≥75% are considered characteristic of large heterogeneity28. Associations by age, sex, histology and EBV status were examined by case-only analyses.

Assuming a sibling relative risk of 3.93 (ref. 29) and a prevalence of 2.8 × 10−5 (Surveillance Epidemiology and End Results data), we made use of receiver operator characteristic curve analysis30 to estimate the proportion of the genetic variance on the liability scale attributable to variation at rs3806624, rs7745098, rs1432295, rs6903608, rs2608053, rs2019960 and rs501764.

Prediction of untyped SNPs was carried out using IMPUTEv2 (ref. 31) based on the 1,000 genomes phase 1 integrated variant set (b37) from March 2012. Imputed data were analysed using SNPTEST v2 to account for uncertainties in SNP prediction32. LD metrics were calculated in PLINK (ref. 25) using 1,000 genomes data and plotted using SNAP33. LD blocks were defined on the basis of HapMap recombination rate (cM/Mb) as defined using the Oxford recombination hotspots34 and on the basis of distribution of confidence intervals defined by Gabriel et al.35

To explore the epigenetic profile of association signals, we made use of chromatin state segmentation in LCL data generated by the ENCODE Project36. The states were inferred from ENCODE Histone Modification data (H4K20me1, H3K9ac, H3K4me3, H3K4me2, H3K4me1, H3K36me3, H3K27me3, H3K27ac and CTCF) binarized using a multivariate Hidden Markov Model. We made use of RegulomeDB37 and HaploReg38 to examine whether any of the SNPs or their proxies (that is, r2>0.8 in 1,000 genomes CEU reference panel) annotate putative transcription factor-binding/enhancer elements.

Relationship between SNP genotype and mRNA expression

To examine for a relationship between SNP genotype and expression, we made use of publicly available expression data generated on LCLs and T cells from HapMap3, Geneva and the MuTHER pilot data using Sentrix Human-6 Expression BeadChips (Illumina)39,40,41.

Additional information

How to cite this article: Frampton, M. et al. Variation at 3p24.1 and 6q23.3 influences the risk of Hodgkin’s lymphoma. Nat. Commun. 4:2549 doi: 10.1038/ncomms3549 (2013).


  1. 1.

    The biology of Hodgkin’s lymphoma. Nat. Rev. Cancer 9, 15–27 (2009).

  2. 2.

    & Infectious aetiology of Hodgkin and non-Hodgkin lymphomas: a review of the epidemiological evidence. J. Intern. Med. 264, 537–548 (2008).

  3. 3.

    et al. Concordance for Hodgkin’s disease in identical twins suggesting genetic susceptibility to the young-adult form of the disease. N. Engl. J. Med. 332, 413–418 (1995).

  4. 4.

    et al. A genome-wide meta-analysis of nodular sclerosing Hodgkin lymphoma identifies risk loci at 6p21.32. Blood 119, 469–475 (2011).

  5. 5.

    et al. Genome-wide association study of classical Hodgkin lymphoma and Epstein-Barr virus status-defined subgroups. J. Natl Cancer Inst. 104, 240–253 (2012).

  6. 6.

    et al. Multiple Hodgkin lymphoma-associated loci within the HLA region at chromosome 6p21.3. Blood 118, 670–674 (2012).

  7. 7.

    et al. A genome-wide association study of Hodgkin's lymphoma identifies new susceptibility loci at 2p16.1 (REL), 8q24.21 and 10p14 (GATA3). Nat. Genet. 42, 1126–1130 (2011).

  8. 8.

    et al. Population structure, differential bias and genomic control in a large-scale, case-control association study. Nat. Genet. 37, 1243–1246 (2005).

  9. 9.

    , , , & Human single-nucleotide polymorphisms alter p53 sequence-specific binding at gene regulatory elements. Nucleic Acids Res. 39, 178–189 (2010).

  10. 10.

    , , , & Transcription factors engaged in development of NK cells are commonly expressed in nasal NK/T-cell lymphomas. Hum. Pathol. 42, 1319–1328 (2011).

  11. 11.

    , , , & The microenvironment in classical Hodgkin lymphoma: an actively shaped and essential tumor component. Semin. Cancer Biol. doi:10.1016/j.semcancer.2013.07.002 (2013).

  12. 12.

    et al. Expression of the T-cell transcription factors, GATA-3 and T-bet, in the neoplastic cells of Hodgkin lymphomas. Am. J. Pathol. 166, 127–134 (2005).

  13. 13.

    et al. Control of effector CD8+ T cell function by the transcription factor Eomesodermin. Science 302, 1041–1043 (2003).

  14. 14.

    et al. Anomalous type 17 response to viral infection by CD8+ T cells lacking T-bet and eomesodermin. Science 321, 408–411 (2008).

  15. 15.

    et al. Progenitor and terminal subsets of CD8+ T cells cooperate to contain chronic viral infection. Science 338, 1220–1225 (2012).

  16. 16.

    et al. Eomesodermin controls interleukin-5 production in memory T helper 2 cells through inhibition of activity of the transcription factor GATA3. Immunity 35, 733–745 (2011).

  17. 17.

    et al. Genetic risk and a primary role for cell-mediated immune mechanisms in multiple sclerosis. Nature 476, 214–219 (2011).

  18. 18.

    et al. Re: familial clustering of Hodgkin lymphoma and multiple sclerosis. J. Natl Cancer Inst. 97, 543–544 author reply 544–545 (2005).

  19. 19.

    et al. New gene functions in megakaryopoiesis and platelet formation. Nature 480, 201–208 (2011).

  20. 20.

    et al. Identification of nine novel loci associated with white blood cell subtypes in a Japanese population. PLoS Genet. 7, e1002067 (2011).

  21. 21.

    et al. Seventy-five genetic loci influencing the human red blood cell. Nature 492, 369–375 (2012).

  22. 22.

    Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007).

  23. 23.

    & Cohort profile: 1958 British birth cohort (National Child Development Study). Int. J. Epidemiol. 35, 34–41 (2006).

  24. 24.

    et al. National study of colorectal cancer genetics. Br. J. Cancer 97, 1305–1309 (2007).

  25. 25.

    et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

  26. 26.

    & Statistical aspects of the analysis of data from retrospective studies of disease. J. Natl Cancer Inst. 22, 719–748 (1959).

  27. 27.

    & Meta-analysis in clinical trials. Control Clin. Trials 7, 177–188 (1986).

  28. 28.

    & Quantifying heterogeneity in a meta-analysis. Stat. Med. 21, 1539–1558 (2002).

  29. 29.

    , , , & Highly increased familial risks for specific lymphoma subtypes. Br. J. Haematol. 146, 91–94 (2009).

  30. 30.

    , , & The genetic interpretation of area under the ROC curve in genomic profiling. PLoS Genet. 6, e1000864 (2010).

  31. 31.

    , & A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).

  32. 32.

    , , , & A new multipoint method for genome-wide association studies by imputation of genotypes. Nat. Genet. 39, 906–913 (2007).

  33. 33.

    et al. SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap. Bioinformatics 24, 2938–2939 (2008).

  34. 34.

    , , , & A fine-scale map of recombination rates and hotspots across the human genome. Science 310, 321–324 (2005).

  35. 35.

    et al. The structure of haplotype blocks in the human genome. Science 296, 2225–2229 (2002).

  36. 36.

    & Discovery and characterization of chromatin states for systematic annotation of the human genome. Nat. Biotechnol. 28, 817–825 (2010).

  37. 37.

    et al. Annotation of functional variation in personal genomes using RegulomeDB. Genome Res. 22, 1790–1797 (2012).

  38. 38.

    & HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res. 40, D930–D934 (2012).

  39. 39.

    et al. Common regulatory variation impacts gene expression in a cell type-dependent manner. Science 325, 1246–1250 (2009).

  40. 40.

    et al. The architecture of gene regulatory variation across multiple human tissues: the MuTHER study. PLoS Genet. 7, e1002003 (2011).

  41. 41.

    et al. Patterns of cis regulatory variation in diverse human populations. PLoS Genet. 8, e1002639 (2012).

  42. 42.

    et al. A genome-wide association study of Hodgkin's lymphoma identifies new susceptibility loci at 2p16.1 (REL), 8q24.21 and 10p14 (GATA3). Nat. Genet. 42, 1126–1130 (2010).

Download references


In the United Kingdom, Leukaemia and Lymphoma Research (LLR; 10021) provided principal funding for the study. Support from Cancer Research UK (C1298/A8362 supported by the Bobby Moore Fund) and the Lymphoma Research Trust is also acknowledged. This study made use of control genotyping data generated by the Wellcome Trust Case–Control Consortium. We acknowledge use of genotype data from the British 1958 Birth Cohort DNA collection, which was funded by the Medical Research Council Grant G0000934 and the Wellcome Trust Grant 068545/Z/02. A full list of the investigators who contributed to the generation of the data is available from http://www.wtccc.org.uk. Funding for the project was provided by the Wellcome Trust under awards 076113 and 085475. At the Institute of Cancer Research, sample and data acquisition was supported by Breakthrough Breast Cancer and the European Union, and we acknowledge NHS funding to the NIHR Biomedical Research Centre. At the MRC University of Glasgow Centre for Virus Research funding is provided by Leukaemia and Lymphoma Research (12022), the Scotland and Newcastle Epidemiological Study of Hodgkin Disease (SNEHD) was funded by the Kay Kendall Leukaemia Fund and the Young Adult Hodgkin Case–Control Study (YHCCS) was funded by the LLR. The German-GWAS made use of genotyping data from the population-based Heinz Nixdorf Recall (HNR) study. The HNR study is supported by the Heinz Nixdorf Foundation (Germany). In addition, the study is funded by the German Ministry of Education and Science and the German Research Council (DFG; Project SI 236/8-1, SI236/9-1, ER 155/6-1 and ER 155/6-2). Funding was provided to LE by the Medical Faculty of the University Hospital of Essen (IFORES). The genotyping of the Illumina HumanOmni-1 Quad BeadChips of the HNR subjects was financed by the German Centre for Neurodegenerative Disorders (DZNE), Bonn. We thank for help with UK sample collection, Hayley Evans, James Griffin, Joanne Micic, Susan Blackmore, Beverley Smith, Deborah Hogben, Alison Butlin, Jill Wood, Margot Pelerin, Alison Hart, Katarzyna Tomczyk and Sarah Chilcott-Burns. Finally, we are grateful to all the patients and individuals for their participation and also thank the clinicians, investigators, other hospital staff and study staff who contributed to the blood sample and data collection for these studies.

Author information


  1. Division of Genetics and Epidemiology, Institute of Cancer Research, Sutton, Surrey SM2 5NG, UK

    • Matthew Frampton
    • , Peter Broderick
    • , Jayaram Vijayakrishnan
    • , Rosie Cooke
    • , Victor Enciso-Mora
    • , Amy Lloyd
    • , Amy Holroyd
    • , Anthony J. Swerdlow
    •  & Richard S. Houlston
  2. Division of Molecular Genetic Epidemiology, German Cancer Research Centre, Heidelberg 69120, Germany

    • Miguel Inacio da Silva Filho
    • , Hauke Thomsen
    • , Asta Försti
    •  & Kari Hemminki
  3. Center for Primary Health Care Research, Lund University, 205 02 Malmö, Sweden

    • Asta Försti
    •  & Kari Hemminki
  4. Institute of Human Genetics, University of Bonn, Bonn D-53127, Germany

    • Per Hoffmann
    •  & Markus M. Nöthen
  5. Genomics Research Group, Medical Genetics, University Hospital Basel, Basel 4031, Switzerland

    • Per Hoffmann
  6. German Center for Neurodegenerative Diseases (DZNE), Bonn 53175, Germany

    • Markus M. Nöthen
  7. Institute for Medical Informatics, Biometry and Epidemiology, University Hospital Essen, University of Duisburg–Essen, Essen 45122, Germany

    • Lewin Eisele
    •  & Karl-Heinz Jöckel
  8. Department of Internal Medicine, University Hospital of Cologne, Cologne 50937, Germany

    • Sabine Ponader
    • , Elke Pogge von Strandmann
    •  & Andreas Engert
  9. Department of Health Sciences, Epidemiology and Cancer Statistics Group, University of York, York Y010 5DD, UK

    • Tracy Lightfoot
    •  & Eve Roman
  10. MRC University of Glasgow Centre for Virus Research, Glasgow G61 1QH, UK

    • Annette Lake
    • , Dorothy Montgomery
    •  & Ruth F. Jarrett
  11. Division of Breast Cancer Research, Institute of Cancer Research, Sutton, Surrey SM2 5NG, UK

    • Anthony J. Swerdlow


  1. Search for Matthew Frampton in:

  2. Search for Miguel Inacio da Silva Filho in:

  3. Search for Peter Broderick in:

  4. Search for Hauke Thomsen in:

  5. Search for Asta Försti in:

  6. Search for Jayaram Vijayakrishnan in:

  7. Search for Rosie Cooke in:

  8. Search for Victor Enciso-Mora in:

  9. Search for Per Hoffmann in:

  10. Search for Markus M. Nöthen in:

  11. Search for Amy Lloyd in:

  12. Search for Amy Holroyd in:

  13. Search for Lewin Eisele in:

  14. Search for Karl-Heinz Jöckel in:

  15. Search for Sabine Ponader in:

  16. Search for Elke Pogge von Strandmann in:

  17. Search for Tracy Lightfoot in:

  18. Search for Eve Roman in:

  19. Search for Annette Lake in:

  20. Search for Dorothy Montgomery in:

  21. Search for Ruth F. Jarrett in:

  22. Search for Anthony J. Swerdlow in:

  23. Search for Andreas Engert in:

  24. Search for Kari Hemminki in:

  25. Search for Richard S. Houlston in:


R.S.H. and K.H. designed and provided overall project management. R.S.H. drafted the manuscript with contributions from P.B. In the United Kingdom, R.S.H. performed overall project management, development, database development and oversaw laboratory analyses; A.J.S. provided samples for UK-GWAS, R.C. provided data on samples for UK-GWAS, M.F. and V.E.-M. performed bioinformatic and statistical analyses, P.B. performed sample and laboratory coordination, J.V., A.L. and A.H. performed validation genotyping. For UK-Replication 2, A.L. and D.M. prepared samples, T.L., E.R., D.M. and R.F.J. provided samples and data. In Germany, A.F., H.T. and M.I.d.S.F. performed bioinformatic and statistical analyses; P.H. and M.M.N. were responsible for German-GWAS analysis; K.-H.J. and L.E. provided the German control samples; S.P., E.P.v.S. and A.E. were responsible for German HL patients. All authors contributed to the final paper.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Richard S. Houlston.

Supplementary information

PDF files

  1. 1.

    Supplementary Information

    Supplementary Tables S1-S6

About this article

Publication history






Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Newsletter Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing