The clustering of different types of B-cell malignancies in families raises the possibility of shared aetiology. To examine this, we performed cross-trait linkage disequilibrium (LD)-score regression of multiple myeloma (MM) and chronic lymphocytic leukaemia (CLL) genome-wide association study (GWAS) data sets, totalling 11,734 cases and 29,468 controls. A significant genetic correlation between these two B-cell malignancies was shown (Rg = 0.4, P = 0.0046). Furthermore, four of the 45 known CLL risk loci were shown to associate with MM risk and five of the 23 known MM risk loci associate with CLL risk. By integrating eQTL, Hi-C and ChIP-seq data, we show that these pleiotropic risk loci are enriched for B-cell regulatory elements and implicate B-cell developmental genes. These data identify shared biological pathways influencing the development of CLL and, MM and further our understanding of the aetiological basis of these B-cell malignancies.
Chronic lymphocytic leukaemia (CLL) and multiple myeloma (MM) are both B-cell malignancies, which arise from the clonal expansion of progenitor cells at different stages of B-cell maturity1,2,3. Evidence for inherited predisposition to CLL and MM comes from the six- and two-fold increased risk of the respective diseases seen in relatives of patients4.
Recent genome-wide association studies (GWAS) have transformed our understanding of genetic susceptibility to the B-cell malignancies, identifying 45 CLL5,6,7,8 and 23 MM risk loci9,10,11,12,13. Furthermore, statistical modelling of GWAS data indicates that common genetic variation is likely to account for 34% of CLL and 15% of MM heritability6,14. Epidemiological observations on familial cancer risks across the different B-cell malignancies suggest an element of shared inherited susceptibility, especially between CLL and MM4.
Linkage disequilibrium (LD) score regression is a method which exploits the feature of a test statistic for a given single nucleotide polymorphism (SNP), whereby that test statistic will incorporate the effects of correlated SNPs15. Conventional LD score regression regresses trait χ2 statistics against the LD score for a given SNP, with the coefficient of the regression line providing an estimate of trait heritability. This method can be modified by instead regressing the product of SNP Z-scores from two traits against the SNP LD score, with the slope providing an estimate of genetic covariance between the two traits16. This method can be applied to summary statistics, is not biased by sample overlap and does not require multiple traits to be measured for each individual.
By analysis of GWAS data for MM and CLL and applying cross-trait LD score regression, we have been able to demonstrate a positive genetic correlation between CLL and MM. We find evidence of shared genetic susceptibility at 10 known risk loci and by integrating promoter capture Hi-C (PCHi-C) data, ChIP-seq and expression data we provide insight into the shared biological basis of CLL and MM.
GWAS data sets
The data from six previously reported MM GWAS9,10,11,12,13 are summarized in Supplementary Table 1. All these studies were based on individuals of European ancestry and comprised: Oncoarray-GWAS (878 cases 7054 controls) UK-GWAS (2282 cases, 5197 controls), Swedish-GWAS (1714 cases, 10,391 controls), German-GWAS (1508 cases, 2107 controls), Netherlands-GWAS (555 cases, 2669 controls) and US-GWAS (780 cases, 1857 controls).
The data from three previously reported CLL GWAS8,9,10,11,12,13 are summarized in Supplementary Table 2. All these studies were based on individuals of European ancestry and comprised: CLL UK1 (505 cases and 2698 controls), CLL UK2 (1236 cases and 2501 controls) and CLL US (2174 cases and 2682 controls).
Collection of patient samples and associated clinico-pathological information was undertaken with written informed consent and relevant ethical review board approval at respective study centres in accordance with the tenets of the Declaration of Helsinki.
Specifically for the Myeloma-IX trial by the Medical Research Council (MRC) Leukaemia Data Monitoring and Ethics committee (MREC 02/8/95, ISRCTN68454111), the Myeloma-XI trial by the Oxfordshire Research Ethics Committee (MREC 17/09/09, ISRCTN49407852), HOVON65/GMMG-HD4 (ISRCTN 644552890; METC 13/01/2015), HOVON87/NMSG18 (EudraCTnr 2007-004007-34, METC 20/11/2008), HOVON95/EMN02 (EudraCTnr 2009-017903-28, METC 04/11/10), University of Heidelberg Ethical Commission (229/2003, S-337/2009, AFmu-119/2010), University of Arkansas for Medical Sciences Institutional Review Board (IRB 202077), Lund University Ethical Review Board (2013/54), the Norwegian REK 2014/97, and the Danish Ethical Review Board (no: H-16032570).
Specifically, the centres for UK-CLL1 and UK-CLL2 are: UK Multi-Research Ethics Committee (MREC 99/1/082); GEC: Mayo Clinic Institutional Review Board, Duke University Institutional Review Board, University of Utah, University of Texas MD Anderson Cancer Center Institutional Review Board, National Cancer Institute, ATBC: NCI Special Studies Institutional Review Board, BCCA: UBC BC Cancer Agency Research Ethics Board, CPS-II: American Cancer Society, ENGELA: IRB00003888—Comite d’ Evaluation Ethique de l’Inserm IRB #1, EPIC: Imperial College London, EpiLymph: International Agency for Research on Cancer, HPFS: Harvard School of Public Health (HSPH) Institutional Review Board, Iowa-Mayo SPORE: University of Iowa Institutional Review Board, Italian GxE: Comitato Etico Azienda Ospedaliero Universitaria di Cagliari, Mayo Clinic Case-Control: Mayo Clinic Institutional Review Board, MCCS: Cancer Council Victoria’s Human Research Ethics Committee, MSKCC: Memorial Sloan-Kettering Cancer Center Institutional Review Board, NCI-SEER (NCI Special Studies Institutional Review Board), NHS: Partners Human Research Committee, Brigham and Women’s Hospital, NSW: NSW Cancer Council Ethics Committee, NYU-WHS: New York University School of Medicine Institutional Review Board, PLCO: (NCI Special Studies Institutional Review Board), SCALE: Scientific Ethics Committee for the Capital Region of Denmark, SCALE: Regional Ethical Review Board in Stockholm (Section 4) IRB#5, Utah: University of Utah Institutional Review Board, UCSF and UCSF2: University of California San Francisco Committee on Human Research, Women’s Health Initiative (WHI): Fred Hutchinson Cancer Research Center and Yale: Human Investigation Committee, Yale University School of Medicine. Informed consent was obtained from all participants.
The diagnosis of MM (ICD-10 C90.0) in all cases was established in accordance with World Health Organization guidelines. All samples from patients for genotyping were obtained before treatment or at presentation. The diagnosis of CLL (ICD-10-CM C91.10, ICD-O M9823/3 and 9670/3) was established in accordance with the International Workshop on Chronic Lymphocytic Leukaemia guidelines.
Standard quality-control measures were applied to the GWAS17. Specifically, individuals with low SNP call rate (<95%) as well as individuals evaluated to be of non-European ancestry (using the HapMap version 2 CEU, JPT/CHB and YRI populations as a reference) were excluded. For apparent first-degree relative pairs, we excluded the control from a case-control pair; otherwise, we excluded the individual with the lower call rate. SNPs with a call rate <95% were excluded as were those with a MAF <0.01 or displaying significant deviation from Hardy–Weinberg equilibrium (P < 10−5). GWAS data were imputed to >10 million SNPs using IMPUTE2 v4 (for CLL) and IMPUTE2 v2.3 (for MM) software in conjunction with a merged reference panel consisting of data from 1000 Genomes Project18 (phase 1 integrated release 3 March 2012) and UK10K19. Genotypes were aligned to the positive strand in both imputation and genotyping. We imposed predefined thresholds for imputation quality to retain potential risk variants with MAF >0.01 for validation. Poorly imputed SNPs with an information measure <0.80 were excluded. Tests of association between imputed SNPs and MM were performed under an additive model in SNPTESTv2.520. The adequacy of the case-control matching and possibility of differential genotyping of cases and controls was evaluated using a Q–Q plot of test statistics. The inflation λ was based on the 90% least-significant SNPs and assessment of λ1000. Details of SNP QC are provided in Supplementary Table 3 and 4. Four principal components, generated using common SNPs, were included to limit the effects of cryptic population stratification in the US-CLL data set. Eigenvectors for the GWAS data sets were inferred using smartpca (part of EIGENSOFT) by merging cases and controls with phase II HapMap samples.
Meta-analyses were performed using the fixed-effects inverse-variance method using META v1.621. Cochran's Q-statistic to test for heterogeneity and the I2 statistic to quantify the proportion of the total variation due to heterogeneity was calculated.
LD score regression
To investigate genetic correlation between MM and CLL, we implemented cross-trait LD score regression by Bulik-Sullivan et al.16. Using summary statistics from the GWAS meta-analysis we implemented filters as recommended by the authors16. Specifically, filtering SNPs to INFO >0.9, MAF >0.01, and harmonizing to Hap Map3 SNPs with 1000 Genomes EUR MAF >0.05, removing indels and structural variants, removing strand-ambiguous SNPs and removing SNPs where alleles did not match those in 1000 Genomes. This was performed by running the munge-sumstats.pr script included with ldsc. We ran ldsc.py, part of the ldsc package, excluding the HLA region. We report heritability estimates on the observed scale. There is no distinction between observed and liability scale genetic correlation for case/control traits16.
Shared risk loci
To identify pleiotropic risk loci, that is genetic loci that influence two traits, we identified SNPs previously reported to be associated with each disease at genome-wide significance (P < 5 × 10−8), as well as highly correlated variants (r2 > 0.8) at the 45 and 23 known risk loci for CLL and MM, respectively. Within these correlated variant sets at each locus, we determined how many of the CLL susceptibility loci were associated with MM at region-wide significance after Bonferroni correction for multiple testing (i.e. Padj < 0.05/45). We then repeated the process, examining MM susceptibility SNPs in CLL, applying a significance level of Padj < 0.05/23. A full list of results is summarized in Supplementary Data File 1 and 2.
A variation of LD score regression, namely stratified LD score regression, can be used to partition heritability according to different genomic categories. For both MM and CLL we applied stratified LD score regression across the baseline model used in Finucane et al.22. We plotted the enrichment of functional categories for each disease- this is defined as proportion heritability divided by the total heritability. We excluded from our plot additional flanking regions around each functional category, which authors designed to allow observation of enrichment of SNP heritability in intermediary regions. A plot of the results is found in Supplementary Figure 1.
Variant set enrichment
To examine enrichment in specific histone mark binding across shared risk loci, we adapted the method of Cowper-Sal lari et al.23. Briefly, for each risk locus, a region of strong LD (defined as r2 > 0.8 and D′ > 0.8) was determined, and these SNPs were considered the associated variant set (AVS). Publically available ChIP-seq data for 6 histone marks from naive B cells was downloaded from Blueprint Epigenome Project24. For each mark, the overlap of the SNPs in the AVS and the binding sites was assessed to generate a mapping tally. A null distribution was produced by randomly selecting SNPs with the same characteristics as the risk-associated SNPs, and the null mapping tally calculated. This process was repeated 10,000 times, and P-values calculated as the proportion of permutations where null mapping tally was greater or equal to the AVS mapping tally. An enrichment score was calculated by normalizing the tallies to the median of the null distribution. Thus, the enrichment score is the number of standard deviations of the AVS mapping tally from the median of the null distribution tallies. An enrichment plot for naive B cells is shown in Supplementary Figure 2.
We considered chromatin mark overlap enrichment for genome-wide significant loci in different cell types using the methodology of Trynka et al.25. This approach scores GWAS SNPs based on proximity to chromatin mark and fold-enrichment of respective chromatin mark, assessing significance using a tissue-specific permutation method. We obtained chip-seq data for H3K4me3 from primary blood cells and CLL samples downloaded from Blueprint Epigenome project24. In addition, we included in our analysis 4 MM cell lines- KMS11, JJN3, MM1-S and L363 processed as previously described26. A heat map of results is shown in Supplementary Figure 3.
eQTL analyses were performed using publicly available whole-blood data downloaded from GTeX27. The relationship between SNP genotype and gene expression we carried out using Summary-data-based Mendelian Randomization (SMR) analysis as per Zhu et al.28. Briefly, if bxy is the effect size of x (gene expression) on y (slope of y regressed on the genetic value of x), bzx is the effect of z on x, and bzy be the effect of z on y, bxy (bzy/bzx) is the effect of x on y. To distinguish pleiotropy from linkage where the top associated cis-eQTL is in LD with two causal variants, one affecting gene expression the other affecting trait we tested for heterogeneity in dependent instruments (HEIDI), using multiple SNPs in each cis-eQTL region. Under the hypothesis of pleiotropy bxy values for SNPs in LD with the causal variant should be identical. For each probe that passed significance threshold for the SMR test, we tested the heterogeneity in the bxy values estimated for multiple SNPs in the cis-eQTL region using HEIDI.
GWAS summary statistics files were generated from the meta-analysis. For the disease discovery GWAS, we set a threshold for the SMR test of PSMR < 2.5 × 10−5 corresponding to a Bonferroni correction for the number of probes which demonstrated an association in the SMR test. For all genes passing this threshold we generated plots of the eQTL and GWAS associations at the locus, as well as plots of GWAS and eQTL effect sizes (i.e. input for the HEIDI heterogeneity test). HEIDI test P-values <0.05 were considered as reflective of heterogeneity. This threshold is, however, conservative for gene discovery because it retains fewer genes than when correcting for multiple testing. SMR plots for significant eQTLs are shown in Supplementary Figures 4, 5 and a summary of results are shown in Supplementary Table 5.
Genetic correlation and heritability
We performed cross-trait LD-score regression using summary statistics from two recent GWAS meta-analyses based on 7717 MM cases and 21,587 controls and 4017 CLL cases and 7881 controls (Fig. 1, Supplementary Table 1-4). While these data sets have been previously subject to quality control (QC)5,6,7,9,10,11,12 for the current analysis we implemented additional filtering steps as per Bulik-Sullivan et al.16, resulting in 1,055,728 harmonized SNPs between the two data sets. Heritability estimates from cross-trait LD score regression of 9.2 (±1.8%) and 22 (±5.9%) were comparable with previous estimates for MM14 and CLL6. LD-score regression revealed a significant-positive genetic correlation between MM and CLL with an Rg value of 0.44 (P = 4.6 × 10−3).
Identification of pleiotropic risk loci
We identified SNPs previously reported to be associated with each disease at genome-wide significance (P < 5 × 10−8), as well as highly correlated variants (r2 > 0.8) at the 45 and 23 known risk loci for CLL and MM, respectively. To identify pleiotropic risk loci, that is genetic loci that influence two traits, we determined how many of the CLL susceptibility loci were associated with MM at region-wide significance after Bonferroni correction for multiple testing (i.e. Padj < 0.05/45). We then repeated the process, examining MM susceptibility SNPs in CLL, applying a significance level of Padj < 0.05/23. Of the 45 CLL risk loci, four were associated with MM (Padj < 0.0011) while, of 23 MM risk loci, five were significantly associated in CLL (Padj < 0.0022) (Table 1, Fig. 2). Correlated SNPs (r2 > 0.8) at 3q26.2 are associated with both CLL and MM at genome-wide significance (Fig. 2), bringing the total number of pleiotropic loci to 10.
Trynka et al. have recently shown that chromatin marks highlighting active regulatory regions overlap with phenotype-associated variants in a cell-type-specific manner25. As H3K4me3 was shown to be the most phenotypically cell-type-specific chromatin mark, we examined cell-type specificity of the 10 pleiotropic risk loci by analysing H3K4me3 chromatin marks in normal haematopoietic cells and CLL patient samples from Blueprint, and de novo data on KMS11, MM1S, JJN3 and L363 MM cell lines. Cell types showing the strongest enrichment of risk SNPs at H3K4me3 marks included naive B cells and CD38-B cells. Notably, variants at 2q31.1, 6p25.3, 8q24.21, 16q23.1 and 22q13.33 were enriched for H3K4me3 in naive B cells (Supplementary Figure 3).
Most GWAS signals map to non-coding regions of the genome29,30 and influence gene expression through chromatin looping interactions31,32. Application of partitioned heritability analysis, stratifying across 53 genomic categories demonstrated enrichment of CLL and MM heritability in functional elements of the genome, in particular FANTOM5 enhancers (CLL and MM) transcription start sites (CLL) and 5′ untranslated region and coding regions (MM) (Supplementary Figure 1). Furthermore, we found significant enrichment of SNPs in the shared loci within regions of active chromatin, as indicated by the presence of H3K27ac and H3K4Me3 marks in naive B cells, supporting the principle that SNPs in shared loci influence risk through regulatory effects (Supplementary Figure 2). To identify target genes we analysed PCHi-C data on naive B cells from Blueprint24. We also sought to gain insight into the possible biological mechanisms for associations by performing an expression quantitative trait locus (eQTL) analysis using mRNA expression data on blood from GTEx. Applying Summary data-based Mendelian Randomization (SMR) methodology, we tested for pleiotropy between GWAS signal and cis-eQTL for genes to identify a causal relationship. Broadly, our analysis of the shared loci groups them into those which act on a B-cell regulation and differentiation and those which underpin the distinctive biology of cancer; specifically, loci relating to genome instability, angiogenesis and dysregulated apoptosis (Supplementary Table 6).
Of the shared loci, three were related to B-cell regulation. This included composite evidence at 10q23.31, from looping interactions in naive B cells and correlation in GWAS effect size and expression, which provide evidence for two candidate genes ACTA2, encoding smooth muscle (α)-2 actin, a protein involved in cell movement and contraction of muscles33 and FAS, a member of the TNF-receptor superfamily. FAS, has a central role in regulating the immune response through apoptosis of B cells34,35. At 2q31.1, looping interactions implicated transcription factor SP3, which has been shown to influence expression of germinal centre genes,36,37. Variants at 6p25.3 reside in the 3′-UTR of IRF4, which has an established role in B-cell regulation38,39 and MM oncogenesis40,41.
Three of the 10 loci contain genes with roles in maintenance of genomic stability. Specifically, evidence from expression and PCHi-C data implicated RFWD3 at 16q23.1. This gene encodes an E3 ubiquitin-protein ligase, which has been shown to promote progression to late stage homologous recombination through ubiquitination and timely removal of RAD51 and RPA at sites of DNA damage42 and is necessary for replication fork restart43. Variants in this locus demonstrated enrichment of H3K4me3 marks in two samples of naive B cells, which represents a plausible cell of disease origin. rs58618031 (7q31.33) maps 5′ of POT1, the protection of telomeres 1 gene, which is part of the shelterin complex and functions to maintain chromosomal stability44,45. Variant rs1317082 at 3q26.2 is located proximal to TERC, a gene which has been shown to influence telomere length46. Additionally, we observed looping interactions to a number of genes at 3q26.2 including SEC62, which has been proposed as a cancer biomarker46,48,49,50. Intriguingly, variants at 3q26.2 this locus have been implicated in colorectal51, thyroid52 and bladder53 cancer.
Several genes were implicated at 22q13.33 by looping interactions for SCO2, LMF2, ODF3B, TYMP/ECGF1, NCAPH2, SYCE3 and ARSA, with TYMP/ECGF1 and SCO2 demonstrating evidence of correlation in GWAS and eQTL effect size, albeit not significant after multiple testing (PSMR = 2.38 × 10−4 and 3.19 × 10−4). Variants within this locus were enriched in H3K4me3 chromatin marks in both CD38- B cells and inflammatory macrophages. TYMP (alias ECGF1) encodes thymidine phosphorylase, which is often overexpressed in tumours and has been linked to angiogenesis54,55. A detailed study on this gene has implicated TYMP in the development of lytic bone lesions in MM, via a mechanism involving activation of PI3K/Akt signalling and increased DNMT3A expression resulting in hypermethylation of RUNX2, osterix, and IRF856. Furthermore, SCO2 (synthesis of cytochrome c oxidase), also mapping to this locus, has been implicated in the development of breast57,58, gastric59 and leukaemia60, through glucose metabolism reprogramming61, a hallmark of cancer62. Tumour suppressor, p53, regulates metabolic pathways, p53-transactivated TP53-induced glycolysis (TIGAR), and regulation of apoptosis in part through SCO258,59,61.
Finally, whereas these data were indifferent to decipher 8q24.21, this locus has also been shown to harbour risk SNPs for other cancers, which localize within distinct LD blocks and likely reflect tissue-specific effects on cancer risk through regulation of MYC30.
Our analysis provides evidence of a genetic correlation between MM and CLL. Furthermore, we have identified shared genetic susceptibility at 10 known risk loci. While requiring biological validation, integration of data from PCHi-C, chromatin mark enrichment and eQTL at shared loci has provided insight into how these loci may confer susceptibility to both CLL and MM. Applying a working hypothesis that the loci may act in pleiotropic fashion, we selected relevant cells representing a common tissue of disease origin; namely naive B cells.
A significant genetic correlation between MM and CLL, as well as the discovery of risk loci shared between them, supports epidemiological data demonstrating elevated familial risks between these B-cell malignancies4. Furthermore, the shared loci we identified could be broadly grouped into those containing genes related to B-cell regulation and differentiation and those containing genes involved in angiogenesis, genome stability and apoptosis, supporting the tenet that these alleles can influence aetiology of either disease. With the expansion of GWAS of the B-cell malignancies, more detailed characterisation of common underlying risk alleles and affected pathways can inform the biology of B-cell oncogenesis.
SNP genotyping data that support the findings of this study have been deposited in Gene Expression Omnibus with accession codes GSE21349, GSE19784, GSE24080, GSE2658 and GSE15695; in the European Genome-phenome Archive (EGA) with accession code EGAS00000000001; in the European Bioinformatics Institute (Part of the European Molecular Biology Laboratory) (EMBL-EBI) with accession code E-MTAB-362 and E-TABM-1138; and in the database of Genotypes and Phenotypes (dbGaP) with accession code phs000207.v1.p1. The remaining data are contained within the paper and Supplementary Files or available from the author upon request. Naive B-cell HiC data used in this work is publicly available from Blueprint Epigenome Project [https://osf.io/u8tzp/]. ChIP-seq data for H3K27ac, H3K4Me1, H3K27Me3, H3K9Me3, H3K36Me3 and H3K27Me3 from naive B cells are publicly available and was obtained from Blueprint Epigenome Project [http://www.blueprint-epigenome.eu/].
Barlogie, B. & Gale, R. P. Multiple myeloma and chronic lymphocytic leukemia: parallels and contrasts. Am. J. Med. 93, 443–450 (1992).
Shaffer, A. L., Rosenwald, A. & Staudt, L. M. Lymphoid malignancies: the dark side of B-cell differentiation. Nat. Rev. Immunol. 2, 920–932 (2002).
Kuppers, R. Mechanisms of B-cell lymphoma pathogenesis. Nat. Rev. Cancer 5, 251–262 (2005).
Sud, A., et al. The landscape of familial risk of hematological malignancies: an analysis of 153,115 cases. (Under review).
Crowther-Swanepoel, D. et al. Common variants at 2q37.3, 8q24.21, 15q21.3 and 16q24.1 influence chronic lymphocytic leukemia risk. Nat. Genet. 42, 132–136 (2010).
Law, P. J. et al. Genome-wide association analysis implicates dysregulation of immunity genes in chronic lymphocytic leukaemia. Nat. Commun. 8, 14175 (2017).
Speedy, H. E. et al. A genome-wide association study identifies multiple susceptibility loci for chronic lymphocytic leukemia. Nat. Genet. 46, 56–60 (2014).
Berndt, S. I., et al. Genome-wide association study identifies multiple risk loci for chronic lymphocytic leukemia. (1546–1718 (Electronic)).
Went, M. et al. Identification of multiple risk loci and regulatory mechanisms influencing susceptibility to multiple myeloma. Nat. Commun 9, 3707 (2018).
Mitchell, J. S. et al. Genome-wide association study identifies multiple susceptibility loci for multiple myeloma. Nat. Commun. 7, 12050 (2016).
Broderick, P. et al. Common variation at 3p22.1 and 7p15.3 influences multiple myeloma risk. Nat. Genet. 44, 58–61 (2011).
Chubb, D. et al. Common variation at 3q26.2, 6p21.33, 17p11.2 and 22q13.1 influences multiple myeloma risk. Nat. Genet. 45, 1221–1225 (2013).
Swaminathan, B. et al. Variants in ELL2 influencing immunoglobulin levels associate with multiple myeloma. Nat. Commun. 6, 7213 (2015).
Mitchell, J. S. et al. Implementation of genome-wide complex trait analysis to quantify the heritability in multiple myeloma. Sci. Rep. 5, 12473 (2015).
Bulik-Sullivan, B. K., Loh, P. R., Finucane, H. K., Ripke, S. & Yang, J. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
Bulik-Sullivan, B., Finucane, H. K., Anttila, V., Gusev, A., Day, F. R. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).
Turner, S., et al. Quality control procedures for genome-wide association studies. Curr. Protoc. Hum. Genet. Chapter 1:Unit1.19 (2011).
Genomes Project, C. et al. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).
Huang, J. et al. Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel. Nat. Commun. 6, 8111 (2015).
Marchini, J., Howie, B., Myers, S., McVean, G. & Donnelly, P. A new multipoint method for genome-wide association studies by imputation of genotypes. Nat. Genet. 39, 906–913 (2007).
Liu, J. Z. et al. Meta-analysis and imputation refines the association of 15q25 with smoking quantity. Nat. Genet. 42, 436–440 (2010).
Finucane, H. K., et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).
Cowper-Sal lari, R. et al. Breast cancer risk-associated SNPs modulate the affinity of chromatin for FOXA1 and alter gene expression. Nat. Genet. 44, 1191–1198 (2012).
Fernandez, J. M. et al. The BLUEPRINT data analysis portal. Cell Syst. 3, 491–5.e5 (2016).
Trynka, G. et al. Chromatin marks identify critical cell types for fine mapping complex trait variants. Nat. Genet. 45, 124–130 (2013).
Li, N. et al. Genetic predisposition to multiple myeloma at 5q15 is mediated by an ELL2 enhancer polymorphism. Cell Rep. 20, 2556–2564 (2017).
GTEx Consortium The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).
Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).
Freedman, M. L. et al. Principles for the post-GWAS functional characterization of cancer risk loci. Nat. Genet. 43, 513–518 (2011).
Sud, A., Kinnersley, B. & Houlston, R. S. Genome-wide association studies of cancer: current insights and future perspectives. Nat. Rev. Cancer 17, 692–704 (2017).
Orlando, G., Kinnersley, B., Houlston, R. S. Capture Hi-C library generation and analysis to detect chromatin interactions. Curr. Protoc. Hum. Genet. 98, e63 (2018).
Mifsud, B. et al. Mapping long-range promoter contacts in human cells with high-resolution capture Hi-C. Nat. Genet. 47, 598 (2015).
Guo, D. C. et al. Mutations in smooth muscle alpha-actin (ACTA2) lead to thoracic aortic aneurysms and dissections. Nat. Genet. 39, 1488–1493 (2007).
Akagi, T., Yoshino, T. & Kondo, E. The Fas antigen and Fas-mediated apoptosis in B-cell differentiation. Leuk. Lymphoma 28, 483–489 (1998).
Figgett, W. A. et al. The TACI receptor regulates T-cell-independent marginal zone B cell responses through innate activation-induced cell death. Immunity 39, 573–583 (2013).
Park, S. R. et al. HoxC4 binds to the promoter of the cytidine deaminase AID gene to induce AID expression, class-switch DNA recombination and somatic hypermutation. Nat. Immunol. 10, 540–550 (2009).
Steinke, J. W. et al. Identification of an Sp factor-dependent promoter in GCET, a gene expressed at high levels in germinal center B cells. Mol. Immunol. 41, 1145–1153 (2004).
Willis, S. N. et al. Transcription factor IRF4 regulates germinal center cell formation through a B cell-intrinsic mechanism. J. Immunol. 192, 3200–3206 (2014).
Nutt, S. L., Hodgkin, P. D., Tarlinton, D. M. & Corcoran, L. M. The generation of antibody-secreting plasma cells. Nat. Rev. Immunol. 15, 160–171 (2015).
Zhang, S. et al. IRF4 promotes cell proliferation by JNK pathway in multiple myeloma. Med. Oncol. 30, 594 (2013).
Ohguchi, H. et al. The KDM3A-KLF2-IRF4 axis maintains myeloma cell survival. Nat. Commun. 7, 10258 (2016).
Inano, S. et al. RFWD3-mediated ubiquitination promotes timely removal of both RPA and RAD51 from DNA damage sites to facilitate homologous recombination. Mol. Cell 66, 622–34.e8 (2017).
Elia, A. E. et al. RFWD3-dependent ubiquitination of RPA regulates repair at stalled replication forks. Mol. Cell 60, 280–293 (2015).
Rice, C. et al. Structural and functional analysis of the human POT1-TPP1 telomeric complex. Nat. Commun. 8, 14928 (2017).
Pinzaru, A. M. et al. Telomere replication stress induced by POT1 inactivation accelerates tumorigenesis. Cell Rep. 15, 2170–2184 (2016).
Codd, V. et al. Identification of seven loci affecting mean telomere length and their association with disease. Nat. Genet. 45, 422 (2013).
Jung, V. et al. Genomic and expression analysis of the 3q25-q26 amplification unit reveals TLOC1/SEC62 as a probable target gene in prostate cancer. Mol. Cancer Res. 4, 169–176 (2006).
Linxweiler, M., Schick, B. & Zimmermann, R. Let's talk about Secs: Sec61, Sec62 and Sec63 in signal transduction, oncology and personalized medicine. Signal Transduct. Target. Ther. 2, 17002 (2017).
Bergmann, T. J., Fumagalli, F., Loi, M. & Molinari, M. Role of SEC62 in ER maintenance: a link with ER stress tolerance in SEC62-overexpressing tumors? Mol. Cell Oncol. 4, e1264351 (2017).
Greiner, M. et al. Sec62 protein level is crucial for the ER stress tolerance of prostate cancer. Prostate 71, 1074–1083 (2011).
Houlston, R. S. et al. Meta-analysis of three genome-wide association studies identifies susceptibility loci for colorectal cancer at 1q41, 3q26.2, 12q13.13 and 20q13.33. Nat. Genet. 42, 973–977 (2010).
Gudmundsson, J. et al. A genome-wide association study yields five novel thyroid cancer risk loci. Nat. Commun. 8, 14517 (2017).
Figueroa, J. D. et al. Genome-wide association study identifies multiple loci associated with bladder cancer risk. Hum. Mol. Genet. 23, 1387–1398 (2014).
Deves, C. et al. The kinetic mechanism of Human Thymidine Phosphorylase - a molecular target for cancer drug development. Mol. Biosyst. 10, 592–604 (2014).
Bijnsdorp, I. V. et al. Thymidine phosphorylase in cancer cells stimulates human endothelial cell migration and invasion by the secretion of angiogenic factors. Br. J. Cancer 104, 1185–1192 (2011).
Liu, H. et al. Thymidine phosphorylase exerts complex effects on bone resorption and formation in myeloma. Sci. Transl. Med. 8, 353ra113 (2016).
Liu, F. et al. The oncoprotein HBXIP promotes glucose metabolism reprogramming via downregulating SCO2 and PDHA1 in breast cancer. Oncotarget 6, 27199–27213 (2015).
Won, K. Y. et al. Regulatory role of p53 in cancer metabolism via SCO2 and TIGAR in human breast cancer. Human. Pathol. 43, 221–228 (2012).
Kim, S. H., Choi, S. I., Won, K. Y. & Lim, S. J. Distinctive interrelation of p53 with SCO2, COX, and TIGAR in human gastric cancer. Pathol. Res. Pract. 212, 904–910 (2016).
Papadopoulou, L. C., Kyriazou, A. V., Bonovolias, I. D. & Tsiftsoglou, A. S. Imatinib inhibits the expression of SCO2 and FRATAXIN genes that encode mitochondrial proteins in human Bcr-Abl(+) leukemia cells. Blood Cell Mol. Dis. 53, 84–90 (2014).
Nath, A. & Chan, C. Genetic alterations in fatty acid transport and metabolism genes are associated with metastatic progression and poor prognosis of human cancers. Sci. Rep. 6, 18669 (2016).
Hanahan, D. & Weinberg, R. A. The hallmarks of cancer. Cell 100, 57–70 (2000).
In the United Kingdom, Myeloma UK and Bloodwise (#05001, #06002 and #13044) provided principal funding. Additional funding was provided by Cancer Research UK (C1298/A8362 supported by the Bobby Moore Fund) and The Rosetrees Trust. M.W. is supported by funding from Mr Ralph Stockwell. A.S. is supported by a clinical fellowship from Cancer Research UK and the Royal Marsden Haematology Research Fund. This study made use of genotyping data on the 1958 Birth Cohort generated by the Wellcome Trust Sanger Institute (http://www.wtccc.org.uk). We are grateful to all investigators who contributed to NSCCG and GELCAPS, from which controls in the replication were drawn. We also thank the staff of the CTRU University of Leeds and the NCRI Haematology Clinical Studies Group. We thank the High-throughput Genomics Group at the Wellcome Trust Centre for Human Genetics (funded by Wellcome Trust grant reference 090532/Z/09/Z) for the generation of UK myeloma Oncoarray data. The BCAC study would not have been possible without the contributions of the following: Manjeet K. Bolla, Qin Wang, Kyriaki Michailidou and Joe Dennis. BCAC is funded by Cancer Research UK (C1287/A10118, C1287/A16563). For the BBCS study, we thank Eileen Williams, Elaine Ryder-Mills and Kara Sargus. The BBCS is funded by Cancer Research UK and Breast Cancer Now and acknowledges NHS funding to the National Institute of Health Research (NIHR) Biomedical Research Centre (BRC), and the National Cancer Research Network (NCRN). We thank the participants and the investigators of EPIC (European Prospective Investigation into Cancer and Nutrition). The coordination of EPIC is financially supported by the European Commission (DG-SANCO) and the International Agency for Research on Cancer. The national cohorts are supported by: Ligue Contre le Cancer, Institut Gustave Roussy, Mutuelle Générale de l’Education Nationale, Institut National de la Santé et de la Recherche Médicale (INSERM) (France); German Cancer Aid, German Cancer Research Center (DKFZ), Federal Ministry of Education and Research (BMBF) (Germany); the Hellenic Health Foundation, the Stavros Niarchos Foundation (Greece); Associazione Italiana per la Ricerca sul Cancro-AIRC-Italy and National Research Council (Italy); Dutch Ministry of Public Health, Welfare and Sports (VWS), Netherlands Cancer Registry (NKR), LK Research Funds, Dutch Prevention Funds, Dutch ZON (Zorg Onderzoek Nederland), World Cancer Research Fund (WCRF), Statistics Netherlands (The Netherlands); Health Research Fund (FIS), PI13/00061 to Granada, PI13/01162 to EPIC-Murcia, Regional Governments of Andalucía, Asturias, Basque Country, Murcia and Navarra, ISCIII RETIC (RD06/0020) (Spain); Cancer Research UK (14136 to EPIC-Norfolk; C570/A16491 and C8221/A19170 to EPIC-Oxford), Medical Research Council (1000143 to EPIC-Norfolk, MR/M012190/1 to EPIC-Oxford) (UK). We thank the SEARCH and EPIC teams which were funded by a programme grant from Cancer Research UK [A16561] and supported by the UK NIHR BRC at the University of Cambridge. We thank Breast Cancer Now and The Institute of Cancer Research (ICR) for support and funding of the UKBGS, and the study participants, study staff, and the doctors, nurses and other health care providers and health information sources who have contributed to the study. We acknowledge NHS funding to the Royal Marsden/ICR NIHR BRC. UKGPCS would like to thank The Institute of Cancer Research and The Everyman Campaign for funding support. The UKGPCS acknowledges The Prostate Cancer Research Foundation, Prostate Action, The Orchid Cancer Appeal, The National Cancer Research Network UK, The National Cancer Research Institute (NCRI), the NIHR funding to the NIHR Biomedical Research data managers and consultants for their work in the UKGPCS study and urologists and other persons involved in the planning, and data collection of the CAPS study. Genotyping of the OncoArray was funded by the US National Institutes of Health (NIH) [U19 CA 148537 for ELucidating Loci Involved in Prostate cancer SuscEptibility (ELLIPSE) project and X01HG007492 to the Center for Inherited Disease Research (CIDR) under contract number HHSN268201200008I]. Additional analytic support was provided by NIH NCI U01 CA188392 (PI: Schumacher). The PRACTICAL consortium was supported by Cancer Research UK Grants C5047/A7357, C1287/A10118, C1287/A16563, C5047/A3354, C5047/A10692, C16913/A6135, European Commission's Seventh Framework Programme grant agreement n° 223175 (HEALTH-F2-2009-223175), and The National Institute of Health (NIH) Cancer Post-Cancer GWAS initiative grant: No. 1 U19 CA 148537-01 (the GAME-ON initiative). We would also like to thank the following for funding support: The Institute of Cancer Research and The Everyman Campaign, The Prostate Cancer Research Foundation, Prostate Research Campaign UK (now Prostate Action), The Orchid Cancer Appeal, The National Cancer Research Network UK, The National Cancer Research Institute (NCRI) UK. We are grateful for support of NIHR funding to the NIHR Biomedical Research Centre at The Institute of Cancer Research and The Royal Marsden NHS Foundation Trust. The APBC BioResource, which forms part of the PRACTICAL consortium, consists of the following members: Wayne Tilley, Gail Risbridger, Renea Taylor, Judith A Clements, Lisa Horvath, Vanessa Hayes, Lisa Butler, Trina Yeadon, Allison Eckert, Pamela Saunders, Anne-Maree Haynes, Melissa Papargiris. The US GWAS was supported by a grant from the National Institutes of Health (P01CA055819). The German study was supported by the Dietmar-Hopp-Stiftung, Germany, the German Cancer Aid (110,131), the Federal Ministry of Education and Research (CLIOMMICS 01ZX1309), The German Research Council (DFG; Project SI 236/81, SI 236/)-1, ER 155/6-1 and the DFG CRI 216), the Harald Huppert Foundation and the Multiple Myeloma Research Foundation. The patients were collected by the GMMG and DSMM studies. The German GWAS made use of genotyping data from the population-based HNR study, which is supported by the Heinz Nixdorf Foundation (Germany). The genotyping of the Illumina HumanOmni-1 Quad BeadChips of the HNR subjects was financed by the German Center for Neurodegenerative Disorders (DZNE), Bonn. We are grateful to all investigators who contributed to the generation of this data set. The German replication controls were collected by Peter Bugert, Institute of Transfusion Medicine and Immunology, Medical Faculty Mannheim, Heidelberg University, German Red Cross Blood Service of Baden-Württemberg-Hessen, Mannheim, Germany. This work was supported by research grants from the Swedish Foundation for Strategic Research (KF10-0009), the Marianne and Marcus Wallenberg Foundation (2010.0112), the Knut and Alice Wallenberg Foundation (2012.0193), the Swedish Research Council (2012–1753), the Royal Swedish Academy of Science, ALF grants to the University and Regional Laboratories (Labmedicin Skåne), the Siv-Inger and Per-Erik Andersson Foundation, the Medical Faculty at Lund University, the Borås foundation for cancer research, and the Swedish Society of Medicine. We thank Jörgen Adolfsson, Tomas Axelsson, Anna Collin, Ildikó Frigyesi, Patrik Magnusson, Bertil Johansson, Jan Westin and Helga Ögmundsdóttir for their assistance. This work was supported by Center for Translational Molecular Medicine (BioCHIP), a clinical research grant from the European Hematology Association, an EMCR Translational Research Grant, a BMBF grant from CLIOMMICS (01ZX1309A) and FP7 grant MSCNET (LSHC-Ct-2006-037602). N.W. was supported by the National Institute of General Medical Sciences of the National Institutes of Health under Award Number P20GM125503. We thank the staff of the HOVON, as well as patients and physicians at participating sites. In addition, we also thank Jasper Koenders, Michael Vermeulen, André Uitterlinden and Nathalie van der Velde for their assistance. We are indebted to the clinicians who contributed samples to Swedish, Norwegian and Danish biobanks. We are indebted to the patients and other individuals who participated in the project. The Genome-Wide Association Study (GWAS) of Non-Hodgkin Lymphoma (NHL) project, from which US-CLL samples were obtained, was supported by the intramural program of the Division of Cancer Epidemiology and Genetics (DCEG), National Cancer Institute (NCI), National Institutes of Health (NIH). The data sets have been accessed through the NIH database for Genotypes and Phenotypes (dbGaP) under accession # phs000801. A full list of acknowledgements can be found in supplementary note (Berndt SI et al., Nature Genet., 2013, PMID: 23770605).
Conflict of interest
The authors declare that they have no conflict of interest.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.