Introduction

Diffuse large B-cell lymphoma (DLBCL), the major subtype of non-Hodgkin's B-cell lymphoma, is accounting for >30% of all lymphoid malignancies worldwide1 and >43% in Japan.2 Although the combination of cyclophosphamide, doxorubicin, vincristine, prednisone and rituximab, has extended the overall survival among DLBCL patients,3 a substantial portion of patients still suffer from persistent or recurrent diseases. To further improve the clinical outcome of DLBCL, elucidation of its underlying pathogenesis is essential.

Along with the established environmental risk factors such as older age, infectious diseases and congenital immunodeficiency, genetic factors also have a major role.4, 5, 6 According to this notion, nearly 10-fold relative risk for DLBCL was reported among first-degree relatives of DLBCL patients.7 Thus, it is conceivable that the pathogenesis of this aggressive disease is largely determined by complex interaction between genetic and environmental factors. To clarify the genetic susceptibility factors for DLBCL, we conducted single-nucleotide polymorphism (SNP) based genome-wide association study (GWAS) among clinically and histopathologicaly confirmed DLBCL patients.

Materials and methods

Study participants

In this study, we conducted two-stage GWAS and a subsequent replication analysis using a total of 399 DLBCL cases and 4243 control subjects. The demographic details of study participants are summarized in Supplementary Table 1. In the first stage of GWAS, 934 Japanese control DNA samples were obtained from Osaka-Midosuji Rotary Club, Osaka, Japan. DLBCL cases in the first and the second stage and control subjects in the second stage were obtained from BioBank Japan.8 Histopathologically confirmed 106 DLBCL cases and age–gender-matched 400 healthy controls were selected from Aichi Cancer Center, Japan for a replication study.9 All the participants provided written informed consent. This research project was approved by the ethics committees at each institute.

SNP genotyping

In all, 74 DLBCL cases and 934 healthy controls in the first stage and 2909 controls in the second stage were genotyped using Illumina HumanHap550v3 Genotyping BeadChip (Illumina, San Diego, CA, USA). Standard SNP quality control filters (P-value of Hardy–Weinberg equilibrium test of >1.0 × 10−6 for controls, minor allele frequency of >0.01, genotyping completeness of 99%) were applied to finally obtain 444 361 SNPs for the association analysis. Case samples in the second stage and case–control samples in the replication analysis were genotyped by the multiplex PCR-based Invader assay (Third Wave Technologies, Inc., Madison, WI, USA).10

Statistical analysis

The association was tested by logistic regression analysis after adjusting for age and gender in the first stage of GWAS. At the second stage and the replication analysis, statistical significance of the association with each SNP was assessed using 1 d.f. Cochrane–Armitage trend test. Odds ratios (OR) and confidence intervals (CIs) were calculated using the major allele as a reference allele. The meta-analysis was conducted using the Mantel–Haenszel method. Heterogeneity among studies was examined by the Breslow–Day test. Software and web tools used in this study are summarized in Supplementary data.

Candidate gene analysis

We searched pubmed for association studies of lymphoma, and obtained the list of 182 candidate genes those were previously analyzed by several researchers among different population by candidate gene approach. We selected tag SNPs located on these genes from Illumina 550k platform (Illumina; Supplementary Table 3). The association between these tag SNPs and DLBCL was tested in the first stage 74 DLBCL cases and 934 healthy controls by logistic regression analysis on adjustment for age and gender.

Software and web tools

For general statistical analyses, we used R statistical environment version 2.6.1 (cran.r-project.org) or plink-1.05 (pngu.mgh.harvard.edu/~purcell/plink/).11 The Haploview software was used to draw linkage disequilibrium maps.12 Primer3-web v0.3.0 (http://frodo.wi.mit.edu) web tool was used to design primers. The r2 between SNPs were annotated using the WGAViewer software.13 In-silico functional annotation of SNPs was performed using FastSNP web tool (http://fastsnp.ibms.sinica.edu.tw/fastSNP2/pages/input_CandidateGeneSearch.jsp).14

Results

First-stage GWAS

In the first stage, we genotyped 74 DLBCL cases and 934 healthy control samples at 550 000 SNP loci. The association between each SNP and DLBCL was assessed by logistic regression analysis after adjusting for age and gender. We plotted observed logistic P-values against expected logistic P-values by quantile–quantile plot and confirmed that there is no or least population stratification (λ=1.03, Supplementary Figure 1). To identify disease susceptibility loci, we further investigated top ranked 500 SNPs (P1 × 10−3, Supplementary Figure 2) in additional cases and controls.

Second-stage GWAS

The analysis of the second stage cohort which consisted of 219 DLBCL cases and 2909 controls at 500 top SNPs revealed significant association (Armitage P<0.05) at 15 SNPs (Supplementary Table 2). The Mantel–Haenszel test was performed at these 15 SNPs to assess the overall degree of association by combining first stage and second stage datasets. Notably, SNP rs4551233 and SNP rs4443228 showed strong association with DLBCL. Both SNPs achieved the Mantel–Haenszel P of 7.06 × 10−7 and 7.23 × 10−7; OR of 1.57 (95% CI 1.32–1.88) and 2.43 (95% CI 1.7–3.45), respectively. We also observed improved associations at SNPs rs11222532, rs17811655, rs1381795 and rs751837. However, no SNPs exceeded a genome-wide significant threshold (P<1.13 × 10−7, after Bonferroni correction).

Replication analysis using Aichi cancer center DLBCL cohort

In all, 15 candidate SNPs from our two-stage GWAS were subsequently genotyped in age and gender matched independent DLBCL replication cohort, consisting of 106 cases and 400 controls from Aichi Cancer Center. Genotyping of this cohort revealed significant association (Table 1) at rs7097 (P=4.89 × 10−2, OR of 1.37; 95% CI 1.01–1.87) that is located on chromosome 13q12–q13 (Figure 1a). In addition, we observed nominal association at rs751837 (P=6.19 × 10−2) that is located within 120 kb linkage disequilibrium block on chromosome 14q32.32 (Figure 1b). The meta-analysis revealed consistent association at rs7097 (P=6.57 × 10−6 and OR of 1.43; 95% CI=1.23–1.67) and rs751837 (P=3.3 × 10−7 and OR of 3.5; 95% CI=2.12–5.88) with the same direction of effect (Supplementary Figure 3).

Table 1 Meta-analysis at two candidates by combining first stage, second stage and replication stage
Figure 1
figure 1

(a) Regional association plot on 13q12–13. The upper panel displays the distribution of −log10 P-values (Cochran–Armitage trend test) according to their physical positions on 13q12–13. The dotted vertical line indicates the physical position of rs7097 SNP and its combined P-value. The second panel shows the r2 values for proxies of rs7097, which are based on the HapMap JPT population. Annotated genes are shown at the top of linkage disequilibrium block. (b) Regional association plot on 14q32. The upper panel displays the distribution of −log10P-values (Cochran–Armitage trend test) according to their physical position on 14q32. The dotted vertical line indicates the physical position of rs751837 SNP and its combined P-value. The second panel shows the r2 values for proxies of rs751837, which are based on the HapMap JPT population. Annotated genes are shown at the top of linkage disequilibrium block. BBJ, biobank Japan; ACC, Aichi Cancer Center.

Discussion

Our study identified two genetic variants to be consistently associated with increased risk of DLBCL susceptibility. SNP rs7097 is located at the common region encompassing the 3-prime un-translated region of POLR1D and the promoter region of LNX2 (ligand of numb protein × 2; Figure 1a). POLR1D encodes a 16 kDA RNA polymerase I polypeptide D. By considering the role of POLR1D in transcription, it is hard to explain the impact of its polymorphisms on DLBCL pathogenesis. On the other hand, LNX2 encodes a PDZ domain containing ring finger 1 protein which may function as an E3 ubiquitin ligase.15 Higher levels of LNX family member was shown to cause ubiquitin–proteasomal degradation of Numb protein which in turn enhance the notch signaling.15 As activated notch signaling is implicated in hematological cancers16 such as B-cell lymphoma,17 SNP rs7097 might have functional consequences for DLBCL. Although, on the basis of in silico prediction, allele G at SNP rs7097 can create a binding site for GATA-2 transcription factor (Supplementary Figure 4A), identifying a specific causative variant at this locus is necessary to understand the impact of this locus for DLBCL.

The second candidate SNP rs751837 is located within the intron 3 of CDC42BPB (also known as MRCK β (myotonic dystrophy kinase-related Cdc42-binding kinase β) Figure 1b). According to in silico prediction, allele T at rs751837 shows preferential binding of the CdxA transcription factor (Supplementary Figure 4b) implying possible role of this variant on the transcriptional regulation of CDC42BPB. However, further characterization of this region is required to delineate the effect of other variants on this locus. CDC42BPB is a member of the multidomained serine/threonine protein kinase family. MRCK isoforms affects cellular structures through modulating Cdc42 actions during cytoskeletal reorganization,18 and are shown to interact with potent tumor-promoting agent phorbol ester and diacylglycerol, albeit with weaker affinity.19 Thus, CDC42BPB might have a role in carcinogenesis by transducing diacylglycerol signals similar to protein kinase C or RasGRPs. In addition, CDC42BPB is located on chromosome 14q32 region, which is inherently susceptible to frequent chromosomal translocations, especially in lymphoma. It is reported that the presence of 14q32 translocation among DLBCL is associated with higher frequencies of other chromosomal aberrations.20 Thus, further analysis about correlation between rs751837 and 14q32 translocation might shed light on the molecular mechanism underlying lymphoma specific chromosomal translocation.

The SNPs identified in this study, however failed to achieve conservative Bonferroni threshold (P<5 × 10−8) for GWAS, which could be because of the small sample size analyzed in this study and/or the clinical heterogeneity of DLBCL. In this regard, it is noteworthy that a recent GWAS among Caucasians identified 6p21.33 locus as a susceptibility region to follicular lymphoma, however, the same study failed to identify genetic susceptibility factors to DLBCL with a relatively larger sample size (783 cases and 3377 controls).21 In contrast, we found two DLBCL susceptibility loci those seem to be functionally related to lymphoma pathogenesis. This is probably because of the the genetic homogeneity among Japanese population or the difference in the genetic background between Asian and Caucasian.22

We also attempted to determine the association between DLBCL and candidate genes analyzed by several researchers from different population. The analysis revealed nominal association (P<0.05) at SNPs located on 52 candidate genes (Supplementary Table 3), which are mainly implicated in immune response, apoptosis and cytokine signaling. These finding suggested that many genetic variations would contribute to the pathogenesis of DLBCL.

As DLBCL exhibit clinical, histological, immunological and molecular heterogeneity,23, 24, 25 more detailed DLBCL classification (for example, germinal center DLBCL or non-germinal center DLBCL or activated B-cell-like) may be necessary to identify subtype specific genetic variants both from candidate gene analysis as well as from GWAS. In this regard, a global collaborative effort might be required to obtain sufficient samples to undertake a meta-analysis to identify stronger genetic susceptibility factors for subtypes of DLBCL. Nevertheless, our findings support the role of genetic component in DLBCL susceptibility and may contribute toward understanding of DLBCL pathogenesis.