To the Editor:

Multiple myeloma (MM) is one of the most common hematological malignancies, accounting for 20% of all newly diagnosed hematological cancers [1]. The most recent data from Cancer Today show that in 2020 the number of new MM cases was 176,404 worldwide (https://gco.iarc.fr/today/home).

Established risk factors for MM include age, male sex, African ancestry, obesity, chronic inflammation, exposure to pesticides, organic solvents, and radiation [2]. Familial aggregation of MM and its precursor monoclonal gammopathy of undetermined significance (MGUS) suggests that genetic factors play a role in risk of MM as well [3]. Genetic variability has been identified as a risk factor for MM, including 25 common genetic loci identified in genome-wide association studies (GWAS). However, estimates of heritability show that many more loci remain to be found [4].

A key question is therefore how to find new causative variants. The stringent significance threshold usually used in GWAS (p < 5 × 10−8) accounts for the many statistical tests being performed but may result in false negatives. Reducing the number of tests will relax the required significance threshold, thereby increasing statistical power to detect associations with MM risk for each SNP. One strategy for reducing the number of tests is to examine SNPs with higher prior probability of association according to meaningful biological criteria. We looked for novel MM risk loci using a two-phase large-scale association study, prioritizing polymorphisms with predicted functional impact, a strategy that has been used for other cancers and led to the discovery of new loci [5,6,7] It is well known that functional variants are indeed more likely to be associated with disease development [8].

We used data from the International Lymphoma Epidemiology Consortium (InterLymph) for discovery and from the German-speaking Myeloma Multicenter Group (GMMG), the International Multiple Myeloma rESEarch (IMMEnSE) consortium, as well as summary statistics from the FinnGen study for replication, for a total of 5982 MM cases and 266,173 controls. Detailed characteristics of the study populations are shown in the supplementary methods and supplementary table 1.

Candidate SNPs to be replicated were selected based on their association with MM risk and their functional role. First, we obtained summary results including odds ratios (OR), 95% confidence intervals (95%-CI), and p-values of the top SNPs of the InterLymph GWAS. Subsequently, all SNPs in the MM data set from InterLymph with p < 5 × 10−4 (N = 4396) were looked up in the first replication dataset, the GMMG GWAS. We did not consider SNPs from 15 loci that were reported to be associated at genome-wide significance level in previous GWASs. All SNPs with significant p-values (p < 0.05) in the GMMG GWAS and ORs going in the same direction in both datasets were selected. The next step was annotating the selected SNPs (N = 136) for their predicted function, using several suitable bioinformatic tools and databases. Specifically, we looked at expression and splicing quantitative trait loci (eQTLs and sQTLs), SNPs located in transcription factor binding sites (TFBS), long non-coding RNA (lncSNPs), SNPs that are located within enhancers, and polymorphisms located in gene coding regions (missense, stop-gain, stop-loss, synonymous SNPs). Supplementary Table 2 shows the details of the 136 SNPs and their predicted functional characterization. The resulting list from all annotations was pruned for linkage disequilibrium (LD) using the LDlink portal (https://ldlink.nci.nih.gov/). Only SNPs with r2 < 0.6 among them were kept, resulting in a total of 12 independent loci on 9 chromosomes. Replication in IMMEnSE and FinnGen was attempted for SNPs showing association with risk in the meta-analysis between InterLymph and GMMG GWAS and at least one in silico functional annotation. After exclusion of SNPs that had already been analysed in IMMEnSE in the context of previous projects and already shown not to be significantly associated with MM risk (on chromosomes 6, 8, 12 and 21), 4 SNPs showed to have low p-value of association with MM risk and had at least one functional prediction annotation (rs12038685, rs2664188, rs12652920, rs28199), which were therefore chosen for replication in IMMEnSE (Supplementary Table 3). An in-depth description of the SNP functional annotation and selection, as well as the technical details of the genotyping and quality control, can be found in the Supplementary Methods.

Analysis of association between each SNP and MM risk was carried out with logistic regression models, by estimating ORs, their 95%-CI, and associated p-value. The analyses were adjusted for age (at diagnosis for MM cases and recruitment for controls), sex, and the 10 first principal components for GWAS data, or country of origin in IMMEnSE, which lacks GWAS data. All meta-analyses were conducted with R, using a fixed-effect model between summary statistics of the different studies. The I² statistic was computed to quantify heterogeneity across studies.

rs28199, on chromosome 5, was associated with MM risk in IMMEnSE (OR = 1.19, 95% C.I. = 0.72–0.97, p = 0.018) and FinnGen (1.17, 95% C.I. = 1.05–1.31, p = 0.014). The G allele of this SNP resulted to be significantly associated with increased MM risk at a genome-wide level in the meta-analysis of the four datasets (OR = 1.18, 95% C.I. = 1.11–1.23, p = 3.18 × 10−10) with no heterogeneity among the studies (I2 = 0) (Table 1, Fig. 1).

Table 1 Association results of the SNPs selected for replication in IMMENSE.
Fig. 1: Meta-analysis result of rs28199.
figure 1

Forest plot of the meta-analysis using a fixed effects model across all four datasets. Heterogeneity was assessed using the I2 statistic. OR = odds ratio, 95% CI = 95% confidence interval.

This polymorphism was selected for being predicted to affect the binding site of three transcription factors: IRF1, STAT2_STAT1 and FOXP1. The strongest effect of the SNP was calculated for IRF1 (interferon regulatory factor 1), a protein member of the IRF family which was first recognized for its role as activator of genes involved in both innate and acquired immune responses. IRF-1 activates a set of target genes associated with regulation of cell cycle, apoptosis and the immune response [9, 10]. According to the SNP2TFBS database, rs28199 is predicted to modify a binding site of IRF1 leading to a stronger bond, which could in turn result in oncogenesis considering the set of genes that IRF1 regulates. The minor allele of rs28199 is located within a regulatory region which according to the variant effect predictor tool (https://www.ensembl.org/info/docs/tools/vep/index.html) and HaploReg (https://pubs.broadinstitute.org/mammals/haploreg/haploreg.php) binds the CTCF protein, a highly conserved zinc finger with various cellular regulatory role. CTCF binding perturbations cause different types of 3D genome reorganization and may cause the activation of the neighboring oncogenes [11]. Among the genes that CTCF regulates there is STK10, which encodes for a serine/threonine-protein kinase, highly expressed in hematopoietic tissue [12]. In various lymphoid cells rs28199-G is associated with an increased expression of STK10. Overexpression of STK10 has been reported in several cancer types, including acute myeloid leukemia (AML), another blood malignancy [13, 14].

We used data from the 500 Functional Genomics cohort from the Human Functional Genomics Project (HFGP; http://www.humanfunctionalgenomics.org/site/) to explore the possible role in modulating immune response of the four SNPs selected for the final replication steps. Namely, we tested if any of the SNPs of interest were cytokine expression quantitative trait loci (cQTL) using data from in vitro stimulation experiments, as well as absolute numbers of 91 blood-derived cell populations and 103 serum or plasmatic inflammatory proteins. The cQTL analyses showed that rs28199-G is also associated with an increased blood level of Interleukin-6 (IL-6) (beta=0.075, p = 0.002). IL-6 is a cytokine with a well established role as a growth and survival factor in MM [15]. Specifically, in line with our results, an increased level of IL-6 contributes to the pleiotropic effects of IL-6 regarding proliferation, survival, drug resistance, and migration of MM cells, thereby facilitating disease progression [16]. The counts of cell populations and the levels of serum or plasmatic inflammatory proteins were not significantly associated with the SNPs of interest.

In conclusion, we identified a new genetic association for MM, supported by functional biological explanations, thus highlighting the importance of secondary analysis using functional approaches for GWAS.