Introduction

Head and neck cancers (HNC) represent a group of patients with increasing loco-regional control and survival rates [1,2,3]. Curative treatment is usually made up of surgery, radiotherapy (RT) or a combination. Primary RT is often preferred for reasons of cosmesis and preserved organ function. However, RT also comes with the price of RT-induced toxicity. In HNC this toxicity is primarily characterised by mucositis, dysphagia, fibrosis of the skin, atrophy of the epithelial linings and xerostomia. The risk of RT-induced toxicity is influenced by various treatment and patient-related factors. A review of normal tissue complication (NTCP) models from 2010–2017 in standard fractionated RT of HNC identified the following parameters to be associated with RT-induced toxicity endpoints: Mean dose, maximal dose, irradiated volumes, the dose leading to a 50% complication rate (D50), RT technique (Intensity-Modulated Radiation Therapy (IMRT) vs. 3D Conformal RT technique), gender, concomitant chemotherapy and stage of T-site [4]. However, increasing evidence also points to an individual inherent predisposition of developing toxicity [5] and in 2009, the Radiogenomics Consortium [6, 7] was established, facilitating research efforts in genetic variations associated with RT-induced toxicity. In studies of breast-, head and neck and prostate cancer cohorts, several SNPs associated with different morbidities after RT have been identified [8,9,10,11]. Two recent GWA studies explored RT-induced toxicity (temporal lobe brain injury and oral mucositis, respectively) in nasopharyngeal HNC patients [12, 13]. And another recent study of 37 patients identified associations between somatic genetic variants in HNC tumour tissue and severity of RT-induced morbidity, which is the first study to link genetic variants in the tumour to surrounding normal tissue morbidity [14]. However, GWASs with respect to radiation-induced toxicity in non-nasopharyngeal HNC patients remain to be investigated. The present study aimed to identify genetic variants associated with either specific RT-induced toxicity endpoints or a general proneness to develop toxicity after RT across endpoints. We conducted a two-stage genome-wide study comprising a total of 1780 European HNC patients with larynx and oro/hypopharynx as the primary tumour site. Further validation was pursued in an Asian cohort of HNC patients with nasopharynx as the primary tumour site.

Materials and methods

We performed a two-stage GWAS including 1780 HNC patients with oropharyngeal/hypopharyngeal or laryngeal cancer from two European cohorts. In a discovery study including 1183 patients, common genetic variants were tested for their association with either specific radiation-induced morbidities or with a general susceptibility to develop toxicity after RT across endpoints. In a replication study including 597 patients, the genome-wide significant variants identified in the discovery study were tested in an independent cohort of comparable HNC patients and if successfully replicated, variants were meta-analysed. A subsequent attempt to replicate significant findings in a cohort of 235 Asian HNC patients with nasopharyngeal cancer was made.

Discovery study

DAHANCA cohorts

Patients in the discovery study were treated within The Danish Head and Neck Cancer Group (DAHANCA) protocols in Denmark for head and neck squamous cell carcinoma in the period of 2000–2011. Eligibility criteria were stringently set to ensure a uniform cohort as follows: Histologically verified pharyngeal or laryngeal squamous cell carcinomas T1-4N0-3M0. Curative intent treatment regimen, consisting of primary RT ± concomitant treatment, >600 days recurrence-free (curative single lymph node extirpation on the neck was accepted) and available biological tissue with germline DNA. Patients were excluded by the death of any cause or switch to palliative treatment regimen during primary RT. Patients were censored by recurrence date in case of recurrence after 600 days. RT doses were 66, 68 (accelerated 6 fx/week regime) or 70 Gy (standard 5 fx/week regime). Fractionation doses were 2 Gy/fx. Concomitant treatment included weekly cisplatin and Nimorazole. Cisplatin was offered to patients with N1–3 disease from 2007, dosage 40 mg/m2 once weekly during RT. Nimorazole was administered to all patients except patients with glottic laryngeal cancers T1N0M0. RT technique was IMRT from 2006. Before this, 3D-Conformal RT was applied. The 1183 patients included were from the following cohorts:

DAHANCA10 (N = 81): In a randomised phase III study, the effect of Darbepoetin-alfa in anaemic patients (haemoglobin < 14.5 g/dl for men and < 13.5 g/dl for women) was analysed with loco-regional failure as the primary endpoint. Patients were eligible by the following criteria: Squamous cell carcinoma of the larynx, oropharynx, hypopharynx or oral cavity T1-4N0-3M0 (excluding stage I glottic carcinomas), WHO performance score 0–2, age ≥18 years. The study was interrupted after an interim analysis showing the inferiority of the Darbepoetin-alfa arm [15].

DAHANCA19 (N = 485): In a randomised, open-label phase III trial in 2007–2012, the effect of zalutumumab, an EGFR-inhibitor, was analysed with loco-regional failure as the primary endpoint. Patients were eligible by histologically verified squamous cell carcinoma of the oral cavity, larynx, oropharynx or hypopharynx, stage T1-4N0-3M0 (excluding stage I + II glottic carcinomas), WHO performance score 0–2, age ≥18 years [16].

DAHANCA Prospective biobank protocol (N = 617): Consecutive patients with available biobank material, age ≥ 18 years, treated according to standardised national DAHANCA guidelines [17].

Endpoints

Follow-up data were prospectively registered weekly during RT and after completion of RT course, at 2 weeks and at 3, 6, 12, 24, 36, 48 and 60 months. RT-induced toxicity in the present study was scored as the maximal grade at any time during treatment or follow-up.

Acute toxicity endpoints comprised acute dysphagia and mucositis between week 2 after RT start and 14 days after treatment completion. Late endpoints comprised late dysphagia, xerostomia, mucosal atrophy and skin fibrosis between 600 days and 5 years after treatment completion. Under the hypothesis that mucosal atrophy and skin fibrosis are manifestations of similar biological pathway processes, these entered a combined fibrosis/atrophy endpoint. Additionally, tube feeding at 6 months was included as a binary variable. This endpoint represents a composite multifactorial endpoint dependent on not only RT- and patient-related factors but also on factors such as inter-institutional variation in the availability of specialised knowledge in tube feeding and cultural differences in when to start and when to end tube feeding. It was included as a surrogate marker for severe toxicity after RT.

Acute and late endpoints were registered ordinally according to standardised DAHANCA forms, which are comparable with LENT-SOMA scales [18]. For the purpose of the present study, endpoints were dichotomised into “moderate-severe” and “severe”. Grading and cut-off points are available in Supplementary Table 1. Severe mucositis (corresponding to ulceration) was not tested independently due to a low number of events.

With the purpose of normalising and combining acute, late and global endpoints, Standardised Total Average Toxicity (STAT) scores were estimated aggregating ordinal endpoints (Supplementary Table 1). This method was previously described elsewhere [19]. STATacute included acute dysphagia and mucositis. STATlate included late dysphagia, xerostomia, fibrosis and atrophy. STATglobal combined STATlate and STATacute. Tube feeding was not included in the STAT scores.

Covariates

Available covariates were age, sex, total RT dose, concomitant chemotherapy, protocol, and a surrogate for irradiated volume (Supplementary Table 1). As dose-volume histograms were not available, a trichotomised surrogate marker for irradiated volume was estimated from the cancer site and TNM stage. The rationale was as follows: Irradiated volumes depended on site and stage. Patients with T1a-T1bN0M0 glottic laryngeal carcinomas received small box fields to the larynx without an elective volume. The remaining patients with N0 disease received bilateral (or unilateral for tonsillar tumours without the involvement of soft palate or base of tong) elective volumes by the following criteria: Oropharyngeal and laryngeal cancers without subglottic disease: lymph node levels II and III. Hypopharyngeal cancers: levels II, III and IV. Subglottic laryngeal cancers: levels II, III, IV and VI. In general, patients with N1–3 disease received a larger irradiated volume than patients with N0 disease as the positive nodes were irradiated to the prescribed tumour dose and elective volumes could be expanded beyond those described above. Based on this, the volume surrogate cut-offs were: Volume 1 = T1a-1bN0M0 glottic laryngeal carcinomas, Volume 2 = all other TxN0 sites, Volume 3 = TxN1-3 carcinomas.

Genotyping, imputation, quality control and bioinformatics

Germline whole-genome DNA was extracted from buffy coats and SNP genotyping was done using the Infinium OncoArray-500K Bead Chip (Illumina Inc., San Diego, USA). Quality control processes adhered to OncoArray guidelines [20] and are available in Supplementary Document and Supplementary Table 2. Haplotype estimation (phasing) was done using SHAPEITv2(r837). Imputation was done with IMPUTE2 using the last phase of 1000 Genomes, mapped to build GRCh37/hg19 as a reference.

Statistical analysis

The cohort was analysed as a pooled dataset of 1183 patients from DAHANCA10, DAHANCA19 and the DAHANCA Prospective biobank protocol as they were eligible by the criteria set. Covariates considered clinically relevant were tested against endpoints in a univariate logistic regression model for binary endpoints and a linear regression model for STAT (continuous) scores (Supplementary Fig. 1). If significant at an α = 0.05 level they entered multivariate analysis and by the method of backwards stepwise selection, final models were created. The total dose was included in the final models for all endpoints. The volume surrogate entered models for all endpoints except severe late dysphagia and tube feeding. Concomitant chemotherapy entered models for acute endpoints. Sex entered models for xerostomia and STAT scores. The protocol entered models for xerostomia, STATlate and STATglobal. Principal components were introduced stepwise in analysis and included if significant (α = 0.05).

Associations between genotypes and endpoints were tested using a logistic regression model for binary endpoints and a linear regression model for STAT (continuous) scores. All analyses were adjusted for covariates in a case-control design. A log-additive genetic model was assumed for single endpoints and a linear model was assumed for STAT scores. SNPs were included as a number of effect alleles or imputed genotype dosage resulting in a per-allele odds ratio (OR) or a regression coefficient (β) for each SNP. The significance level was set to p < 5 × 10−8 and thus was not adjusted for testing multiple endpoints. This choice was based on including genetic variants suggestively significant in the replication phase.

In a Bayesian approach, we estimated the posterior probability of a false discovery (pBayes), assuming a prior probability = 10−4 and a prior variance = 0.352 [21, 22]. Top variants with a posterior probability <10% were considered a plausible true positive discovery. Data preparation and statistical analyses were carried out using Stata13 (StataCorp, TX USA), R Statistical Package v3.3.0 [23] and SNPTEST [24]. The study is reported in compliance with the STROGAR guidelines for radiogenomic studies [25].

Power analysis

Power was estimated using the Genetic Association Study (GAS) Power Calculator [26] and is illustrated for selected endpoint frequencies in Supplementary Fig. 2. For the discovery phase, in general associations with ORs <1.7 were not likely to be identified with a power of 80%.

Replication studies

Dutch cohort

The study population consisted of 1102 HNC patients eligible by the same criteria as in the discovery study and treated with definitive or postoperative RT or Chemo-RT from 2007 to 2017. All patients were included in the UMCG-HNC prospective data registration program, in which all baseline patient-, tumour- and treatment characteristics, as well as acute and late toxicity, were prospectively scored. Patients were followed up to 5 years after completion of treatment (NCT02435576 at clinicaltrials.gov).

Endpoints, SNPs and covariates

A replication study was performed on genetic variants reaching GWAS significance in the discovery phase and including suggestively significant associations with p < 10−5 for the specific endpoint. The replication study included the same covariates associated with the endpoint as in the discovery study. In the discovery cohort, acute toxicity was assessed weekly up to the second week after completion of treatment, whereas in the replication cohort, acute toxicity was assessed weekly during radiation and up to 5 weeks after completion of RT. The included endpoint, mucositis, was defined as moderate or severe on at least one time point between the second week after the start of RT and the fifth week after completion of RT.

Genotyping, imputation, quality control and bioinformatics

Out of 1,102 patients, 607 were genotyped on Illumina human hap 550k v.3.0 and 495 on Illumina global screening array. Quality control procedures were performed using standard procedures. First, samples and SNPs with call rates below 98% and MAF < 1% were removed. We checked the ethnicity of subjects using multidimensional scaling (MDS) clustering of our samples with HapMap Phase3 individuals using EIGENSTRAT [27]. Samples that deviated more than 3 SD from the mean of their closest clusters were removed. Second, genotypes that passed the QC were imputed on the backbone of Haplotype Reference Consortium reference panel (HRC) version R1.1. Third, we further performed post-imputation QC involved removing SNPs with an imputation quality (info) score of R2 < 0.3, with a MAF < 0.01, SNPs that had a discordant MAF (maximum allowed difference < 0.15) compared to the reference panel, and strand ambiguous AT/CG SNPs and multi-allelic SNPs. Two imputed datasets were then merged together using PLINK (version 1.90). As the genotyping array differed between the discovery and the replication cohorts, some variants from the discovery phase were not reproduced and thus not analysed in the replication phase.

Statistical analysis

We used logistic regression analysis and included covariates similar to the discovery study. The top ten principal components were included. The effect estimates were reported as OR with their corresponding 95% CI and associated p-values. The summary results of both discovery and replication studies were meta-analysed using an inverse variance-weighted fixed effects model as implemented in METAL (version 2011-03-25) [28]. A p-value <0.05 was considered statistically significant for replication analysis. For the meta-analysis, we maintained a significance threshold of p < 5 × 10−8. In all cases, it was a part of the quality control to check that directionality of effect and that MAFs were comparable between discovery and replication phases.

Singapore cohort

Genetic variants identified in the discovery phase and successfully replicated were independently tested in an Asian cohort of 235 HNC patients with nasopharyngeal carcinoma treated with curative intent with Radiotherapy (70 Gy/35 fractions) ± chemotherapy (clinicaltrials.gov: NCT04340024). Germline DNA was whole-exome sequenced using the Agilent V6 + UTR NovaSeq platform. Genotyping, quality control and bioinformatics was performed using Genome Analysis Tool KIT (GATK v 4.1.8.0) and adhering to the GATK best practices workflow (https://github.com/gatk-workflows/gatk4-germline-snps-indels). Germline variants were called for each sample using HaplotypeCaller [29]. This was followed by a joint genotyping step via GATK GenotypeGVCF across the whole cohort. The cohort followed the same protocol as the Dutch cohort in terms of endpoints, covariates included, statistical analysis methods and p-value significance cut-off.

Results

Discovery study

Genotyping and patient characteristics

Genotyping and imputation resulted in a dataset of 14,640,905 common variants in 1237 patients including duplicate controls. After Quality Control processes, 7,178,424 variants (SNPs, deletion/insertion variations (indels) and copy number variations (CNVs)) and 1183 patients remained in analysis. Stepwise outcomes from QC processes are available in Supplementary Table 2. Clinical characteristics are displayed in Table 1.

Table 1 Patient characteristics of discovery and replication cohorts.

GWAS–QQ and Manhattan plots

The 7,178,424 SNPs were tested for associations with acute dysphagia, mucositis, late dysphagia, xerostomia, skin fibrosis, fibrosis/atrophy, tube feeding at 6 months, STATacute, STATlate and STATglobal. The first 7 endpoints were analysed as binary outcomes, and the first six of these were analysed at two different cut points, ‘moderate-to-severe’ and ‘severe’ (Supplementary Table 1). Due to a low number of severe events, mucositis was only evaluated as ‘moderate-to-severe’. The analysis identified a significant association for mucositis with QQ and Manhattan plots shown in Fig. 1. QQ and Manhattan plots for all endpoints are shown in Supplementary Fig. 3.

Fig. 1: GWAS results for moderate/severe mucositis.
figure 1

QQ plot (left) and Mangattan plot (right).

In moderate-to-severe acute dysphagia, an imputed single SNP, rs2024705 on chromosome X, reached genome-wide significance. Imputation quality was rather poor (info = 0.44). This SNP represents an intron variant in LINC00632 (long intergenic non-protein-coding RNA 632) and was considered a potential false-positive finding.

Mucositis

Three SNPs (rs28419191, rs10875554 and rs1131769) tagged a locus on chromosome 5 that reached genome-wide significance for association with radiation-induced mucositis (Table 2 and Fig. 2). The strongest association was found for effect allele C (allele frequency 0.87) of rs28419191 with OR = 2.27; 95% CI 1.70–3.05; p = 4.4 ∙ 10−8; pBayes = 0.068. The SNPs were well imputed (imputation info score range 0.95–0.96) and in close LD with each other (Supplementary Fig. 4A). The locus spans an area from 4 kb upstream of the ECSCR gene (rs28419191 and rs10875554) to an exon in the STING1 gene, where rs1131769T > C represents a missense variant changing the codon from H (His) > R (Arg). The locus is in cis-eQTL with DNAJC18, PROB1, SLC23A1, SPATA24 and in sQTL with STING1 [30]. DNAJC18, PROB1, SLC23A1 and SPATA24 have a similar expression pattern across tissue types and ECSCR and STING1 share a different expression pattern (Supplementary Fig. 5).

Table 2 Summary results of the association between a locus on chromosome 5 and RT-induced mucositis in discovery, replication and meta-analysis.
Fig. 2: Locus Zoom plot for a locus on chromosome 5 associated with moderate/severe mucositis.
figure 2

Variants in orange or red colour are in Linkage Disequilibrium with the specified SNP rs28419191. The locus is in cis-eQTL with genes marked in red and in sQTL with gene marked in green.

ECSCR (Endothelial Cell Surface expressed Chemotaxis and apoptosis Regulator protein, previously known as Endothelial Cell-Specific Molecule 2 (ECSM2)) is expressed mainly in endothelial cells and blood vessels [31] and is involved in the regulation of chemotaxis and apoptosis through modulation of various pathways including EGFR, VEGF and FGFR signaling [32, 33].

STING1 (STimulator of INterferon response cGAMP interactor 1, previously known as transmembrane protein 173 (TMEM173)) has several splice variants. It is critically involved in the β-interferon synthesis and inflammation/immune system function [34] and non-canonical activation of a STING signalling pathway has been observed following DNA damage [35].

DNAJC18 is a member (C18) of the DnaJ heat shock protein family (Hsp40) [36]. The transcripts of this family function as chaperone molecules protecting other cellular proteins across pro- and eukaryotic species [37].

PROB1 (PROline rich Basic protein 1) is a protein-coding gene [36]. Its function in humans is not known, although one study has identified a possible association between alterations in this gene and the eye disorder keratoconus [38].

SLC23A1 (SoLute Carrier family 23 member 1, formerly SVCT1) is a member of the solute carrier (SLC) group of proteins and is involved in the transport of ascorbic acid (Vitamin C) across the cell membrane [39, 40].

SPATA24 (spermatogenesis associated 24) also a protein-coding gene, expressed in several tissues, in particular in the testes and prostate. It is preserved in several mammals where it plays a role in Cell differentiation as well as spermatogenesis [39].

The study did not identify other loci reaching genome-wide significance. A near-significant association was found between a locus on chromosome 6 and STATacute, available in Supplementary Figs. 4B, 6 and in Supplementary Table 3, where a full list of variants reaching p-values < 10−5 is available for all endpoints.

Replication study

Dutch cohort

Genotyping and patient characteristics

Among 1102 HNC samples, 957 patients and 6,334,277 SNPs passed imputation process and quality controls. The Dutch cohort includes HNC patients treated with both definitive and postoperative RT. Phenotypic quality control yielded 597 patients who were treated with definitive RT and met the remaining eligibility criteria matching the discovery study. Table 1 presents the characteristics of the Dutch replication population.

Replication and meta-analysis

The discovery study identified three variants significantly and 27 variants suggestively associated with mucositis. Two out of 3 genome-wide significant and 21 out of 27 suggestively significant variants were available for analysis in the replication study. We found the rs28419191*C allele with an OR of 1.55 (95% CI 1.03–2.32; p < 0.035) and rs1131769*C allele with an OR of 1.65 (95% CI 1.10–2.47; p < 0.015) significantly associated with increased risk of mucositis in the replication cohort (Table 2). None of the 21 suggestive SNPs from the discovery study was associated with mucositis in the replication cohort (Supplementary Table 3, sheet “Mucositis”). Next, we performed a meta-analysis on summary data for the discovery and replication studies, the combined subjects amounting to 1780 patients. The pooled OR for rs28419191*C allele was 1.96 (95% CI 1.45–2.46; pmeta-analysis = 3.67 × 10−14) and that for rs1131769*C allele was 1.95 (95% CI 1.48–2.41; pmeta-analysis = 4.34 × 10−16) and thus significantly associated to mucositis (Table 2).

Singapore cohort

In 235 samples, 49 patients with nasopharyngeal carcinoma experienced moderate-severe mucositis. Patient characteristics are presented in Table 1. In the locus eligible for analysis, rs28419191 and rs10875554 were unavailable for analysis and only rs1131769, was therefore analysed. This variant failed to replicate with an OR = 0.94 (95% CI 0.46–1.92; p = 0.86). The genotyping quality was good (0.99). However, the effect allele C had a lower frequency than in the DAHANCA and Dutch cohorts with 0.77 vs 0.87 and 0.87, respectively.

Discussion

Many studies in radiogenomics have been conducted since the turn of the millennium and several genomic loci have been identified as significantly associated with RT-induced toxicity. Up to now, breast, prostate, gynaecological and lung cancer cohorts are represented in hypothesis-driven studies of associations between SNPs and RT-induced toxicity. In non-hypothesis-driven GWA studies, breast and prostate cancer cohorts have been explored [8,9,10,11] and two fairly recent studies explored loci associated with temporal lobe injury [12] and oral mucositis [13] reaching p-values of 10−7 after RT for nasopharyngeal carcinoma.

The present study is the first GWAS study to explore associations with RT-induced toxicity in non-nasopharyngeal HNC cohorts. Previous studies in breast cancer cohorts have indicated that genetic variations may predispose to both an endpoint-specific as well as a more general susceptibility of developing RT-induced toxicity [41, 42] and hence supports the idea that bringing a cancer site with new RT-induced toxicity data to the research field of radiogenomics may contribute both site-specific and general value. The study identified and replicated a locus on chromosome 5 in the STING1 gene to be significantly associated with RT-induced mucositis and meta-analysis confirmed this association. The expression pattern falls into two groups, rendering it likely that this locus could be involved by either splice variants in the STING1 pathway with which the locus is in sQTL or the group of genes (DNAJC18, PROB1, SLC23A1 and SPATA24) with which the locus is in cis-eQTL (Supplementary Fig. 5).

An effort was done to replicate this finding in an Asian ancestry cohort from Singapore using whole-exome sequencing data. However, it failed to replicate due to most likely a different tumour site (nasopharynx) bringing a different distribution of RT-induced toxicity, a low sample size and the fact that only one variant, rs1131769, was available for analysis and this had a lower minor allele frequency than in the European ancestry cohorts studied here. In accordance with this chain of reasoning, the loci reaching low p-values in the present study were different from the ones identified in [13] exploring RT-induced oral mucositis in patients with both a different tumour site and ancestry (nasopharyngeal cancers in Chinese patients).

We did not identify SNPs significantly associated with any other acute and late endpoints. This may own to the sample size of 1183 patients in the discovery study, which was a limitation to the study as it rendered the identification of low-penetrance genetic variants less likely.

The significance level in the discovery phase was set to p < 5 × 10−8 and thus was not adjusted for testing multiple endpoints. This was a pragmatic choice that follows consensus in GWA studies and further was based on the considerations that the replication phase included genetic variants suggestively significant, and as the multiple endpoints tested are not individually independent, a Bonferroni estimate of a corrected p-value may increase the risk of false-negative discoveries to a larger extent than it will decrease the risk of false-positive findings. However, we do acknowledge this limitation and recommend that the finding in our study should be validated in other independent comparable cohorts before considered conclusive.

It is our hope that this study can serve such future independent studies as summary data for associations reaching p-values < 10−5 are published in Supplementary Table 3 and may serve as a reference for directionality, effect size and endpoint association. By design, the present study analysed the association between individual SNPs and endpoints. The complexity of various types of epistasis and other causes for modification of gene function (epigenetics, gene-environment interactions) were beyond the scope.

One of the more robust findings in radiogenomics is an association between ATM variants and RT-induced morbidity. A meta-analysis across breast and prostate cancer cohorts identified rs1801516 *A associated with ORs of 1.5 or 1.2 for being in the upper quartile of acute and late toxicity, respectively [9]. We did not identify associations between ATM variants and endpoints in this study which may own to the limitation of sample size, not rendering identification of variants in this range of effect size likely.

In conclusion, this is the first study to identify and replicate a locus associated with RT-induced mucositis in HNC patients with a laryngeal or oro/hypopharyngeal tumour site. It serves to extend the field of radiogenomics and adds a piece to the increasing evidence that an individual genetic predisposition is a contributory factor in the development of toxicity after RT. Collaborative efforts of combining cohorts of HNC patients with solid data for RT-induced morbidity would enable identification of genetic variants with lower penetrance, possibilities for validation and in the longer run, to add biology variables to Normal Tissue Complication Models. Such efforts are warranted under the auspices of the Head and Neck Cancer Group in the Radiogenomics Consortium.

Reporting guidelines

The study was reported complying with the STROGAR guidelines for radiogenomic studies.

Web resources

dbSNP, https://www.ncbi.nlm.nih.gov/SNP/, GAS power calculator, http://csg.sph.umich.edu/abecasis/cats/gas_power_calculator/index.html, GATK, https://gatk.broadinstitute.org/hc/en-us, The Genotype-Tissue Expression (GTEx) Project, gtexportal.org, IMPUTE2, http://mathgen.stats.ox.ac.uk/impute/impute_v2.html, LocusZoom, http://locuszoom.org/, National Institute of Health LD analysis and plot, https://analysistools.nci.nih.gov/LDlink/, PLINK, https://www.cog-genomics.org/plink2, R Software, https://www.r-project.org/, SHAPEIT, https://mathgen.stats.ox.ac.uk/genetics_software/shapeit/shapeit.html, SNPedia, https://www.snpedia.com/index.php/SNPedia, SNPTEST, https://mathgen.stats.ox.ac.uk/genetics_software/snptest/snptest.html.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.