Lifetime Ultraviolet Radiation Exposure and DNA Methylation in Blood Leukocytes: The Norwegian Women and Cancer Study

Ultraviolet radiation (UVR) exposure is a leading cause of skin cancers and an ubiquitous environmental exposure. However, the molecular mechanisms relating UVR exposure to melanoma is not fully understood. We aimed to investigate if lifetime UVR exposure could be robustly associated to DNA methylation (DNAm). We assessed DNAm in whole blood in three data sets (n = 183, 191, and 125) from the Norwegian Woman and Cancer cohort, using Illumina platforms. We studied genome-wide DNAm, targeted analyses of CpG sites indicated in the literature, global methylation, and accelerated aging. Lifetime history of UVR exposure (residential ambient UVR, sunburns, sunbathing vacations and indoor tanning) was collected by questionnaires. We used one data set for discovery and the other two for replication. One CpG site showed a genome-wide significant association to cumulative UVR exposure (cg01884057) (pnominal = 3.96e-08), but was not replicated in any of the two replication sets (pnominal ≥ 0.42). Two CpG sites (cg05860019, cg00033666) showed suggestive associations with the other UVR exposures. We performed extensive analyses of the association between long-term UVR exposure and DNAm. There was no indication of a robust effect of past UVR exposure on DNAm.

www.nature.com/scientificreports www.nature.com/scientificreports/ methylation, including imputation of LINE-1 specific CpGs, in whole blood. UVR exposure is the main driver for skin photoaging, and we also examined if lifetime UVR exposure could result in an accelerated epigenetic age, estimated from DNA methylation in leucocytes 19 . Analyses were performed in the discovery set and two replication sets.

Materials and Methods
Study samples. The NOWAC cohort includes 172 000 women aged 30-70 years (born 1927-1965) when included in 1991-2006 from a nationwide random sample (response 54%) 20 . Host characteristics and lifetime UVR exposure were collected through questionnaires at baseline and every 4-6 years. Approximately 50 000 women in the NOWAC cohort donated blood samples and constitute the postgenome cohort 21 . The present paper includes controls from three data sets from the postgenome cohort, all cancer-free women at the time of blood sampling and selected as controls in case-control studies of melanoma (discovery set, n = 183 controls), breast cancer (replication set R 1 , n = 191 controls) 22 , and lung cancer (replication set R 2 , n = 125 controls) 5,23 . Matching factors were time since blood sampling and year of birth (1943-1947, 1948-1952, 1953-1957).
The women gave written informed consent to donate blood samples for biomarker analyses. We confirm that all methods employed in the study were performed in accordance with the relevant guidelines and regulations. UVR exposure. On the basis of ambient UVR hours at place of residence, ambient UVR is categorized as low (northern Norway), medium-low (central Norway), medium (southwestern Norway), and highest (southeastern Norway) 24,25 . In the baseline and follow-up questionnaires, participants reported history of severe sunburns (never, 1, 2-3, 4-5, ≥6 times per year), average number of weeks spent on sunbathing vacations per year (never, 1, 2-3, 4-6, ≥7 weeks) and average use of an indoor tanning device (never; rarely; 1, 2, or 3-4 times/month; >1 time/week) in childhood (≤9 years), adolescence (10-19 years), and adulthood (>19 years) 24 . The reported frequencies of indoor tanning, sunbathing vacations, and sunburns were transformed into equivalents of yearly sessions and multiplied by the length of each interval 16 . The participants were then classified into five categories; non-exposed and quartiles. To capture the tail of the distribution, the upper quartile was further divided into two equally sized groups (i.e. six categories in total). Cumulative UVR exposure was constructed by summarizing the categories (i.e. scores 0-5) for indoor tanning and sunbathing vacations 16 . covariates. Participants reported education (≤10, 11-13, ≥14 years), smoking (never, former, current smoker), and hair color (dark brown/black, brown, blond/yellow, red); which is the best measure of skin sensitivity to UVR in the NOWAC cohort 13,15 . DnA methylation analyses. DNA was extracted at the HUNT Biobank, Levanger, Norway, and methylation arrays were analyzed at the Institute for Genomic Medicine, Torino, Italy. DNA was extracted from the blood samples using the QIAsymphony DNA Midi Kit (Qiagen, Crawley, UK), and 1000 ng (discovery) and 500 ng (R 1 and R 2 ) of DNA were converted with bi-sulfite (EZ-96 DNA Methylation-Gold ™ Kit, Zymo Research, Orange, CA, USA) according to manufacturer's instruction.
The samples for the discovery set were randomly placed on the plates, and randomly assigned to a row/column position, with equally many cases and controls on each column and plate. The Illumina Infinium MethylationEPIC BeadChips were hybridized according to the manufacturer's protocol. All predicted cross-hybridizing probes (44 210) 26 , out-of-band probes (2843), and all probes with at least one CpG with detection p-value above 0.8 (5504 CpGs) were removed. This left 775 528 CpGs in samples from 183 controls. DNA methylation at LINE-1 CpGs were imputed in the discovery set using the R-package REMP 27 and its default pipeline, without removing cross-hybridizing probes. We assessed only LINE-1 methylation and not other repetitive elements, since this is by far the most studied marker of association between UVR and DNA methylation.
For R 1 and R 2 , the Illumina Infinium HumanMethylation450 BeadChips were hybridized according to the manufacturer's protocol. Plate specific batch effects were corrected using ComBat 28,29 . After quality control that included removal of CpGs with >20% missing and non-specific CpGs, 416412 autosomal CpGs remained for R 1 and 450890 for R 2 . Quality controls have been described in detail for R 1 22 and R 2 5 . All three data sets had background subtraction and control normalization performed with minfi to reduce background noise and dye bias 30 . Beta mixture quantile normalization 31 using the wateRmelon R-package 32 was performed for type I and type II probes in the three sets jointly. Cell type composition was estimated using the Houseman algorithm 33 with a reference data set from Reinius et al. 34 . White blood cell composition estimates were obtained for CD4 + and CD8 + T-cells, NK cells, B cells, monocytes, granulocytes, and we estimated the granulocytes-to-leukocytes ratio.

Statistical analysis.
Correlations between the five UVR variables were estimated using Pearson's correlation coefficient, r. Linear regression was used to study associations between UVR exposure variables and estimated fraction of each cell type component, as well as the lymphocyte to neutrophil ratio, adjusting for age at sampling, smoking status, time in freezer, and data set.
The methylation values were transformed from beta-values to M-values using a logit2 transform. Smoking results in a strong, well-known pattern in the DNA methylation and as a quality control, we performed linear regression with smoking status as the main exposure and DNA methylation as the outcome.
In the genome-wide analysis, DNA methylation was modelled as the outcome and UVR as the covariate in a linear regression model for each CpG, adjusting for age, smoking status, and time in freezer. Additional adjustment was performed for hair color, as a marker of skin sensitivity 15 , and for cell type composition. We tested for interactions between cumulative UVR exposure and hair color in each CpG, and similarly between lifetime sunburns and hair color.
We present estimated regression coefficients with standard errors (SE). Note that, as we are testing for trends through ordered categorical exposure variables, the estimated regression coefficients should be interpreted with caution. Furthermore, we used non-negative matrix factorization to summarize the UVR exposure variables and hair color, and to cluster the individuals into three exposure groups. Analysis of variance (ANOVA) was used to test for differences between these groups with regard to each CpG.
All p-value adjustments for multiple testing were done with the Benjamini-Hochberg false discovery rate (FDR) procedure. A CpG site was defined as significant if the FDR adjusted p-value was <0.05, and as replicated if the nominal p-value in any of the replication sets was <0.05. Replication was attempted for the 20 CpGs with lowest p-values in the discovery set.
We attempted replication of the previously reported association between ambient UVR exposure and cg26930596 in the PRKCZ gene 10 in all three sets using linear regression.
We assessed global DNA methylation by two indicators: the average over all measured CpGs, and by imputing methylation at CpG sites in LINE-1. Average methylation levels were analyzed using linear regression, with UVR as the exposure and average methylation as the outcome, adjusting for age, smoking, and time in freezer. The association between UVR exposure and LINE-1 CpGs was modeled with two models, one at the level of individual CpGs with linear regression and one at the level of LINE-1 subfamilies using linear mixed models, with subfamilies as grouping factor, adjusting for age, smoking, and time in freezer for both models.
Biological age (PhenoAge) was estimated based on the 513 CpGs published by Levine et al. 19 , out of which 512 were available in the discovery set, 506 in R 1 and 505 in R 2 . Age acceleration phenotype was defined as the difference between the chronological age and the estimated biological age 35 . Linear regression was used with age acceleration as the outcome and UVR as the exposure, adjusting for smoking and time in freezer. All analyses were performed using R software 36 . ethics approval. The Medical Ethical Committees of North Norway has approved the NOWAC study and the storage of human biological material, as well as each sub-study used in this project.

Results
Women in the discovery and replication sets were older than women invited to the postgenome cohort (Table 1). Furthermore, R 2 was older, recruited earlier, and had shorter time in freezer, lower education, and more non-smokers compared to the discovery and R 1 sets. UVR exposures in the three sets are presented in Table 2, and R 2 had lowest proportion of women from the region with highest ambient UVR. Low correlation was found between residential ambient UVR and the other four UVR variables (−0.06 ≤ r ≤ 0.14), and between lifetime sunburns and the other UVR variables (0.09 ≤ r ≤ 0.16). Indoor tanning and sunbathing vacations were moderately correlated (r = 0.30).  Table 1. Characteristics of the women in the discovery and replication sets, and the women invited to participate in the postgenome cohort.
When testing each cell type independently, the UVR exposure variables were not significantly associated with cell type composition in any of the three sets, (0.06 ≤ p adjusted ≤ 0.98) (Supplementary Table S1). The lymphocyte to neutrophil ratio was also not significant for any UVR exposure (0.07 ≤ p adjusted ≤ 1). A total of 326 758 CpGs were present in all three sets. In the analysis of smoking, 113 CpG sites had p adjusted < 0.05, of which 58 were replicated in at least one of the two replication sets (Supplementary Table S2).
Differentially methylated CpG sites. The top 20 CpGs associated with each UVR exposure are listed in Supplementary Table S3. Two of the top 20 CpGs replicated in either R 1 (sunburns and cg00033666) or R 2 (ambient UVR and cg05860019). One CpG (cg01884057) was genome-wide significantly associated with UVR exposure (cumulative UVR) in the discovery set (p adjusted = 0.03), but it was not replicated neither in R 1 (p nomial = 0.64) nor in R 2 (p nomial = 0.42) ( www.nature.com/scientificreports www.nature.com/scientificreports/ We tested for interaction between lifetime sunburns and hair color and found no interaction for any of the CpGs (p adjusted ≥ 0.42, discovery set). When testing the interaction between lifetime cumulative UVR and hair color, significant interaction was found for one CpG (cg15277477, p adjusted = 4.1e-3), but this was not replicated (p nominal = 0.81 in R 1 and p nominal = 0.99 in R 2 ). After adjustment for cell type composition (Supplementary  Table S3), no substantial differences were observed. The correlation coefficients between the top 20 effect estimates in the model with and without this adjustment ranged from 0.80 to 0.99.
The ANOVA comparing each CpG between the groups from the cluster analyses, identified two in the top 20 CpGs that were replicated: cg21452538 (p nominal = 3.69e-5 in discovery) was replicated in R 1 (p nominal = 0.03) and cg05967123 (p nominal = 2.75e-5 in discovery) in R 2 (p nominal = 0.02). The main driver of these associations was a factor composed of sunbathing vacations and cumulative UVR exposure. cpG site indicated in the literature. The CpG cg26930596 in the PRKCZ gene, previously reported to be associated with ambient UVR exposure, was significantly associated with ambient UVR exposure in R 1 (p nominal = 9.34e-3), but not in the discovery set (p nominal = 0.65) or in R 2 (p nominal = 0.28).
Global DNA methylation. Average methylation was not associated with any of the UVR exposure variables in the discovery or replication sets (0.06 ≤ p nominal ≤ 0.93) (Supplementary Table S5). Indoor tanning and cumulative UVR exposure had negative effect estimates in all three sets, sunbathing vacation had positive effect estimates in all three sets, while lifetime sunburns and ambient UVR had a positive effect estimate in the discovery set and negative estimates in both replication sets. In the discovery set, no LINE-1 CpG was significantly associated with any of the UVR exposure variables (data not shown). No LINE-1 subfamily was significantly associated with any of the UVR exposure variables (Supplementary Table S6). Accelerated aging. Accelerated aging was associated with sunbathing vacations in R 2 (regression coefficient = 1.8, SE = 0.48, p nominal = 1.20e-3), but not in the other two sets (0.08 ≤ p nominal ≤ 0.32). The remaining four UVR exposure variables were not significantly associated with accelerated aging (0.06 ≤ p nominal ≤ 0.88; with the lowest p-value for cumulative UVR in R 2 ).

Discussion
We investigated the association between five UVR exposure variables (residential ambient UVR exposure, lifetime sunburns, lifetime sunbathing vacations, lifetime indoor tanning, and cumulative UVR exposure) and DNA methylation in lymphocytes in a discovery and two replication sets from the NOWAC cohort.
Only one CpG (cg01884057) site was associated with cumulative UVR exposure, but this finding was not replicated. Additionally, two CpGs were suggestively associated with the other four UVR exposure variables and replicated in one of the replication sets.
The CpG associated with cumulative UVR in our study lies in a DNase hypersensitive region 7 kb upstream of the Adenylate Cyclase 3 (ADCY3) gene, shown to be a potential oncogene 37 . However, no robust association with skin cancer has been indicated. Ambient UVR exposure was suggestively associated to a CpG (cg05860019) about 10 kb upstream, in the shore of a CpG Island associated to the One cut homeobox 1 (ONECUT1) gene. This gene is mainly transcribed in liver cells, but is important for cell cycle regulation and potentially associated with tumorigenesis or metastasis of malignant tumors 38 . The CpG suggestively associated with lifetime sunburns (cg00033666) lies in the shore of a CpG Island next to the master regulator gene Nuclear Receptor subfamily 2, group F (NR2F2). This gene has been suggested as an inhibitor target for melanoma and other cancers 39 . Somatic mutations in NR2F2 have been observed in about 1% of melanomas 40 .
There are few studies on UVR exposure and DNA methylation, and most of the existing studies focus on cell lines or short-term exposure to UVR. The most similar study to ours in terms of design is the study by Aslibekyan et al. 10 , who investigated ambient UVR exposure and DNA methylation in CD4 + T-cells, which have been shown to express the CCR10 receptor when stimulated with sun induced vitamin D 3 41 . One CpG from Aslibekyan et al. 10 was nominally significant (cg26930596) in one of our samples, but with an effect estimate in the opposite direction.
The average beta-value across all methylation probes was not associated with any of the UVR exposure variables, but we observed an indication (not statistically significant) of hypomethylation. This is in line with previous research, which has observed a loss of DNA methylation after UVR exposure 11 . UVR exposure has been linked to LINE-1 hypomethylation in previous studies, but this has not been translated into an increased risk of  www.nature.com/scientificreports www.nature.com/scientificreports/ melanoma 42 . LINE-1 methylation is often used as an indicator for global methylation. In this study, we used both average methylation over all observed CpGs, and imputed CpG levels at LINE-1. Neither was significantly associated with any of the UVR exposures.
UVR exposure is a primary driver of photo aging in the skin, and it can be hypothesized that other tissues could also show accelerated aging after UVR exposure. However, the association between sunbathing vacations and accelerated aging observed in R 2 was not very strong, and was not found in the discovery or the R 1 sets. No other UVR exposures showed a significant association.
An important strength of our study is the detailed life history of solar and artificial UVR exposure in the population-based NOWAC cohort, which has been consistently associated with risk of cutaneous melanoma 13,[15][16][17] and squamous cell carcinoma 14,18 . Indoor tanning irradiances are high in UVA radiation 43 while UVB is the main cause of sunburns 44 . The intensity of the UVR exposure could not be directly assessed through the questionnaires, but since the UVR questions were segmented into age intervals in decades for each individual, estimates of dose were obtained.
An exposure with a demonstrated strongest epigenetic footprint, smoking, has been extensively studied, also in the NOWAC study 5 . As a quality control of the methylation data, we studied the associations with smoking in all three data sets. All significant probes that replicated across our sets have been previously reported in a large meta-analysis on methylation and smoking 45 and demonstrate that the data were of sufficient quality and sample size to find biomarkers of strong exposures.
A weakness of our study is the lack of a directly exposed tissue, and the use of whole blood over skin samples. When studying the epigenetic patterns relating to environmental exposures or diseases, being as close as possible (in time and space) to the affected tissue is important since the epigenetic profile differs between tissues. Different cell types will also respond differently to the same environmental exposure. However, this has to be balanced against the availability of bio-samples. Large scale, general purpose biobanks, suitable for pre-diagnostic sampling will usually store only blood samples. While skin is the primary exposed tissue to UVR, and thus the most relevant tissue for studying direct effects of the UVR exposure, secondary effects of chronic UVR exposure might also be observed elsewhere, including in circulating lymphocytes 46 . The suppression of the immune system by UVR exposure is documented and is used as treatment for some autoimmune diseases 47 . Under the hypothesis that sustained UVR exposure influences the immune system, the cell type composition would be affected by the UVR exposure, and thus act as a mediator on the path from exposure to outcome. This places cell type composition on the causal pathway between UVR exposure and DNA methylation. Thus, after adjusting for cell type composition we will not be able to identify the total effect of the UVR exposure on DNA methylation, but rather the direct effect that is not caused by changes in the immune cell composition. We did not observe any strong associations between UVR exposures and the estimated white blood cell composition. The only UVR exposure variable showing some potential association with cell type composition was residential ambient UVR, which is also potentially confounded by population substructure. Given that no other UVR exposure showed a consistent association with cell type composition, this association is more likely caused by other factors than UVR exposure. Moreover, additional adjustment for cell type composition did not change the results.
The three data sets were collected as controls for case-control studies of melanoma (discovery), breast cancer (R 1 ) and lung cancer (R 2 ), which explains the older age of these sets compared to all women invited to the postgenome cohort and also the differences in time in freezer. The long-term storage of whole blood and DNA in biobanks may have a negative effect on DNA yield, but the integrity of DNA methylation does not seem to be affected by this 48 . However, there may be systematic differences between the three samples that are reflected in the time spent in freezer, such as difference in lab procedures, and this variable may serve as a proxy for such differences. To make the three samples more comparable, we therefore included this adjustment in all models.
UVR exposure is the main risk factor for skin cancer, but if this risk is mediated by DNA methylation is still not determined. We have made an extensive analysis of the potential association between UVR exposure and DNA methylation in blood, investigating the problem using different statistical approaches. Thus, we feel confident that long term UVR exposure has little effect on DNA methylation, and if DNA methylation is acting as a mediator of the melanoma risk from chronic UVR exposure, this is not reflected in DNA methylation in white blood cells.

Data availability
The DNA methylation data generated and/or analyzed in the current study can be accessed upon reasonable request to the originating cohort. Access will be conditional on adherence to both local and national ethical and security policy. R codes used for the analyses presented in the paper are available upon request.