ETS transcription factors induce a unique UV damage signature that drives recurrent mutagenesis in melanoma

Recurrent mutations are frequently associated with transcription factor (TF) binding sites (TFBS) in melanoma, but the mechanism driving mutagenesis at TFBS is unclear. Here, we use a method called CPD-seq to map the distribution of UV-induced cyclobutane pyrimidine dimers (CPDs) across the human genome at single nucleotide resolution. Our results indicate that CPD lesions are elevated at active TFBS, an effect that is primarily due to E26 transformation-specific (ETS) TFs. We show that ETS TFs induce a unique signature of CPD hotspots that are highly correlated with recurrent mutations in melanomas, despite high repair activity at these sites. ETS1 protein renders its DNA binding targets extremely susceptible to UV damage in vitro, due to binding-induced perturbations in the DNA structure that favor CPD formation. These findings define a mechanism responsible for recurrent mutations in melanoma and reveal that DNA binding by ETS TFs is inherently mutagenic in UV-exposed cells.

Several experimental studies were performed in fibroblasts. It is not clear that these same mechanisms apply to melanocyte or keratinocyte cell types, which would be the cells most vulnerable to UV mutation-associated cancer. Some validation should accompany the fibroblast studies.
Reviewer #2: Remarks to the Author: This exciting and well written study by an outstanding team of investigators tests the hypothesis that transcription factor "binding may alter the rate at which UV lesions form in DNA". While this team has begun to investigate this question in yeast cells (their recent paper in Proc Natl Acad Sci U S A 113,9057-62, 2016.) this problem has not been addressed in any significant way in the context of the environment of mammalian cell nucleus. Thus, this study will have an important lasting impact on the field. In order to fully address this question, and what sets it apart from previous work by the Sancar and Morrison laboratories is the development and validation of a new method, C PD-seq for analyzing cyclobutane pyrimidine dimer-induced damage at base pair resolution across the mammalian cell genome. The strength of this assay is that it does not require immunoprecipitation of C PD, which in the past probably underestimated the frequency of C PD at sites other than TTs. Using this new tool, the authors nicely demonstrate that "ETS TFs induce a unique signature of UV damage hotspots at specific locations in their binding motif, both in cells and in vitro". The experiments for the most part are well described and carefully repeated to gain strong statistical significance. Moving from an initial observation in mammalian cells to validation with a purified in vitro system is a strength of this study. 7. To this end, Figures 5 & 6 add a significant amount to this study and provide a strong validation of their major hypothesis. Also comparing signature mutations in melanomas to the frequency of C PD at TF is another important strength of this work. Finally, the addition of extensive supplementary data adds significantly to this work. Overall this is an outstanding study that will have lasting impact on the field once published. The following suggestions are to improve the overall impact of the study. 1. The authors down play the significance of the frequency of 6-4 photoproducts in the introduction (top of page 3). The authors indicate, "to a much lesser extent, pyrimidine  pyrimidone photoproducts (6-4PPs) at dipyrimidine sequences in DNA." The authors should avoid this vague term and indicate that depending on the sequence the frequency of 6-4 photoproduct can range anywhere from 10% to equal numbers. This also raises the question as what happens when there is a 6-4 photoproduct within the same sequence as a C PD -while rare this could happen. 2. Again top of page 3, C itations of previous work. Overall the authors do a OK job, but miss the opportunity on page 3 to include references to recent Sancar papers and a recent paper from the Morrison lab, EMBO J. 2017 Oct 2;36(19):2829-2843. They do cite these papers later, but not in this early context. 3. The authors do a fair job of detailing the various doses of UV-C used in their experiments in the methods, but to make it easier for the reader, it would be helpful to indicate the doses used in the results sections and also the figure legends whenever possible. In the methods the authors state: "NHF1 cells were irradiated with 20 J/m2 or 100 J/m2 UV and were harvested immediately after UV treatment." However which dose goes with which figure is not always clear. This is especially important in Figure legend 1 that sets the bar for the rest of the study. And later in Figure 3 where they are making direct comparisons to mutation frequencies -can one dose be used to extrapolate to mutation frequencies? 4. What is the relative ability to detect a lesion with this sequencing method, i.e. what is the signal to noise and the depth of the sequencing -how high is the resolution? This needs to be clearly stated early in this present paper. Perhaps they did a dose response in their previous work-up of yeast cells (PNAS 2016). Understanding these questions is essential for drawing all the following conclusions. 5. On page 9 the authors make the strong statement, "Elevated UV damage, not lower repair activity, promotes mutagenesis at ETS binding sites." However, they used previous data from one study, Nature 532, 264-7 (2016) to support this claim. Why did they not include recent data from the Sancar laboratory as well? This again occurs in the discussion on page 18. The authors should also cite the Morrison EMBO J as they make a similar claim in their study. Overall this is sure to be controversial so the stronger the argument that can be made in this study the better. 6. The gray and black lines in Figure 4 are difficult to see clearly -a better color scheme is advised.
Reviewer #3: Remarks to the Author: Summary: A long-standing question is why tumor mutations occur where they do. The answer was long assumed to be selection for altered protein function or regulation, but in the past few years factors such as chromatin and replication timing have been shown to also influence the likelihood of causing a driver mutation. Recently, several papers bioinformatically identified transcription factor binding sites (TFBS) as determinants of mutation hotspots in melanoma (notably refs 6 and 9), and at least one prominent paper (ref 9) ascribed this effect to reduced repair at TFBS. The latter paper's observation was a little puzzling because the repair defect was only a few-fold, resulting in reducing an elevated repair in the regions flanking the TFBS back to a repair level typical of the genome as a whole; why would this give that site more than the genome-wide density of mutations?
The current paper is a pivotal contribution to our understanding. It proceeds on two fronts: a closer look at the TFBS sequences involved, and direct measurement of the frequency of UVinduced cyclobutane dimers (C PD) at single bases. Given the prior literature, the C PD measurements are the most novel contribution. The authors find that if they focus on canonical TFBS, rather than all sequences immunoprecipitated by TFBS as prior papers did (a point that is buried in the Discussion, perhaps out of diplomacy): A subclass of TFBS --ETS1 --are associated with melanoma mutation hotspots; these sites are hotspots for C PD formation; a 16-fold elevation in C PDs accompanies a 125-fold increase in muations; repair is elevated at these sites, not suppressed, so the mutation site is chosen by having more initial damage not less repair; melanoma mutations are correlated only with ETS1 sites having a TC or C C sequence, allowing the UV signature C ->T mutation; this correlation is mechanistically confirmed using DNA bound to ETS1 and irradiated in vitro, but is prevented by mutant ETS1 that cannot bind to one of the two hotspot sites; these resutlts are nicely explained by the biophysics of C PD formation, which requires certain geometries (bond distances and angles) that are in fact created in DNA-ETF1 crystal structures; and molecular modeling shows that ETS1 shifts the thermally induced range of DNA conformations into the required range of geometries. QED. C ritique: This is a very careful and thoughtful study, with nice controls such as in vitro validation and, within that validation, mutants that abolish the behavior being tested. The writing is clear. Some minor points need to be addressed: Scientific -1. The two DNA strands are combined for the analysis. The authors' previous paper explains why they do this, but it would help the reader to explain it in the Supplemental here, too.
2. By eyeball, the C PD motif seems to usually or always be on the template strand. Are there examples on the non-transcribed strand that would serve to indicate the role of transcriptioncoupled repair in converting the C PD to a mutation?
3. The read depth of the experiments needs to be stated. This also bears on the "sample size" reporting requirement.
4. The Introduction states that prior studies using C PD immunoprecipitation suffered from the limitation that they would only precipitate T-containing C PDs. But the original Mori '91 paper states that the TDM-2 antibody binds TT and C T dimers. If there is a later reference showing this not to be the case, it should be cited.

5.
A hypothetical point, for the authors to think about. They are careful to say that the C PD is the cause of these melanoma mutation hotspots rather than "reduced repair". Intentionally or not, they don't say repair is not important. This leaves open the possibility that the mutation hotspot is related to repair in an additional way -arising from error-prone repair of the C PD at this site of ETF1-altered DNA conformation. Error-prone repair (as opposed to replicational bypass) is an old idea that I haven't heard discussed lately.
Writing -1. The authors would be better served in the Abstract and Introduction by talking about "mutations in regulatory regions" rather than "non-coding mutations". To readers not in the field, and maybe in it, "non-coding mutations" sounds like "silent mutations" that don't change the amino acid. I sympathize with the desire to get to the C PD data as soon as possible, so perhaps another approach would be keep the current figures but to call attention to the apparent discrepancy in peak width and say "and therefore we investigated the TFBS specificity in greater detail".
3. p9 line 11. "mC PD formation at the center of the active TFBS" 4. p12 line 13. Table S1 seems to now be S3.

Response to Reviewers:
Referee #1: 1. "According to this study, mutagenesis is enhanced at sites where ETS1 and related family members bind. Perhaps I missed this, but can ETS factors still bind to these UVmutated binding sites at the same affinity? It is important to be clear about this point."

RESPONSE:
The core ETS binding motif, consisting of TCC (GGA on the opposite strand), is invariant across all ETS binding sites, and is critical for ETS proteins to bind DNA, based on both in vitro binding experiments and structural studies 1,2 . A C-to-T mutation at position 0 relative to the ETS midpoint (i.e., TCC), which is enriched in melanoma, is expected to disrupt ETS transcription factor binding affinity to nonspecific levels. Indeed, a recent study shows that a C-to-T mutation at this location, which mimics the UV-induced mutation in melanomas, significantly reduces binding of GABPA (an ETS family member) to the ETS motif in the SDHD promoter in vitro 3 . We have included a detailed discussion of this important point on page 18 of the revised manuscript.
2. "Several experimental studies were performed in fibroblasts. It is not clear that these same mechanisms apply to melanocyte or keratinocyte cell types, which would be the cells most vulnerable to UV mutation-associated cancer. Some validation should accompany the fibroblast studies."

RESPONSE:
We show using our CPD-seq method that UV-induced CPD lesions are elevated at ETS binding sites in UV-irradiated fibroblasts. We further show using an in vitro model system that purified ETS1 protein stimulates UV-induced CPD lesions at ETS binding sites in vitro, and describe the molecular mechanism likely responsible for elevated CPD formation at ETS binding sites. These in vitro studies validate our findings in fibroblasts, and further demonstrate that ETS binding alone, in the absence of any additional, cell-type specific co-factors, induces a unique UV damage signature. Given the conserved DNA binding mechanism by ETS transcription factors across different cell types (and in different species), we expect that the same UV damage signature will occur in melanocyte or keratinocyte cells. We plan to map CPD lesions in these cell types in our future studies, but we believe these experiments are beyond the scope of the current manuscript.
Referee #2: 1. "The authors down play the significance of the frequency of 6-4 photoproducts in the introduction (top of page 3). The authors indicate, 'to a much lesser extent, pyrimidine (6-4) pyrimidone photoproducts (6-4PPs) at dipyrimidine sequences in DNA.' The authors should avoid this vague term and indicate that depending on the sequence the frequency of 6-4 photoproduct can range anywhere from 10% to equal numbers. This also raises the question as what happens when there is a 6-4 photoproduct within the same sequence as a CPD -while rare this could happen."

RESPONSE:
We have modified the sentence (see page 3) to read: "UV light induces the formation of cyclobutane pyrimidine dimers (CPDs) and, to a lesser extent, pyrimidine (6-4) pyrimidone photoproducts (6-4PPs) at dipyrimidine sequences in DNA." We did not feel it was necessary to indicate the exact fold-difference in lesion formation between CPDs and 6-4PPs, since this is described in much more detail in the cited reference, and is not the focus of our study. Since CPD lesions are overall more abundant (~3-4-fold higher than 6-4PPs following UV-C irradiation 4 ) and are repaired more slowly than 6-4PPs, and because it has been long recognized that CPDs are the major mutagenic lesion in skin cancers (e.g., 5 ), we mapped just the CPD lesions in our study.
It is possible (albeit very unlikely) that a nearby 6-4PP could impact our ability to map a CPD lesion using CPD-seq. If the 6-4PP were within ~150 bp to 5' side of the CPD lesion on the same DNA strand, we would not be able to PCR amplify and sequence the DNA fragment associated with that particular CPD lesion. However, this is unlikely to have a significant effect on our CPD-seq data, since the UV doses used in our study yielded a relatively low density of UV lesions (i.e., for 100 J/m 2 , there are roughly 0.8 CPD lesions per kb, based on T4 endo V digestion and alkaline gel electrophoresis of the irradiated DNA sample; 6-4PPs would be even lower than this).
2. "Again top of page 3, Citations of previous work. Overall the authors do a OK job, but miss the opportunity on page 3 to include references to recent Sancar papers and a recent paper from the Morrison lab, EMBO J. 2017 Oct 2;36(19):2829-2843. They do cite these papers later, but not in this early context."

RESPONSE:
We have now included the references mentioned by the reviewer near the top of page 3 of the revised manuscript.
3. "The authors do a fair job of detailing the various doses of UV-C used in their experiments in the methods, but to make it easier for the reader, it would be helpful to indicate the doses used in the results sections and also the figure legends whenever possible. In the methods the authors state: "NHF1 cells were irradiated with 20 J/m2 or 100 J/m2 UV and were harvested immediately after UV treatment." However which dose goes with which figure is not always clear. This is especially important in Figure  legend 1 that sets the bar for the rest of the study. And later in Figure 3 where they are making direct comparisons to mutation frequencies -can one dose be used to extrapolate to mutation frequencies?" RESPONSE: To clarify, most CPD-seq experiments were conducted with the dose of 100 J/m 2 UV-C. The only experiment that used 20 J/m 2 was shown in Supplemental Fig.  S2B. We have modified the sentence mentioned in the Methods section (page 21) to make this more explicit. We've also clarified the UV doses used in the legend to Figure  1 and in the results text. Data in Figure 3 are used to highlight the similar trends for CPDs and mutations, as both peak near the midpoint of ETS binding sites. The CPD yields generated at 100 J/m 2 are not meant to be directly extrapolated to mutation frequency in sequenced melanoma tumors. We chose 100J/m 2 as a representative UV dose for CPD-seq