Squalene epoxidase is a bona fide oncogene by amplification with clinical relevance in breast cancer

SQLE encodes squalene epoxidase, a key enzyme in cholesterol synthesis. SQLE has sporadically been reported among copy-number driven transcripts in multi-omics cancer projects. Yet, its functional relevance has never been subjected to systematic analyses. Here, we assessed the correlation of SQLE copy number (CN) and gene expression (GE) across multiple cancer types, focusing on the clinico-pathological associations in breast cancer (BC). We then investigated whether any biological effect of SQLE inhibition could be observed in BC cell line models. Breast, ovarian, and colorectal cancers showed the highest CN driven GE among 8,783 cases from 22 cancer types, with BC presenting the strongest one. SQLE overexpression was more prevalent in aggressive BC, and was an independent prognostic factor of unfavorable outcome. Through SQLE pharmacological inhibition and silencing in a panel of BC cell lines portraying the diversity of SQLE CN and GE, we demonstrated that SQLE inhibition resulted in a copy-dosage correlated decrease in cell viability, and in a noticeable increase in replication time, only in lines with detectable SQLE transcript. Altogether, our results pinpoint SQLE as a bona fide metabolic oncogene by amplification, and as a therapeutic target in BC. These findings could have implications in other cancer types.

The gene SQLE, found on chromosome 8q24.13 in humans 1-3 , encodes squalene epoxidase, one of the key enzymes in the later stages of cholesterol synthesis [3][4][5][6][7] . SQLE catalyzes the oxidation of squalene to 2,3-oxidosqualene 5,6,8 , downstream HMG-CoA reductase (HMGCR), the target of the statin class of cholesterol-lowering agents 9 . SQLE expression is almost ubiquitous in humans, with higher levels in skin, gastrointestinal mucosa, and central nervous system, and lower expression in skeletal muscle 10 . SQLE homologs can be found in several eukaryotic organisms. Indeed, fungal squalene epoxidase is inhibited by antimycotic agents, in use either topically or by oral administration since the 1990s (reviewed in 11 ). Of interest, some of those drugs also block the activity of SQLE at high concentrations in in vitro models 12,13 . Several investigators actively pursued the development of potent, selective human SQLE inhibitors as a feasible way to lower cholesterol [14][15][16] . However, the description of potential side effects such as skin rash 18 or peripheral demyelination 19 in animal models and the widespread adoption and efficacy of statins 17 halted the clinical development of SQLE inhibitors and their use in humans 11,20 : this happened in a possibly premature way and in spite of promising results 21 . As a consequence, the interest in SQLE-targeted drug development progressively faded in later years, and to our knowledge, no clinical study has been conducted to this day.
With the advent of -omics technologies and high-dimensional cancer multilayered analyses (exemplified by 22 ), several works were published, identifying cancer genes with a potential for copy-number (CN) driven overexpression. In breast cancer, research results highlighted the high correlations between CN and gene expression (GE) of unsuspected and widely known oncogenes by amplification, such as ERBB2 (coding the ERBB2 protein, also known as Her2) and MYC [23][24][25][26][27][28][29][30][31] . Scarce interest was dedicated, however, to the not infrequent appearance of SQLE among the top CN-GE correlating genes in several of those works (e.g., in 23,24,26,30 ). This was possibly due to SQLE relative proximity to MYC 2 , and to the then-prevailing focus on aberrations in the targetable kinome, a major topic of oncological research in the 2000s 32 . Few articles from mid-size datasets described the potential role of SQLE overexpression in the definition of a prognostically unfavorable stage I-II estrogen receptor positive (ER+ ) breast cancer subgroup 33 or in Afro-American luminal-A breast cancer patients 34 , and the identification of clusters of breast tumors characterized, among other alterations, by SQLE CN amplification and overexpression 25 , or by its aberrant methylation and expression patterns in concomitance with MYC amplification 35 . No further systematic, large-scale effort was however undertaken as a follow-up of those isolated observations.
In the present work, we set up to exhaustively analyze the correlation of SQLE CN and GE across multiple cancer histological types, Here, we provide the first, systematic, large-scale assessment of the prevalence and interaction of SQLE CN amplification with its GE variation in breast cancer and other tumor types in humans. We subsequently focused our efforts on the association of clinical and pathological factors with SQLE in breast cancer as well as on the potential prognostic relevance of SQLE in that disease. Finally, we investigated whether any biological effect of SQLE inhibition could be demonstrated from experiments performed in thoroughly characterized breast cancer cell line models.

SQLE gene expression shows a high correlation with its locus copy number in breast cancer.
To systematically assess which cancer histotypes showed an association between chromosome 8q24.13 locus CN, where SQLE resides 1,2 , and SQLE gene expression, we analyzed publicly avaimlable data generated by The Cancer Genome Atlas (TCGA), limiting our study to 22 cancer types with at least 100 assessed patients (n = 8,783 cases, see Table 1), and with available SNP6 arrays and RNA-sequencing data. Gains at the SQLE locus, defined according to the GISTIC2.0 algorithm 36,37 , presented the highest frequency in ovarian cancer and the lowest in glioblastoma multiforme (76% and 12% of affected patients respectively, see Fig. 1A, leftmost panel). 62% of breast cancer patients were characterized by gains at the SQLE locus. After correction for multiple testing, we found that 20 of the 22 cancer sets still presented a positive association between SQLE locus CN and SQLE GE, with a false discovery rate (FDR) < 0.001 (see Fig. 1A, rightmost panel). However, only breast, ovarian, and colorectal cancers showed a large effect size 38 for that association, with breast cancer presenting the strongest correlation (ρ = 0.71, see Fig. 1C top left) since it also represented the largest TCGA dataset, with 1,178 collected cancer specimens. Ovarian cancer had a similar SQLE CN-GE correlation coefficient (ρ = 0.71, see Fig. 1D top left), followed by colorectal cancer with ρ = 0.61. Taken together, these data suggest that SQLE CN may have a tissue-dependent role in contributing to the variability of SQLE GE, and that breast and ovarian cancers show the tightest correlation between those two parameters. As a comparison, it is worth noting that a prototypical oncogene driven by copy number amplification, ERBB2, shows a ρ = 0.63 in the same TCGA breast cancer dataset where SQLE was assessed. SQLE and MYC are transcriptionally independent in breast and ovarian cancer, albeit residing in close proximity on chromosome 8. Since its cloning and characterization 39 , MYC has been one of the most studied cancer-related genes. It is physically located on chromosome 8q24.21, the cytoband immediately adjacent to the SQLE locus, towards the telomeric end 2 . We therefore explored MYC and SQLE associations in terms of GE and CN in the breast and ovarian cancer TCGA datasets (see Supplementary Table S1 and Supplementary Fig. S1   All cancer histotypes with at least 100 cases collected by the TCGA were assessed for the presence of SQLE copy number gains and losses (blue and orange respectively, left bar chart (A), and log2 ratios were correlated with normalized SQLE gene expression using the Spearman's correlation coefficient (ρ value on the x-axis, decreasing from highest to lowest, right bar chart (A). SQLE copy number/gene expression correlation was invariably higher than for MYC, which is physically closely located on chromosome 8q, in both breast (topmost panels, (B) and ovarian cancer (topmost panels, (C positive lymph nodes (NPLN), histological grade, ER and Her2 status, we took advantage of the data generated by METABRIC, currently the single largest clinically annotated CN and GE breast cancer dataset publicly available 31 .
Using multiple correspondence analysis (MCA), a multivariate statistical method similar to principal component analysis, but suited for categorical data 41 , we reduced our variables of interest to two dimensions, explaining the largest fraction of the variance characterizing the factors we considered from the METABRIC data. Clinical and pathological variables together with SQLE expression levels were projected as vectors in a space defined by those two dimensions (see Fig. 2). The position of the variable categories in this two-dimensional space reflects their mutual associations, with no a priori assumption on the underlying structure of the data. We observed that elevated SQLE expression lay in close proximity with high histological grade, and in the same spatial region where positive nodal status and larger (T2 or greater) T size were positioned.  . The largest effect sizes for those associations could be observed for histological grade and Her2 status, which indeed remained independently significantly associated with SQLE GE in a multivariable linear regression model including all the aforementioned clinical and pathological parameters (see Table 2). In summary, by both MCA analysis and classical statistical tests we could demonstrate that, in breast cancer, high SQLE GE characterizes clinically aggressive tumors (higher grade, larger size, and positive nodal status ones), and that SQLE tends to be overexpressed in Her2 + and ER-cases.      Table 3 and Fig. 3D). We intentionally did not include the integrative clustering subgroups 31 or the PAM50 intrinsic subtypes 42 as variables in our model: due to the strong independent prognostic value of those genomic classifiers, they would be retained at the cost of excluding more traditional parameters such as grade and ER status, which were the focus of this exploratory characterization of SQLE in breast cancer. To summarize, we demonstrated that SQLE overexpression is independently associated with an unfavorable outcome in breast cancer, even when taking into consideration classical clinical and pathological variables such as age, tumor size, nodal status, grade, ER, and Her2 status.

Breast cancer cell lines show highly variable SQLE CN and GE. With the purpose of establishing an
in vitro breast cancer cell line panel for SQLE characterization, we selected and acquired six breast cancer lines reported to present alterations in SQLE CN and GE levels from the CCLE 43 or the NCI-60 cell line panel 44 , after assessing the data available through the cBioPortal for Cancer Genomics 45 . We then characterized those lines for ploidy and absolute SQLE CN value by array-CGH, and GE by qPCR (see Table 4, Fig. 4A,B, and Supplementary  Fig. S2 online). SK-BR-3 and MCF-7 (see Fig. 4A) showed the highest levels of SQLE CN, each with more than 8 copies, through focal amplification of the chromosome 8q24.13 cytoband. Hs 578T presented the highest SQLE GE values and 5 copies of the gene through chromosome 8q gain, while MDA-MB-468 and T-47D were hypo-triploid, and were not characterized by SQLE focal alterations. MDA-MB-231 was unique, in that it behaved as a natural knockdown for SQLE, carrying only one copy of this gene and almost undetectable SQLE GE levels (see Fig. 4B). Altogether, the six breast cancer cell lines we selected were representative of the full spectrum of SQLE possible alterations, ranging from high-degree focal amplification, through arm-level gains with SQLE overexpression, to deletion with almost undetectable endogenous SQLE transcript.  SQLE pharmacological inhibition decreases breast cancer cell line viability in a copy-dosage correlated way. Due to SQLE association with breast cancer aggressiveness, we hypothesized that its inhibition would lead to a decrease in cell proliferation. To control for potentially aspecific toxic effects of cholesterol biosynthesis inhibition, we assessed in parallel cell lines with differential levels of expression of SQLE. We therefore tested the viability of our six breast cell lines upon challenging with a SQLE inhibitor, terbinafine 46 . Terbinafine is used as an antifungal agent, since it inhibits fungal SQLE at plasma concentrations between 0.34 and 3.4 μ M 47 . However, it also targets mammalian SQLE at higher concentrations 12,13 . Terbinafine indeed caused cancer cell demise in our breast lines, with an IC 50 varying by almost an order of magnitude across the six assessed lines (see Table 4 and Fig. 4D,E). Of interest, terbinafine exerted its effect in an evident SQLE copy-dosage correlated manner (ρ = − 0.81, P = 0.0499, see Fig. 4C). Moreover, Hs 578T showed a peculiar sensitivity to terbinafine, in spite of having a lower SQLE CN value than other lines. Not surprisingly however, Hs 578T was the highest SQLE expresser in our panel, and indeed also SQLE GE levels showed a trend for correlation with terbinafine IC 50 in the six lines panel (ρ = − 0.71, P = 0.1361). In summary, we demonstrated that pharmacological inhibition of SQLE effectively decreases breast cancer cell line viability in a copy-dosage correlated manner, suggesting that SQLE may be a treatment target in such disease, and that SQLE CN and/or GE increase could be used as predictive biomarkers of sensitivity to selective inhibitors of mammalian SQLE activity.

SQLE silencing increases doubling time only in SQLE-expresser breast cancer cell lines. Since
terbinafine may act at least partially in a non-specific manner at high concentrations 48 , we induced transient SQLE silencing in two lines with high endogenous SQLE transcript levels (MCF-7 and Hs 578T), one line with low but detectable SQLE (T-47D), and the cell line carrying a deletion in SQLE, MDA-MB-231. Silencing was highly effective, since it reduced SQLE GE by more than 90% at 24 h in all the three lines with detectable SQLE levels at baseline, compared to scrambled siRNA control (P < 0.001, see Fig. 4F). Upon SQLE silencing, all the three SQLE expresser cell lines showed an increase in their doubling times, by 47%, 17%, and 33% respectively, compared to cells treated with scrambled siRNA (see Table 4). MDA-MB-231, on the other hand, did not show any lengthening in its replication time. Taken together, our results demonstrate that targeting SQLE transcript by transcriptional silencing has an inhibitory biological effect, only in breast cell lines that exhibit endogenous SQLE transcription, with MDA-MB-231 acting as a natural negative control for our experiments. Moreover, our findings constitute a methodologically independent confirmation that the decrease in cell viability, observed in cell lines treated with terbinafine is indeed due to SQLE inhibition.

Discussion
In the present article, we assessed the presence of CN and GE aberrations of SQLE, a key enzyme in the synthesis of cholesterol 3-7 , across more than 8,000 cases from 22 cancer types made available by the TCGA. We found that SQLE CN gains are frequent in several histologies, that SQLE GE is tightly correlated with its CN values, and that the strength of this association is tightest in breast cancer, followed by ovarian and colorectal cancer. SQLE CN-GE correlation appears to be systematically stronger in breast and ovarian tumors compared to the same association calculated for MYC in those cancer types, in spite of the close proximity of the two gene loci 2 , which results in similar CN values for both genes. By exploring SQLE correlations with clinical and pathological variables in METABRIC 31 (the single largest, clinically annotated, publicly available CN/GE breast cancer dataset), we established that aggressive cases, defined by high histological grade, larger tumor size, nodal involvement, and by ER-and Her2 + disease, are characterized by SQLE overexpression. Moreover, we observed that SQLE overexpression, but not MYC, is independently significantly associated with unfavorable outcome in breast cancer, even after taking into account the above-mentioned clinical and pathological parameters. Finally, through SQLE pharmacological inhibition and SQLE transcript-directed silencing experiments in a panel of breast cancer cell lines portraying the diversity of SQLE CN and GE, we demonstrated that SQLE inhibition results in a decrease in cell viability that is highly correlated with SQLE copy dosage, and that induced reduction of SQLE GE levels causes a noticeable lengthening in replication time, only in cells with endogenous detectable SQLE transcript. Altogether, our results pinpoint SQLE as a bona fide metabolic oncogene by amplification, as well as a therapeutic target in breast, and possibly, other cancer types, since it responds to major requirements to be considered as an oncogene of therapeutic relevance 49 : first, SQLE is frequently altered by CN gains in breast cancer, and SQLE GE appears to be tightly regulated by increases in the copy dosage of its gene locus; second, SQLE is a key enzyme in the synthesis of cholesterol, and several studies have pointed toward a therapeutic effect of cholesterol lowering in cancer, possibly by decreasing cholesterol bioavailability, altering cancer cell membranes, and through other mechanisms [50][51][52] ; third, due to the CN-driven GE increase found only in cancer cells, SQLE may behave as an "oncogene by addiction" 53 (as opposed for example to HMGCR, the target of statins, for which no recurrent aberrations have been described in tumors), such that mammalian SQLE inhibition may have a high therapeutic index, leaving healthy cells relatively unharmed by SQLE blockade, whilst provoking cell demise in SQLE-amplified cells only: one such notable example is the selective activity of anti-ERBB2 agents in ERBB2-amplified breast cancer, which has led to their successful adoption in clinical practice; last but not least, researchers have already developed potent and selective inhibitors of mammalian SQLE 14-16 , albeit with an entirely different scope than the one we foresee, i.e. as anti-cancer agents.
Of interest, several genes, apart from SQLE and MYC, located in the q arm of chromosome 8 may have an even tighter CN-GE correlation than SQLE, and be endowed with essential biological properties for cancer proliferation and survival. A notable example is RAD21, found in close proximity to SQLE on chromosome 8q24.11 2 . RAD21 encodes a protein involved in DNA double-strand-break repair 54 , shows high-level amplifications in 37% of the TCGA breast cancer cases, and presents a significant transcriptional correlation with SQLE GE (ρ = 0.62) (data from http://www.cbioportal.org, accessed May 2 nd , 2015). Our findings are not in contrast with a pleiotropic, multipronged impact of the amplification of this region. Notwithstanding, on the one hand a direct role of SQLE overexpression in promoting neoplastic growth was shown by the clear in vitro evidence we generated, that SQLE inhibition causes cell demise and slows replication: we have therefore a rationale to consider SQLE as an especially relevant cancer-gene belonging to that region. On the other hand, it is entirely possible that several proteins encoded in the same chromosomal region cooperate to determine a more aggressive phenotype in cancer: indeed, it has been demonstrated that genes with a common final function can be physically clustered in the same genomic interval (see 55 and citations therein). We can speculate that SQLE and RAD21 concomitant overexpression would enable a cancer cell to proliferate more effectively through a more proficient DNA damage repair system while dividing (through RAD21), and at the same time to speed up the process by more efficient membrane synthesis (courtesy of SQLE). However, the key point of our research is another: even if other genes found in the same physical region of SQLE have their own roles in determining an aggressive cancer phenotype, we have also showed experimentally that SQLE is an attractive bona fide target for treatment, while other co-located proteins (such as RAD21) are not readily druggable, and hence less clinically relevant to the purpose of finding novel therapeutic targets in breast cancer and other tumor types.
We have to acknowledge that our in vitro experiments are far from exhaustive: the potential for off-targets in pharmacological SQLE inhibition experiments is still present, although terbinafine blocks the activity of mammalian SQLE, and albeit selective, SQLE silencing could result in a cytostatic rather than cytotoxic effect on cell lines. Nonetheless, the independent nature of the two methods we used to block SQLE activity (i.e. using a chemical compound known to target also mammalian SQLE and decreasing SQLE transcription by siRNA) corroborates the proof of concept that the detrimental biological effects we observed in breast cancer cell lines are indeed likely to result from the selective inhibition of SQLE activity. Moreover, the SQLE copy-dosage correlation with terbinafine activity is per se highly suggestive of an on-target biological phenomenon, worth of further investigation. Copy dosage alone does not explain entirely the sensitivity to SQLE inhibition in our cell set: in the case of Hs 578T, terbinafine effect is elevated in spite of a relatively low copy dosage, while SK-BR-3 exhibit lower sensitivity to terbinafine than MCF-7, while having an even higher copy number in the SQLE region. However, the first cell line exhibits the highest SQLE expression of the six we tested, possibly driven by epigenetic mechanisms worth of further investigation, whereas partial resistance phenomena and other post-translational modifications 7 still to be tested could account for the discrepancy observed in the second line. Finally, SQLE inhibition may be toxic in in vitro, but not in vivo experiments, due to the limited supply of cholesterol in the employed media: however, fetal bovine serum provides on average 310 μ g/mL of such molecule 56 , thus proving cultivated cancer cell lines with an external source of cholesterol upon de novo synthesis inhibition.
We are not the first investigators to point out that SQLE may have a biological relevance in breast cancer. Other researchers observed SQLE overexpression in an adverse prognosis group of ER + , stage I/II breast cancer cases 33 , described aberrant methylation patterns in the 8q12.1-q24.22 genomic region chromosomal region 35 , or reported that SQLE amplification and increased transcription was enriched (together with other genes) in a distinctive cluster of triple negative breast tumors or in specific ethnic populations 25,34 . Again, we have to remark that we provided here the first systematic, large-scale assessment of SQLE CN amplification and overexpression in breast cancer and other tumor types in humans. Our analyses have allowed us to identify the tendency of aggressive breast tumors, especially of the Her2 positive, higher grade ones, to overexpress SQLE, and its independent role in determining an unfavorable outcome in breast neoplasms. Finally, to the best of our knowledge no one had so far explored the potential anti-cancer properties of SQLE inhibition in SQLE-amplified breast cancer models.
To conclude, we believe our present research has shed light on a neglected metabolic cancer gene by amplification in breast cancer, and possibly in other tumor types. Our findings may thus pave the way to additional studies in the clinical setting, to assess the relevance of SQLE inhibition as a novel cancer treatment option.

Methods
Datasets. GISTIC aberration calls, log2 ratio CN intensities, and RNA-sequencing pre-processed GE intensities generated by the TCGA were downloaded from the Broad Institute TCGA GDAC repository (version 2015/02/04). Only datasets with more than 100 collected cases were considered in our analyses, for a total of 8,783 cases and 22 cancer types. The METABRIC dataset was downloaded from the Sage Bionetworks Synapse repository (last accessed, March 3 rd , 2015). In our analyses, we only considered patients from the METABRIC dataset with complete information for age, tumor size, NPLN (categorized as 0, 1-3, 4 or more), histological grade, ER and Her2 status by bimodal gene expression assessment, as well as follow-up status. We also excluded cases with no intrinsic subtype classification, as well as those classified as "normal-like", in light of the controversies that this class of breast cancers may be reflective of too low a cellularity to obtain meaningful CN and GE data 57 . We were therefore left with 1,633 patients to conduct our downstream tests. For aberrations in the chromosome 8q24.13-q24.21 region (whose log2 ratios were calculated as the means of SQLE and MYC loci values), we categorized intensities as "gains" if the log2 ratio for a given sample was > 0.32 and ≤ 0.81, and as "amplifications" if > 0.81. Those empirical thresholds would correspond to three and five gene copies respectively, in a fully clonal region of a near-diploid cancer genome with ~ 50% tumor cell fraction (see formula (1)  Manufacturer's instructions. DNA was quantified with a NanoDrop ® ND-1000 spectrophotometer (Thermo Scientific, Inc.). DNA copy number aberrations were determined using high-resolution arrays (SurePrint G3 Human CGH Microarrays, 4 × 180K) (Agilent Technologies, Palo Alto, CA, USA). For DNA labeling and assessment of DNA labeling efficiency, 0.8 μ g of amplified test and reference DNA (female normal genomic DNA from Promega, Madison,WI) were labeled using Sure Tag DNA labeling kit (Agilent Technologies, Palo Alto, CA, USA) with Cy5-dUTP and Cy3-dUTP respectively, according to the CGH Enzymatic Labeling Kit Protocol v.7.3 (Agilent Technologies, Palo Alto, CA, USA). Unincorporated nucleotides were then removed using centrifugal filters (Amicon Ultra 0.5ml, Merck Millipore, Merck KGaA, Darmstadt, Germany) according to the Manufacturer's protocol. Quality analysis and quantification of labeled DNA were performed by NanoDrop ® ND-1000 (Thermo Scientific, Inc.) spectrophotometer, measuring A260 (for DNA), A550 (for Cy5) and A649 (for Cy3) to evaluate yield, degree of labeling, and specific activity. To perform array hybridization and scanning, Cy5-labeled DNA from cell lines was mixed with an equivalent amount of Cy3-labeled reference DNA. Repetitive sequences were blocked with human Cot-1 DNA (Invitrogen TM , Thermo Scientific, Inc.) and samples were hybridized with Oligo aCGH/ChIP-on-chip Hybridization Kit onto the microarray slides, according to the Manufacturer's specifications. Following hybridization at 65 °C for 24 h in a rotating oven (Agilent Technologies, Palo Alto, CA, USA) at 20 rpm, slides were washed and scanned using a Agilent Microarray Scanner (G2505C). Resulting images were then elaborated and quality-checked using the Feature Extraction software v11.01.1 (Agilent Technologies, Palo Alto, CA, USA), and exported into .txt files for further analyses. qPCR. Total RNA was extracted from cells using the RNeasy minikit (Qiagen S.r.l., Milan, Italy) according to the Manufacturer's specifications. Concentration and integrity were checked using an Agilent 2100 Bioanalyzer system (Agilent Technologies, Palo Alto, CA, USA). One μ g of RNA was reverse-transcribed in a final volume of 50 μ l using the High Capacity cDNA Reverse Transcription kit (Invitrogen). 5 μ l of the resulting cDNA was used for qPCR, performed in triplicates using a 7900 HT Fast real-time PCR system (Applied Biosystems by Invitrogen) with TaqMan ® Gene Expression Assays for human RPLP0 (Hs99999902_m1), GAPDH (Hs99999905_m1), SQLE (Hs01123768_m1), and TaqMan ® Universal PCR Master Mix. SQLE GE was normalized to housekeeping GE (geometric mean of GAPDH and RPLP0). Comparisons in GE were calculated using the 2 −ΔΔCt method.
Array-CGH data analyses. Log10 ratios from Agilent feature extraction .txt files were imported in R (http:// www.R-project.org/) using the data.table package, averaged over probe replicates using the limma BioConductor package 58 , back-transformed into linear scale before converting into log2 ratio data space. After mapping probe location to the NCBI37/hg19 build of the human genome using the UCSC liftOver utility (https://genome.ucsc. edu/cgi-bin/hgLiftOver), data were preprocessed by outlier winsorization with the copynumber package 59 using default options, and segmented by penalized least square regression using a heuristically chosen value of γ = 100, which optimized the number of segments per sample, while not leading to excessive information loss. Segmented data were then used to compute cancer cell line mean ploidy and absolute copy number values for SQLE using ABSOLUTE 60 with the copy_num_type argument set to "total", and following the recommendations from the companion website (http://www.broadinstitute.org/cancer/cga/ABSOLUTE).
Once the most likely ploidy and predominant clone cell fraction for a given cell line was selected, the absolute copy number of SQLE locus could be calculated solving the following equation for n: where α is the predominant clone fraction in a cell mixture, β is the mean ploidy returned by ABSOLUTE, n is the integer copy number of a segment, and α β , LRR is the measured log2 ratio for that segment, given α and β . Raw and log2 normalized data are available in GEO (www.ncbi.nih.gov/projects/geo) under the accession number GSE71395.
Statistics. Continuous-association tests were carried out using the Spearman's correlation coefficient, which does not hold a priori assumptions over the distribution of the data (normality being systematically violated by log2 ratio CN measures). FDRs were calculated with the Benjamini-Hochberg multiple testing correction. MCA was performed using the package FactoMineR (http://factominer.free.fr/contact/index.html), after categorizing SQLE expression intensities in tertiles (as for MYC) in the METABRIC dataset, and results were represented with ggplot2 61 . Linear regression was employed to analyze the associations of multiple variables with SQLE expression, whereas t tests or one-way ANOVA tests were used for comparison of SQLE expression between categorical variables with two or more levels respectively, using the Tukey honest significant difference method to compare levels in the latter case. Survival curves were plotted using the Kaplan-Meier estimators, generated with the package survcomp 62 , and P values were calculated with the log-rank test. For survival analyses including more than one variable, a stepwise backward-forward Cox proportional hazards regression model was employed, starting from all clinical and pathological variables described above, as well as MYC and SQLE expression levels categorized in tertiles, until minimization of the Akaike Information Criterion was achieved (package MASS 63 ). For Cox regression, the P value was calculated using the Wald test. The forest plot in Fig. 3D was generated using the package rms. All the aforementioned analyses were conducted in R, as were the related figures. For terbinafine experiments, Hill slope curves, IC 50 concentrations, and 95% CIs were calculated by fitting a nonlinear function to the quintuplicate data points, whereas silencing efficiency on cell lines was tested using one-way ANOVA with contrasts, followed by the Bonferroni correction for multiple testing. These tests and the corresponding plots were generated using GraphPad PRISM 6. All statistical tests were two-tailed, and null-hypotheses were rejected with P values < 0.05. Adobe Illustrator CS6 was used to finalize the illustrations. No data was altered for graphical representation.