An ALYREF-MYCN coactivator complex drives neuroblastoma tumorigenesis through effects on USP3 and MYCN stability

To achieve the very high oncoprotein levels required to drive the malignant state cancer cells utilise the ubiquitin proteasome system to upregulate transcription factor levels. Here our analyses identify ALYREF, expressed from the most common genetic copy number variation in neuroblastoma, chromosome 17q21-ter gain as a key regulator of MYCN protein turnover. We show strong co-operativity between ALYREF and MYCN from transgenic models of neuroblastoma in vitro and in vivo. The two proteins form a nuclear coactivator complex which stimulates transcription of the ubiquitin specific peptidase 3, USP3. We show that increased USP3 levels reduce K-48- and K-63-linked ubiquitination of MYCN, thus driving up MYCN protein stability. In the MYCN-ALYREF-USP3 signal, ALYREF is required for MYCN effects on the malignant phenotype and that of USP3 on MYCN stability. This data defines a MYCN oncoprotein dependency state which provides a rationale for future pharmacological studies.

For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.
n/a Confirmed The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.
A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted Give P values as exact values whenever suitable.

For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings
For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated Our web collection on statistics for biologists contains articles on many of the points above.

Software and code
Policy information about availability of computer code

Data collection
No custom algorithms or software was used for data collection. All other software are described and cited in the manuscript. Whole genome sequencing (WGS) data were obtained through the TARGET data matrix (https://ocg.cancer.gov/programs/target/data-matrix) and further processed using the R statistical language and RStudio (1.1.456). We utilised public data resources produced by the Cancer Cell Line Encyclopedia (CCLE) and Project Achilles via the Cancer Dependency Map portal (DepMap,20Q1). Gene expression, copy number data were first obtained from DepMap and then filtered using R/R Studio (1.1.456).
For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information.

October 2018
Data Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: -Accession codes, unique identifiers, or web links for publicly available datasets -A list of figures that have associated raw data -A description of any restrictions on data availability The ALYREF ChIP-seq data has been deposited at the Gene Expression Omnibus Website with series number of GSE150303 (https://www.ncbi.nlm.nih.gov/geo/ query/acc.cgi?acc=GSE150303). We also obtained several other publicly available ChIP-seq datasets (GSE80151) of MYCN, RNA Polymerase II, BRD4, H3K27ac, H3K4me3 and H3K27me3 to complement our ALYREF ChIP-seq data (GSE80151)(https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE80151), fastq files were obtained directly from the European Nucleotide Archive (ENA) under the study accession PRJNA318044 (https://www.ncbi.nlm.nih.gov/bioproject/? term=PRJNA318044). Gene expression and relevant patient prognosis information in TARGET, SEQC and Kocak neuroblastoma patient datasets were downloaded from R2 platform (http://r2.amc.nl). Whole genome sequencing (WGS) data were obtained through the TARGET data matrix (https://ocg.cancer.gov/programs/ target/data-matrix). RNA-seq data which had paired WGS data, were also obtained from the TARGET data matrix. RNA-seq data for the SEQC neuroblastoma cohort were obtained from the gene expression omnibus (GEO) with the accession GSE62564 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE62564). We utilised public data resources produced by the Cancer Cell Line Encyclopedia (CCLE) and Project Achilles via the Cancer Dependency Map (DepMap, 20Q1) portal (https://depmap.org/portal/). Uncropped and unprocessed immunoblot scans as well as colony formation assay and PCR agarose gel pictures for all main figures are provided as Supplementary Information. All other relevant data are available from the corresponding authors on request.
Field-specific reporting Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection.

Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences
For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf

Life sciences study design
All studies must disclose on these points even when the disclosure is negative. Data exclusions Animals were only excluded from analyses when they were excluded from the experiments, because they did not develop tumors after xenografting with neuroblastoma cells.

Replication
All in vitro experiments (except for ChIP-seq and rescue MYCN ubiquitination) were performed at least 3 times and we were able to confirm the reproducibility of our results. All attempts at replication were successful and are included in the data analyses. All experiments were independently repeated to ensure the findings are reproducible. For cellular and molecular experiments, each single measurement was performed at least in triplicate and the results were consistently reproducible. In vivo experiments were performed in 8 mice per experimental group.
Randomization All mice for in vivo data and cells for in vitro experiments were randomly assigned to experimental and treatment groups.

Blinding
Investigators were not blinded during gene expression analysis in human tumor tissues, to complete analyses correctly, sample information and corresponding clinical data were known to investigators. Investigators were not blinded during in vivo studies, as the doxycycline treatment was given by cage.

Reporting for specific materials, systems and methods
We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response.

Validation
All primary antibodies were validated previously by manufacturer and published papers at the manufacturer's website. We also validated all antibodies in our own experiments for both endogenous and exogenous expressions (when applicable) with proper molecular weight markers and positive/negative controls. Series of antibody dilutions were tested and optimized dilutions are provided in the Methods section. When dilution is not applicable (immunoprecipitation and ChIP experiments), antibody concentrations are provided in the Methods section.