Chimeric EWSR1-FLI1 regulates the Ewing sarcoma susceptibility gene EGR2 via a GGAA microsatellite

Journal name:
Nature Genetics
Volume:
47,
Pages:
1073–1078
Year published:
DOI:
doi:10.1038/ng.3363
Received
Accepted
Published online

Deciphering the ways in which somatic mutations and germline susceptibility variants cooperate to promote cancer is challenging. Ewing sarcoma is characterized by fusions between EWSR1 and members of the ETS gene family, usually EWSR1-FLI1, leading to the generation of oncogenic transcription factors that bind DNA at GGAA motifs1, 2, 3. A recent genome-wide association study4 identified susceptibility variants near EGR2. Here we found that EGR2 knockdown inhibited proliferation, clonogenicity and spheroidal growth in vitro and induced regression of Ewing sarcoma xenografts. Targeted germline deep sequencing of the EGR2 locus in affected subjects and controls identified 291 Ewing-associated SNPs. At rs79965208, the A risk allele connected adjacent GGAA repeats by converting an interspaced GGAT motif into a GGAA motif, thereby increasing the number of consecutive GGAA motifs and thus the EWSR1-FLI1–dependent enhancer activity of this sequence, with epigenetic characteristics of an active regulatory element. EWSR1-FLI1 preferentially bound to the A risk allele, which increased global and allele-specific EGR2 expression. Collectively, our findings establish cooperation between a dominant oncogene and a susceptibility variant that regulates a major driver of Ewing sarcomagenesis.

At a glance

Figures

  1. EGR2 overexpression is mediated by EWSR1-FLI1.
    Figure 1: EGR2 overexpression is mediated by EWSR1-FLI1.

    (a) EGR2 and ADO expression levels in Ewing sarcoma (EwS, GSE34620) and normal tissue (GSE3526). The normal-body atlas consisted of 353 microarrays representing 63 individual tissue types (Supplementary Fig. 1). Data are shown as medians (horizontal bars) with ranges for the 25th–75th percentile (box) and 10th–90th percentile (whiskers). P values determined via two-tailed unpaired Student's t-test with Welch's correction. (b) Between-group analysis. Genes (gray dots) and tumor samples (colored spheres) are separated along three axes. EwS, Ewing sarcoma (n = 279); RMS, rhabdomyosarcoma (n = 121); OS, osteosarcoma (n = 25); DSRCT, desmoplastic small-round-cell tumor (n = 32); MB, medulloblastoma (n = 52); NB, neuroblastoma (n = 64); MRT, malignant rhabdoid tumor (n = 35). The main genes specifically overexpressed in Ewing sarcoma are indicated. (c) Quantitative real-time PCR analysis of EGR2 and ADO expression in human MSC lines L87 and V54-2 after ectopic EWSR1-FLI1 expression (pEWSR1-FLI1) as compared with empty vector (pControl). Data are shown as the mean and s.e.m.; n ≥ 9 independent experiments. The EWSR1-FLI1 targets NR0B1 and PRKCB served as positive controls17, 35. EWSR1-FLI1 expression was confirmed by immunoblot (loading control: β-actin).

  2. EGR2 is critical for the growth and tumorigenicity of Ewing sarcoma.
    Figure 2: EGR2 is critical for the growth and tumorigenicity of Ewing sarcoma.

    (a) xCELLigence proliferation kinetics of A673 cells. Data shown are the mean ± s.e.m. of results obtained with two different siRNAs against EGR2 and three different siRNAs against ADO; n ≥ 6 technical replicates. EGR2 or ADO knockdown was confirmed at 48 h by quantitative real-time PCR (mean ± s.e.m., n ≥ 4 independent experiments) and immunoblot (loading control: β-actin). (b) Validation of xCELLigence results by cell counting (including supernatant) 96 h after transfection of A673, SK-N-MC, EW7 and POE cells. Data are mean and s.e.m. of results obtained with two different siRNAs against EGR2 and three different siRNAs against ADO; n ≥ 3 independent experiments. (c) Left, phase-contrast images of sphere-formation assays (scale bars, 1 mm). Right, mean and s.e.m. of n ≥ 3 independent experiments performed with SK-N-MC and POE containing a doxycycline-inducible shRNA against EGR2 (shEGR2_4 or shEGR2_5). Also shown is a representative EGR2 immunoblot for POE cells (96-h doxycycline treatment; loading control, β-actin). (d) Growth curves for subcutaneously xenografted POE or SK-N-MC cells in mice (shControl and shEGR2_4). When tumors reached a volume of 75–100 mm3, doxycycline and sucrose (Dox +) or sucrose alone (Dox −) was added to the drinking water (treatment). Mean ± s.e.m.; n ≥ 6 mice per group. P values determined via two-tailed unpaired Student's t-test. (e) Size-proportional Venn diagrams of up- and downregulated genes 48 h after knockdown of EWSR1-FLI1 (siEF1) or EGR2 (siEGR2) in A673 and SK-N-MC cells (minimum log2 fold change ± 0.5, Benjamini-Hochberg–corrected P < 0.05). Fisher's exact test.

  3. Fine-mapping and epigenetic profiling revealed candidate EGR2 regulatory elements.
    Figure 3: Fine-mapping and epigenetic profiling revealed candidate EGR2 regulatory elements.

    (a) Top, Manhattan plot of 1,440 SNPs identified by targeted deep sequencing within the chr10 susceptibility locus and flanking haplotype blocks. rs10995305 was the SNP most significantly associated with Ewing sarcoma at this locus (false discovery rate (FDR)-corrected P = 1.27 × 10−4). The blue lines indicate the recombination-rate estimates from the HapMap project36. Middle, LD plot of the chr10 susceptibility locus hotspot (chr10:64,449,549–64,756,872) based on the analysis of 290 significant Ewing sarcoma–associated SNPs in 343 affected subjects (a subset of the original GWAS cohort4) and 251 controls. Bottom, epigenetic profile of the chr10 susceptibility locus hotspot in the Ewing sarcoma cell lines SK-N-MC, A673 and EW502. Displayed are signals from published ChIP-Seq or DNase-Seq data for RNA polymerase II (pol II), DNaseI hypersensitivity (HS), EWSR1-FLI1 (EF1), H3K4me1 and H3K27ac in Ewing sarcoma cells transfected with either a control shRNA (shGFP) or a specific shRNA against EWSR1-FLI1 (shEF1), and FAIRE8, 26, 27. The read count is given on the left. mSat1 and mSat2 are GGAA microsatellites (Supplementary Fig. 8). (b,c) Normalized luciferase reporter signals in A673-TR-shEF1 and SK-N-MC–TR–shEF1 cells containing a doxycycline-inducible shRNA against EWSR1-FLI1. EWSR1-FLI1 knockdown was confirmed by quantitative real-time PCR and immunoblot (loading control: α-tubulin). Data are shown as means and s.e.m.; n ≥ 5 independent experiments.

  4. Germline variation at mSat2 modulates EWSR1-FLI1-dependent EGR2 expression.
    Figure 4: Germline variation at mSat2 modulates EWSR1-FLI1–dependent EGR2 expression.

    (a) Coordinates, epigenetic profile and sequence of the mSat2 locus. Consistent with previous studies, H3K4me1 and H3K27ac signals peaked adjacent to the repetitive GGAA mSat8, 27. The P value reported for rs6479860 reflects the significance of its association with Ewing sarcoma. (b) Luciferase reporter signals of mSat2 with the T or A allele at rs79965208. Data are mean and s.e.m.; n ≥ 6 independent experiments. P values determined via two-tailed unpaired Student's t-test. (c) EGR2 expression measured by quantitative real-time PCR in 117 Ewing samples (103 primary tumors and 14 cell lines). EGR2 expression was normalized to that of RPLP0 and is displayed as expression relative to that of the median sample (set as 1). Horizontal bars represent means, and whiskers represent the 95% confidence interval boundaries. P value determined via linear regression. (d) Allele fraction of reads mapping to rs79965208 generated in a ChIP-MiSeq experiment in the A/T Ewing cell line MHH-ES1 (Supplementary Fig. 10 and Supplementary Data). (e) Left, representative Integrative Genomics Viewer37 pile-up of reads covering the EGR2 3′ UTR rs61865883 in matched constitutional or tumor DNA and tumor-derived RNA. The sample EW012 exhibited transcriptional allelic imbalance of EGR2, whereas EW577 did not. Right, raw rs61865883 allele fractions of targeted RNA deep sequencing in 45 Ewing sarcomas heterozygous (A/T) for the transcribed EGR2 3′ UTR allelic marker rs61865883. Horizontal bars represent means, and whiskers show the 95% confidence interval boundaries. P values determined via parametric two-tailed Student's t-test. (f) Regulatory model of EWSR1-FLI1 and mSat2 controlling EGR2 expression and proliferation of Ewing sarcoma cells in convergence with the FGF pathway.

  5. Expression pattern of EGR2 and ADO in normal tissues relative to Ewing sarcoma.
    Supplementary Fig. 1: Expression pattern of EGR2 and ADO in normal tissues relative to Ewing sarcoma.

    The normal-body atlas consisted of 353 microarrays representing 63 individual tissue types (GSE3526). Gene expression levels are shown as the mean and s.e.m. of n ≥ 3 samples per tissue type. The normal-body atlas and the Ewing sarcoma (EwS) data (GSE34620) were normalized simultaneously by RMA using custom brainarray CDF (v18, ENTREZG). All microarray data were generated on Affymetrix HG-U133Plus2.0 arrays.

  6. eQTL analyses across tissue types identify Ewing sarcoma-specific correlations of EGR2 and ADO expression with the risk-allele (G) at rs1848797.
    Supplementary Fig. 2: eQTL analyses across tissue types identify Ewing sarcoma-specific correlations of EGR2 and ADO expression with the risk-allele (G) at rs1848797.

    Data are shown as medians (horizontal bars) with ranges for the 25th−75th percentile (box) and 10th−90th percentile (whiskers). Outliers are depicted as dots. P values determined via linear regressions. EwS, Ewing sarcoma; AML, acute myeloid leukemia; LCL, lymphoblastoid cell lines.

  7. Analysis of EGR2 and ADO expression in Ewing sarcoma cell lines after knockdown of EWSR1-FLI1.
    Supplementary Fig. 3: Analysis of EGR2 and ADO expression in Ewing sarcoma cell lines after knockdown of EWSR1-FLI1.

    A673, SK-N-MC, EW7 and POE cells were transfected either with non-targeting control siRNA or siRNA targeting EWSR1-FLI1. Gene expression was assessed 48 h thereafter by qRT-PCR. FC, fold change. Data are shown as the mean and s.e.m.; n ≥ 5 independent experiments. Two-tailed unpaired Student’s t-test; ns, not significant; * P < 0.05, ** P < 0.01, *** P < 0.001.

  8. Knockdown of EGR2 reduces clonogenic growth, cell cycle progression, and viability of Ewing sarcoma cells.
    Supplementary Fig. 4: Knockdown of EGR2 reduces clonogenic growth, cell cycle progression, and viability of Ewing sarcoma cells.

    (a) Analysis of EGR2 protein expression by immunoblot in Ewing sarcoma cell lines (loading control: β-actin). The neuroblastoma cell line SK-N-SH served as negative control. (b) Analysis of EGR2 protein expression by immunohistochemistry in tumors of xenografted Ewing sarcoma cell lines (A673, TC-71 and SK-ES1), the alveolar rhabdomyosarcoma cell line SJ-RH30, and the neuroblastoma cell line IMR-32, and comparison with EGR2 mRNA expression levels as determined by Affymetrix HG-U133Plus2.0 arrays (GSE36133). Scale bars = 200 µm. (c) qRT-PCR analysis of knockdown efficacy of siRNAs used to silence EGR2 or ADO (48 h after transfection). Data are shown as the mean and s.e.m.; n ≥ 4 independent experiments. (d) Analysis of clonogenic growth after seeding at low-density and serial re-transfection with siRNA every four days. Cells were fixed 9–14 days after seeding and individual colonies were colorized with crystal violet. (e) Analysis of cell cycle phases by PI staining 96 h after transfection with siRNA. EGR2 knockdown reduces the percentage of cells in S phase (P < 0.01 for A673, SK-N-MC, and POE; P = 0.12 for EW7), while increasing the percentage in sub G1 phase (all P < 0.0001). (f) Analysis of apoptosis by Annexin-V-staining 96 h after transfection with siRNA. Data in d-f are shown as the mean and s.e.m. of results obtained with two different siRNAs for EGR2 and three different siRNAs for ADO as displayed in c; n ≥ 3 independent experiments. Two-tailed unpaired Student’s t-test; ns, not significant; *** P < 0.001.

  9. EGR2 is a downstream component of the FGF pathway in Ewing sarcoma.
    Supplementary Fig. 5: EGR2 is a downstream component of the FGF pathway in Ewing sarcoma.

    (a) Scatter-dot plot and medians (horizontal bars) of EGFRs (gray) and FGFRs (red) expression levels in n = 117 primary Ewing sarcoma tumors (GSE34620). (b) Analysis of proliferation of Ewing sarcoma cell lines with a Resazurin assay. Cells were seeded in RPMI 1640 medium with 0.5–1% FCS and stimulated with the indicated growth factor (GF) concentrations for 72 h. Data are shown as the mean and s.e.m.; n ≥ 6 experiments. (c) Analysis of EGR2 induction by qRT-PCR in Ewing sarcoma cell lines after incubation with either EGF or bFGF (both 10 ng/ml). The EGFR expressing MDA-MB-231 breast cancer cell line served as a positive control for EGF action. Data are shown as the mean and s.e.m.; n ≥ 3 independent experiments. Two-tailed unpaired Student’s t-test, ** P < 0.01, *** P < 0.001.

  10. Targeted germline deep sequencing of principal-component analysis (PCA)-matched Ewing sarcoma cases and controls.
    Supplementary Fig. 6: Targeted germline deep sequencing of principal-component analysis (PCA)-matched Ewing sarcoma cases and controls.

    (a) PCA-clustering of the selected core population (dashed box) for sequencing. (b) Analysis of target-region coverage after mapping and base quality filtering of reads (MQ20 and BQ20). (c) Analysis of average nucleotide coverage (high-quality reads) of the chr10 target-region across all samples. The median nucleotide coverage is reported (217X).

  11. Schematic illustration of the workflow for next-generation sequencing and variant analysis.
    Supplementary Fig. 7: Schematic illustration of the workflow for next-generation sequencing and variant analysis.

    AAR, alternative allele ratio.

  12. Genomic coordinates, evolutionary sequence conservation, aligned DNA sequences and homology of EGR2 enhancer elements.
    Supplementary Fig. 8: Genomic coordinates, evolutionary sequence conservation, aligned DNA sequences and homology of EGR2 enhancer elements.

    Mammal Cons, 46 vertebrates basewise conservation from the UCSC genome browser60; GERP, Genomic Evolutionary Rate Profiling of 30 vertebrate species from the UCSC genome browser; MultiZ Align, multiple sequence alignment from the UCSC Genome browser; MSE, Myelinating Schwann cell Enhancer; BoneE, Bone Enhancer. Sequence alignment of murine and human DNA sequences was carried out using Clustal (v1.2.0)45. Asterisks indicate homologous nucleotides.

  13. Genomic coordinates, epigenetic profile and reference sequence of the mSat1 locus.
    Supplementary Fig. 9: Genomic coordinates, epigenetic profile and reference sequence of the mSat1 locus.

    GGAA repeats are underlined by arrows. The reported numbers of GGAA motifs correspond to the reference sequence (hg19).

  14. Validation of ChIP efficiency by qRT-PCR in the Ewing sarcoma cell line MHH-ES1 (A/T at rs79965208).
    Supplementary Fig. 10: Validation of ChIP efficiency by qRT-PCR in the Ewing sarcoma cell line MHH-ES1 (A/T at rs79965208).

    A described CCND1 EWSR1-FLI1 binding site was used as positive control47, and an intronic CCND1 locus (intron 2) served as negative control. Data are shown as the mean and s.d.

  15. Conditional association analysis suggests that rs79965208 within mSat2 is the major EGR2 regulatory variant at the chr10 susceptibility locus.
    Supplementary Fig. 11: Conditional association analysis suggests that rs79965208 within mSat2 is the major EGR2 regulatory variant at the chr10 susceptibility locus.

Accession codes

Primary accessions

Gene Expression Omnibus

References

  1. Delattre, O. et al. Gene fusion with an ETS DNA-binding domain caused by chromosome translocation in human tumours. Nature 359, 162165 (1992).
  2. Gangwal, K. et al. Microsatellites as EWS/FLI response elements in Ewing's sarcoma. Proc. Natl. Acad. Sci. USA 105, 1014910154 (2008).
  3. Guillon, N. et al. The oncogenic EWS-FLI1 protein binds in vivo GGAA microsatellite sequences with potential transcriptional activation function. PLoS One 4, e4932 (2009).
  4. Postel-Vinay, S. et al. Common variants near TARDBP and EGR2 are associated with susceptibility to Ewing sarcoma. Nat. Genet. 44, 323327 (2012).
  5. von Levetzow, C. et al. Modeling initiation of Ewing sarcoma in human neural crest cells. PLoS One 6, e19305 (2011).
  6. Tirode, F. et al. Mesenchymal stem cell features of Ewing tumors. Cancer Cell 11, 421429 (2007).
  7. Delattre, O. et al. The Ewing family of tumors—a subgroup of small-round-cell tumors defined by specific chimeric transcripts. N. Engl. J. Med. 331, 294299 (1994).
  8. Patel, M. et al. Tumor-specific retargeting of an oncogenic transcription factor chimera results in dysregulation of chromatin and transcription. Genome Res. 22, 259270 (2012).
  9. Brohl, A.S. et al. The genomic landscape of the Ewing sarcoma family of tumors reveals recurrent STAG2 mutation. PLoS Genet. 10, e1004475 (2014).
  10. Crompton, B.D. et al. The genomic landscape of pediatric Ewing sarcoma. Cancer Discov. 4, 13261341 (2014).
  11. Tirode, F. et al. Genomic landscape of Ewing sarcoma defines an aggressive subtype with co-association of STAG2 and TP53 mutations. Cancer Discov. 4, 13421353 (2014).
  12. Worch, J. et al. Racial differences in the incidence of mesenchymal tumors associated with EWSR1 translocation. Cancer Epidemiol. Biomarkers Prev. 20, 449453 (2011).
  13. Chung, C.C. & Chanock, S.J. Current status of genome-wide association studies in cancer. Hum. Genet. 130, 5978 (2011).
  14. Dominy, J.E. Jr. et al. Discovery and characterization of a second mammalian thiol dioxygenase, cysteamine dioxygenase. J. Biol. Chem. 282, 2518925198 (2007).
  15. Chandra, A., Lan, S., Zhu, J., Siclari, V.A. & Qin, L. Epidermal growth factor receptor (EGFR) signaling promotes proliferation and survival in osteoprogenitors by increasing early growth response 2 (EGR2) expression. J. Biol. Chem. 288, 2048820498 (2013).
  16. Topilko, P. et al. Krox-20 controls myelination in the peripheral nervous system. Nature 371, 796799 (1994).
  17. Mackintosh, C., Madoz-Gúrpide, J., Ordóñez, J.L., Osuna, D. & Herrero-Martín, D. The molecular pathogenesis of Ewing's sarcoma. Cancer Biol. Ther. 9, 655667 (2010).
  18. Gao, C. et al. HEFT: eQTL analysis of many thousands of expressed genes while simultaneously controlling for hidden factors. Bioinformatics 30, 369376 (2014).
  19. Radtke, I. et al. Genomic analysis reveals few genetic alterations in pediatric acute myeloid leukemia. Proc. Natl. Acad. Sci. USA 106, 1294412949 (2009).
  20. Moffatt, M.F. et al. Genetic variants regulating ORMDL3 expression contribute to the risk of childhood asthma. Nature 448, 470473 (2007).
  21. Northcott, P.A. et al. Subgroup-specific structural variation across 1,000 medulloblastoma genomes. Nature 488, 4956 (2012).
  22. Wang, K. et al. Integrative genomics identifies LMO1 as a neuroblastoma oncogene. Nature 469, 216220 (2011).
  23. GTEx Consortium. The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 45, 580585 (2013).
  24. Labalette, C. et al. Hindbrain patterning requires fine-tuning of early krox20 transcription by Sprouty 4. Development 138, 317326 (2011).
  25. Weisinger, K., Kayam, G., Missulawin-Drillman, T. & Sela-Donenfeld, D. Analysis of expression and function of FGF-MAPK signaling components in the hindbrain reveals a central role for FGF3 in the regulation of Krox20, mediated by Pea3. Dev. Biol. 344, 881895 (2010).
  26. ENCODE Project Consortium. et al. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 5774 (2012).
  27. Riggi, N. et al. EWS-FLI1 utilizes divergent chromatin remodeling mechanisms to directly activate or repress enhancer elements in Ewing sarcoma. Cancer Cell 26, 668681 (2014).
  28. Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 4349 (2011).
  29. Maurano, M.T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 11901195 (2012).
  30. Ghislain, J. et al. Characterisation of cis-acting sequences reveals a biphasic, axon-dependent regulation of Krox20 during Schwann cell development. Development 129, 155166 (2002).
  31. 1000 Genomes Project Consortium. et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 5665 (2012).
  32. Robison, L.L. et al. The Childhood Cancer Survivor Study: a National Cancer Institute–supported resource for outcome and intervention research. J. Clin. Oncol. 27, 23082318 (2009).
  33. Edwards, S.L., Beesley, J., French, J.D. & Dunning, A.M. Beyond GWASs: illuminating the dark road from association to function. Am. J. Hum. Genet. 93, 779797 (2013).
  34. Faye, L.L., Machiela, M.J., Kraft, P., Bull, S.B. & Sun, L. Re-ranking sequencing variants in the post-GWAS era for accurate causal variant identification. PLoS Genet. 9, e1003609 (2013).
  35. Surdez, D. et al. Targeting the EWSR1–FLI1 oncogene-induced protein kinase PKC-β abolishes Ewing sarcoma growth. Cancer Res. 72, 44944503 (2012).
  36. International HapMap 3 Consortium. et al. Integrating common and rare genetic variation in diverse human populations. Nature 467, 5258 (2010).
  37. Robinson, J.T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 2426 (2011).
  38. Carrillo, J. et al. Cholecystokinin down-regulation by RNA interference impairs Ewing tumor growth. Clin. Cancer Res. 13, 24292440 (2007).
  39. Conrad, C., Gottgens, B., Kinston, S., Ellwart, J. & Huss, R. GATA transcription in a small rhodamine 123(low)CD34(+) subpopulation of a peripheral blood-derived CD34(–)CD105(+) mesenchymal cell line. Exp. Hematol. 30, 887895 (2002).
  40. Thalmeier, K. et al. Establishment of two permanent human bone marrow stromal cell lines with long-term post irradiation feeder capacity. Blood 83, 17991807 (1994).
  41. Wiederschain, D. et al. Single-vector inducible lentiviral RNAi system for oncology target validation. Cell Cycle 8, 498504 (2009).
  42. Dai, M. et al. Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data. Nucleic Acids Res. 33, e175 (2005).
  43. Culhane, A.C., Thioulouse, J., Perrière, G. & Higgins, D.G. MADE4: an R package for multivariate analysis of gene expression data. Bioinformatics 21, 27892790 (2005).
  44. Melot, T. et al. Production and characterization of mouse monoclonal antibodies to wild-type and oncogenic FLI-1 proteins. Hybridoma 16, 457464 (1997).
  45. Sievers, F. et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 7, 539 (2011).
  46. Franken, N.A.P., Rodermond, H.M., Stap, J., Haveman, J. & van Bree, C. Clonogenic assay of cells in vitro. Nat. Protoc. 1, 23152319 (2006).
  47. Boeva, V. et al. De novo motif identification improves the accuracy of predicting transcription factor binding sites in ChIP-Seq data analysis. Nucleic Acids Res. 38, e126 (2010).
  48. Yeager, M. et al. Genome-wide association study of prostate cancer identifies a second risk locus at 8q24. Nat. Genet. 39, 645649 (2007).
  49. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 20782079 (2009).
  50. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 12971303 (2010).
  51. Pruim, R.J. et al. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics 26, 23362337 (2010).
  52. Barrett, J.C., Fry, B., Maller, J. & Daly, M.J. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21, 263265 (2005).
  53. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559575 (2007).
  54. Gabriel, S.B. et al. The structure of haplotype blocks in the human genome. Science 296, 22252229 (2002).
  55. Clavel-Chapelon, F. et al. E3N, a French cohort study on cancer risk factors. E3N Group. Etude Epidémiologique auprès de femmes de l'Education Nationale. Eur. J. Cancer Prev. 6, 473478 (1997).
  56. Wang, Z. et al. Improved imputation of common and uncommon SNPs with a new reference set. Nat. Genet. 44, 67 (2012).
  57. Howie, B.N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).
  58. Boeva, V., Lermine, A., Barette, C., Guillouf, C. & Barillot, E. Nebula—a web-server for advanced ChIP-seq data analysis. Bioinformatics 28, 25172519 (2012).
  59. Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).
  60. Meyer, L.R. et al. The UCSC Genome Browser database: extensions and updates 2013. Nucleic Acids Res. 41, D64D69 (2013).

Download references

Author information

  1. Present address: Laboratory for Pediatric Sarcoma Biology, Institute of Pathology, LMU Munich, Munich, Germany.

    • Thomas G P Grünewald

Affiliations

  1. Genetics and Biology of Cancers Unit, Institut Curie, PSL Research University, Paris, France.

    • Thomas G P Grünewald,
    • Virginie Raynal,
    • Didier Surdez,
    • Marie-Ming Aynaud,
    • Olivier Mirabeau,
    • Franck Tirode,
    • Sakina Zaidi,
    • Anneliene H Jonker,
    • Carlo Lucchesi &
    • Olivier Delattre
  2. INSERM U830, Institut Curie Research Center, Paris, France.

    • Thomas G P Grünewald,
    • Virginie Raynal,
    • Didier Surdez,
    • Marie-Ming Aynaud,
    • Olivier Mirabeau,
    • Franck Tirode,
    • Sakina Zaidi,
    • Anneliene H Jonker,
    • Carlo Lucchesi &
    • Olivier Delattre
  3. Institut Curie Genomics of Excellence (ICGex) Platform, Institut Curie Research Center, Paris, France.

    • Virginie Bernard,
    • Virginie Raynal,
    • Thomas Rio Frio &
    • Olivier Delattre
  4. École Normale Supérieure (ENS), Institut de Biologie de l'ENS (IBENS), INSERM U1024, CNRS UMR8197, Paris, France.

    • Pascale Gilardi-Hebenstreit &
    • Patrick Charnay
  5. Instituto de Investigación de Enfermedades Raras, Instituto de Salud Carlos III, Madrid, Spain.

    • Florencia Cidre-Aranaz &
    • Javier Alonso
  6. INSERM U916 Biology of Sarcomas, Institut Bergonié, Bordeaux, France.

    • Gaëlle Perot
  7. Département d'Epidémiologie et de Biostatistiques, Institut Gustave Roussy, Villejuif, France.

    • Marie-Cécile Le Deley
  8. Département de Pédiatrie, Institut Gustave Roussy, Villejuif, France.

    • Odile Oberlin
  9. Institute for Pediatric Hematology and Oncology, Leon-Bérard Cancer Center, University of Lyon, Lyon, France.

    • Perrine Marec-Bérard
  10. INSERM U1052, Léon-Bérard Cancer Centre, Cancer Research Center of Lyon, Lyon, France.

    • Amélie S Véron &
    • David G Cox
  11. Unité Génétique Somatique (UGS), Institut Curie Centre Hospitalier, Paris, France.

    • Stephanie Reynaud,
    • Eve Lapouble,
    • Gaëlle Pierron &
    • Olivier Delattre
  12. INSERM U900, Bioinformatics, Biostatistics, Epidemiology and Computational Systems Biology of Cancer, Institut Curie Research Center, Paris, France.

    • Valentina Boeva
  13. Mines ParisTech, Fontainebleau, France.

    • Valentina Boeva
  14. Institute for Cancer Outcomes and Survivorship, School of Medicine, University of Alabama, Birmingham, Alabama, USA.

    • Smita Bhatia
  15. Centre de Recherche sur les Pathologies Prostatiques (CeRePP)–Laboratory for Urology, Research Team 2, UPMC, Hôpital Tenon, Paris, France.

    • Geraldine Cancel-Tassin &
    • Olivier Cussenot
  16. Division of Cancer Epidemiology and Genetics (DCEG), National Cancer Institute (NCI), Bethesda, Maryland, USA.

    • Lindsay M Morton,
    • Mitchell J Machiela &
    • Stephen J Chanock

Contributions

T.G.P.G. coordinated and designed the study, performed all functional experiments, analyzed the sequencing data, performed bioinformatic analyses, wrote the paper, designed the figures and helped with grant applications. V. Bernard processed the sequencing data and performed bioinformatic analyses. P.G.-H. participated in the study design, cloned enhancer elements and contributed to data analysis. V.R. performed all sequencing experiments and helped analyze the bioinformatic data. D.S. contributed to the in vivo experiments, performed the ChIP-MiSeq experiments and provided experimental protocols. M.-M.A., F.C.-A., A.H.J. and S.Z. helped with functional experiments. O.M., F.T. and C.L. provided statistical advice and helped in the bioinformatic analyses. G. Perot assisted in generation of the shRNA constructs. M.-C.L.D., O.O. and P.M.-B. provided Ewing sarcoma samples and annotation. G. Pierron, S.R. and E.L. provided and prepared Ewing sarcoma samples. T.R.F. coordinated and supervised sequencing experiments. V. Boeva helped analyze ChIP-Seq data. J.A. provided the A673-TR-shEF1 and SK-N-MC–TR–shEF1 cell lines. A.S.V. performed principal-component analysis clustering of Ewing sarcoma subjects and controls. G.C.-T. and O.C. provided DNA from healthy controls. D.G.C. and S.J.C. provided genetic and statistical guidance. S.B., L.M.M., M.J.M. and S.J.C. provided data for the CCSS replication cohort and performed imputation analyses. P.C. provided advice on analyses concerning EGR2 and laboratory infrastructure. O.D. initiated, designed and supervised the study; provided biological and genetic guidance; analyzed the data; wrote the paper together with T.G.P.G.; and provided laboratory infrastructure and financial support. All authors read and approved the final manuscript.

Competing financial interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to:

Author details

Supplementary information

Supplementary Figures

  1. Supplementary Figure 1: Expression pattern of EGR2 and ADO in normal tissues relative to Ewing sarcoma. (197 KB)

    The normal-body atlas consisted of 353 microarrays representing 63 individual tissue types (GSE3526). Gene expression levels are shown as the mean and s.e.m. of n ≥ 3 samples per tissue type. The normal-body atlas and the Ewing sarcoma (EwS) data (GSE34620) were normalized simultaneously by RMA using custom brainarray CDF (v18, ENTREZG). All microarray data were generated on Affymetrix HG-U133Plus2.0 arrays.

  2. Supplementary Figure 2: eQTL analyses across tissue types identify Ewing sarcoma-specific correlations of EGR2 and ADO expression with the risk-allele (G) at rs1848797. (444 KB)

    Data are shown as medians (horizontal bars) with ranges for the 25th−75th percentile (box) and 10th−90th percentile (whiskers). Outliers are depicted as dots. P values determined via linear regressions. EwS, Ewing sarcoma; AML, acute myeloid leukemia; LCL, lymphoblastoid cell lines.

  3. Supplementary Figure 3: Analysis of EGR2 and ADO expression in Ewing sarcoma cell lines after knockdown of EWSR1-FLI1. (84 KB)

    A673, SK-N-MC, EW7 and POE cells were transfected either with non-targeting control siRNA or siRNA targeting EWSR1-FLI1. Gene expression was assessed 48 h thereafter by qRT-PCR. FC, fold change. Data are shown as the mean and s.e.m.; n ≥ 5 independent experiments. Two-tailed unpaired Student’s t-test; ns, not significant; * P < 0.05, ** P < 0.01, *** P < 0.001.

  4. Supplementary Figure 4: Knockdown of EGR2 reduces clonogenic growth, cell cycle progression, and viability of Ewing sarcoma cells. (290 KB)

    (a) Analysis of EGR2 protein expression by immunoblot in Ewing sarcoma cell lines (loading control: β-actin). The neuroblastoma cell line SK-N-SH served as negative control. (b) Analysis of EGR2 protein expression by immunohistochemistry in tumors of xenografted Ewing sarcoma cell lines (A673, TC-71 and SK-ES1), the alveolar rhabdomyosarcoma cell line SJ-RH30, and the neuroblastoma cell line IMR-32, and comparison with EGR2 mRNA expression levels as determined by Affymetrix HG-U133Plus2.0 arrays (GSE36133). Scale bars = 200 µm. (c) qRT-PCR analysis of knockdown efficacy of siRNAs used to silence EGR2 or ADO (48 h after transfection). Data are shown as the mean and s.e.m.; n ≥ 4 independent experiments. (d) Analysis of clonogenic growth after seeding at low-density and serial re-transfection with siRNA every four days. Cells were fixed 9–14 days after seeding and individual colonies were colorized with crystal violet. (e) Analysis of cell cycle phases by PI staining 96 h after transfection with siRNA. EGR2 knockdown reduces the percentage of cells in S phase (P < 0.01 for A673, SK-N-MC, and POE; P = 0.12 for EW7), while increasing the percentage in sub G1 phase (all P < 0.0001). (f) Analysis of apoptosis by Annexin-V-staining 96 h after transfection with siRNA. Data in d-f are shown as the mean and s.e.m. of results obtained with two different siRNAs for EGR2 and three different siRNAs for ADO as displayed in c; n ≥ 3 independent experiments. Two-tailed unpaired Student’s t-test; ns, not significant; *** P < 0.001.

  5. Supplementary Figure 5: EGR2 is a downstream component of the FGF pathway in Ewing sarcoma. (121 KB)

    (a) Scatter-dot plot and medians (horizontal bars) of EGFRs (gray) and FGFRs (red) expression levels in n = 117 primary Ewing sarcoma tumors (GSE34620). (b) Analysis of proliferation of Ewing sarcoma cell lines with a Resazurin assay. Cells were seeded in RPMI 1640 medium with 0.5–1% FCS and stimulated with the indicated growth factor (GF) concentrations for 72 h. Data are shown as the mean and s.e.m.; n ≥ 6 experiments. (c) Analysis of EGR2 induction by qRT-PCR in Ewing sarcoma cell lines after incubation with either EGF or bFGF (both 10 ng/ml). The EGFR expressing MDA-MB-231 breast cancer cell line served as a positive control for EGF action. Data are shown as the mean and s.e.m.; n ≥ 3 independent experiments. Two-tailed unpaired Student’s t-test, ** P < 0.01, *** P < 0.001.

  6. Supplementary Figure 6: Targeted germline deep sequencing of principal-component analysis (PCA)-matched Ewing sarcoma cases and controls. (177 KB)

    (a) PCA-clustering of the selected core population (dashed box) for sequencing. (b) Analysis of target-region coverage after mapping and base quality filtering of reads (MQ20 and BQ20). (c) Analysis of average nucleotide coverage (high-quality reads) of the chr10 target-region across all samples. The median nucleotide coverage is reported (217X).

  7. Supplementary Figure 7: Schematic illustration of the workflow for next-generation sequencing and variant analysis. (299 KB)

    AAR, alternative allele ratio.

  8. Supplementary Figure 8: Genomic coordinates, evolutionary sequence conservation, aligned DNA sequences and homology of EGR2 enhancer elements. (456 KB)

    Mammal Cons, 46 vertebrates basewise conservation from the UCSC genome browser60; GERP, Genomic Evolutionary Rate Profiling of 30 vertebrate species from the UCSC genome browser; MultiZ Align, multiple sequence alignment from the UCSC Genome browser; MSE, Myelinating Schwann cell Enhancer; BoneE, Bone Enhancer. Sequence alignment of murine and human DNA sequences was carried out using Clustal (v1.2.0)45. Asterisks indicate homologous nucleotides.

  9. Supplementary Figure 9: Genomic coordinates, epigenetic profile and reference sequence of the mSat1 locus. (103 KB)

    GGAA repeats are underlined by arrows. The reported numbers of GGAA motifs correspond to the reference sequence (hg19).

  10. Supplementary Figure 10: Validation of ChIP efficiency by qRT-PCR in the Ewing sarcoma cell line MHH-ES1 (A/T at rs79965208). (67 KB)

    A described CCND1 EWSR1-FLI1 binding site was used as positive control47, and an intronic CCND1 locus (intron 2) served as negative control. Data are shown as the mean and s.d.

  11. Supplementary Figure 11: Conditional association analysis suggests that rs79965208 within mSat2 is the major EGR2 regulatory variant at the chr10 susceptibility locus. (143 KB)

PDF files

  1. Supplementary Figures (2,386 KB)

    Supplementary Figures 1–11

Excel files

  1. Supplementary Data (259 KB)

Additional data