Establishment and genomic characterization of the new chordoma cell line Chor-IN-1

Chordomas are rare, slowly growing tumors with high medical need, arising in the axial skeleton from notochord remnants. The transcription factor “brachyury” represents a distinctive molecular marker and a key oncogenic driver of chordomas. Tyrosine kinase receptors are also expressed, but so far kinase inhibitors have not shown clear clinical efficacy in chordoma patients. The need for effective therapies is extremely high, but the paucity of established chordoma cell lines has limited preclinical research. Here we describe the isolation of the new Chor-IN-1 cell line from a recurrent sacral chordoma and its characterization as compared to other chordoma cell lines. Chor-IN-1 displays genomic identity to the tumor of origin and has morphological features, growth characteristics and chromosomal abnormalities typical of chordoma, with expression of brachyury and other relevant biomarkers. Chor-IN-1 gene variants, copy number alterations and kinome gene expression were analyzed in comparison to other four chordoma cell lines, generating large scale DNA and mRNA genomic data that can be exploited for the identification of novel pharmacological targets and candidate predictive biomarkers of drug sensitivity in chordoma. The establishment of this new, well characterized chordoma cell line provides a useful tool for the identification of drugs active in chordoma.

Very few bona fide chordoma cell lines have been available until recently, limiting the identification of relevant targets and the development of effective drugs. Validated chordoma cell lines available from the Chordoma Foundation (a global chordoma patient advocacy group, www.chordomafoundation.org) include the prototype cell lines U-CH1 and U-CH2 and a few other more recently established cell lines [15][16][17] .
Here we describe the generation and characterization of the new Chor-IN-1 chordoma cell line, established from a surgical sample of a sacral chordoma. The Chor-IN-1 cell line was shown to display the morphological and growth features of chordoma and to express brachyury, as well as other key relevant markers associated with chordoma diagnosis. This newly established cell line was characterized in parallel with U-CH1 15 , U-CH2 15 , MUG-Chor1 16 and JHC7 17 chordoma cell lines by Next Generation Sequencing (NGS), in order to compare genomic profiles and to evaluate the expression of kinases which might represent potential new therapeutic targets.

Results
Chor-IN-1 cell line establishment. The original surgical sample was obtained from a 55 year old man diagnosed with a locally advanced sacral chordoma. A sacral nodule of 2 cm of diameter, macroscopically invading the surrounding soft tissues, was surgically excised. Histological diagnosis of chordoma was made according to WHO classification (2013). Morphologically, the tumor recapitulated the features of conventional chordoma, exhibiting lobulated growth of round epithelioid cells separated by fibrous septa. The cells, arranged in ribbons and nests, showed eosinophilic and/or vacuolated cytoplasms (physaliferous morphology) and were embedded into abundant extracellular matrix. Immuno-phenotyping revealed expression of vimentin, S100, brachyury and EMA (Fig. 1a).
The new cell line was established by mechanical and enzymatic disaggregation of the fresh aseptic surgical chordoma sample followed by seeding of the resulting cell suspension in collagen-coated tissue culture plates. Once stabilized in culture, the Chor-IN-1 cell line was subjected to detailed characterization. The morphology was repeatedly monitored over passages, confirming that most of the cells displayed the typical physaliferous phenotype (Fig. 1b) of chordoma cells. This physaliferous morphology was less evident immediately after seeding but involved most of the cells 2-3 days later and is likely associated with non-actively dividing cells, as previously reported for the MUG-Chor1 cell line 18 . The cell line was confirmed to express brachyury and EMA by immunocytochemistry (Fig. 1b).
The Chor-IN-1 cell line displayed a doubling time of about seven days (Fig. 1c), in agreement with the very slow growth rate typical of chordoma cell lines that faithfully represent the disease 15 .
The karyotype of the cell line was analyzed using Cytovision ® software and found to be: 45, XY, add(1)(p13), Brachyury protein was expressed at comparable levels in all cell lines (Fig. 2a). Interestingly, RT-qPCR analysis revealed that Chor-IN-1 expresses mRNA levels of the T gene, encoding brachyury, comparable to U-CH1 and U-CH2. Conversely, the MUG-Chor1 and JHC7 cell lines showed higher mRNA expression levels which do not however translate into higher protein levels, likely indicating a strict cellular control on the protein levels of this transcription factor (Suppl. Fig. 1). Cytokeratin 19 and vimentin were also confirmed to be expressed by RT-qPCR analysis, as required for comprehensive chordoma cell line validation (Suppl. Fig. 1). Finally, FACS analysis confirmed the expression of CD24 and CAM5.2 membrane antigens in the Chor-IN-1 cell line, similar to the U-CH2 used as control cell line (Fig. 2b).
The Chor-IN-1 cell line was authenticated by Short Tandem Repeat (STR) analysis in parallel with the parental tumor sample. The identity of the other chordoma cell lines was also confirmed by comparison with the published STR profiles (Suppl. Fig. 2).
We were primarily interested in the identification of gene variants of clinical relevance, thus we used the Illumina TruSight One "clinical exome" panel (Illumina, San Diego, CA, USA), which allows the complete sequencing of the exonic regions of a subset of 4,813 genes harbouring disease-causing mutations, covering genes reported in the Human Gene Mutation Database (HGMD), in the Online Mendelian Inheritance in Man (OMIM) catalog and most of the genes currently reviewed in clinical research. Variants (SNVs and short InDels) detected in the Chor-IN-1 cell line and in the original tumor were highly consistent (>80%). All but four of these variants were also present in the DNA extracted from the non-tumor tissue component, therefore indicating that they represent germline variants. Tumor unique variants were identified in the MUC1, KEL, TECTA and SART3 genes (Suppl. Fig. 3). We also manually inspected the raw data for the 9 variants found uniquely in the normal sample. These variants could not be detected in the tumor sample and its derived cell line with the imposed coverage threshold, however they were called just below the filter threshold, due to their location in regions of lower coverage. Also, the overall Chor-IN-1 genomic profile was compared to the original tumor, revealing a complete overlap, with several major genomic alterations that were not present in the non-tumor tissue component (Fig. 3).
In summary, this extensive molecular characterization confirms that the Chor-IN-1 cell line faithfully recapitulates the molecular features of the tumor of origin.
In order to highlight similarities and differences to other chordoma lines, the Chor-IN-1 variants were compared to those identified by TruSight One sequencing performed on the other four cell lines (Suppl. Fig. 4). Only a few variants common to all chordoma cell lines were identified. However they could not be considered peculiar to chordoma, since they were also present in other unrelated tumor cell lines. Also, the four Chor-IN-1 tumor-specific variants were not present in the other chordoma cell lines. Therefore, we were unable to identify somatic mutations distinctive of chordoma, in line with what has already been published in literature 19,20 . Chromosomal alterations shared by all cell lines but JHC7 included the monosomy of chr1p, the trisomy of chr7 and a biallelic deletion of 9p21. A monosomy of chr 22 could be observed in Chor-IN-1, U-CH1 and MUG-Chor1 cell lines. A monosomy of chr2q was observed in the Chor-IN-1 cell line, while a partial terminal deletion of chr2q was present in U-CH2 and JHC7. Also, a gain of the whole chromosome X was present in U-CH1, while a partial duplication of Xq and a monosomy of Xpter-q21.33 were detectable in U-CH2. Furthermore, only Chor-IN-1 and U-CH1 cell lines shared a monosomy of chr1p and a partial deletion of a chr13q region. A duplication of the 6q22.1-qter region could be appreciated in U-CH2. A gain of 6q27 region, including the brachyury gene, could be clearly detected in JHC7 cell line and to a lesser extent in MUG-Chor1, where a complex rearrangement including the amplification of the T locus is present, likely accounting for the higher levels of T gene mRNA in these two cell lines ( Fig. 4 and Suppl. Fig. 1).
The genomic similarity between chordoma cell lines and clinical samples of chordoma was then investigated by calculating a consensus profile of the chordoma cell lines based on the NGS data and comparing it to the consensus profile calculated by Progenetix on 50 chordoma tumor samples profiled by aCGH 22 ; www.progenetix.org (Fig. 5). Consensus copy number aberrations observed in the cell lines were generally consistent with those  Table 1.
Results were displayed using Kohonen maps, a data analysis and visualization neural network-based technique for multidimensional quantitative and qualitative data comparison (https://cran.r-project.org/web/packages/kohonen/index.html). This analysis showed a very consistent kinase expression profile among chordoma cell lines, which differs from that of control placenta tissue (Fig. 6a). As shown in Fig. 6a, in the chordoma panel,  about 75% of kinases are expressed, more than half at high levels, while 25% are expressed at very low levels or not expressed.
We focused our analysis particularly on RTKs, which are involved in cell regulation processes and frequently dysregulated in tumors. We found that out of 56 RTKs, 15 were expressed at high levels in all cell lines. These include EGFR and MET, as expected, while PDGFR-β was expressed in all cell lines but not in the U-CH1 (Fig. 6b). A few kinases showed a cell-line distinctive profile. Therefore we further analyzed the kinases specifi-

Discussion
Chordoma is a rare disease with a high unmet medical need. Preclinical research on chordoma has been hampered by the limited number of available cell lines, which were only recently expanded beyond the two prototype U-CH1 and U-CH2 cell lines. The availability of well characterized, newly-established cell lines can have a high impact on understanding the complex biology of this tumor, potentially contributing to the identification of new therapeutic targets.
Here we present the establishment of a novel chordoma cell line from a sacral tumor sample, which was named Chor-IN-1. We also provide an extensive molecular characterization of this cell line in comparison to the original tumor and to a panel of representative chordoma cell lines.
In keeping with the indolent nature of this tumor, chordoma cell lines display very long doubling times in culture and require several months to reach stabilization. The Chor-IN-1 cell line has been maintained in vitro for >50 population doublings. This is a required feature for consideration as a stabilized chordoma cell line, since chordoma primary cell cultures tend to enter a growth crisis after 25-30 passages in vitro (www.chordomafoundation.org).
We decided to characterize Chor-IN-1 in parallel with four other cell lines of sacral origin, including the most widely used U-CH1 and U-CH2, as well as the more recently established MUG-Chor1 and JHC7 cell lines.
In general, the TruSight One analysis did not identify gene variants common to the different cell lines, in agreement with literature data 19 . Also, the availability of patient DNA from the non tumoral tissue highlighted that the SNVs identified in the tumor tissue and Chor-IN-1 cell line were mainly germline, with four exceptions involving TECTA, SART3, KEL and MUC1 genes. Interestingly, MUC1 (Mucin 1, Cell Surface Associated) is the gene encoding EMA, a transmembrane glycoprotein, commonly used as a biomarker for chordoma diagnosis 17,23 . This variant lies within an insertion region of unknown significance, located within the signal peptide present in a subset of MUC1 isoforms. The biological significance of the mutations identified in these genes will require further studies.
Next we investigated gross chromosomal rearrangements by mean of aCGH and NGS analysis. Several chromosomal alterations shared among the chordoma cell lines were identified. In particular, all chordoma cell lines apart from JHC7 share a biallelic loss of 9p21. This region includes the loci for CDKN2A and CDKN2B tumor suppressor genes and is reported to be frequently altered in chordoma tumor samples (~59%) 24 . Similarly, trisomy of chr7, clearly evidenced also by karyotype analysis of the Chor-IN-1, is a feature shared by all the chordoma cell lines with the exception of JHC7, which bears a partial chr7 trisomy involving only the long arm. This trisomy was previously reported in a significant fraction of chordoma clinical samples 25,26 . Interestingly, the EGFR gene is located in the short arm of chr7 and is therefore not affected by the partial trisomy in the JHC7 cell line. In general, the major chromosomal abnormalities, common to the chordoma cell lines reported here, well recapitulate the most common alterations reported in Progenetix database for chordoma clinical samples.
Besides these features, typical of chordoma, the Chor-IN-1 genome harbors peculiar chromosomal alterations in regions containing bone dysmorphism-associated genes, including DACH1, a gene involved in regulation of gene expression and cell fate determination during development, reported to downregulate EGFR and cyclin D1 and associated with osteosarcoma development 27 ; LIG4, whose mutations are responsible for LIG4 syndrome, a disease characterized by dysmorphic features and microcephaly 28 and WWP1, an E3 ubiquitin ligase involved in the regulation of osteoblast functions 29 . Of interest, an evident amplification of the MALAT1 locus, associated with the proliferation and metastasis of tumor cells, was observed.
Chor-IN-1 major alterations were also detected by karyotype analysis. In particular, this analysis revealed the presence of an anomalous submetacentric C-like chromosome (chromosome A), possibly originated from the rearrangement of different portions of other chromosomes. This might account for the presence of the short arm of chr2 and of partial long arm of chr13 observed in aCGH analysis and not in the karyotype. Interestingly, chordomas were described as being among tumors that undergo "chromothripsis", a peculiar event involving massive genomic rearrangements of one or few chromosomes 30 . It is tempting to speculate that the newly identified "chromosome A" might have been assembled from portions of different chromosomes following a catastrophic chromothriptic event that occurred in the development of this tumor. We next focused on the characterization of protein kinase gene expression in chordoma cell lines. Kinases constitute one of the largest known families of enzymes that control different cellular functions and whose deregulation plays a causal role in cancer. A number of receptor tyrosine kinases have been reported to be implicated in chordoma pathogenesis, including EGFR, PDGFRβ, and c-MET 13,14,31 . We reasoned that detection of abundantly/differentially expressed kinases in chordoma cell lines might represent a convenient strategy for the identification of potential new pharmacological targets in this disease.
We provide here data on kinome gene expression in the different chordoma cell lines, which can be exploited to investigate the molecular basis for the sensitivity of each cell line to the different kinase inhibitors. Several among the most widely expressed kinases are inhibited by drugs currently undergoing clinical development, which may represent a new therapeutic option also in chordoma. We recently investigated in depth the role of EGFR inhibitors, providing a rationale to start a clinical trial with afatinib in this setting (Magnaghi P. et al., manuscript submitted). The current analysis highlighted AXL, DDR1, DDR2, EPHA4, EPHB4, EPHA2, FGFR1, FGFR2, MET and ERBB2 as other interesting kinases that deserve further investigation as potential biologically relevant targets in chordoma.
Comparative analysis focused on Chor-IN-1 showed that this cell line expresses high levels of ULK4, a member of the unc-51-like serine/threonine kinase (STK) family. Although little is known about ULK4 function, its role in chordoma deserves further investigation since the other members of the family have been implicated in autophagic pathways. CDKL4 (Cyclin Dependent Kinase Like 4), a member of the CDK family which includes CDK4 and CDK6, is also highly expressed. Interestingly, the use of CDK4/6-specific inhibitor palbociclib was reported to efficiently inhibit tumor cell growth in vitro in chordoma cell line models 32 , providing the rationale for clinical trials evaluating the efficacy of palbociclib in chordoma.
In conclusion, we generated and extensively characterized Chor-IN-1, a new chordoma cell line that represents a valuable contribution to the preclinical research in the field, in view of the paucity of current cell models and of the heterogeneity of chordoma tumors. The Chor-IN-1 cell line will be made available to be used for preclinical studies aimed at further understanding the pathogenesis of chordoma and its sensitivity to available drugs. Moreover, we have generated genomic data at the DNA and RNA level that can be exploited for the identification of biomarkers of drug sensitivity, and for the identification of novel pharmacological targets in chordoma.

Methods
Case report. The surgical sample was obtained from a patient initially diagnosed with sacral chordoma. The patient refused surgery, and received imatinib and radiotherapy. After three years, the patient experienced local subcutaneous progression at lombo-sacral level, received again imatinib and, upon further progression, also metformin. A tumor biopsy was performed when the patient finally underwent surgery. A sacral nodule of 2 cm of diameter, invading macroscopically the surrounding soft tissues, was surgically excised. All methods were performed in accordance with the relevant guidelines and regulations.
The patient gave his informed consent to study his tumor, followed by the approval of the Fondazione IRCCS Istituto Nazionale dei Tumori (Milan, Italy) Ethical Committee for the genotypic and phenotypic characterization of the cell line to confirm the identity with original tumor.
Cell culture. Fresh aseptic surgical chordoma sample was minced and incubated with collagenase (Cat. No. C6885, Sigma) for three hours and the obtained cell suspension was filtered through a 45 µL nylon mesh, washed with PBS/FCS 0.5% and seeded on collagen coated plates using 70% DMEM 30% IMDM2 supplemented with 10% of fetal bovine serum. When confluence was reached, cells were carefully detached and passed in new flasks at 20,000 cells/cm 2 . Cells were monitored daily to evaluate growth and morphology, and were carefully detached and re-seeded upon reaching confluence. After three months in culture the resulting cell line, named Chor-IN-1, displayed stable morphological and growth features and was further maintained in culture up to 50 population doublings, which required fourteen months. For doubling time calculation, Chor-IN-1 cells were seeded into 12 well plates (13,000 cells/cm 2 ) into 1 ml of culture medium. Cells were carefully detached from two wells every day and counted using a Multisize 3 (Beckman Coulter). Doubling time was calculated on the exponential part of the growth curve, using the formula DT = (Log 2 Cell number[Tx] − Log 2 Cell number[T0])/Tx (hr).
Karyotype. Karyotype was obtained according to standard procedures: chromosome preparations were obtained from semi-confluent petri at the seventh passage by ipotonic treatment (KCl, HEPES, EDTA) followed by Cornoys fixative (Methanol:Acetic Acid = 3:1). Slides were G banded with Wright staining and analyzed at 100 X magnification with a Leica DM6000D microscope.
Flow cytometry analysis of chordoma-specific surface antigens. U-CH2 and Chor-IN-1 cells were collected by trypsinization and recovered in their culture medium at 37 °C and 5% CO 2 for 30 minutes, then washed with PBS containing 1% FCS (staining buffer) and counted. 0.5 × 10 6 cells of each cell line were resuspended in 100 µl staining buffer and stained with 10 µl PE-conjugated mouse anti-human CD24 or its correspondent PE-conjugated isotype control (BD Biosciences, San Jose, CA, USA) for 20 minutes at room temperature. Samples were washed with staining buffer, fixed with 1% formaldehyde for 10 minutes at 37 °C and permeabilized with 90% methanol for 30 minutes on ice. After washing with staining buffer, 10 µl of FITC-conjugated mouse anti-cytokeratin (CAM5.2) or its correspondent FITC-conjugated isotype control (BD Biosciences, San Jose, CA, USA) were added and incubated for 20 minutes at room temperature. Samples were washed with staining buffer, then acquired and analyzed with a FACSCalibur cytometer and CellQuest software (BD Biosciences, San Jose, CA, USA). Analysis was performed on 10,000 events, gating out debris and doublets. Primers for PCR amplification and sequencing were designed using the freely available Primer3 software (http://primer3.ut.ee/) and synthesized using an Applied Biosystems 3900 Synthesizer (Thermo Fisher Scientific, Waltham, MA, USA).
For direct sequencing, the obtained PCR product was electrophoresed in agarose gel, then purified using NucleoSpin Gel and PCR clean-up (Macherey-Nagel, Düren,Germany), according to the manufacturer's protocol, and subjected to Sanger sequencing with a ABI 3100 Genetic Analyzer instrument (Thermo Fisher Scientific, Waltham, MA, USA) with the same primers used for PCR amplification.

Next Generation Sequencing (NGS) characterization using TruSight One (Illumina).
Chordoma clinical samples and cell lines were profiled using TruSight One (TSO) sequencing panel kit (Illumina, San Diego, CA, USA), following manufacturer's instructions. The protocol allows the detection of single nucleotide variants and short InDels over the entire coding sequence of 4,813 genes (http://www.illumina.com/products/ trusight-one-sequencing-panel.html). Briefly, 50 ng of input DNA were used for each sample for library preparation. Pooled libraries were quantified with the Qubit 2.0 Fluorometer System using the Qubit dsDNA HS assay kit (Thermo Fisher Scientific, Waltham, MA, USA). The libraries were sequenced in paired-end on the MiSeq platform (Illumina, San Diego, CA, USA) using the v3 reagent kit (2 × 150 bp). A minimum mean coverage depth of 100× was achieved for all the profiled samples.
The FASTQ files obtained from the instrument were analyzed following GATK gold standard procedures. Briefly, the sequences were mapped against the human reference genome (hg19) using BWA (v. 0.7.7) and the gene variants were identified using the Unified Genotyper (GATK v1.6) variant caller. Synonymous variants, variants with minor allele frequency >2% (reported in dbSNP) and low coverage variants (DP < 20, ADV TREx NGS-targeted kinase sequencing. A custom panel interrogating 487 human kinases was selected from the available pre-designed assays for TruSeq Targeted RNA expression kit (TREx Illumina, San Diego, CA, USA) using the Design Studio software (Illumina, San Diego, CA, USA). Custom TREx is based on amplicon technology generating PCR products with an average insert size between 70 and 80 bp. Library preparation was performed starting from 200 ng RNA input. To assess RNA quality, the RIN parameter (RNA Integrity Number) was evaluated using 2100 Bioanalyzer instrument, RNA 6000 nano kit (Agilent Technologies, Santa Clara, CA, USA). All the analyzed samples had a RIN value >8. The libraries were sequenced in single-end on MiSeq platform (Illumina, San Diego, CA, USA) with sequencing reagent kit v3 (1 × 50 bp). Fastq files were aligned to the human reference genome (hg19) using STAR (v. 2.5.1b) 34 . Raw count quantification was performed using Bedtools Coverage tool (v. 2.22.0) 35 , starting from the bam files. Normalization was performed using DESeq. 2 (v. 1.12.4) 36 with default parameters and data were log2 transformed. Due to the similar length of the produced fragments, a length normalization step was not performed. R was used to calculate the distance matrix (distance method = maximum) between the kinases in the chordoma cell lines and "complete linkage" clustering was applied (hclust function) to generate 3 groups: high, low and medium expressed kinases. These clusters were then used to generate a plot of kinase expression pattern in chordoma cell lines by means of Kohonen maps, used for visualization.
Real Time quantitative PCR. Real Time quantitative PCR (RT-qPCR) was carried out using SYBR green technology; specific primers were designed for the genes of interest using the freely available Primer3 software (http://primer3.ut.ee/) and synthesized using an Applied Biosystems 3900 Synthesizer.
RT-qPCR data were obtained using TaqMan Reverse Transcription Reagents (Thermo Fisher Scientific, Waltham, MA, USA) and random hexamer priming to reverse transcribe RNA to complementary DNA (cDNA), according to manufacturer's instructions. Real Time quantitative PCR (qPCR) was carried out on a ABI Prism 7900HT Applied Biosystems Sequence Detector (Thermo Fisher Scientific, Waltham, MA, USA), with reagents and materials from Applied Biosystems/Thermo Fisher Scientific (Power SYBR ® Green PCR Master Mix), according to manufacturer's instructions, in a volume of 12.5 μl per reaction, each containing approximately 10-12 ng cDNA (diluted in TE buffer 1×), 300 nM primers for the different targets and for 3 endogenous reference controls (glucuronidase beta (GUSB), peptidylprolylisomerase A (PPIA), and 18S ribosomal RNA), as detailed in Suppl. Table 2. Each sample was assayed in duplicate qPCR reactions. Relative quantification of expression levels was calculated following the manufacturer-suggested ΔΔCt method using the average of the 3 above reference controls 37 .