Establishment and characterization of 18 human colorectal cancer cell lines.

Colorectal cancer (CRC) represents the third most frequently diagnosed malignancy worldwide and is the second most common cause of tumor-associated mortalities in Korea. Due to the disease's aggressive behavior, the 5-year survival rate for CRC patients remains unpromising. Well-characterized cell lines have been used as a biological model for studying the biology of cancer and developing novel therapeutics. To assist in vitro studies, 18 CRC cell lines (SNU-1566, SNU-1983, SNU-2172, SNU-2297, SNU-2303, SNU-2353B, SNU-2359, SNU-2373B, SNU-2407, SNU-2423, SNU-2431, SNU-2465, SNU-2493, SNU-2536C, SNU-2621B, SNU-NCC-61, SNU-NCC-376, and SNU-NCC-377) derived from Korean patients were established and characterized in the present study. General characteristics of each cell line including doubling time, in vitro morphology, mutational profiles, and protein expressions of CRC-related genes were described. Whole exome sequencing was performed on each cell line to configure mutational profiles. Single nucleotide variation, frame shift, in-frame deletions and insertions, start codon deletion, and splice stop codon mutation of various genes were found and classified based on their pathogenicity reports. In addition, cell viability was assayed to measure their sensitivities to 24 anti-cancer drugs including anti-metabolites, kinase inhibitors, histone deacetylase inhibitors, alkylating inhibitors, and topoisomerase inhibitors, all widely used for various cancers. On testing, five CRC cell lines showed MSI, of which MLH1 or MSH6 gene was mutated. These newly established CRC cell lines can be used to investigate biological characteristics of CRC, particularly for investigating gene alterations associated with CRC.


Growth properties and morphology in vitro.
Cell growth rate was measured with same method described previously 9 . For growth properties, cells were seeded into 96-well plates at a density of 2.0 × 10 3 cells/ well and were treated with EZ-cytox (DAEIL Lab, Seoul, Korea), a water-soluble tetrazolium salt solution that could be reduced by succinate-tetrazolium reductase to produce formazan dye. After incubating at 37 °C for 2 h, optical density (OD) was assessed at 450 nm using a Multiskan ™ GO Microplate Spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). The number of cells was analyzed in triplicate at 24-hour intervals for at least 7 days. The doubling time of the cells was calculated from the growth phase. Growth curve and growth properties were drawn and calculated using GraphPad Prism software with normalized OD values. Cell morphology was assessed using an Axiovert 100 microscope at 100× magnification. DNA fingerprinting. DNA fingerprinting analysis was performed as decreased before 10 . Briefly, total DNA was isolated from cell pellet by using QIAamp DNA Mini Kit (Qiagen, Hilden, Germany) according to manufacturer's protocol. Quantified and diluted gDNA solution was added to reaction mixture consisted of Amp FISTR PCR reaction mix, Taq DNA polymerase, and Amp FISTR identifier primer set (Applied Biosystems, CA, USA). DNA was amplified using a GeneAmp PCR System 9700 (Applied Biosystem) with annealing temperature set to 59°C. Gene Scan-500 Rox standard (0.05 μl) and 9 μl oHi-Di Formamide (Applied Biosystem) were added to 1 μl of PCR product of each cell line and denatured at 95 °C for 2 min. The mixture was then analyzed with a 3500 xL Genetic Analyzer (Applied Biosystems).
Western blotting analysis. Detailed procedure was described previously 9 . Cells were harvested with a cell scraper after washing with cold PBS. Whole protein was extracted with EzRIPA buffer (ATTO Co., Tokyo, JAPAN) supplied with 1% protease inhibitor and 1% phosphatase inhibitor in accordance with the cell viability assay time frame. The volume of lysis buffer was adjusted to the number of cells collected in each vial. The protein concentration was determined by SMART TM micro BCA protein assay kit (Intron biotechnology, Gyeonggi, Korea). Proteins in equal amounts were loaded on a 4-12% Bis-Tris gel (Invitrogen) and run at 50 volts for 2 h. Proteins on gel were then transferred to a PVDF membrane (Invitrogen) by electro-blotting with constant current of 80 mA at 4 °C overnight. Proteins on transferred membrane were blocked by incubating with 1.5% to 2.0% skim milk in 0.05% Tween 20-TBS buffer including 1 mM MgCl 2 at room temperature for an hour. The membrane was then incubated with primary antibodies against EGFR ( , and β-actin (Santa Cruz Biotechnology, TX, USA) (1:100) followed by incubation with mouse or rabbit IgG 2 nd antibody (Jackson Immunoresearch, PA, USA) (1:5000) conjugated with peroxidase that matched with the primary antibody used. Chemiluminescent working solution WESTZOL TM (Intron biotechnology) was then used to treat the membrane which was then exposed to Fuji RX film (Fujifilm, Tokyo, Japan) for 1-5 minutes.
Whole exome sequencing. Detailed procedure was described previously 9 . SureSelect sequencing libraries were prepared using SureSelect Human All Exon 50 Mb Kit (Agilent) according to manufacturer's instructions using a Bravo automated liquid handler. Three micrograms of genomic DNA were fragmented to a median size of 150 bp using a Covaris-S2 instrument (Covaris, MA, USA). Adapter ligated DNA was amplified by PCR. PCR product quality was then assessed by capillary electrophoresis. Hybridization buffer and DNA blocker mix were incubated at 95 °C for 5 minutes and 65 °C for 10 min in a thermal cycler. The hybridization mixture was then added to a bead suspension and incubated at RT for 30 min while mixing. These beads were washed and DNA was eluted from beads with 50 ml SureSelect elution buffer (Agilent). The flow cell was then loaded on a HiSeq. 2500 sequencing system (Illumina). MSI test. Detailed procedure was described previously 11 . For microsatellite instability (MSI) analysis, BAT25 and BAT26 (two mononucleotide microsatellite markers) were evaluated using a capillary-based sequencing analysis 8 . PCR was performed as described above except that forward primers were labeled with a fluorescent dye. Labeled samples were run on an ABI 3730 genetic analyzer (Applied Biosystems). GeneMapper software v4.0 (Applied Biosystems) was used to calculate the size of each fluorescent PCR product. For gel-based MSI analysis, desired fragments were amplified in the presence of [a-P32] deoxycytidine triphosphate. PCR products were denatured and separated on 6 M urea/7%polyacrylamide gels run at 60 W.  DNA fingerprinting of 18 CRC cell lines. Fifteen tetranucleotide repeat loci and the gender-determining marker amelogen were heterogeneously distributed in each cell line, without cross-contamination (Table 3). They were also matched with the STR profiles of cell lines with passage 0 or 1 (including original tissue mass) in order to confirm that the established cell lines were not cross-contaminated with other patient material (Supplementary Table 1).

Expression levels of growth factor receptor and EMT proteins in 18 CRC cell lines. Protein
expressions of MLH1 and MSH2 of newly established cell lines were analyzed in accordance with their mutational profiles. Three cell lines (SNU-1983, SNU-2434 and SNU-3030) had pathogenic mutations in MLH1 and the protein expression was exclusively low accordingly. Two cell lines (SNU-2359 and SNU-2493) harbored benign mutation in MLH1 (c.655 A > G/p.Ile219Va), and protein structure was not affected. Although no pathogenic MSH2 mutation was present in the newly established CRC cell lines, the protein expression of MSH2 was varying, which implicated the protein expression of MSH2 was determined by RNA splicing or epigenetical alternations (Fig. 2a). Four cell lines (SNU-2359, SNU-2431, SNU-2465 and SNU-NCC-61) exhibited augmented EGFR level. SNU-2431 and SNU-2465 had increased expression of both EGFR and HER2 (Fig. 2b). Expression levels of EMT-related proteins, E-cadherin, EPCAM and vimentin were analyzed according to the in vitro molphology (Fig. 2c). E-cadherin was significantly decreased in SNU-2423, while EPCAM was expressed in all cell lines. Vimentin was exclusively expressed in SNU-2536C and SNU-NCC-61. Both cell lines grew as monolayers of substrate-adherent cells with adherent aggregates.
Genomic analysis. Fifteen genes in developing CRC were screened in the 18 newly established CRC cell lines. Using Clinvar database (www.ncbi.nlm.nih.gov/clinvar), we determined pathogenic mutations. Results are summarized in Fig. 3, Table 4 and Supplementary Table 2. Mutations included in the Fig. 3 are only pathogenic mutations indicated by Clinvar database. Supplementary Table 2 includes the entire mutations in which their clinical meanings were in question. The most common actionable alterations across the sample sets were TP53 (83%) and APC (67%). KRAS and SMAD4 mutations were also prevalent in the sample sets at 44%. The most hyper-mutated cell line was SNU-2621B (10 mutations). Genes that are related to DNA repair such as POLD1, MSH6, and PMS2 were mutated in the SNU-2621B cell line. Similarly, SNU-1983 was also hyper-mutated (9 mutations) and DNA repair genes such as MLH1 and POLD1 were mutated. The truncation mutations of MLH1 and MSH6 genes in SNU-1566, SNU-1983 and SNU-2621B cell lines were confirmed with Sanger sequencing ( n/a n/a n/a n/a None n/a n/a n/a n/a n/a n/a n/a n/a n/a SNU-2431 n/a M None n/a n/a n/a n/a n/a n/a n/a SNU-2621B 51 M n/a n/a n/a n/a n/a n/a n/a n/a SNU-NCC-61 49 M n/a n/a n/a n/a n/a n/a n/a n/a SNU-NCC-376 73 M n/a n/a n/a n/a n/a n/a n/a n/a SNU-NCC-377 64 M n/a n/a n/a n/a n/a n/a n/a n/a www.nature.com/scientificreports www.nature.com/scientificreports/ inhibitor (Belinostat, SAHA), alkylating inhibitor (Oxaliplatin), topoisomerase inhibitor (Irinotecan), growth factor receptor inhibitor (Cetuximab, Bevacizumab), natural compounds (Resveratrol, Curcumin, Baicalein, Genistein), and miscellaneous (Lecouvorin calcium, ICG-001, Olaparib), were estimated (Fig. 3). CRC cell lines were uniformly sensitive to Apitolisib, Trametinib, Belinostat, 5-FU, and Buparlisib with exceptions of SNU-2423 and SNU-2465. and resistant to Cetuximab, Bevacizumab, Leucovorin calcium, Olapraib, cyclopamine, and Resveratrol.

Discussion
New CRC cases continue to increase. At the time of detection, many CRC cases have already progressed to stage IV, which makes surgical resection unfeasible, and nearly 50% of CRC cases have shown recurrence or distance metastasis after primary resection 12 . Although there has been much research on inventing novel therapeutics, the molecular basis of drug response and aggressive behavior remains obscure due to its genetic intricacy, and more comprehensive analysis is called for to refine regimes for treatment and prevention 13 .
The importance of human CRC cell lines lies in their similarity to original tissues and their renewability, which facilitate the study of human CRC. Several CRC cell lines such as HCT-116, LoVo, SW-480, and WiDr have accelerated the CRC research. Nevertheless, those accessible CRC cell lines are somewhat obsolete and possibly acquire genetic alternations as passaging 14 . Clinical correlation between original human materials and cancer cell lines can decrease due to the accumulation of genetic aberrations with increasing subculture numbers [15][16][17][18][19] . Therefore, novel CRC cell lines can deliver suitable biological models for investigating a broader spectrum of  Nearly 15% of sporadic CRC cases show the MSI phenotype, which is prompted by inactivation of mismatch repair (MMR) genes such as MLH1, MSH2, and MSH6 20 . Hereditary non-polyposis CRC, which accounts for 2-5% of all CRC cases is also concurrent with germline mutations in MMR genes. Nearly 90% of reported mutations in MMR genes were harbored in MLH1 and MSH2 21,22 . In this study, five cell lines harbored pathogenic mutations in MMR genes. MLH1 was mutated in SNU-1983, SNU-2359, SNU-2434, SNU-2493, and SNU-3030. Among these cell lines, three (SNU-1983, SNU-3030, and SNU-2434) had pathogenic mutations in MLH1, and the protein expression was exclusively low accordingly. Interestingly, we found no pathogenic MSH2 mutation in the newly established CRC cell lines. Although Wei et al. reported that there were different patterns of MSH2 and MLH1 mutations between Asian and Caucasian population 23 , the prevalence of MLH1 mutation in comparison with MSH2 mutation in an Asian population has not been reported. Although we found no pathogenic MSH2 mutation, the protein expression of MSH2 varied, which implied that the protein expression of MSH2 was determined by RNA splicing or epigenetic alternations. Two (SNU-1566, SNU-2423) of these five cell lines were derived from patients with hereditary non-polyposis CRC.
APC, KRAS, and tp53 are frequently abberant genes in CRCs 15 , and these three genes were mostly mutated in the CRC cell lines characterized in this study as well. Most of the identified APC germline alternations are nonsense mutations or frameshift mutations near the 5' end of the gene, which truncated the protein structure 24 .  www.nature.com/scientificreports www.nature.com/scientificreports/ We considered APC mutations pathogenic when they were reported in Clinvar (https://www.ncbi.nlm.nih.gov/ clinvar), and the types of pathogenic APC mutations we identified in this study were also nonsense or frameshift.
KRAS serves as a fundamental mediator in the transduction of several growth or differentiation factor stimuli 25 . Most aberrations in KRAS harbor codons 12, 13, 59, and 61 26 . In this study, KRAS mutations gene were harbored in 8 of 18 cell lines (44%). Two cell lines (SNU-1566 and SNU-2423) had a mutation at codon 13, and six lines had a mutation at codon 12. Mutation types were G to A or G to T transitions.