Elucidation of molecular and functional heterogeneity through differential expression network analyses of discrete tumor subsets

Intratumor heterogeneity presents a major hurdle in cancer therapy. Most current research studies consider tumors as single entities and overlook molecular diversity between heterogeneous state(s) of different cells assumed to be homogenous. The present approach was designed for fluorescence-activated cell sorting-based resolution of heterogeneity arising from cancer stem cell (CSC) hierarchies and genetic instability in ovarian tumors, followed by microarray-based expression profiling of sorted fractions. Through weighted gene correlation network analyses, we could assign enriched modules of co-regulated genes to each fraction. Such gene modules often correlate with biological functions; one such specific association was the enrichment of CD53 expression in CSCs, functional validation indicated CD53 to be a tumor-initiating cell- rather than quiescent CSC-marker. Another association defined a state of poise for stress-induced metastases in aneuploid cells. Our results thus emphasize the need for studying cell-specific functionalities relevant to regeneration, drug resistance and disease progression in discrete tumor cell fractions.

often culminates in therapeutic failure 17 . Aneuploidy leads to selective adaptive changes ensuring tumor survival under stress 18 . The genetic variance imposed by aneuploidy is also recognized to be a major player in tumor dormancy, yet has been a challenge to elucidate [19][20][21] . In the present study, towards further molecular understanding of the dynamics of tumor heterogeneity, we resolved and characterized discrete cellular fractions based on the regenerative hierarchy and genetic instability by combining flow sorting with gene expression microarrays in a xenograft model of ovarian cancer. Further analyses and functional validation generated knowledge relating to regeneration associated markers and molecular pathways of drug resistance and residual disease that could contribute to improvement of present day therapy.

Methods
Cells culture, xenograft generation, flow cytometry and sorting. A4 cells used in the study were established from malignant ascites of a patient diagnosed with grade 4 serous ovarian adenocarcinoma and cultured in MEM medium with 5% FBS and 1% NEA 22 . 2.5 × 10 6 A4 cells were stained with 2 μ M PKH67 (PKH67-GL; Sigma-Aldrich) for 7 minutes, washed with ice-cold MEM(E) medium, and injected subcutaneously in NOD/SCID mice. Mice were bred and maintained at the NCCS Experimental Animal Facility. All experimental procedures and protocols performed in accordance with the laws and policies and approved by the NCCS Institutional Animal Ethics Committee. Tumors were harvested after 4 weeks and processed for flow cytometry analysis and sorting. Tumor fractions were sorted at 99% purity based on label intensity as PKH hi , PKH lo , PKH neg . Fractions were gated considering freshly labeled and unlabeled tumor cells were used as positive and negative controls respectively 15 . Additionally, tumor cells were stained with Hoechst (45 mins) and Pyronin Y (45 mins) at 37 °C before sorting cell fractions based on DNA-RNA content and cell cycle phases, where fractions were defined as euploid (G0 + G1) and aneuploid (G0 + SG2M) since segregation of entire fraction required intracellular staining of cyclin B1. Paclitaxel was administered intraperitoneally 25 mg/kg body weight of mice in two regimes as 3 doses (3D) at an interval of 48 hrs followed by 1 recovery cycle) or 6D (2 cycles of 3D) 16 . Seven aneuploid clones from paclitaxel-treated A4 xenografts (5 doses of 25 mg/kg body weight mice at an interval of 7 days) were established following cell sorting and culture. Cell sorting, data acquisition and analysis were achieved using BD FACS Aria II Sorp (BD Biosciences) and FACS Diva software version 6.0.
Microarray data analysis. Microarray hybridization (Agilent Human Whole Genome 8 × 60 K Array) and data acquisition were carried out using standard procedures. Briefly, samples were labeled using the Agilent Low Input RNA amplification kit and hybridized to Agilent Human Whole Genome 8 × 60 k Array using Agilent In situ Hybridization kit. Triplicate datasets of each sorted fraction (PKH hi , PKH lo , PKH neg , euploid and aneuploid) were analyzed using Agilent Genomic Work Bench. Each data were normalized using median over entire array normalization method in BRB array tools v3.8.0 23 . Genes were filtered by class comparison method using criteria of p-value > 0.05 of univariate test, minimum fold change in which 20% of the expression data values have at least 1.5 fold change. Class comparison derived differential genes were subjected to two-way hierarchical clustering using MeV_4_8_1 software. Class comparison derived 5216 differential genes across 5 tumor sorted fractions (PKH hi , PKH lo , PKH neg , euploid and aneuploid) each in triplicate i.e 15 sample dataset were subjected to Weighted Gene Correlation Network Analysis (WGCNA) for derivation of functionally correlated genes. Pearson correlation coefficient was calculated between all pairs of genes across 5 tumor sorted microarray samples. The correlation matrix generated was raised to power 6, producing weighted correlation matrix. Weighted correlation matrix converted into topological overlap (TO), which not only measure the correlation between two genes but also extent of overlap across the weighted network. Genes were clustered based on TO values and dendrogram were plotted using Dynamic tree cut algorithm. Representative expression profile (module eigengene values) for each module was generated based on first principle component of genes containing thatparticular module. The resulting MEs were used to identify representative signature genes for each module, based on correlation criteria. Unsupervised Principle Component Analyses (PCA) was carried out for resolution of overlap between tumor fractions across 3 samples of five tumor fractions 24,25 . AracNe (Accurate Cellular Networks) was used to predict the interaction network the MI was fixed at 0.20; data processing inequality was set at 0.10, networks were visualized using Enrichment map in Cytoscape_v2.8.2 26 . Pathway analysis was done using DAVID Bioinformatics Resource v6.7 27 .
RNA isolation, cDNA synthesis and qPCR quantification. Sorted cells were subjected to standard trizol-based RNA isolation and cDNA synthesis protocols. qPCR analyses with specific gene primers were carried out with Step one plus in 96-well plate format using SYBR Green Mix (Life Technologies). Changes in threshold cycle (CT) values were calculated as: Δ CT = CT (test) − CT (control); fold difference was calculated as: fold difference = 2− Δ (Δ CT). GAPDH expression was used for normalization; non-template controls accounted for possible contaminating DNA in reaction mixtures. Primer sequences are available on request.
Functional assays for assessment of proliferative, regenerative and migratory potential.
Clonogenecity. Tumor sorted fractions were plated in 96 well plate with cells (500 cells/well). Assays were terminated on Day 14 and plates subjected to crystal violet staining. Cells were washed with PBS, fixed with chilled 3% paraformaldehyde and incubated with 0.05% crystal violet dye for 2 hours. Images were captured by camera (Olympus). Clonogenic potential was counted by using Image J software. For drug treated fractions relative colony counts were established after normalizing with those from corresponding A4 control tumor fractions.
Soft agar assay. In vitro tumorigenecity and anchorage independent growth was assayed using standard methods. For colony formation, tumor fractions were sorted, suspended in 0.5% agarose at a concentration of 5000 Scientific RepoRts | 6:25261 | DOI: 10.1038/srep25261 cells and seeded in 35 mm plates containing a basal layer of 1% agarose. Colonies were counted on Day 14 under Olympus IX71 microscope at 10X magnification.
Spheroid formation assay. Each tumor sorted fractions was seeded in ultra low attachment plate at a density of 5 × 104 cells/well and cultured in MEM containing 1% serum. Spheroids were counted on Day 14 and images were captured on Olympus IX71 microscope.
Scratch assay. Tumor sorted fractions were plated in 96 well plate at a density of 1000 cells/well, media was changed every 48 h and cells grown till confluency. Wound was inflicted with a pipette tip, washed twice with 1XPBS to remove floating cells and serum-free MEM medium was added. Wound healing was monitored for 72 hours; Images were captured on Olympus IX71 microscope and analyzed by TScratch Software.
Matrigel Migration assay. 5000 sorted cells were seeded in a matrigel chamber (Bd Biosciences) in 24 well plate. Media containing serum (MEM + 5% FBS) was used as chemoattractant in the lower well. Cells migrating and attaching to the bottom of the plate were stained with crystal violet and quantified after 48 hours.
Limiting dilution assay. Tumor-initiating potential was evaluated by injecting tumor cells (5000, 10000 or 20000 cells; 1:1 -matrigel:1XPBS) subcutaneously in NOD/SCID mice. Tumor formation was monitored for up to 1 month after injections. Stem cell frequency/tumor-initiating potential was calculated using the ELDA software 28 .
Immunoflorescence (IF) and Immunohistochemical (IHC) staining. 2 × 10 3 sorted euploid and aneuploid cells from A4 xenograft tumors were seeded on cover-slips and subjected to scratch assay as described earlier 24 . After 48 hours cells were fixed with 4% paraformaldehyde, blocked with 10% goat serum and subjected to IF staining. Thus, cells were incubated with primary antibody against Moesin (1:100; Abcam) for 1 hour followed by corresponding secondary antibody, and subsequently with Falloidin. Images were acquired and analyzed on confocal microscope (Leica, Germany). Standard protocols were used for IHC as described earlier 24 . Briefly, 5 μ thick sections of A4 xenografts embedded in paraffin were deparaffinized and hydrated using an alcohol -water gradient. Subsequently sections were washed twice in 1XPBS, blocked with 10% goat serum for 30 min and incubated with primary Moesin/Slug antibody (1:100 dilution) overnight. Internal peroxidase activity was neutralized using 0.3% H 2 O 2 followed by incubation with horseradish peroxidase (HRP) conjugated secondary antibody for 30 minutes. Color was developed using DAB (3,3′-diaminobenzidine) and sections counterstained with Hematoxylin. Finally, slides were dehydrated, xylene cleared, mounted and imaged using CellD software for Olympus IX71. Analysis and quantification of frequency of cells associated with protein expression was performed with ImageJ software.

Statistical analysis.
Unless mentioned otherwise, all experiments were carried out in triplicate; data is expressed as mean ± SEM of at least three independent experiments. The significance of difference in the mean values was determined using two-tailed Student's t test *p < 0.05; **p < 0.01; ***p < 0.001.

Results
Genes-modules-pathway correlations with tumor fractions. The regenerative hierarchy in xenografts demarcated on a basis of quenching kinetics of the membrane fluorophore PKH, identified three fractions as described above. Briefly, the PKH hi fraction represents CSCs, PKH lo includes tumor progenitors and PKH neg cells represent differentiated component of the tumor. Alternatively, variability in the cellular DNA content identified two discrete tumor fractions representing euploid and aneuploid cells (Fig. 1a,b-left panels). Class comparison of gene expression datasets from each of these fractions indicated association of definitive expression patterns (Fig. 1a,b-right panels). 5216 differentially expressed genes were identified across the five cell fractions that could be clustered in 8 modules using Weighted Gene Correlation Network Analysis (WGCNA; Fig. 1c); each module represents a set of strongly correlating genes that do not correlate with the genes in other modules (Supplementary Table 1).
Unsupervised, Principal Component Analysis of the top, most significant 50 WGCNA module genes (n = 439) distinctly segregated euploid from other data; aneuploid datasets exhibited low distance similarity with two PKH lo datasets, while data of the PKH hi fractions clustered closely with that of PKH neg (Supplementary Fig. 1a). Very importantly, the WGCNA modules were also observed to be differentially enriched across various cell fractions ( Fig. 1d; Supplementary Fig. 1b) suggesting an association of distinct molecular expression profile with each fraction. With a view to identify such differentially expressed gene signatures across distinct cellular subsets, we first eliminated the most unlikely associations based on our earlier studies 15,16 (Fig. 1e) that suggest an overlapping, heterogeneous nature of the PKH and DNA content based subsets. The most important and relevant considerations thus identified include-(i) PKH hi and PKH neg are almost entirely euploid and constitute non-cycling states, and (ii) dormant aneuploid cells are confined to the PKH lo fraction (Fig. 1e).
Thus associations predicted from our analyses that seemed unlikely included enrichment of modules M3, M7, M8 in PKH hi and PKH neg but not in euploid data; M1b and M4 in euploid fractions but not in PKH hi or PKH neg data. On the other hand, the most likely associations predicted included enrichment of module M2 expression in the PKH hi fraction, and an anti-correlative association of M5 and M6 module genes in euploid and aneuploid cells respectively. To further establish unequivocal associations, gene expression intensities of M2 and M5 module genes were compared across tumor fractions. Distinct up-regulation affirmed association of M2 and M5 genes with PKH hi and aneuploid fractions respectively (Fig. 2a). Pearson analysis to identify highly correlating genes within modules M2 and M5 further revealed 3 distinct clusters in each module, that suggest an association with specific functionalities (Fig. 2b,c; Supplementary Table 2). Thus, within the PKH hi datasets, DAVID pathway analyses 27 assigned the strong correlation between M2 associated gene clusters Cl1 & Cl3 due to shared functions of G-protein coupled signaling, although Cl1 is additionally involved in meiotic cell cycle and Cl3 in transporter activity pathways. Strikingly, the smaller Cl2 cluster negatively correlates with Cl1 & Cl3, predicatively to maintain homeostasis of G-protein signaling. Of the three module M5 clusters in aneuploid datasets, clusters Cl2 and Cl3 positively correlated with each other due to shared functionalities associated with cytoskeletal proteins and cell projection assembly components in cell division, and negatively correlated with Cl1 (cytoskeletal remodelling by ERM proteins). Intriguingly, this suggests a decoupling of cell division from cell migration.
Module M2 analysis and validation assigns CD53 functions. Literature-based functional annotation assigned further relevance to expression of M2 genes in the PKH hi /CSC fraction through their known involvement in stem cell functioning and differentiation, cell survival, cell adhesion and migration (Supplementary Table 3). Network analyses of these genes revealed the cell surface tetraspanin molecule CD53 to be a major hub with several interacting partners (Fig. 2d). In validating expression of some of the M2 genes, CD53 exhibited a striking exclusive association with the PKH hi fraction, although the protein was associated not only with PKH hi but also with euploid and G2M-growth arrested aneuploid PKH lo fractions (Fig. 3a,b; Supplementary Fig. 2a-c). CD53 was also expressed in xenografts derived from other serous ovarian cancer cell lines including CAOV3, OVMZ6, OVCAR3 and OV90 that suggests a pan-ovarian cancer association (Supplementary Fig. 2d).
In further probing for functional correlates of this expression, we observed differences in regenerative potential of CD53 expressing (CD53 pos ) vs. CD53 lacking (CD53 neg ) cells when assayed for spheroid formation, in vitro tumorigenecity and in vivo limiting dilution assays for tumorigenecity potential (Fig. 3c-e). Similar evaluation of PKH derived fractions in the latter assay indicated PKH hi cells to exhibit maximal regenerative ability [estimated as 1 in ELDA -Extreme Limiting Diluting Assay software 26 that is useful in comparing stem cell frequencies], while PKH lo cells have a much lower potential (1/94648), and PKH neg cells are devoid of regenerative abilities. Since CD53 expression is associated with only the PKH hi and PKH lo cell fractions of tumors (Fig. 3b, middle panel) and exhibits an intermediate stem cell frequency (1/16097), this strongly suggests an association with regenerative functions within a tumor. Thereby, these results suggest CD53 to be a putative marker for tumor initiating/regenerative potential.
Aneuploid cells harbor latent metastatic potential. WGCNA also established an association between module M5 expression and aneuploidy (Fig. 1c). Further detailed literature annotation assigned several functions including migration, invasion and metastases through EMT, change in morphology through cytoskeletal remodeling, enzymatic penetration of basal lamina, cell cycle, immune evasion to overcome host resistances and re-acquisition of 'stemness' features to ensure long-term cell survival and regeneration (Supplementary Table 4). Gene-gene interaction network analyses identified cytoskeletal remodeling genes as a major hub with several interacting partners (Supplementary Fig. 3a-c). Validation of core hub genes affirmed the association with aneuploid over euploid cells (Fig. 4a).
Towards a deeper study, we attempted establishment of single cell sorted aneuploid clones from xenografts. Interestingly, in vitro culture led to rapid genome stabilizion within 5-6 passages in cultured aneuploid cells, wherein all clones evolved to a near-euploid state indicating an inability to sustain the replicative stress imposed by aneuploidy (Supplementary Fig. 4a,b-i,b-ii). These aneuploid clones exhibited improved viability but not tumorigenic potential as compared to parental A4 cells, unless challenged with paclitaxel exposure ( Supplementary  Fig. 4c,d). These results suggest a role for dynamic niche signaling in maintenance of genetic instability besides  Fig. 1a; (c) Graphical representation of functional potential of sorted populations (CD53 neg ; CD53 pos ,) in assays for: spheroid formation; (d) -anchorage-independent clonogenecity; (e) Limiting dilution assay for in vivo tumorigenic potential of tumor sorted CD53 neg ; CD53 pos , PKH hi , PKH lo and PKH neg fractions, where Stem cell frequency was estimated based on number of regenerated tumors in limiting dilution assay using ELDA software 28 , $-p-values based on Students t-test between CD53 pos vs CD53 neg and PKH hi vs PKH neg .
Scientific RepoRts | 6:25261 | DOI: 10.1038/srep25261 confirming our earlier observations that the aneuploid fraction in naïve, untreated tumors is growth-arrested and exposure to stress and selective pressure triggers cell cycling and possibly selection 15 . In a separate study, we have also also shown that such aneuploid cells have potential to generate parallel drug resistant hierarchies within the same tumor 16 , that also emphasized microenvironmental stress to be essential for examining predicted functional correlates of aneuploid cells. We thus examined and compared the functional dynamics of aneuploid cells from naïve tumors with those under three-or six-doses of paclitaxel. Drug exposure not only enhanced frequency of aneuploidy ( Supplementary Fig. 5a), but significantly contributed to enhanced in vitro regenerative potential as assayed by colony and spheroid forming assays (Fig. 4b-d; Supplementary Fig. 5b). Profiling of M5 genes in paclitaxel treated tumors also identified their upregulation in a dose-dependent manner (Fig. 4e) suggesting contributions to tumor cell survival. Most striking, expression of moesin and flotillin (known to contribute to centrosomal abnormilities, cytoskeletal remodelling and migration) were further enriched. This led us to hypothesize that aneuploid cells in naïve tumors are 'poised' for metastases, which could be triggered by stress. On testing this conjecture through functional wound healing (migration) and matrigel invasion assays, aneuploid cells from paclitaxel treated xenografts indeed validated this predicted function ( Fig. 4e-g;  Supplementary Fig. 5c,d). In further quantifying cells with these enhanced capabilities, it was observed that the frequency of moesin, flotillin, and F-actin-moesin co-expressing cells (most likely to undergo metastases) were -(i) distinctly higher in aneuploid than euploid fractions in naïve tumors, and (ii) showed further enhanced expression following exposure to six doses of paclitaxel and (iii) represented the entire aneuploid population in treated tumors (Fig. 4h).
Concurrently acquisition of cellular plasticity is also evident with increased frequency of cells co-expressing fibroblast associated protein (FAP) and E-cadherin, indicating a propensity towards migration which is supported by the functional assays. A final visual evidence for cytoskeletal remodeling was provided by immunofluorescence profiling of F-actin and moesin expression in sorted euploid and aneuploid tumor cells subjected to scratch/ wound healing assays (Fig. 4i). Presence and extensive re-arrangement of actin stress fibers and increased moesin expression was observed in aneuploid over euploid cells and provided visual evidence of active actin cytoskeletal remodeling.
Further, immunohistochemistry staining and analysis of A4 xenograft paraffin sections for moesin and Slug expression (a transcription factor involved in epithelial to mesenchymal transition -EMT; Fig. 4j-i,j-ii,j-iii) was also carried out along with detection of giant nuclei harboring cells that are likely to be aneuploid; expression quantification was carried out across 10 random fields at the migratory edge as well as at the core of tumors. This identified Slug and Moesin expression in aneuploid cells that were also more frequent at the migratory edge than in the tumor core ( Supplementary Fig. 5e). Together, this affirms a latent metastatic potential in aneuploid cells, that may be activated by stress conditions through dynamic cell plasticity and modulation of the actin cytoskeleton.

Discussion
From a wider perspective, our study provides some reasons for failure of biomarker and molecular target identification studies based on entire tumor profiling, which are also relevant to achieving complete elimination of tumor cells 29,30 . Recent approaches involving molecular profiling of mesenchymal/stromal cells, immune cells including macrophages and lymphocytes [31][32][33] identify effects of inter-tumor heterogeneity by examining differences in tumors between different individuals. Cell sorting and differential fraction profiling has also been attempted in the context of CSCs vs. non-CSCs in tumors that unfortunately overlooks the role of progenitors and genetic instability in long-term cell survival and regeneration 34,35 . Closing the loop back to analyses of multiple-fraction sorted cells, we demonstrate how one can overcome limitations imposed by averaged measures to perform "reverse engineering" with molecular profiles of different cell types and identify their relevance to tumor behavior.
No direct correlation of CD53 expression with CSC activity is reported, yet our network analyses identified CD53 as a major hub in sorted CSC fractions. Apart from its conventional role in infiltration, modulating adhesion and migration of neoplastic B cells 36 , CD53 is also known to be associated with glioblastoma, ovarian, breast, colon, kidney, lung, uterine and rectum cancer 31 wherein its expression correlates with radioresistance, matrix degradation, inflammation, regulating apoptosis to trigger tumor cell survival 37,38 and results in poor patient prognosis 39,40 . Our study is possibly the first to identify and validate a strong correlation of CD53 with tumor regenerative capabilities. It also became apparent that cues for long-term regeneration and metastases are well established within the primary tumor as observed from activation of the actin cytoskeleton in quiescent, non-proliferative aneuploid cells by therapeutic stress. The latter also suggests that decoupling of moesin-flotillin-cytoskeletal remodeling from cell division may provide some advantages to migratory cells 41 .
Taken together and from a disease perspective, tumor initiating potential of CD53 expressing CSCs and progenitors could define tumor dormancy and recurrent disease, while aneuploid cells are likely to contribute not only through enhanced adaptation capabilities potentiated by their genetic diversity, but harbor a capability to escape from a restrictive or hostile microenvironment. Concurrently, with gain of regenerative potential this can lead to drug resistance, tumor metastases and disease relapse in ovarian cancer.