Introduction

It has been long understood that cancer results from sequentially evolving genetic events. In solid tumors, malignancies are viewed as a collection of diseases that are heterogeneous in nature in: genomics, transcriptomic variations and clinical outcomes. Lately, evidence has supported the claim that cancers possessing stem-like properties typically have a worse prognosis1,2. Other studies showed that overexpression of epithelial-mesenchymal transition transcription factors enhanced stem-like properties and increased the aggressiveness of tumor cells3,4,5. We established several panels of tumor stem-like cells (TSLCs) in head and neck2,6, brain7 and breast8. Here, we focus on lung adenocarcinomas (LACs), for which we have established a panel of lung TSLCs previously as well9. Indeed, lung cancer is one of the leading causes of cancer-related deaths worldwide10. Its highly invasive and metastatic phenotypes are the major reasons for treatment failure and poor prognosis11.

The study aim is to identify a critical target regulating both lung cancer survival and the stem-like properties of lung TSLCs. Lately, epigenetic regulators such as chromatin modifiers and polycomb group proteins were shown to be important players in cell fate decisions and reprogramming12. Nuclear perturbation was also known to play an important role in cancer biology13. Given the noisy and scarce nature of TSLCs, we first try to consolidate a consensus gene signature of low variation and consistent gene activities across the panel of TSLCs of different tissue of origins. Such gene signature is important that it could distinguish TSLCs from the parental tumor cells (PTCs) and from the human embryonic stem cells (hESCs). In this study, enriched signaling pathways of DNA methylation and establishment and/or maintenance of chromatin architecture were found in the consensus TSLC networks generated by the consensus gene signature. Base on the lung TSLC-specific gene signature, we further built the lung TSLC network model for survival prediction. CBX5, a chromatin regulator in the polycomb group, was identified as a significant target in lung cancer survival analysis through a scalable network-based target identification process. Importantly, we verified that CBX5 was also essential for the maintenance of aggressiveness and stem-like properties in lung TSLCs. We believe that our work supports a stronger claim of the epigenetic roadmap for the future understanding of lung cancer and lung TSLCs.

Our network models are derived from gene expression signature given the tumor stem-like states. Topologically-weighted signal mechanics incorporated in the network model are designed to dissect a possible role of functional noise. Several lines of evidence showed that stochastic fluctuations in gene expression were observed in embryonic stem cells leading to different lineages and cell fates14,15,16. Researchers also tried integration of electrical potentials in neurons and the brain functional magnetic resonance images17 in relational network-based models to understand noisy signals. Therefore, we propose such network-based topologically-weighted signal model to estimate individual cancer survival time.

In summary, network-based models based on the TSLC panels were developed in this work to help understand the underlying biological perturbation leading to the variable survival time of lung cancer patients. It was expected that such network-based models could further elucidate the regulatory mechanisms leading to tumor invasion and metastasis. Data is available at GEO GSE35603. The R codes for all analysis could be accessed at https://sites.google.com/site/nwtoposignalincancer/.

Results

From our previous experiences working on TSLCs, we have found that the panels of TSLCs were quite heterogeneous depending on the experimental cultivation procedures and/or the original tumor samples. In addition, due to the scarce nature of TSLCs, it was difficult to have a comprehensive transcriptome of TSLCs within a single tumor type. In this study, we therefore first aimed at establishing the soundness and importance of commonality across the panel of TSLCs. We then proceeded to develop an application model for lung cancer survivals based on the common and consistent gene expression profiles in lung CD133+- TSLCs.

Characterization of lung TSLCs

Recent studies showed that expression of CD133 in lung cancers represents high tumorigenicity and resistance to cytotoxic therapy18,19. We previously reported greater chemoradioresistance of CD133+-TSLCs isolated from non-small cell lung cancers (NSCLCs) compared to CD133-NSCLCs9. Here, we isolated CD133+-TSLCs from 7 NSCLCs (Fig. 1a). Isolated lung CD133+-TSLCs could form floating spheroid-like bodies in serum-free medium more easily than CD133-NSCLCs. Quantitative RT-PCR results showed a higher level of transcripts of stemness genes (Oct4, Sox2 and Nanog) and drug resistant genes (MDR1, ABCG2) in lung CD133+-TSLCs (Fig. 1b). Lung CD133+-TSLCs displayed not only higher invasion activity, as well as enhanced foci formation, but also resistance to cisplatin, doxorubicin and taxol (Fig. 1c–e). In vivo, transplants of lung CD133+-TSLCs exhibited more aggressiveness tumorigenity in the lungs (Table 1).

Table 1 Characterization of tumorigenicity and effects of sh-CBX5 RNAi in lung CD133+-TSLCs from 7 non-small cell lung cancer (NSCLC) patient
Figure 1
figure 1

Characterization of lung TSLCs.

(a) Lung CD133+-TSLCs were sorted and characterized by FACS assay and cultured in bFGF and EGF with DMEM serum-free medium. (b) The relative mRNA levels of Oct4, Sox2, Nanog, MDR1 and ABCG2 were measured for the PTCs, CD133-NSCLCs and lung CD133+-TSLCs. (c–e) Evaluation of the cell viability under treatment of cisplatin, doxorubicin and taxol for the PTCs, CD133-NSCLCs and lung CD133+-TSLCs. (*P < 0.05) All results shown are means of three independent experiments ± SD. Samples were isolated from No.1 patient listed in Table 1.

Distinct transcriptional patterns of the inter-modular hubs of the consensus TSLC networks

First, we performed differential expression analysis between panels of TSLCs vs. PTCs for each tumor tissue type or experimental technique to come up with eight gene lists. These lists are of similar length, setting the 87.5% of absolute value of fold changes as the filtering threshold, merged as geneset A (Fig. S1, Table S1). Second, we ranked the top 500 probes with minimal transcriptional variability within TSLCs. A nonredundant geneset B consisting of 459 probes of low transcriptional variability in TSLCs as well as geneset A was summarized. Third, we filtered out gene signatures with inconsistent activities of TSLCs comparing to PTCs to be the concordant geneset C. We identified a consensus gene-list of 64 probes characterized with low variation and commonality (lv_com). They were found not only having concordant gene activities in at least two tumor types/experimental conditions, but also either differentially expressed in at least two tumor types/conditions (n = 18) or with minimal transcriptional variation in TSLCs (n = 46). In clustering analysis, consensus gene signature of lv_com demonstrated its capability of best separation across the three panels of cells. Hierarchical clustering and non-metric multi-dimensional scaling of TSLCs, PTCs and hESCs based on lv_com were displayed in Fig S2. By using lv_com as inputs in the Ingenuity Pathways Analysis (IPA), DNA methylation and transcriptional repression signaling pathways were found significantly enriched (Fisher exact test, −log(P-value) = 4.17). The output literature networks were merged with human protein-protein interactions (PPIs). We consolidated links in the merged networks by setting threshold of gene-gene co-expression within TSLCs. There are 77 links co-expressed among all TSLCs with abs(Pearson Correlation Coefficients; PCCs) >0.4 out of the total 161 links in the merged networks(Fig. S3; Table S2&3). PCCs of 77 links were mostly positive with the maximum of 0.92 between HNRNPD and ILF3. There are 49 genes, topologically categorized as 12 inter-modular hubs, 22 intra-modular hubs and 15 periphery genes, in the consensus TSLCs networks. Of note, in the group of inter-modular hubs, averaged gene activities (Exprs) and SNR of TSLCs were statistically different from those of hESCs or those of PTCs (Fig. S2). DNMT3A was the only gene that distinguished three panels (Fig. S4). Gene Ontology (GO) functional annotation revealed differences in the biological processes characterized by these two different kinds of hubs (Table 2). Intra-modular hubs were all membrane bound intracellular organelles and 43% of them participated in the establishment and/or maintenance of chromatin architecture. All of the inter-modular hubs had molecular function of protein binding.

Table 2 Gene Ontology Functional Enrichment Annotation of consensus TSLC network genes with different gene grouping according to the topological characteristics

Network signaling in lung-TSLC networks

There are 96 genes - 25 inter-modular hubs, 44 intra-modular hubs and 27 periphery genes - and 144 links in the lung-TSLC networks (Fig. S5; Table S4&5). Network-based survival analyses were conducted using two different sets of member genes, i.e. hub genes only (Nj = 69) or all genes in the lung-TSLC networks (Nj = 96), as well as using different combination of weights and weighting genes (Ni): (1) intra-modular hubs weighted by degrees; (2) inter-modular hubs weighted by degrees; (3) inter-modular hubs weighted by focality; (4) intra- and inter-modular hubs weighted by degrees; and (5) intra-modular hubs weighted by degrees plus inter-modular hubs weighted by focality. We calculated the measurements of Exprs, wt.Exprs, Mag, Spec and SNR derived from each combinatory network model (Ni vs. Nj) and tested them in the survival analyses. By grouping lung cancer patients into quartiles given the network-based measurements, we found that at least one type of measurement could significantly rank patients into 2 to 4 risk groups (Table S6). To eliminate the possibility of the sample size in each dataset being too small, we further conducted meta-analyses with the pooled metastasis-free survivals (MFS; n = 374) and overall survivals (OS; n = 828). Exprs values of inter-modular hubs were consistently significant predictors of OS and MFS. With regard to the MFS, measurements of Exprs, wt.Exprs, Spec and SNR of both the intra- and inter-modular hubs demonstrated trend-like significance (Table S6). We speculated that the lung-TSLC network model might be more sensitive to tumor progression.

Identifying a regulatory role of CBX5 in lung-TSLC networks modulating variation of lung cancer survivals

To identify potential targets in the lung-TSLC networks, we tried out each single gene as the weighting gene in the survival analyses (Ni = 1; Nj = 69). Network-based measurements were calculated and first tested by survival analyses based on quartiles and then tested by linear model fit with survival times. Genes showing statistical significance in multiple tests were identified into 3 groups: (1) OS-related; MLH1 and SMAD1; (2) MFS-related; CBX5, CPSF1, DNMT1, HNF1, IRS1, KPNA2, MSH2 and RASA1; and (3) OS/MFS-related; CDC2, COL18A1, RACGAP1 and SHC1.

CBX5 was chosen as a target for further validation for its potential role in lung cancer survival as well as in lung TSLCs based on the MFS analyses using the public lung cancer transcriptome. Foremost, levels of Exprs, Spec and SNR of CBX5 in quartiles were all significantly demonstrating dosage-like effects. Moreover, levels of Exprs and SNR of CBX5 were significantly correlated with the MFS survival time in the metastasis-free group (Fig. 2a–c). In addition, a general linear model fit existed between Spec of CBX5 and the reciprocal of MFS time in the metastasis group (Fig. 2b). It is worthy to note that CBX5 was originally found differentially expressed in TSLCs of atypical teratoid/rhabdoid tumor (AT/RT-TSLCs) and consistently induced in lung-TSLCs. The topological characteristics of CBX5 in the network models might provide a possible explanation of its role in TSLCs. CBX5 was an intra-modular hub connected with RB1 and E2F1 in the lung-TSLC networks as well as an inter-modular hub connected with DNMT3A in the consensus TSLC networks.

Figure 2
figure 2

Network-topologically-based measurements of CBX5 in lung-TSLC network model modulated variability of lung cancer survivals.

(a–c) MFS analyses of quartile groups of Exprs, Spec and SNR of CBX5. Xy plots with general linear model (GLM) fits for Exprs and SNR of CBX5 vs. MFS time. Xy plots with GLM fit for Spec of CBX5 vs. the reciprocal of MFS time. Metastasis patients colored red and metastasis-free yellow. (*P < 0.05) (d) Levels of CBX5 mRNA by quantitative real-time PCR from 20 pairs of primary LAC vs. adjacent non-tumorous lung tissues. Levels of CBX5 mRNA between local lung vs. metastatic lesions of 10 patient-pairs were also shown. Results are means of 3 independent experiments ± SD. (e) Representative results of immunohistochemical staining for CBX5 in LAC patients at different grades (left, low-grade; right, high-grade). Overall survival analysis according to the CBX5 expression levels in 125 Taiwanese LAC patients.

In order to determine whether CBX5 participated in lung tumorigenesis, we examined the levels of CBX5 in 20 pairs of LAC samples (T) vs. the corresponding controls (N) by qRT-PCR analysis. RNA transcripts of CBX5 were significantly higher in the tumor samples as well as in the metastatic lesions (Fig. 2d). We further collected another Taiwanese validation cohort of LAC patients for immunohistochemical staining (Table S7). The results supported that the CBX5-positive LAC cases were associated with worse overall survivals (Fig. 2e).

Validation of CBX5 in regulating self-renewal of lung TSLCs

We tried to verify the significance of CBX5 in the tumorigenicity and invasiveness of lung cancers by sh-RNAi knockdown of CBX5 in lung TSLCs (Fig. 3a). We showed that the capabilities of sphere formation, colony formation and migration/invasiveness of CD133+-TSLCs treated by sh-CBX5 RNAi were indeed significantly inhibited (Fig. 3b–d). Additionally, the percentages of CD133+-TSLCs and side population (SP) cells treated by sh-CBX5 RNAi were dramatically decreased (Fig. 3e–g).

Figure 3
figure 3

Knockdown of CBX5 with sh-RNAi in vitro.

(a) Western blot of knockdown of CBX5 in lung CD133+-TSLCs derived from two patients (No.1&2). The abilities of (b) sphere formation, (c) colony formation and (d) migration in CD133+-TSLCs treated with sh-CBX5 RNAi were decreased. The percentages of (e–f) CD133+-TSLCs and (g) SP cells were significantly reduced. (h) Using RT-PCR, we measured suppression of expression levels of BIRC5, SMAD1, MSH2, DNMT1, E2F1, RB1, TAF5, ESR1, MLH1 and SIN3A in the sh-CBX5 RNAi treated lung CD133+-TSLCs. These genes were identified by statistically significant correlation with CBX5 in lung cancer survival analysis. All data shown are the mean ± SD of 3 experiments. (*P < 0.05)

In order to understand the gene-gene interplays of CBX5 in lung-TSLC networks, we calculated the pair-wise correlations between levels of Exprs, Mag, Spec and SNR of CBX5 with those of the survival significant genes using the lung cancer transcriptome. We found that CBX5 was significantly correlated with the survival significant genes such as BIRC5 and DNMT1 among the metastasis-free patients. The correlations between the Spec or SNR levels were higher than those of the Exprs or Mag. These results indicated that gene-gene regulatory controls were indeed synchronized under such a network-based model, especially when taking into account gene membership, network topology, as well as signal stochasticity. We further tried to experimentally validate the identified correlated synchronization between CBX5 vs. the survival significant genes using in vitro sh-CBX5 RNAi inhibition in lung CD133+-TSLCs. Using RT-PCR, we detected decreased mRNA expression levels of BIRC5, DNMT1, E2F1, ESR1, MLH1, MSH2, RB1, SMAD1, SIN3A and TAF5 and in the lung CD133+-TSLCs treated with sh-CBX5 RNAi (Fig. 3h). Our results suggested that knockdown of CBX5 in lung-TSLCs could simultaneously inhibit these correlated survival significant genes.

Validation of CBX5 in modulating tumorigenicity and aggressiveness of lung carcinoma in vivo

In vivo models were utilized to further examine the effect of sh-CBX5 RNAi knockdown. By injecting 2×105 sh-CBX5 RNAi treated lung CD133+-TSLCs vs. sh-Luc controls through tail vein after 8 weeks, we demonstrated that the tumorigenic engraftment, tumor growth rate (Fig. 4a) and metastatic tendency to lung by lung CD133+-TSLCs (Fig. 4b&c) were prominently blocked by sh-CBX5 RNAi knockdown. For lung cancers, surgery is the current standard of care treatment. However, for locally advanced lung tumors (stage 3b or above) that cannot be surgically removed, treatment with combined radiation and chemotherapy would be given to improve survivals. Therefore, given the aggressive nature of lung CD133+-TSLCs, we further tested to show that treatment of sh-CBX5 RNAi significantly increased the radiosensitivity of CD133+-TSLCs in vitro as well (Fig. 4d). Mice transplanted with the sh-CBX5 RNAi-treated lung-TSLCs had significantly prolonged survivals as well (data not shown).

Figure 4
figure 4

Inhibition of CBX5 lessened tumorigenicity and aggressiveness of lung TSLCs in vivo.

2×105 lung CD133+-TSLCs treated with sh-CBX5 RNAi or sh-Luc controls were injected through tail vein of NOD-SCID mice (n = 6 per group). The animals were sacrificed and examined 8 weeks after injection. (a) Tumor volume and (b) the number of metastatic foci (arrows) in the lungs were analyzed by ex vivo GFP imaging and histological examination. (c) The effects of IR of 2, 4, 6, 8 and 10 Gy were evaluated in CD133+-TSLCs treated with sh-Luc or sh-CBX5 RNAi. Data shown are the mean ± SD of 3 experiments. (d) Knockdown of CBX5 in lung CD133+-TSLCs effectively reduced the number of lung metastases in transplanted mice. Treatment of sh-CBX5 RNAi combined with 4 Gy IR further enhanced the anti-tumor efficacy (*P <0.05).

Discussion

We have identified CBX5 as a potential target regulating lung cancer survivals and the stem-like properties of lung CD133+-TSLCs. Moreover, interplays of CBX5 with other genes in the lung-TSLC network model were statistically tested and experimentally validated. Lung cancer patients of higher CBX5 gene activities were of poorer prognosis and the knockdown of CBX5 with sh-RNAi in lung CD133+-TSLCs demonstrated lessened aggressiveness in vivo. We demonstrated that a scalable and predictable target identification approach was feasible, given the context of network topology and signaling mechanics.

CBX5, a highly conserved nonhistone protein containing chromatin organization modifier domain, i.e. chromodomain, belongs to the heterochromatin protein family20. In rodent and D. melanogaster cells, CBX5 was found to interact with H3K9me3 or colocalized with H3K9me to the heterochromatin regions21. The role of heterochromatin in transcriptional gene silencing and long-range chromatin interactions has been well-established. However, in mammalian cells, CBX5 and H3K9me were found to associate with coding regions of activated genes, although the possible mechanism was unclear. Evidence also showed that CBX5 served as a common gene expression signature shared by human mature oocytes and embryonic stem cells22. Recently, Wong and colleagues identified that ATRX working together with H3.3 and CBX5 might be a key regulator of ES-cell telomere chromatin23. Here, CBX5 was identified in both the consensus TSLC and the lung-TSLC network models. Our findings further supported a recent finding24 that CBX5 was essential in the maintenance of leukemia stem cells (LSCs). To date, we are the first to report CBX5 playing an essential role in the regulatory control of lung-TSLCs, as well as in malignant lung carcinomas. Survival significant genes identified from our analysis, specifically the identified target gene CBX5, again highlighted the importance of epigenetic regulatory controls. Thus, the lung-TSLC network model provided a link between experimentally cultivated lung-TSLCs and clinical lung cancer survival times, with statistical significance and mechanistic understandings.

Recent works by Bröske and colleagues demonstrated the indispensability of DNMT1 for the cell-autonomous survival of hematopoietic stem cells (HSCs) and LSCs25. De novo methylation by DNMT3A and DNTM3B was also shown essential for HSCs renewal but not for differentiation26. Our findings of DNMT3A in the TSLC-consensus network and DNMT1 synchronized with CBX5 in the lung-TSLC networks, were compatible to the above-mentioned reports. We recognized that the network models were built on known gene interactions and knowledge. Nevertheless, high co-expression in TSLCs lent a better support of their validity. PCC demonstrating co-expression of DNMT1 with E2F1 was 0.98 and with BIRC5 0.87; and PCC of DNMT3A with CBX5 was 0.72, with EED 0.62 and with MYC 0.48, respectively. Collectively, we supported and extended the importance of epigenetic regulations of TSLCs. However, it remains an open question in fully understanding the underlying regulation.

Network-based survival models have been developed for breast cancer27 and glioblastoma28. We are the first to address the stochastic gene expression activities embedded in biological networks by summarizing them in the noise-like Spec, as well as SNR, in malignant lung carcinomas. This approach provides each patient a unique estimated profile to summarize the variable transcriptional signature within the same set of genes in a network model. In conclusion, we demonstrated that stochastic element of transcriptional profiles of lung cancers, given the relational model based on the lung-TSLC networks, could be useful in estimating the prognostic survival time. Last, the methodology is generic and future exploitation in other research areas will establish the validity of its robustness and applicability.

Methods

Microarray data

(1) PTCs and TSLCs: The cultivated TSLCs were of six tissue of origins: breast, lung, colon, head and neck, glioblastoma and AT/RT, whose parental cells were MCF7, A549, SW480 and HT29, FaDu and SAS, PT1 (primary culture) and U87, ATRT, BT, BT6 and BT12, respectively. TSLCs were cultivated by methods described elsewhere2,3,6,7,8,9. For lung TSLCs, we isolated CD133+-TSLCs from tissue samples of NSCLC patients using the magnetic bead method of FACS assay. We extracted and purified total RNA according to procedures published elsewhere2. RNA was hybridized on microarrays of Affymetrix GeneChip HG-U133 plus 2.0 at the genomic core facilities at the National Yang-Ming University Genome Research Center. (2) hESCs: Oocytes (GSE12034, N = 3) and human embryonic stem cells (hESCs) microarrays were downloaded from the Gene Expression Omnibus (GEO) of NCBI (GSE7879: VUB01, N = 3; SA01, N = 3; and Sheff4 hESCs, N = 3. GSE9440: T3ES, N = 3. H1 hESCs, N = 4, from GSE9196 and GSE9510. H9 hESCs, N = 6, from GSE9196, GSE9510 and GSE9940). Please see Table S1 for details. Gene expression data of all three panels could be downloaded from GEO GSE35603. (3) Lung cancer transcriptome: Eight datasets with lung cancer survivals were downloaded from GEO and supplements as described29,30,31,32.

A precompiled gene set

A gene list was compiled (5812 GeneIDs) to incorporate properties of migration33,34, stemness1,35, calcium-related processes36 and cancer-specific transcript variants37,38 for gene expression analysis.

Gene expression analysis

Please see Fig. S1 for the analysis flow chart. All CEL files were pre-processed and standardized with mean of zero and SD of 1. We used R/Bioconductor software for the analysis. Differential gene expression analysis was controlled for FDR<0.0539 and set with threshold of fold changes of TSLCs vs. PTCs, to generate geneset A. Coefficient of variance40 was calculated to rank the top 500 probes of lowest transcriptional variability in TSLCs. We combined geneset A and low variability gene signatures (as geneset B) and further excluded those gene signatures with inconsistent activities of TSLCs comparing to PTCs (as geneset C). Among gene signatures with concordant gene activities in at least two tumor types/experimental conditions, we identified the first list of 64 probes, either differentially expressed in at least two tumor types/experimental conditions or from the low variability genes in TSLCs, denoted as a consensus gene-list characterized with low varation and commonality (lv_com). From geneset C, we found the second list of 145 probes (including 14 probes from lv_com) with consistent gene activities in lung TSLCs comparing to those of lung PTCs.

Network construction from literature knowledge base, human protein-protein interactions (PPIs) and co-expression profiles of cultivated TSLCs

Literature networks (svg files) using lv_com and the lung-TSLCs concordant gene signatures as inputs in the IPA were extracted, parsed and compiled41. Please visit the supplementary website for the Perl script and R code. Correlated output genes generated from the IPA as well as the input genes were further mapped onto the human PPIs downloaded from the NCBI (HPRD, BioGrid and BIND). PPIs would be retrieved if and only if both of the reactants were queried. Then, IPA generated networks were merged with the mapped human PPIs. To consolidate the network models, we calculated the co-expression Pearson correlation coefficients (PCCs) of every gene-gene interactions in the merged networks using all TSLCs and the lung TSLC only. Absolute values of PCCs of co-expression were calculated using all TSLCs and 0.4 was set as the cut-off threshold for the consensus TSLC networks. For the lung-TSLC networks, we set the cut-off threshold of abs(PCCs in lung TSLCs)>0.8. The thresholds were determined such that the number of nodes in the final networks would be less than 100. Functional annotation clustering of genes in the TSLC-consensus networks was analyzed by DAVID (Database for Annotation Visualization and Integrated Discovery, NIH)42.

Network topological analysis and predictive measurements derived from signal processing mechanics

Network topological analyses and classification of genes were performed according to methods previously published41. We developed five measurements to describe the network signal processing mechanics: expression level (Exprs); topologically weighted expression level (wt.Exprs); the 0-order magnitude (Mag), i.e. amplitude of the transcriptional signal; the 1-order property spectrum (Spec), i.e. the pair-wise relative transcriptional noise; and the signal-to-noise ratio (SNR). In the network model, there would be Ni weighting genes as well as Nj member genes of scalable sizes. For each gene, g, we assigned a measure of topological property, wtg. Importantly, wtg would be different according to the topological grouping: that is, zero for the periphery genes; degrees (number of nodes connected) for the intra-modular hubs; and either degrees or the estimated effects of perturbation (focality)41 for the inter-modular hubs. Then, for a single weighting gene, g, in the model with Nj member genes, the expression value (Exprs) was θg; wt.Exprs was the value of wtgg; Mag would be ; Spec would be calculated as ; and SNR would be defined as . For a group of Ni weighting genes, Exprs, wt.Exprs and Mag would be the averaged values. Spec would be calculated as and SNR unchanged.

Statistical and survival analyses

Student t test and bootstrap Kolmogorov-Smirnov test were used to determine the statistical significance of means or distributions. Kaplan-Meier survival curves based on quartiles of network-based predictive measurements were tested by log-rank tests. We also fitted Cox proportional hazard regression model and used Wald test statistics to determine a trend of gene dosage effects. Statistical significance was set at P < 0.05. Please visit the supplementary website for R codes used in the survival analysis.

Clonogenic assay

For a clonogenic assay, cells were exposed to different chemotherapeutic agents (cisplatin, doxorubicin and taxol)(10 μg/ml). After incubation for 10 days, colonies (>50 cells per colony) were fixed and stained for 20 min with a solution containing crystal violet and methanol. Cell survival was determined by a colony formation assay. The plating efficiency (PE) and survival fraction (SF) were calculated as follows: PE = (colony number/number of inoculated cells) × 100%. SF = colonies counted/(cells seeded x (PE/100)).

Western blot assay

Fifteen microliters of sample were boiled at 95°C for 5 min and separated by 10% SDS-PAGE. The proteins were transferred to Hybond-ECL nitrocellulose paper (Amersham) by a wet-transfer system. The primary antibodies used was antibody rabbit anti-human CBX5 (Cell Signaling Technology). The reactive protein bands were detected by the ECL detection system (Amersham).

In vitro cell invasion analysis and soft agar assay

The 24-well plate Transwell® system with a polycarbonate filter membrane was used (8 µm pore size; Corning, United Kingdom). Cell suspensions were seeded in the upper compartment of the Transwell chamber at a density of 1×105 cells in 100 µL of serum-free medium. The opposite surface of the filter membrane facing the lower chamber was stained with Hoechst33342 for 3 min and migrating cells were visualized under an inverted microscope. For the soft agar assay, the bottom of each well (35 mm) of a six-well culture dish was coated with 2 mL of an agar mixture (DMEM, 10%(v/v) FCS, 0.6% (w/v) agar). After the bottom layer solidified, 2 mL of a top agar-medium mixture (DMEM, 10%(v/v) FCS, 0.3%(w/v) agar) containing 2×104 cells was added and incubated at 37°C for 4 weeks. The plates were stained with crystal violet. The number of colonies was counted using a dissecting microscope.

Quantitative real-time RT-PCR and patients and tissue samples

Lung adenocarcinomas (LACs) and adjacent noncancerous tissues were obtained at the time of surgery from 20 patients in Taipei Veterans General Hospital. All patients gave their informed consent under institution's approval. Total RNA was extracted from tissue samples using TRIzol according to the manufacturer's protocol (Invitrogen, Carlsbad, CA). The amplification and PCR reaction were carried out (Roche, Alameda, CA). Standard curves (cycle threshold values versus template concentration) were prepared for each target gene and for the endogenous reference (GAPDH) in each sample.

Patient subjects and immunohistochemistry (IHC)

Between 1996 and 2009, 125 patients with operable LAC, without histories of radiation or chemotherapy, underwent surgery at Taipei Veterans General Hospital (Table S6). All samples were obtained after informed consent according to the tenets of the Declaration of Helsinki. Tissue samples were spotted on glass slides for IHC staining, deparaffinised, rehydrated, processed with antigen retrieval by 1X Trilogy diluted in H2O (Biogenics), immersed in 3% H2O2 for 10 min and washed with PBS 3 times. The tissue sections were then blocked with serum (Vestastain Elite ABC kit, Vector Laboratories, Burlingame, CA) for 30 min, followed by incubating with the primary antibody rabbit anti-human CBX5 (Cell Signaling Technology) in PBS solution at room temperature for 2 hr, washed with PBS 3 times, incubated with biotin-labeled secondary antibody for 30 min, incubated with streptavidin-horse radish peroxidase conjugates for 30 min, washed with PBS 3 times and immersed with chromogen 3-3′-diaminobenzidine plus H2O2 substrate solution (Vector® DBA/Ni substrate kit, SK-4100, Vector Laboratories, Burlingame, CA) for 10 min. Hematoxylin was applied for counter-staining (Sigma Chemical Co., USA). Study pathologists, blinded to the clinical data, examined and scored the IHC staining. The interpretation was done in five high-power views for each slide and 100 cells per view were counted for analysis.

Xenograft tumorigenicity assay

All procedures involving animals were in accordance with the institutional animal welfare guideline and the experiment was approved by Taipei Veterans General Hospital. Virus-infected lung TSLCs were harvested, washed with PBS and re-suspended in normal culture medium. Lung TSLC cells (2×105) infected with sh-CBX5 RNAi or control vector were injected through tail vein of 8-week-old male NOD-SCID mice. All mice were anesthetized and killed on day 56 (8 weeks) after injection. The number of tumor nodules and tumor volume in lung of the transplanted mice were measured by ex vivo and H&E survey. Ionizing radiation (IR) was delivered by a cobalt unit (Theratronic International, Inc., Ottawa, Canada) at a dose rate of 1.1 Gy/min (source-to-surface distance = 57.5 cm). Lung CD133+-TSLCs treated with sh-CBX5 RNAi were exposed to the radiation doses of 2, 4, 6, 8 and 10 Gy.