Prognostic Impact of Modulators of G proteins in Circulating Tumor Cells from Patients with Metastatic Colorectal Cancer

The consequence of a loss of balance between G-protein activation and deactivation in cancers has been interrogated by studying infrequently occurring mutants of trimeric G-protein α-subunits and GPCRs. Prior studies on members of a newly identified family of non-receptor guanine nucleotide exchange factors (GEFs), GIV/Girdin, Daple, NUCB1 and NUCB2 have revealed that GPCR-independent hyperactivation of trimeric G proteins can fuel metastatic progression in a variety of cancers. Here we report that elevated expression of each GEF in circulating tumor cells (CTCs) isolated from the peripheral circulation of patients with metastatic colorectal cancer is associated with a shorter progression-free survival (PFS). The GEFs were stronger prognostic markers than two other markers of cancer progression, S100A4 and MACC1, and clustering of all GEFs together improved the prognostic accuracy of the individual family members; PFS was significantly lower in the high-GEFs versus the low-GEFs groups [H.R = 5, 20 (95% CI; 2,15–12,57)]. Because nucleotide exchange is the rate-limiting step in cyclical activation of G-proteins, the poor prognosis conferred by these GEFs in CTCs implies that hyperactivation of G-protein signaling by these GEFs is an important event during metastatic progression, and may be more frequently encountered than mutations in G-proteins and/or GPCRs.

Cytoskeleton associated guanine nucleotide exchange factor for trimeric G protein, Giα that modulates growth factor signaling.

CCDC88C
Dvl-associating protein with a high frequency of leucine residues (Daple) Guanine nucleotide exchange factor for trimeric G protein, Giα that enhances noncanonical Wnt signaling.
Tumor suppressor in the normal epithelium; Proinvasive role in cancer cells. See Table  S1.
Colon, Gastric NUCB1 Nucleobindin1/Calnuc EF-hand containing calcium binding protein and a guanine nucleotide exchange factor for trimeric G protein, Giα that is required for unfolded protein response. The role of its GEF function remains unknown.
Possible role in survival via regulation of UPR. See Table S1 Colon, Gastric NUCB2 Nucleobindin2/Nesfatin-1/NEFA Increases migration, proliferation and invasion. See Table S1.  (a) Levels of mRNA expression of a panel of genes was analyzed in the invasive front and the corresponding non-invasive central areas of the same tumor (n = 13) by qPCR. Box plots show the fold change in levels (Y axis) of expression normalized to non-invasive tumor tissue. The statistical significance of the differences for individual genes in both tumor areas was calculated applying a non-parametric Wilcox signed rank test. (b) Levels of mRNA expression of a panel of genes was analyzed in an independent set of metastatic tissue (n = 14, 7 lung metastases and 7 liver metastases) and compared to the mean levels of expression of each gene in the non-invasive (N.I) area of primary colorectal tumors. Box plots display the fold change in levels (Y axis) of expression normalized to non-invasive tumor tissue. Statistical significance was analyzed as in A. Multiple comparison adjustment was performed. *p < 0,05; **p < 0,01; ***p < 0,001. compared these to mRNA levels in the non-invasive central cores of 13 primary tumors. Expression levels of all genes, i.e., GEFs and positive controls, S100A4 and MACC1 were elevated in metastatic lesions compared to primary tumors; the fold increase was highest in the case of S100A4 (Fig. 2b). These findings confirmed the previously defined roles of MACC1 and S100A4 in metastasis, and provided evidence for the involvement of GEF-related genes in CRC progression. These findings also underscore the limitations of biomarker studies, i.e., primary and metastatic tumors are composed by a variety of different cellular subtypes that confer them high degree of heterogeneity; such heterogeneity is spatially (non-invasive core vs invasive periphery) and temporarily variable and altered by the administration of anticancer drugs 18 .
Expression of GEFs is increased in CTCs from patients with metastatic CRC compared to healthy volunteers. To overcome the limitations of analyzing tissue samples from primary tumors, e.g., sampling biases arising from tumor heterogeneity, restricted number of biopsies, quantity and location 19 , we chose to study circulating tumor cells (CTCs). CTCs are believed to be central players in tumor dissemination [20][21][22][23][24] . Despite their heterogeneity and low frequency of appearance in circulation 25 , the ability to analyze CTCs has been likened to 'liquid biopsy' for prognostication and prediction, allowing repeated temporal access and spatial sampling of the whole tumor [26][27][28][29][30] . To investigate whether elevated expression of GEFs in CTCs might be used as a prognostic measure, we compared GEF expression in EpCAM-isolated CTCs from the peripheral blood of 51 patients with metastatic CRC (Table 2) to similarly treated samples from 24 healthy donors. For each of the 4 GEFs, expression level was higher in the patient, samples compared to the healthy donors (Fig. 3), with GIV (CCDC88A), NUCB1 and NUCB2 displaying the highest differences in expression. We assessed the ability of each gene to discriminate between patients and controls using area under the receiver operating characteristic curves (AUC). These analyses showed that CCDC88A was better able to discriminate patients and controls than the other genes analyzed (AUC = 0,80, p < 0,001; Fig. 3). Considering the known prognostic markers S100A4 and MACC1, we found a significantly higher expression of MACC1 (p < 0,001) in samples obtained from patient with metastatic CRC compared to controls. The expression level of S100A4 showed a similar trend, but was not significant (p = 0,14).
These results indicate that the background levels of mRNA of various members of the GEF family in the peripheral blood of healthy volunteers is relatively low, and may potentially serve as useful tools in CTC detection in the peripheral blood of patients afflicted with colorectal cancer.
High expression of GEFs in CTCs is associated with shorter progression-free survival. Next we investigated whether the expression level of GEFs in the CTCs was associated with disease progression or survival. We constructed Kaplan-Meier survival curves for both progression-free survival (PFS) and overall survival (OS) for each marker. Patients were grouped into "high" or "low" expression groups depending on whether the level of expression was above or below the 75% percentile cutoff value for each independent marker, as previously shown 31 . All GEFs showed significant prognostic value for PFS (Table 3, left columns); the median time to progression was significantly shorter in patients with high expression in CTCs compared to patients with low expression. NUCB1 had the strongest association with PFS; median PFS was twice as long in patients with low levels of NUCB1 compared to those with high levels of NUCB1 (10,6 vs 5,2 mon, p < 0,001) ( Table 3). In the case of OS, for all GEFs the median time to death was shorter in patients with high expression compared to those with low expression (Table 3, right columns), however only Daple (CCDC88C and CCDC88Cfl) and NUCB1 reached statistical significance (Table 3, right panel). In the case of our positive controls, MACC1 and S100A4, although high expression was associated with shorter survival, surprisingly, only S100A4 was significantly associated with PFS, and neither was significant for OS (Table 3). We used univariate Cox regression to compare the potential prognostic performance of the GEFs with that of the standard clinical parameters (Table 4). Among clinical parameters analyzed, only the presence or absence of lung metastases had a significant impact on both PFS and OS (Table 4), whereas the number of metastatic sites (≤ 2 vs > 2) and ECOG performance status were significantly associated with OS alone (Table 4). Consistent with prior findings 27 , the serum levels of Carcinoembryonic Antigen (CEA) failed to show an association with survival. By contrast, each member of the GEF family showed a strong and significant association with PFS, with Hazard Ratios (HR) ranging from 2.51 for GIV(CCDC88A) to 3.62 for NUCB1 (Table 4). Consistent with the Kaplan-Meier analyses, only Daple (CCDC88C) and NUCB1 were significantly associated with OS, with HR of 2.88 for Daple and 3.01 for NUCB1. The HRs for MACC1 and S100A4 were smaller than for the GEF family members, not reaching statistical significance in most cases (Table 4). Taken together, these results demonstrate the potential of individual members of the GEF family to be prognostic tools in CTCs, as high expression of each GEF conveyed a significantly worse prognosis.
Grouping related genes into clusters improves the prognostic value of GEFs. Because each member of the GEF family showed prognostic value individually, we asked if different combinations of GEFs, i.e., gene clusters, might improve the prognostic strengths of individual markers alone. Clusters were defined based on the degree of similarity between each member, e.g., the CCDC88 cluster included the two closely related orthologues GIV and Daple (CCDC88A, CCDC88C and CCDC88Cfl) and the NUCB cluster included the two closely related orthologues NUCB1/Calnuc and NUCB2. Patients were classified as high/low CCDC88 when 2 of the 3 CCDC88A/C probes were in agreement that levels of GIV and/or Daple were above/below the previously chosen cutoff. For the NUCB cluster, patients were classified as high when either NUCB1 or NUCB2 were above the previously chosen cutoff. No improvement in prognostic power was seen in either the CCDC88 or NUCB clusters when compared to individual markers alone, both for Kaplan-Meier and Cox survival analyses (Fig. 4a-d). However, when all GEFs were clustered together (CCDC88A, CCDC88C, CCDC88Cfl, NUCB1 and NUCB2), classifying patients as high GEF expression when at least 3 markers were expressed at levels higher than cutoff, we could see an improvement in prediction of PFS (Fig. 4e,f). The median PFS was 10.3 mon among patients classified as low GEF, whereas the median PFS was reduced by half, i.e., 5.2 mon among those patients classified as high GEF [HR of 3.68 (p < 0,001)] (Fig. 4e,f). Of note, clustering of the two unrelated genes, S100A4 and MACC1, our two positive controls did not show any improvement of prognostic power, and this cluster continued to perform poorly compared to the GEFs (Fig. 3g,h).
Finally, we asked if the GEF and the S100A4/MACC1 clusters might show additive value in accurately classifying patients into good and poor prognosis groups. Patients were classified into four groups depending on the expression levels (low or high) of GEF and S100A4/MACC1 clusters (Fig. 5, upper panel). Patients with low expression levels for both clusters (n = 24) had the best PFS and OS among all the patients (Fig. 5 5, lower panel; magenta line). We found that in ~30% (16/51) of patients there was no agreement between the GEF cluster and S100A4/MACC1 cluster; 14 patients had high expression of S100A4/MACC1 cluster but low expression of GEF cluster (Fig. 5, yellow line), whereas 2 patients high GEF expression but low levels of expression of S100A4/MACC1 (Fig. 5, green line). A Kaplan-Meier analysis confirmed that the presence of high levels of GEF was an overriding prognostic factor despite low levels of S100A4/MACC1 both for PFS as well as OS, i.e., the patients with high-GEF-low-S100A4/MACC1 signature lived shorter (Fig. 5, green line) than the patients with low-GEF-high-S100A4/MACC1 signature (Fig. 5, yellow line). Taken together, these findings indicate that the GEF cluster is strongly associated with survival, suggests that it adds significant information above the currently available markers S100A4 and MACC1.
To investigate the independent prognostic value of the GEF cluster, we used multivariate Cox regression (Table 5). In addition to the GEF and S100A4/MACC1 clusters, we included the three clinical variables that previously showed a significant univariate association: 1) the presence of lung metastases, 2) ECOG performance status, and 3) the number of metastases (Table 4). In this multivariate model, the GEF cluster remained an   independent significant prognostic factor [HR: 5,20, p < 0,001; Table 5] for PFS, after adjusting for the effects of the clinical variables and S100A4/MACC1 cluster. However, the S100A4/MACC1 cluster was no longer statistically significant (likelihood ratio test p = 0.86). Removing S100A4/MACC1.

Conclusions
The major finding in this work is the demonstration of the individual and combined prognostic impact of members of a new family of GEFs in CTCs isolated from patients with advanced CRC. The usefulness of CTCs as a direct indicator of patient prognosis and therapy response has gained traction in recent years, with incorporation of CTC enumeration as a parameter to guide treatment plans in the clinical setting for colon 28,29 , prostate 32 and breast 33,34 cancers. There has even been speculation that CTC evaluation may potentially become a test in oncology that is on a par with blood glucose measurements in diabetics 35 . Despite this progress, there is substantial agreement that analysis should incorporate molecular profiling of CTCs, not just enumeration, in order to assess their metastatic potential and predict either tumor progression, detect relapse, or monitor response to specific therapies 35 . Prior studies have demonstrated the ability of a panel of markers to improve the overall prognostic impact, compared to individual targets 31,[45][46][47][48][49] , and the ability of molecular profiling of CTCs from different tumor types, for gene expression as well as mutational status of key cancer-related genes (KRAS 36 , BRAF 37 , PI3KCA 38,39 , EGFR 40 , etc) to provide valuable insights into the biology and behavior of the primary tumor [41][42][43][44] . In the present study, clustering of GEFs together improved the prognostic accuracy of the individual family members. Surprisingly, the GEFs fared better as prognostic markers than two established markers of cancer progression, S100A4 and MACC1. The fact that MACC1 did not demonstrate a significant prognostic effect in our analysis could suggest that the prognostic/predictive impact of MACC1 is limited to cell-free RNA in the peripheral circulation, as shown previously 16 .
Another insight gained from this work is that increased expression of each of the 4 known members of this family is individually associated with poor outcome. Because the only shared module of all 4 molecules is a G protein regulatory motif which exerts GEF activity, it is possible that their elevated expression may synergistically contribute to and serve as a surrogate measure of elevated G protein activation during cancer progression. Based on the fact that all the members of the GEF family are widely relevant in the metastatic progression of a variety of cancers [summarized in Tables 1 and S1], we speculate that the prognostic utility of this panel of markers in CTCs will also be useful in other cancers beyond CRCs.
The current study also has implications for understanding G protein biology. The contribution of hyperactivated G protein signaling in cancers is currently interrogated using a genomics approach to identify and investigate infrequent oncogenic mutations in G proteins and GPCRs in the primary tumor tissue. One major limitation   of such approach is that it ignores the impact of deregulated expression of genes other than GPCRs which coordinately function within the G protein regulatory network to maintain finiteness of G protein signaling, e.g., non-receptor GEFs which can also activate G proteins in a GPCR-independent manner, GTPase accelerating proteins (GAPs) which terminate G protein signaling, and guanine nucleotide dissociation inhibitors (GDIs) which maintain G proteins in an inactive GDP-bound state. The current study, which evaluated an entire family of non-receptor GEFs shows that aberrant expression of these network of regulatory proteins may contribute to hyperactivation of G proteins relatively more frequently than mutations in G proteins/GPCRs. By revealing the prognostic impact of elevated expression of individual as well as clusters of non-receptor GEFs on survival, this work reveals the benefit of transcriptome analysis of G protein regulatory proteins in cancer biology. Our study has several limitations. While EpCAM, also known as HEA or BerEP4, is one of the most commonly used markers for positive isolation and detection of CTCs from patient blood its use has limitations. The occurrence of EMT in tumor cells lead to downregulation of epithelial markers including EpCAM and reduces the sensitivity for detection of CTC 50 . However, it has been demonstrated that at least a subpopulation of CTCs might reflect a partial mesenchymal phenotype, in that, they express both epithelial markers (like EpCAM) and mesenchymal markers (that are upregulated during EMT) simultaneously 35 . It is perhaps because of this reason that several studies have shown that EMT markers are indeed detected in EpCAM-isolated CTCs and have prognostic value (reviewed in Bednarz-Knoll N et al., Cancer Metastasis Reviews, 2012). Furthermore, because increased expression of all 4 GEFs, i.e., GIV, Daple and Calnuc/NUCB2 are all associated with increased invasiveness and/or EMT-like phenotype [see Table S1], it is likely that the use of EpCAM to isolate CTCs may result in  Table S3.  a significant underestimation of the abundance of CTCs that overexpress one or more of these GEFs and display EMT. Another limitation of this study is a relatively small cohort of patients in this study from a single center. Multicentric trials on larger cohorts, using either the same analysis methodology or, incorporating them to existing technologies are essential to fully realize the potential of these markers.

Methods
Gene expression analysis in primary tumors and metastatic tissue. Primary colorectal carcinomas (n = 13) and metastasis (liver metastasis, n = 7; lung metastases, n = 7) were processed by the Tissue Biobank, Pathology Department, Complexo Hospitalario Universitario of Santiago de Compostela. Non-invasive and invasive areas of primary tumors were identified by H&E staining and macroscopically dissected by an experienced pathologist, ensuring similar tumor cell percentages. RNA was purified (TRIZOL reagent, Invitrogen; RNeasy kit, Qiagen), cDNA was synthesized (MuLV reverse transcriptase, Life Technologies), and gene expression was evaluated using hydrolysis probes (Life Technologies) (see Table S2 for probe details). Data was represented as fold change relative to the expression in the non-invasive area. GAPDH, ACTB and RLPLO were used as reference genes.
CTC Study design. 51 51 , disease progression was defined as an increase in the number of metastatic lesions, growth of preexisting distant tumors in more than 20% of the initial size, or both. Patients who died during the follow-up period without being evaluated by CT, were also considered as progression events, having verified that death was disease-related. One 10 ml EDTA blood tube was collected for all patients at baseline (before therapy start). At the same time, the same amount of blood was collected from 24 age and sex-matched healthy controls. The experimental protocols outlined above were approved by the Ethical Committee of the Complexo Hospitalario Universitario of Santiago de Compostela (institutional code of approval: 2009/289). All methods were carried out in accordance with the approved guidelines. All participants signed an informed consent specifically approved for this study.
CTC isolation and gene expression analysis. Sample processing procedures have been previously described 52 . Briefly, CTCs were enriched from 7.5 ml of whole blood using anti-EpCAM coated magnetic beads (CELLection epithelial enrich, Life Technologies) and EpCAM + isolated cells were pooled together from each patient. RNA was extracted with a methodology optimized for low concentration samples (Qiamp Viral, Qiagen) and cDNA was synthesized using SuperScriptIII polymerase (Life Technologies). To optimize target detection, samples were first preamplified (PreAmp Master Mix kit, Life Technologies). mRNA levels of CD45, CCDC88A, CCDC88C, NUCB1, NUCB1, S100A4 and MACC1 genes were quantified by quantitative Real-Time PCR using hydrolysis probes chemistry (Life Technologies) in a StepOne plus thermocycler (Life Technologies). Probe characteristics are detailed in Supplementary Table S2. Each sample was run in duplicate for each gene and appropriate negative controls were included in each qPCR reaction plate. Cq values (defined as the cycle number at which the fluorescence reached a fixed threshold value) for each transcript were normalized to 40 (maximum number of cycles), and this value to the 40-Cq value for CD45 (40-CqCD45), used as a reference gene as it detects hematopoietic cells unspecifically isolated.
Statistical analysis. OS and PFS were defined as the time from start of treatment to death, or to the earlier of disease progression or death, respectively. Marker levels were classified as high or low when they were above or below the 75% percentile in these 51 patients. Kaplan-Meier (KM) curves and COX proportional hazards regression were used to study associations between marker levels and PFS/OS. Likelihood ratio tests were used to compare nested models. Cox models were evaluated using Harrell's concordance index (c-index) and compared using the dependent sample t-test. Differences in gene expression between controls and patients, were analyzed using Mann-Whitney non-parametric test. Tests were performed with SPSSv20.0, GraphPad prism v5 or R v3.1.3 software, at the 5% significance level. AUCs were computed using GraphPad prism v5.