Main

Lymphovascular invasion (LVI) is an indicator of metastatic potential and poor outcome in breast cancer (BC) (Lee et al, 1990; Quinlan, 1993; Altman and Bland, 1994; Lauria et al, 1995; Choi et al, 2003; Truong et al, 2005; Debled et al, 2010; Song et al, 2011; Ugras et al, 2014). In a previous study (Quinlan, 1993) we have demonstrated that in a lymph node-negative BC cohort, LVI could be used as a high-risk criterion conferring survival disadvantage equivalent to that provided by involvement of one or two lymph nodes and to that provided by one higher size category (pT1 to pT2) (Quinlan, 1993). Despite its recognised prognostic roles, the molecular mechanisms underlying the development of LVI in BC and genes driving this process that could be used as potential therapeutic targets remain largely unknown (reviewed in Aleskandarany et al, 2015). This is related not only to the complexity of the mechanisms involved in the development of LVI as part of the invasion-metastasis process and the complex interaction between tumour cells and their microenvironment but also due to the difficulty in studies’ design and the subjectivity of identification of true LVI-negative and LVI-positive BC. Here we hypothesised that large-scale transcriptomic and genomic profiling of a cohort with strictly defined LVI status could potentially yield candidate key/driver genes. This approach is likely to narrow down the number of potential genes that can be further investigated using in vitro and in vivo assays models with the aim of identifying novel therapeutic targets.

In this study, the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) data were used for exploring genes associated with LVI in invasive BC through extensive genome-wide analysis of gene expression and copy number aberrations (CNAs) (Curtis et al, 2012). LVI was defined using strict criteria with the aid of immunohistochemistry (IHC). Genes with significant differential expression were ranked according to the strength of their association with LVI. The CNAs of these genes were determined. One of the top genes that is currently gaining interest as a therapeutic target (Rao et al, 2015a) showed differential expression together with CNAs was the Rho-GTPase activating-protein 18 (ARHGAP18). The prognostic impact of ARHGAP18 gene expression was externally validated using online BC gene expression data sets. Protein expression of ARHGAP18 was assessed on a large clinically annotated cohort of invasive BC with long-term follow-up using IHC. The association between ARHGAP18 and LVI as well as clinicopathologic criteria and patients’ outcome were explored.

Materials and methods

Gene identification using METABRIC cohort

To identify putative LVI driver genes, the METABRIC cohort data were explored as regards the gene expression and CNA between cases with definite LVI and cases with no evidence of LVI. Details of tissue processing, analysis and interpretation of the findings of the METABRIC study were previously described (Curtis et al, 2012). Briefly, the extracted and purified DNA probes were hybridised to Affymetrix SNP 6.0 arrays (Affymetrix, Santa Clara, CA, USA) at AROS Applied Biotechnology (Aarhus, Denmark). The Illumina Totalprep RNA amplification kit (Ambion, Warrington, UK) was used, and then hybridised onto Illumina Human HT-12 v3 Expression Beadchips from the same manufacturer. For this purpose, the discovery set formed of cases from Nottingham primary series of invasive BC included in the METABRIC study (n=328). Tumours in this discovery set were treated using standardised methods of specimen fixation, sampling and processing as previously described (Rakha et al, 2012). The clinicopathological data including LVI status, axillary nodal status and other variables were available. The LVI status of these cases was determined using a set of criteria as follows: cases were defined as LVI positive (LVI+) when LVI was reported in routine practice as positive based on H&E sections of the whole tumours and LVI was confirmed using IHC using CD34 and/or D2–40 (Rakha et al, 2012) on one full-face tumour section from the surgical specimens. LVI-negative (LVI−) cases were defined by negativity of LVI on both H&E sections and IHC. Importantly, cases in the discovery set with positive lymph nodes in the LVI− subgroup were excluded to avoid bias caused by potentially undetectable foci of LVI in the primary tumour. This discovery set was used to identify genes differentially expressed between the two LVI subgroups that are likely to be related to LVI. The validation set (Addenbrookes cases; n=914 cases) set was used to validate the expression of the differentially expressed genes, to ensure that genes were differentially expressed in both the Nottingham and Addenbrookes cohorts. To test for genes/transcripts significantly differentially expressed between LVI+ and LVI− cases, a supervised differential gene expression analysis approach of gene expression data was followed. This was performed using the Linear Model for Microarray and RNA-seq data (LIMMA) software package that is compatible with the Affymetrix data. The Affymetrix SNP 6.0 array data were further analysed for CNAs of each transcript in the list of top associated genes with LVI using Affymetrix SNP6 Copy Number Inference Pipeline (Cancer Genomics Computation Analysis group of the Broad Institute, USA). The top differentially expressed genes were ranked based on their P-value of association with LVI. Subsequently the CNAs of the top differentially expressed genes were determined. Of the top five differentially expressed genes, ARHGAP18 was identified.

ARHGAP18 gene expression

ARHGAP18 gene expression was evaluated in the METABRIC cohort of 1980 BC samples (Curtis et al, 2012). In this cohort, patients with ER-positive and/or lymph node-negative tumours did not receive adjuvant chemotherapy, while those with ER-negative and/or lymph node-positive tumours received adjuvant chemotherapy. Dichotomisation of ARHGAP18 mRNA expression was performed at the median mRNA expression of the studied cohort.

External validation

To further assess the prognostic significance of ARHGAP18 mRNA expression, bc-GenExMiner v4.0 (Breast Cancer Gene-Expression Miner v3.0) online data set (http://bcgenex.centregauducheau.fr) was used. This is composed of two statistical mining modules; the ‘prognostic module’, offering the possibility to evaluate the in vivo prognostic impact of candidate genes in BC, and the ‘correlation module’, to compute correlation coefficients between gene expressions or to find lists of correlated genes in BC. The prognostic module was in this external validation, where Cox model, Kaplan–Meier and forest plots were performed (Jezequel et al, 2012). The prognostic significance was used as LVI status was not available for these cohorts.

ARHGAP18 protein expression

ARHGAP18 protein expression was assessed in formalin-fixed paraffin-embedded tissue samples from BC cases enrolled into the Nottingham Tenovus Breast Carcinoma Series: a well-characterised series of primary operable invasive BC presenting between 1987 and 1998 at Nottingham City Hospital (n=959). Clinical and pathological data had been recorded and patients’ outcome data are regularly updated and prospectively maintained. Outcome data recorded included survival status, mean survival in months, recurrence of disease (including distant metastases) and cause of death. Disease-free interval (DFI) was defined as months from date of primary surgery to appearance of local, regional, or distant recurrence, breast cancer-specific survival (BCSS) was defined as months from primary surgery until BC-specific death, and time to distant metastasis (distant metastasis-free interval, DMFI) was defined as months from primary surgery to occurrence of first distant metastasis/recurrence. Adjuvant therapy was based upon tumour prognostic and predictive factors including the Nottingham Prognostic Index (NPI), lymph node (LN) status, menopausal status and ER positivity/negativity and included hormonal therapy and CMF (cyclophosphamide, methotrexate and 5-Flourouracil).

Expression of a large number of molecular biomarkers relevant to BC tumourigenesis and progression including oestrogen receptor (ER), progesterone receptors (PR), human epidermal growth factor receptor 2 (HER2), p53, Ki67, and others have been previously studied (Abd El-Rehim et al, 2005). The tumours were subtyped into luminal (ER+/PR+/HER2−), HER2+, and triple negative (PR−/ER−/HER2−) subtypes (Rakha et al, 2009). Table 1 summarises the clinicopathological data of this cohort. The median and mean age of the study population was 54 years (range: 24–70). The median overall survival in months was 168 months while the median DFI was 109 months. This work was approved by Nottingham Research Ethics Committee 2 under the title: Development of a molecular genetic classification of breast cancer. All tissue samples included in this study were from patients who were consented prior to inclusion in the study cohort.

Table 1 Summary of patient demographics of the Nottingham primary BC series used in this study

Validation of primary antibody specificity using western blotting

Prior to IHC, the specificity of the anti-ARHGAP18 antibody (ab175970, Abcam, UK) was validated using western blotting performed on whole cell lysates of high expressor cells of ARHGAP18 mRNA according to the manufacturer’s protocol. For this purpose, lysates of human cervical cancer (HeLa CCL-2, American Type Culture Collection (ATCC), USA) cell lines and human breast adenocarcinoma (MDA-MB-231, ATTC HTB-26, USA) cell lines were used. This was performed using 1 : 1000 dilution of the primary antibody (ab175970, Abcam, UK), and 1 : 15 000 of the horseradish peroxidase-labelled secondary anti-rabbit antibody, with β-actin (diluted 1 : 2000) used as a loading control. Bovine serum albumin was used for blocking. Enhanced chemiluminescence was used to visualise the membrane as previously described (Mruk and Cheng, 2011). The images were developed via chemiluminescence using an Odyssey Fc (Li-cor Bisosciences, Lincoln, NE, USA).

Immunohistochemistry

Heat-assisted antigen retrieval was performed in accordance to the manufacturer’s recommendations (citrate buffer pH 6.0 at 1000 W for 20 min using microwave). The Novolink Max Polymer Detection System was used to visualise the reaction (RE7280-k, Leica, Newcastle, UK). Endogenous peroxidase activity was blocked by the addition of a peroxidase block (Novolink Peroxidase Block) for 5 min. Non-specific binding of the primary antibody, post primary block (Novolink Protein block) was used. The slides were washed and incubated with the anti-ARHGAP18 primary antibody (ab175970, Abcam, UK, diluted 1 : 75 in Bond Primary Antibody Dilutent, Leica, Germany), for 30 min at room temperature. This working dilution was chosen based on repeated attempts of optimisation using different dilutions, guided by the manufacturer’s recommendation, in order to achieve specific staining with minimal background. The positive control, β2 microglobulin (Polyclonal Rabbit Anti-Human Beta-2-Microglobulin, Dako, Glostrup, Denmark) was incubated at 1 : 2000 dilution. A DAB (3,3′-diaminobenzadine) in a buffered stabiliser solution (Novolink DAB Chromogen; 1 : 20 dilution) was added to the slides for 5 min. The slides were counterstained with haematoxylin for 6 min (Novolink Haematoxylin). Dehydration, clearing, mounting and cover-slipping were performed as previously described. Full-face tissue sections from 25 BC excision specimens were also stained for ARHGAP18 to assess for the distribution of staining to decide for TMAs suitability. Negative and positive controls (by omission of the primary antibody, and human liver tissues, respectively) were included in each staining run.

Scoring of TMA cores

Stained TMA slides were digitally scanned (NanoZoomer, Hamamatsu Photonics, Tokyo, Japan) to high resolution images and viewed by Aperio ImageScope (Aperio Technologies, Milton Keynes, UK) at × 20 magnification. Cores/images were individually evaluated and assigned an ‘H-Score’ by identification of the percentage of the malignant cells that were stained (0–100) and the staining intensity (0 – negative, 1 –weak, 2 – moderate and 3 – strong), and the final H-Score was calculated as previously described (McCarty et al, 1985). Scoring was performed by a single observer (RS) and a subset of cases (20%) was scored by an independent observer (MA) to assess inter-rater reliability. All identifying patient information and clinical variables were completely anonymised during scoring.

Statistical analysis

The H-Scores were linked to anonymised patient codes to match all other variables with biomarker expression. Analysis was performed in the statistical software package SPSS (IBM SPSS statistics, Version 22). The data were exported to X-Tile (X-Tile Bioinformatics Software, Yale University, version 3.6.1) to determine an unbiased optimally significant cut off point based on patients’ outcomes. Kappa statistic was used to test scoring reproducibility between observers (inter-observer agreement). Analyses were performed for nuclear and cytoplasmic H scores independently. Univariate analysis was performed using the chi-squared test to evaluate the significance of the association between expression of the biomarkers and the clinicopathological parameters of the data, as well as other previously investigated biomarkers. Kaplan–Meier analysis was performed to assess BCSS and DFI survival differences. Multivariate Cox Regression analysis with adjustment of co-variates was fitted to test independence from standard prognostic factors. A P-value of <0.05 (two tailed) was considered significant.

Results

Gene identification using the METABRIC study

Analysis of the METABRIC transcriptomic data in the discovery and validation sets revealed 514 genes differentially expressed with regard to LVI status. These genes were ranked in order based on the P-value of expression. Biological functions of the top differentially expressed genes were also assessed using published literature. In addition, the CNAs of these genes were analysed to identify those genes whose expression was associated with their gene CNAs (cis acting). Of the top five genes that showed differential expression associated with CNAs with regard to LVI was ARHGAP18 and this gene was subject to further analysis in this study.

Using the median expression as a cut-off point, high ARHGAP18 mRNA expression was significantly associated with variables of good prognosis including smaller size (P<0.04), early stage (P<0.001), tumours of tubular subtype (P<0.001) and HER2− tumours (P=0.029). When comparing the levels of ARHGAP18 mRNA expression in the intrinsic (PAM50) subtypes, significant differences were observed (P=0.006), with the HER2+ subtype showing the least expression levels. Similarly, significant difference in the expression levels were observed within the different METABRIC Integrative Clusters (P=0.021), with clusters 5 (ERBB2 amplified) and cluster 9 (Luminal B subgroup) showed the least ARHGAP18 mRNA expression. A trend towards significantly improved patients’ survival (P=0.082) in ARHGAP18 mRNA over-expression was also identified.

ARHGAP18 mRNA expression levels in the external validation cohorts

The prognostic impact of ARHGAP18 mRNA expression was subsequently assessed using bc-GenExMiner v4. (Breast Cancer Gene-Expression Miner v4) online data set as external validation cohorts (n=21 data sets, 4177 patients). Of these, 12 data sets provided information on DMFI for ARHGAP18 mRNA expression. As shown in the Forest plot (Supplementary Figure 1), 4 out of 12 studies showed high ARHGAP18 mRNA expression was significantly associated with improved DMFI; however, the remaining eight studies did not show significant associations. When the data were pooled together (n=2016), high ARHGAP18 mRNA expression was significantly associated with longer DMFI (P<0.001, hazard ratio (HR)=0.82, 95%CI 0.75–0.90), Supplementary Figure 2. Interestingly, exhaustive prognostic analysis, which permits to screen the prognostic impact of ARHGAP18 mRNA on all possible combinations of population, showed significantly improved outcome of patients in 9 out of 18 studies using all possible combinations, Supplementary Table 1.

Immunohistochemical expression of ARHGAP18

Validation of the antibody specificity using western blot showed a single specific band at the predicted size (75 kDa) of ARHGAP18 protein, confirming the specificity of the antibody, Figure 1A. Full-face sections stained with ARHGAP18 antibody showed homogenous staining distribution throughout stained sections; therefore, validating the use of TMA. IHC expression showed ARHGAP18 protein was localised in the nuclei and cytoplasm of tumour cells with varying intensities, Figure 1B–D. In addition, positivity was observed in the cells of ductal carcinoma in situ and in the normal ductal epithelial cells entrapped in some cores. Overall, the expression was reduced in some invasive tumours compared to the cells of the in situ component or the normal ductal cells.

Figure 1
figure 1

Western blot and immunohistochemical expression of ARHGAP18 in BC. (A) Western blot of ARHGAP18 and the housekeeping Beta-actin. Western blotting performed on whole cell lysates of HeLa CCL-2 and MDA-MB-231 (lanes 1 and 2, respectively). Primary antibody (Anti-ARHGAP18, 1 : 1000 dilution), and of the HRP-labelled secondary anti-rabbit antibody (1 : 15 000), with β-actin (1 : 2000, lanes 3 and 4) used as a loading control. The images were developed via chemiluminescence using an Odyssesy Fc (Li-cor Bisosciences, USA). (BD) Immunohistochemical expression of ARHGAP18 in invasive BC: (B) Negative, (C) Invasive BC case showing positive cytoplasmic expression, and (D) a case of invasive BC case showing both nuclear and cytoplasmic expression.

Cases re-scored by an independent observer showed a very good (Kappa=0.83) agreement between the two scorers. The H-Scores of both nuclear and cytoplasmic expression did not follow normal distribution. X-Tile determined optimal cut-off values as H-Score 140 for cytoplasmic staining, and 65 for nuclear staining. At these cut-offs, 848 out of 959 (88.4%) cases showed negative/low nuclear expression and 111 out of 959 (12.6%) showed positive/high nuclear expression. Cases with cytoplasmic staining were dichotomised as 784 out of 959 (81.8%) cores negative/low expression and 175 out of 959 (18.2%) with positive/high expression.

ARHGAP18 protein expression and clinicopathological variables

Cytoplasmic expression of ARHGAP18 showed an inverse association with the LVI (P=0.006) and Nottingham Prognostic Index (NPI, P=0.010; Table 2). Loss of its expression was associated with epithelial mesenchymal transition with loss of E-cadherin and overexpression of N-cadherin, and with HER2 overexpression. Consistent with mRNA expression, ARHGAP18 cytoplasmic protein expression showed an association with IHC-defined molecular subtypes where it was less expressed in the HER2+ and TNBC classes (P=0.035; Table 3). Nuclear expression of ARHGAP18 was associated with lower grade (P=0.022), with less mitotic counts and nuclear pleomorphism, smaller tumour size (P=0.002), the better prognostic NPI subgroup (P=0.001), histological types of excellent/good prognosis (P=0.002) but not with LVI (Table 4). Positive nuclear expression showed significant associations with HER2 negativity (P=0.003), negative/low Ki67 (P=0.001) luminal-A subtype (P=0.001), and the negative/low expression of N-cadherin, transforming growth factor β (TGF-β1), PIK3CA and C-terminal tensin-like (CTEN) (P<0.05; Table 5).

Table 2 Statistical association of cytoplasmic H score expression of ARHGAP18 and the clinicopathological parameters of the studied series
Table 3 Statistical association of cytoplasmic H score expression of ARHGAP18 and the expression of other biomarkers
Table 4 Associations of nuclear H scores for ARHGAP18 in relation to the clinicopathological parameters
Table 5 Association of nuclear ARHGAP18 expression and the expression of other biomarkers

Similar statistical associations were observed when the expression of ARHGAP18 was taken into account irrespective of the localisation (i.e. negative versus positive cytoplasmic or nuclear expression).

Expression of ARHGAP18 patient outcome

Increased cytoplasmic expression of ARHGAP18 was significantly associated with improved outcome in terms of longer BCSS (P=0.004) and DMFI (P=0.012; Figure 2). Using multivariate cox regression analysis, this association was independent of tumour size, tumour grade, nodal stage and BC molecular subtype (P=0.017, HR=0.68, 95% CI 0.49–0.93, and P=0.031, HR=0.72, 95% CI=0.54–0.97, for BCSS and DMFI, respectively). High nuclear expression was also significantly associated with longer BCSS (P=0.001) and DMFI (P=0.003; Figure 3). Again, similar statistical associations with patients’ outcome were observed when the expression of ARHGAP18 was taken into account irrespective of the localisation. These associations were independent of the standard prognostic factors including tumour size, nodal stage and tumour grade using Cox proportional multivariate analysis (P=0.023, HR=0.62, 95% CI=0.41–0.94, and P=0.047, HR=0.69, 95% CI=0.47–0.10, for BCSS and DMFI, respectively).

Figure 2
figure 2

Kaplan–Meier plots of the association between cytoplasmic ARHGAP18 expression and BCSS and time to distant metastasis during the follow-up period.

Figure 3
figure 3

Kaplan–Meier plots of the association between nuclear ARHGAP18 expression and BCSS and time to distant metastasis during the follow-up period.

Discussion

The ARHGAP genes encode a family of at least 32 members of RhoGAP proteins with variable functions (Katoh and Katoh, 2004). The downstream target of the ARHGAPs is the Rho GTPase protein family; therefore, they are logical candidates to investigate their roles in BC. Members of the Rho GTPase family have numerous cellular functions including maintenance and control of the actin cytoskeleton, angiogenesis, cell proliferation, differentiation, and survival (Peck et al, 2002). Consequently, deregulation of these proteins has been implicated in multiple types of tumours. In this study and using an integrative bio-informatics analysis of gene expression and CNA data associated with LVI coupled with strict defining criteria for definition of LVI in BC we have identified ARHGAP18 as one of the top differentially expressed genes and its expression was associated with its gene CNA. ARHGAP18 is located on chromosome 6q22.33 and is a member of the ARHGAP family (Potkin et al, 2008). After confirmation of its prognostic value in the METABRIC cohort and in external validation series, we sought to assess the protein expression of ARHGAP18 in a large unselected annotated series of BC with long-term follow-up.

Studies exploring the impact of tissue-based ARHGAP18 protein expression in cancers including BC are lacking and most of the studies so far were performed using in vitro tissue culture platform. In the current study, ARHGAP18 IHC staining was detected in the cytoplasm and in the nuclei of invasive BC cells with variable percent and intensity, in the cells of ductal carcinoma in situ and in the normal ductal epithelial cells. An overall tendency of reduced expression was observed in invasive tumours compared to the non-invasive and normal ductal cells, with the latter showing the highest expression; observations which may be denoting a tumour suppressor role.

Cytoplasmic expression of ARHGAP18 showed significant negative association with LVI. Recently, Chang et al reported on roles of ARHGAP18 in limiting proangiogenic signalling and promoting vascular stability through limiting endothelial cell sprouting and stabilising junctional integrity (Rao et al, 2015b). Moreover, knockdown of ARHGAP18 has been recently reported to promote endothelial cells migration in scratch wound assay with a protrusive and irregular migratory front and disrupted cell junctions at the proximal edge of the leading cell (Chang et al, 2014). Sustained angiogenesis is one of the hallmarks of cancer and is essential for tumour progression. Moreover, in the current study, reduced expression was associated with high Ki67, epithelial mesenchymal transition with expression of N-cadherin and TGFβ1 and reduced expression of E-cadherin. Similar significance was also identified with nuclear expression, showing positive association with other prognostic variables and outcome. A large proportion of cases strongly expressing ARHGAP18 were of the excellent prognosis histological subtypes including tubular, invasive cribriform and mucinous carcinomas. Nuclear expression was also significantly associated with lower tumour grade, lower mitotic scores, less nuclear pleomorphism and smaller tumour size. Such associations suggest ARHGAP18 may be a marker of related to tumour differentiation.

Investigations specifically pertaining to the subcellular localisation of ARHGAP18 are lacking. However, similar to findings of the current study with ARHGAP18 in BC it has been reported that ARHGAP21 is localised both in the cytoplasm and nuclei of prostatic carcinoma (Barcellos et al, 2013) and nuclear and perinuclear localisation seen in glioblastoma-derived cell lines (Bigarella et al, 2009). We suggest ARHGAP18 might have a similar trafficking from the cytoplasm, where synthesis occurs, to the nucleus, possibly upon occasion of tumourigenic changes.

The associations of both nuclear and cytoplasmic ARGAP18 with the cadherin switch in the current data; E-cadherin loss and N-cadherin gain, as well as the negative association with the EMT triggers TGFβ1 and PIK3CA (Zhao et al, 2006), as well as the migration-associated CTEN (Albasri et al, 2009), reflects its potential role in controlling cellular cytoskeletal dynamics and migration. The latter is attributed to the recognised roles of Rho GTPases in controlling the actin cytoskeleton. ARHGAP18 has been postulated to suppress the function of RhoA and cause a disruption in the production of stress fibres; contractile bundles of actin found in non-muscle cells (Maeda et al, 2011). These functions may suggest a mechanism through which ARHGAP18 is involved in limiting spread and metastasis of cancer cells therefore may provide some explanation for its good prognostic value.

High ARHGAP18 mRNA expression within the METABRIC data set was significantly associated with variables of good prognosis including the PAM50 and Integrative Clusters of good prognostic classes. Although analysis of the METABRIC study revealed a trend towards improved patients’ outcome in cases overexpressing ARHGAP18, pooled analysis of the publicly available data sets, using the bc-GenExMiner v4, showed significant association between ARHGAP18 expression and better outcome. Although gene expression data reflect the overall expression levels, they show concordance with our protein expression data where increased ARHGAP18 expression in both the nucleus and cytoplasm was significantly associated with better outcome.

In conclusion, results presented herein suggest ARHGAP18 may be acting, directly or indirectly, as a putative metastasis suppressor gene, based on its association with favourable prognostic features including negative association with LVI and longer survival at both protein and mRNA expression levels. The general tumour suppressor effect probably indicates that the relationship to LVI is perhaps within a coordinated set of events related to multiple events in BC development and progression. They also indicate the validity of our approach to identify novel biomarkers associated with complex biological processes that are related not only to the investigative techniques but also on robust histological characterisation. Further investigation of this biomarker as well as other genes differentially expressed with regard to LVI is warranted to decipher the mechanism underlying its development and to identify potential therapeutic targets.