Introduction

Gastro-oesophageal carcinoma (GOC) represents the third leading cause of cancer death worldwide,1 with a rising incidence of junctional cancers2 as well as tumours developing among younger individuals.3 In this tumour, the identification of new targets for effective treatment represents an unmet need.4 Up to now, the human epidermal growth factor receptor 2 (HER2) is the only available biomarker for personalised treatment in metastatic GOC. Trastuzumab demonstrated a significant overall survival benefit when added to first-line chemotherapy5 and is now approved for patients with HER2 overexpression at immunohistochemistry (IHC) or gene amplification by in situ hybridisation (ISH) in equivocal cases.6

Globally, the rate of HER2 positivity in GOC ranges from less than 10% to almost 50%, on the basis of the samples used for analysis and the laboratory techniques applied.7 Screening data from the ToGA trial has shown that several clinical and pathologic variables are associated with a higher rate of HER2 positivity, such as the location of the primary tumour in the gastro-oesophageal tract and Lauren’s subtype, HER2 positive cases being more prevalent among upper lesions and in cases with intestinal histology.8 Moreover, the authors reported a slightly higher rate of HER2 positive status for biopsies compared with surgical specimens.8 More recently, a meta-analysis reported that male gender, well or moderately differentiated as well as intestinal histology were all factors associated with higher rates of HER2 positivity in GOC.9 Other authors recently reviewed the available literature data partially confirming previous reports, but also underlining that large heterogeneity exists about the association between single clinico-pathologic markers and HER2 status.10

Considering the relevance of determining HER2 status in metastatic GOC for optimal treatment selection, the limitations linked to sample adequacy and methodology (type of test) used and the need for rapid HER2 report, we aimed to develop and validate a nomogram, based on easily accessible clinical or pathologic characteristics, which is able to anticipate the probability of harbouring a HER2 positive disease before direct tumour assessment by IHC and ISH.

Materials and methods

Sample identification and data collection

Consecutive GOC cases, analysed from January 2011 to December 2017 for HER2 status by IHC and ISH at the Department of Surgical, Medical, Molecular Pathology and Critical Area of the University of Pisa (Italy), were used as developing cohort. Findings in this subset were then confirmed in a separate, independent validation cohort of GOC cases analysed at independent Institutions in Italy since January 2011.

The following clinical and pathologic data were collected in both cohorts: gender (male versus female); primary tumour location (oesophago-gastric junction versus gastric body versus gastric fundus versus not specified); histologic subtype according to Lauren’s classification (intestinal versus diffuse versus not specified); tumour grading (G1 versus G2 versus G3 versus not specified); site of sampling (primary tumour versus metastasis); adequacy of pathologic material for analysis (inadequate material versus adequate material versus surgical specimen). Adequacy of biopsy for HER2 assessment was defined according to available evidence and recommendations, i.e. a number of at least 6 biopsies was considered as optimal for analysis.11,12,13,14

Pathologic features collected in the analysed datasets (tumour grading and Lauren’s subtypes) were evaluated by pathologists with high expertise in GOC. With regard to tumour grading, the WHO 2010 classification has been applied.15 According to this scale, G1 (well-differentiated) tumours are composed of well-formed glands, sometimes resembling metaplastic intestinal epithelium, whereas G3 (poorly-differentiated) tumours are composed of glands with highly irregular and atypical features that are recognised with difficulty. G2 (moderately-differentiated) tumours are indeed represented by neoplasms with intermediate features between well- and poorly-differentiated carcinomas.

HER2 status assessment

HER2 status was initially assessed by IHC and in equivocal cases (i.e., 2+ at IHC) by ISH assays: HER2 positivity was then defined as IHC score of 3+ or 2+ with a positive ISH.

IHC was performed on formalin-fixed, paraffin-embedded tumour sections using the commercial antibodies PATHWAY antiHER-2/neu (4B5) Rabbit monoclonal (Roche-Ventana Medical Systems, Tucson, Az, USA). Sections were stained using automated slide stainer (Benchmark ULTRA, Roche Ventana Medical Systems, Tucson, Az, USA). An appropriate scoring system,14 also assimilated by the College of American Pathologists and regulatory authorities,11 exclusive for gastric tumours and accounting for type of specimen used (biopsy or surgical tissue), was applied for HER2 evaluation.

ISH assays were performed using the kit HER2 FISH pharmDx™ (DAKO/Agilent Santa Clara, CA United States) as per manufacturer’s instructions. Gene amplification by fluorescence ISH (FISH) was expressed as the ratio between the number of copies of the HER2 gene and the number of copies of chromosome 17 within the nucleus counted in at least 20 cancer cells. The positivity of FISH was considered at a HER2:chromosome 17 ratio of ≥2.0. The entire specimens were screened for amplified regions at an x20 magnification. In borderline amplification cases (ratio within the range 1.8–2.2), 20 additional cells were re-counted.11,14,16

Statistical analyses

Four hundred and thirteen patients in the development cohort were used to build an unconstrained logistic regression model able to predict HER2 positivity. First, all the variables were tested in univariate models. All statistically significant variables were then used to build multivariable models. Variables not significant, but with a strong literature support in favour of an association with HER2 status, were also considered for inclusion. Both backwards and forward method were used. Collinearity was evaluated using Fisher’s test, t-test and ANOVA, depending on the nature of the covariates, and Variance Inflation Factor (VIF). Global fit was evaluated with Nagelkerke’s R2, Somer’s D and model log-likelihood ratio chi-square. Final model was selected considering statistical significance of the covariates, the percentage of models in which it remained significant, their clinical plausibility and the global fit. A nomogram was then developed from the final model. Validation and calibration were performed on an external, independent dataset from three different Italian centres. C-index, visual inspection of the calibration plot, Brier score and Spiegelhalter z-test were used to assess the performance of the nomogram. The 95% confidence intervals of C-index were calculated with bootstrap method. A ROC curve was built with data from the validation set to assess sensitivity and specificity of the test at different cut-offs of predicted probabilities.

Package ‘rms' and ‘pROC' of R were used for all the analyses.

Results

Sample characteristics

Characteristics of the samples included in the two datasets are listed in Table 1. A total of 723 cases were included in the study, with 413 cases in the development cohort and 310 cases in the validation cohort. There were 70 (16.9%) and 56 (18.1%) HER2 positive cases in the development and validation cohorts, respectively. We observed significant differences between the two cohorts regarding the pathologic material used for analysis (i.e., higher number of biopsies compared to surgical specimens in the validation cohort) and primary tumour location (i.e., higher percentage of cases located in the gastric fundus in the validation cohort), whereas no differences were present in respect to all other collected characteristics. Since mixed tumours represented a minority (<3% of all cases) in both cohorts, tumour samples were reviewed and labelled as either intestinal or diffuse type according to the prevalent histotype in each case.

Table 1 Study cohort’s characteristics

Association of investigated variables with HER2 status and nomogram development

When tested for association with HER2 positive status, the following features were confirmed significant at multivariate analysis in the development cohort: tumour grading (analysed as continuous variable), histotype (diffuse versus intestinal) and site of sampling (primary versus metastases). Type of pathologic material was not significant but, given the amount of data about its correlation with HER2 status, we retained this parameter in the multivariable model. A detailed list of univariate and multivariate analyses is given in Table 2. Site of primary cancer was also significant at univariate analysis but was excluded from the final model due to an excessive amount of collinearity with grading and histologic subtype. Exclusion of this variable did not affect the global fit, which remained good: R2 was 0.19, Somer’s D was 0.5, C-index was 0.75 and likelihood ratio chi-squared was 41.12 (P < 0.0001). Notably, we detected collinearity between grade and histologic type (P < 0.001). A slight collinearity was also present between pathologic material and site of sampling. On primary cancers the diagnoses were made by biopsy in 35.7% of the cases. On the contrary, the proportion of diagnoses made by biopsies on metastatic lesions was 66.7%. However, a formal Fisher test was negative (P = 0.144) and the VIF was always lower than 3 for all the variables tested, so it is unlikely that significant bias could be introduced. No first-degree interaction was significant. The definitive nomogram is depicted in Fig. 1.

Table 2 Association of clinical and pathologic factors with HER2 status: univariate and multivariate analyses
Fig. 1
figure 1

HER2 nomogram. Legend: Each variable is assigned a score in the ‘Points' axis. Locate the sum of all the single scores combined in the ‘Total Points' axis: the point identified by a line drawn downwards on the ‘Probability of HER2 overexpression' axis corresponds to the estimated probability of HER2 overexpression anticipated by the nomogram

HER2 nomogram: external validation

Probabilities predicted by the nomogram were tested against those observed in the validation set. The nomogram discriminative ability was very good, with a C-index of 0.827 (95%CI 0.741–0.913). Brier score was 0.101 and the Spiegelhalter Z-test was not significant (P = 0.319). Visual inspection of the calibration plot showed a very good overlap between predicted and observed probabilities, with no relevant overestimations or underestimations (Fig. 2).

Fig. 2
figure 2

Calibration plot for external validation of the nomogram. Legend: Curves represent observed versus predicted probabilities, with gray line representing an ideal model (i.e., observed and predicted probabilities overlapping) and black and dotted lines representing the observed results. Brier Brier score, C (ROC) C-index, Dxy Somer’s D, S:p P-value of Spiegelhalter z-test, S:z z-value of Spiegelhalter z-test

We developed a ROC curve on the validation cohort to calculate specificity and sensitivity of the test to detect the presence of HER2 positivity at different probabilities, as given by the model (Fig. 3). At a cut-off probability of 0.2, the model had 82% sensitivity and 74% specificity to detect the presence of HER2 positivity.

Fig. 3
figure 3

ROC curve between probabilities of HER2 positivity as predicted by the nomograms and actual HER2 status in the validation cohort. Legend: The curve represents the sensitivities and the specificities observed in the validation cohort at different cut-off values of probabilities predicted by the nomogram. Higher predicted probabilities correspond to higher values of sensitivity but lower values of specificity, and vice-versa

Discussion

Defining HER2 status is crucial in the management of advanced GOC patients, as HER2 positive cases by IHC (and ISH, when required) may benefit from trastuzumab in combination with first-line chemotherapy.5,6,17 Missing HER2 positive tumours could then result in unexploited treatment opportunities, considering also that enrolment in clinical trials with novel anti-HER2 agents may be a suitable option. Therefore, in view of the relevance of this target (currently the only available validated predictive biomarker in GOC) and the challenges related to laboratory assessment of tumour samples in everyday practice,18 we developed and validated an easy-to-use and cheap nomogram, which predicts with high accuracy the chance of HER2 positivity before approved molecular diagnostics such as IHC and ISH are performed.11 Using a cut-off probability as given by the nomogram of 0.2, the test has a sensitivity of 0.82 and a specificity of 0.74 to detect the presence of HER2 positivity.

The variables included in the nomogram had been all associated with HER2 positive status, as it was previously found in available literature: lower tumour grading,19 intestinal histology,19,20 adequacy of analysed samples,21 and site of collection. Regarding the site of sampling (primary tumour versus metastases) larger heterogeneity is reported in different studies,22,23,24,25 ultimately confirming that specific clinico-pathologic features are not able to predict HER2 status when considered separately.10 In our series, tumour grading was the strongest predictor of HER2 status, in both cohorts. Both pathologic material and site of sampling were included in the model, moving from the evidence that biopsies are associated with higher probability of HER2 positive status compared with surgical specimens in our series. Similar findings were also reported by the ToGA trial investigators,8and may be justified by the different cut-off values used to define HER2 positivity. Another plausible explanation is that biopsies could be a surrogate variable, linked to deeper biological alterations associated with HER2 positivity. Again, there could be a potential overlap between biopsies and metastatic lesions, which are more likely to be HER2 positive. We indeed found in the developing set that a greater percentage of metastases are diagnosed by biopsies, introducing a slight collinearity in the model. However, the degree of such collinearity is small, unlikely to fully explain these findings. Our analyses also revealed a partial overlap between grading and Lauren’s subtype. However, the VIF for each parameter was always lower than 3, strongly suggesting that the model is not affected by relevant collinearity.

Adequate pathologic material for analyses is the key element of any quality-controlled laboratory procedure.11,12,21,26 Considering the heterogeneity in HER2 expression across tumour cells in GOC,14 a minimum of 5 biopsies are required according to literature data,12,13,21 with 6 to 8 specimens considered as the optimal threshold for adequate and reliable HER2 assessment in GOC by available recommendations.11 So, we set 6 as the reference number to categorise biopsy samples adequacy. Unfortunately, this issue has not been fully implemented in routine practice and our study confirms that a significant percentage of biopsies used for HER2 status assessment should have been considered indeed inadequate for appropriate evaluation. In our opinion, this nomogram could prompt clinicians to perform tumour re-biopsy in initially HER2 negative cases at molecular diagnostics but with an anticipated high probability of HER2 positive status and unavailability of sufficient tumour tissue for molecular analyses.26 As re-biopsy translates into increased risks for the patient and greater costs for health services, the tool we developed could be shared with all the specialties involved in the multidisciplinary management of GOC patients in order to raise awareness about the relevance of adequate sampling for optimal medical management.

Obviously, this nomogram cannot substitute the direct HER2 evaluation by IHC and ISH, according to approved diagnostics and guidelines.11,14 However, molecular tests are time consuming, may not be promptly accessible in all institutions and are subject to strict requirements with regards to the quality of biologic samples to be analysed. Therefore, clinicians could be interested in predicting HER2 status in the single patient at first assessment, in order to immediately evaluate different treatment options or study proposals in patients with higher probability of HER2 positive disease. This could also allow not delaying treatment initiation in patients with very low pre-test probability of harbouring a HER2 positive tumour. These patients could benefit from alternative chemotherapy regimens compared to the cisplatin plus fluoropyrimidine combination used in ToGA5: reasonable options are represented by a docetaxel-containing triplet, in order to increase activity and potentially efficacy,27 or oxaliplatin-based doublets, in order to improve safety.28

To conclude, HER2 status might be accurately and rapidly predicted by a simple nomogram based on four validated clinico-pathologic parameters. This tool could be easily implemented in clinical practice during the first assessment of the patient to add information for prompt case management.