Introduction

Whenever metastatic carcinoma is suspected in organs such as lung, liver, bone, and brain in women, metastatic breast cancer is always part of the differential diagnosis, even in the absence of a documented clinical history; this is because breast cancer is the most common malignancy in women, and ~20–30% of all breast cancer cases become metastatic [1, 2]. Along with the patient’s tumor morphology and clinical history, a sensitive and specific tumor marker is essential for identifying the primary site of carcinoma.

Breast cancer can be categorized into four major clinical surrogate subtypes based on the presence or absence of specific molecular markers: estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2). These subtypes include luminal A and luminal B types (ER/PR-positive; 70% of patients), HER2-positive type (15%), and triple-negative breast cancer (TNBC, 15%) which expresses none of the three molecular markers and typically overlaps with intrinsic basal-type breast cancer [3,4,5]. Among these breast cancer subtypes, TNBC is the most aggressive with a high relapse incidence and early metastasis, and the most undifferentiated phenotype [6]; moreover, metastatic poorly differentiated TNBC frequently has no significant tumor markers for its breast origin.

In current clinical practice, GATA binding protein 3 (GATA3), gross cystic disease fluid protein 15 (GCDFP-15), and mammaglobin are the commonly used breast-specific immunohistochemical (IHC) markers in the workup of tumors of unknown primary. Among the three breast cancer markers, GATA3 is most widely used for clinical diagnosis, not only because it is reliable and its nuclear staining is more easily read than the cytoplasmic staining of GCDFP-15 and mammaglobin, but also because it has better sensitivity than the other markers have in detecting breast carcinoma. Previous studies have shown that the sensitivity of GATA3 for all breast cancer ranges from 70 to 90%, which is higher than that of GCDFP-15 at 40–60% and of mammaglobin at 50–80%. However, the expression of all three markers was significantly lower in TNBC than in ER- or HER2-positive breast cancers. GATA3 expression in TNBC ranged from 15 to 60% using varied cutoffs, compared with more than 90% in ER-positive breast cancer; GCDFP-15 and mammaglobin had even lower sensitivity (10–30%) in TNBC [7,8,9,10,11,12].

To find a more sensitive marker for TNBC with expression comparable to that in ER- or HER2-positive breast cancer, we first analyzed the TCGA (The Cancer Genome Atlas) database and found that trichorhinophalangeal syndrome type 1 (TRPS1) is a potential marker for all types of breast cancer, including basal-type/TNBC. We then systemically evaluated TRPS1 expression in 479 cases of various types of breast cancer, which we compared with GATA3 expression in these cases. In addition, we concurrently examined the expression of TRPS1 in 1234 cases of other tumor types, including urothelial carcinoma, ovarian cancer, lung cancer, pancreatic and gastric adenocarcinomas, colon adenocarcinoma, salivary duct carcinoma, thyroid carcinoma, renal cell carcinoma, and melanoma.

Materials and methods

The Cancer Genome Atlas (TCGA) database analysis

TCGA gene expression data for breast cancer and other 30 tumor types were downloaded from Broad GDAC website (https://gdac.broadinstitute.org/). Molecular subtypes of breast cancers were determined using consensus Kmeans clustering of top 5000 variedly expressed genes using GenePattern from the Broad Institute (https://www.genepattern.org/). This categorized breast cancer samples into four subtypes (luminal A, luminal B, HER2-positive, and basal type). We used the following filter settings to search for potential breast cancer marker genes. The gene must meet these criteria: overall median mRNA expression level in breast cancers is greater than 4 (log2 TPM value) and at least twofold higher than median expression in other tissues, median nonzero expression level in breast cancer is at least 1.5-fold higher, where zero expression is defined as log2 TPM below −8.8, in addition, top 80% percentile lower line for expression in breast cancer is higher than top 20% percentile lower line in other tumors, allowing at most two exceptions.

Human tumor samples

Tissue microarrays (TMAs) of 479 cases of breast carcinoma, 199 cases of lung cancer, 251 of ovarian cancer, 42 of urothelial carcinoma, 173 of salivary duct carcinoma, 144 of pancreatic adenocarcinoma, 38 of gastric adenocarcinoma, 40 of melanoma, and 112 of renal cell carcinoma were made in the Department of Pathology and Translational Molecular Pathology at The University of Texas MD Anderson Cancer Center. All of these cases had been diagnosed previously at the MD Anderson, and this study was approved by MD Anderson’s Institutional Review Board. One additional urothelial carcinoma TMA (BL805a, containing 73 cases), one thyroid carcinoma TMA (TH8010a, containing 70 cases), and one colon adenocarcinoma (CO1021a, containing 92 cases) were purchased from US Biomax Inc (Rockville, MD).

The American Society of Clinical Oncology/College of American Pathologists guideline recommendations [13, 14] were used as a reference for categorizing ER, PR, and HER2 status of breast carcinoma as part of the routine pathologic evaluation. In the current study, breast cancer patients were categorized as follows on the basis of receptor status: positive for ER and PR but negative for HER2 (ER/PR-positive group); HER2-positive regardless of ER and PR status (HER2 group); and negative for ER, PR, and HER2 (TNBC group).

IHC analysis

IHC staining was performed with a rabbit polyclonal antibody against human TRPS1 (PA5-845874 from Invitrogen/ThermoFisher, Waltham, MA) or a mouse monoclonal antibody against human GATA3 (L50-823 from Cell Marque, Rocklin, CA) in a Leica Bond Max autostainer system (Leica Biosystems, Nussloch, GmbH) according to standard automated protocols. Briefly, 4-µm-thick formalin-fixed paraffin-embedded tumor tissue sections of different types of TMAs were deparaffinized and rehydrated according to the Leica Bond protocol. Antigen retrieval was performed with Bond Solution #1 (Leica Biosystems, equivalent to citrate buffer, pH 6.0) at 100 °C for 20 min. Primary antibody (TRPS1, or GATA3 with dilution 1:100) was incubated for 25 min at the room temperature. The primary antibody was detected with use of the Bond Polymer Refine Detection kit (Leica Biosystems, Cat# DS9800), and diaminobenzidine (DAB) chromogen. DAB enhancer (Leica Biosystem, Cat# AR9432) was applied to the slide for 5 min after DAB incubation and counterstaining with hematoxylin. Breast tissue samples were used as positive (incubated with primary antibody) and negative (incubated with antibody diluent) control.

TRPS1 and GATA3 immunostains in breast carcinoma cases were reviewed by pathologist QD, DA, FY, LH, HC, and AS; TRPS1 expression in all other cancer cases were reviewed by pathologist QD, DA, and AS. Difficult and discrepant cases were determined by discussion and review by at least two pathologists. For TRPS1 and GATA3, only nuclear staining is counted as positive; rare cases have membranous expression that is counted as negative. Immunoreactivity scores were calculated by multiplying a number representing the percentage of immunoreactive cells (0, <1%; 1, 1–10%; 2, 11–50%; 3, 51–100%) by the number representing staining intensity (0, negative; 1, weak; 2, moderate; 3, strong). The immunoreactivity scores were considered negative (0–1), low positive (2), intermediate positive (3–4), or high positive (6 and 9) for TRPS1 and GATA3 expression.

Results

TRPS1 and GATA3 mRNA levels in breast cancer and 30 other tumor types

To identify potential breast-specific tumor markers, we systematically analyzed mRNA expression data in 31 types of solid tumors using TCGA database. As shown in Fig. 1, a high mRNA level of TRPS1 was observed in breast carcinoma only, among the 31 types of solid tumors; moreover, the mRNA level of TRPS1 showed equally high expression in all four subtypes of breast carcinoma including ER/PR-positive luminal A and B types, HER2-positive, and basal types (which are predominantly TNBC). In contrast, a high mRNA level of GATA3 was found in luminal A/B and HER2-positive types, but not in basal-type breast cancer; in addition, a high mRNA level of GATA3 is expressed in urothelial carcinoma. The high mRNA levels of GATA3 in specific subtypes of breast cancer and urothelial carcinoma are consistent with its reported high protein level in breast cancer and urothelial carcinoma.

Fig. 1: mRNA levels of TRPS1 and GATA3 in four subtypes of breast cancer and 30 other tumor types.
figure 1

Breast carcinomas (BRCA) are in order as basal, HER2+, luminal B, and luminal A (red to dark green), and purple line is the top 80% expression in BRCA. Analyzed TCGA tumor types include ACC adrenocortical carcinoma, BLCA urothelial carcinoma, BRCA breast invasive carcinoma, CESC cervical squamous cell carcinoma and endocervical adenocarcinoma, CHOL cholangiocarcinoma, COAD colon adenocarcinoma, ESCA esophageal carcinoma, GBM glioblastoma, HNSC head-neck squamous cell carcinoma, KICH chromophobe renal cell carcinoma, KIRC clear cell renal cell carcinoma, KIRP renal papillary carcinoma, LGG low-grade glioma, LIHC hepatocellular carcinoma, LUAD lung adenocarcinoma, LUSC lung squamous cell carcinoma, MESO mesothelioma, OV ovarian serous cystadenocarcinoma, PAAD pancreatic adenocarcinoma, PCPG pheochromocytoma, PRAD prostate adenocarcinoma, READ rectal adenocarcinoma, SARC sarcoma, SKCM skin cutaneous melanoma, STAD stomach adenocarcinoma, TCGT testicular germ cell tumor, THCA thyroid carcinoma, THYM thymoma, UCEC uterine corpus endometrial carcinoma, UCS uterine carcinosarcoma, UVM uveal melanoma.

TRPS1 and GATA3 expression in breast cancers

We next detected TRPS1 and GATA3 protein levels in breast cancer by IHC staining, and found that TRPS1 was positive in 91% of 479 examined breast cancers, and GATA3 was positive in 69% of 476 examined breast cancers. In which, TRPS1 and GATA3 had comparable high expression rate and strength in ER/PR-positive breast cancers, TRPS1 was positive in 98% of 176 ER/PR-positive breast cancers with predominantly intermediate and high expression in 95% of cases. GATA3 was positive in 96% of the ER/PR-positive breast cancers with intermediate and high expression in 91% of cases (Table 1, Fig. 2). Almost all GATA3-positive cases were also positive for TRPS1, except for one case that was positive for GATA3 but negative for TRPS1.

Table 1 TRPS1 and GATA3 expression in breast cancers.
Fig. 2: TRPS1 and GATA3 expression in representative ER/PR+ breast cancer cases.
figure 2

Case 1 shows a low-grade breast carcinoma with high expression of both TRPS1 and GATA3. Case 2 shows a high-grade breast carcinoma with high expression of TRPS1 and low expression of GATA3.

TRPS1 and GATA3 were also highly expressed in HER2-positive breast cancers, with comparable expression rate and strength, although levels were a little lower than in ER/PR-positive breast cancer. TRPS1 was expressed in 87% of 67 HER2-positive breast cancers with 79% intermediate and high expression. GATA3 was detected in 88% of HER2-positive breast cancers with 77% intermediate and high expression (Table 1, Fig. 3). Three cases had negative GATA3 but intermediate to high positive TRPS1, and two cases had negative TRPS1 but intermediate to high positive GATA3.

Fig. 3: TRPS1 and GATA3 expression in representative HER2+ breast cancer cases.
figure 3

Case 1 shows an invasive ductal carcinoma with high expression of both TRPS1 and GATA3. Case 2 shows an invasive carcinoma with high expression of TRPS1 and negative GATA3.

Importantly, TRPS1 expression in both metaplastic and nonmetaplastic TNBC was significantly higher than GATA3 expression (p < 0.001). GATA3 was expressed in only 51% of nonmetaplastic TNBC, and in 21% of metaplastic TNBC with the majority having low expression (7/11, 64%). In contrast, TRPS1 maintained a considerately high expression rate and expression strength in both metaplastic and nonmetaplastic TNBC, with 86% positivity in both metaplastic and nonmetaplastic TNBC; intermediate and high expression were seen in 81% of all positive cases (Table 1). In nonmetaplastic TNBC, there were 69 cases with negative GATA3 but intermediate and high TRPS1, and only 6 cases had negative TRPS1 but intermediate and high GATA3 (Fig. 4). In metaplastic TNBC, there were no cases with negative TRPS1 and positive GATA3, but 31 cases had intermediate and high expression of TRPS1 and negative GATA3 (Fig. 5).

Fig. 4: TRPS1 and GATA3 expression in representative nonmetaplastic TNBC cases.
figure 4

Case 1 shows a poorly differentiated carcinoma with high expression of TRPS1 and intermediate to high expression of GATA3. Case 2 shows a poorly differentiated carcinoma with high expression of TRPS1 and negative GATA3.

Fig. 5: TRPS1 and GATA3 expression in representative metaplastic breast cancer cases.
figure 5

Case 1 shows a metaplastic carcinoma of chondroid differentiation with high expression of TRPS1 and negative GATA3. Case 2 shows the giant/polymorphic nuclei of a metaplastic carcinoma with high expression of TRPS1 and negative GATA3. Case 3 shows a high-grade spindle/sarcomatous metaplastic carcinoma with high expression of TRPS1 and negative GATA3. Case 4 shows a metaplastic squamous cell carcinoma with high expression of TRPS1 and low expression GATA3.

TRPS1 expression in other tumor types

TRPS1 expression in urothelial carcinoma is a rare event. Of 115 urothelial carcinoma cases, only two had low/faint nuclear expression of TRPS1 (Table 2, Fig. 6a).

Table 2 TRPS1 expression in malignancies of multiple organs.
Fig. 6: TRPS1 expression in other tumor types.
figure 6

a Low expression in one urothelial carcinoma. b Low expression in one lung adenocarcinoma. c Intermediate expression in one ovarian serous carcinoma. d Intermediate expression in one salivary duct carcinoma. e Low expression in one pancreatic adenocarcinoma carcinoma. f Low expression in one melanoma.

TRPS1 was weakly expressed in a small percentage of lung and ovary carcinoma cases. TRPS1 positivity was rare in lung adenocarcinoma, with only three positive cases (1 intermediate and 2 low in 122 cases, 2.5%) (Table 2, Fig. 6b). However, a considerable proportion of lung squamous cell carcinoma cases were positive for TRPS1 and included 4 intermediate to high and 15 low expression in 77 cases (25%). Similarly, TRPS1 was positive in 23 (6 intermediate to high and 17 low, 14%) of 165 cases of serous carcinoma of ovary, and in 7 (3 intermediate to high and 4 low, 8%) of 86 cases of ovarian non-serous carcinoma (Table 2, Fig. 6c).

Within the non-mammary adenocarcinomas, salivary duct carcinoma had relatively high expression of TRPS1, TRPS1 was expressed in 41 (23 intermediate to high and 18 low, 24%) of 173 cases of salivary duct carcinomas (Table 2, Fig. 6d).

TRPS1 positivity was extremely rare in other tumor types, which included 406 cases; only one pancreatic adenocarcinoma and one melanoma showed weak nuclear staining (Table 2, Fig. 6e, f). Colon and gastric adenocarcinomas, renal cell carcinomas, and thyroid carcinomas were all negative for TRPS1 expression (Table 2).

Discussion

The original GATA transcriptional factor family includes 6 members-GATA1 to GATA6, which play an essential role in heart development and in erythrocyte and lymphocyte differentiation. In addition, GATA3 is the most abundant transcriptional factor in luminal epithelial cells and considered a “master regulator,” cooperating with ERs in the development of normal mammary gland and breast ductal epithelial differentiation [15,16,17,18,19]. TRPS1, named for its association with the autosomal dominant genetic disorder TRPS1, has been found to be a critical modulator in mesenchymal-to-epithelial transition during the development and differentiation of several types of tissue, including cartilage, bone, kidney, and hair follicle. Recently, TRPS1 was identified to be a novel GATA transcriptional factor, functioning as an essential regulator for growth and differentiation of normal mammary epithelial cells and possibly involved in the development of breast cancer [20,21,22,23,24]. A microarray analysis identified a panel of 15 upregulated genes in 54 breast cancer mRNA samples, which included TRPS1, and a limited study showed that TRPS1 is overexpressed in breast carcinoma samples [25,26,27]. A systematic study of TRPS1 expression in four subtypes of breast cancer, especially for metaplastic and nonmetaplastic TNBC, has not been performed.

As the currently used breast markers GATA3, GCDFP-15, and mammaglobin have relatively good sensitivity in ER-positive breast cancer only, but not for TNBC. The primary focus of the current study was to find a reliable marker for all types of breast cancer, especially for TNBC. Through data mining, we found TRPS1 is a potential target, as it is highly expressed in breast cancer only and shows equally high expression in all four types of breast carcinoma. We first tested two commercial TRPS1 antibodies from Invitrogen/Thermo Fisher Scientific: PA5-36002 and PA5-84874. Both exhibited extraordinary immunoreactivity in staining nuclear TRPS1, but PA5-84874 showed more clear background at the same condition. Thus, we used the PA5-84874 antibody for IHC staining in all TMAs.

There are more than 60 cases containing benign breast ducts in the total 479 breast carcinoma TMA. TRPS1 and GATA3 were expressed in almost all benign breast ducts with various strengths of nuclear staining, and both were expressed in ductal epithelial cells but not in myoepithelial cells (Fig. 7). In the current study, TRPS1 and GATA3 exhibited extremely high sensitivity in ER/PR-positive breast cancer and excellent sensitivity, although a little lower, in HER2-positive breast carcinoma. Furthermore, most TRPS1 and GATA3 were expressed extensively, with strong intensity in duplicate or triplicate TMA spots, which demonstrated that both are excellent and reliable molecular markers in detecting ER- and HER2-positive breast carcinomas. In addition, we noticed rare cases with membranous expression of TRPS1 in HER2-positive breast cancers, which was also observed in rare cases of urothelial carcinoma, lung adenocarcinoma, and salivary duct carcinoma, we will further investigate this abnormal expression.

Fig. 7: TRPS1 and GATA3 expression in benign breast ducts.
figure 7

TRPS1 and GATA3 are expressed in the ductal epithelial cells of benign breast ducts.

Similar to previous study findings, only 21% of metaplastic and 51% of nonmetaplastic TNBC cases expressed GATA3, and weak positivity accounted for a large portion of the total positivity. This indicates that GATA3 is not an ideal marker for TNBC, as GATA3 is lost when tumors become more high-grade or develop a metaplastic phenotype. However, TRPS1 maintained high sensitivity in both metaplastic and nonmetaplastic TNBC that was significantly higher than that of GATA (86% vs. 51% in nonmetaplastic, and 86% vs. 21% in metaplastic TNBC). Especially, high/strong TRPS1 expression was observed in most high-grade metaplastic carcinomas including high-grade spindle/sarcomatous, squamous, polymorphic/giant carcinoma cells, and carcinoma with heterologous mesenchymal differentiation. Furthermore, TRPS1 was also highly/strongly expressed in most positive cases, only 11 in 203 positive TNBC cases showed low/weak expression; and almost all GATA3-positive TNBC cases were TRPS1-positive. All of these data suggest that TRPS1 is an excellent marker for TNBC, and can help confirm breast origin of metastases with fewer IHC workup for the limited biopsy tissue, especially when the primary tumor is not available for comparison, which also addresses the current need in pathology to preserve tissue for next-generation sequencing and companion diagnostics for immune checkpoint inhibitors that are required for personized medicine or clinical trial enrollment.

GATA3 is not only the most used marker in breast cancer but is also the most sensitive and specific marker for urothelial carcinoma. It is well documented that GATA3 is expressed in 70–100% of all urothelial carcinomas [11, 28, 29]. In the current study, we used two urothelial carcinoma TMAs, one generated from urothelial carcinoma patient samples at MD Anderson Cancer Center, and one commercial TMA from US Biomax, Inc. Only 2 out of 115 urothelial carcinoma cases showed weak expression of TRPS1, indicating that TRPS1 can be used to differentiate breast cancer and urothelial carcinoma, although this is a rare but truly existed clinical issue, such as metastatic carcinoma with positive GATA3 in the lung with history of both breast and bladder cancers.

Differentiating lung adenocarcinoma and breast carcinoma is one of commonly encountered clinical problems of most surgical pathologists. A newly identified, poorly differentiated adenocarcinoma in the lung in a female patient with a history of breast cancer always raises the question of whether this is a primary lung cancer or metastatic breast cancer, especially in situations of negative tumor-specific markers such as GATA3, TTF-1, and ER; this is because TTF-1-negative lung adenocarcinoma and GATA3-negative breast cancers have been frequently reported, and poorly differentiated adenocarcinomas of various organs can be morphologically similar. In the current study, TRPS1 was expressed in 3 (2 low and 1 intermediate) of 122 lung adenocarcinoma (2.5%), indicating that TRPS1 can be used to differentiate lung adenocarcinoma from breast cancer.

High-grade ovarian serous carcinoma is the most common non-mammary metastases to the breast and axilla [30], which was most frequently misdiagnosed as a primary breast carcinoma due to overlapping morphology and immunophenotype. In addition, when working on an unknown origin carcinoma with papillary architecture with immunoprofile CK7+ and ER+, breast primary or gynecologic primary, especially serous carcinoma, should be considered. In this study, TRPS1 was expressed in 14% of serous carcinoma of the ovary, while in most cases (17 of 23, 74%) expression was weak. Recently, gynecological tumor marker PAX8 has been reported to be expressed in some high-grade metastatic breast cancers [31], and TRPS1 itself cannot completely differentiate breast carcinoma from serous carcinoma; therefore, differentiating these carcinomas effectively need the joint application of TRPS1 and PAX8.

Salivary duct carcinoma and pancreatic adenocarcinoma are the most commonly GATA3-positive non-mammary adenocarcinoma, 43–100% in salivary duct carcinoma and 37% in pancreatic adenocarcinoma [11, 32]. In the current study, TRPS1 expression in pancreatic adenocarcinoma was extremely rare, with only 1 weak positive case observed in 144 total cases. TRPS1 was detected in 24% (41 out of 173) of salivary duct carcinoma, although the positive percentage was not high, a considerable portion of salivary duct carcinomas (13%, 23 out of 173) expressed intermediate to high TRPS1, indicating that TRPS1 may not be safe in differentiating breast carcinoma from salivary duct carcinoma.

Previous studies have shown that GATA3 was expressed in >50% of chromophobe renal cell carcinoma cases and in 5–10% of thyroid carcinoma cases [11]. In this study, no positive TRPS1 cases were identified in all thyroid and renal cell carcinomas, including 38 chromophobe renal cell carcinoma cases. In addition, gastrointestinal adenocarcinomas including colon and gastric adenocarcinomas did not express TRPS1.

In summary, through data mining and IHC staining in a large number of breast carcinoma and various types of solid tumors cases, TRPS1 has been found to be a highly specific and sensitive marker for all types of breast carcinoma, especially for TNBC.