Protein co-expression networks identified from HOT lesions of ER+HER2–Ki-67high luminal breast carcinomas

Patients with estrogen receptor-positive/human epidermal growth factor receptor 2-negative/Ki-67-high (ER+HER2–Ki-67high) luminal breast cancer have a worse prognosis and do not respond to hormonal treatment and chemotherapy. This study sought to identify disease-related protein networks significantly associated with this subtype, by assessing in-depth proteomes of 10 lesions of high and low Ki-67 values (HOT, five; COLD, five) microdissected from the five tumors. Weighted correlation network analysis screened by over-representative analysis identified the five modules significantly associated with the HOT lesions. Pathway enrichment analysis, together with causal network analysis, revealed pathways of ribosome-associated quality controls, heat shock response by oxidative stress and hypoxia, angiogenesis, and oxidative phosphorylation. A semi-quantitative correlation of key-protein expressions, protein co-regulation analysis, and multivariate correlation analysis suggested co-regulations via network-network interaction among the four HOT-characteristic modules. Predicted highly activated master and upstream regulators were most characteristic to ER-positive breast cancer and associated with oncogenic transformation, as well as resistance to chemotherapy and endocrine therapy. Interestingly, inhibited intervention causal networks of numerous chemical inhibitors were predicted within the top 10 lists for the WM2 and WM5 modules, suggesting involvement of potential therapeutic targets in those data-driven networks. Our findings may help develop therapeutic strategies to benefit patients.


Results
Proteome datasets of ER+HER2-Ki-67high luminal breast carcinomas. MS-based proteomic analysis was performed using 10 microdissected lesions comprising five HOT and five COLD spots obtained from the five FFPE tissue specimens of this subtype. These specimens were selected for their preserved condition, tumor area, and well-clarified pathological diagnosis and molecular marker status (ER+HER2-Ki-67high (> 80%)) ( Table 1). Pre-surgical treatment was not performed in any of the cases.
A total of 1,862 proteins were significantly identified from all the HOT and COLD lesions; of these, approximately 65% were commonly expressed (Fig. S3a). The proportion of proteins unique to HOT lesions was approximately 26%, whereas that of proteins expressed in the COLD lesions was approximately 10%. Overall, 476 and 180 proteins were characteristic of the HOT and COLD lesions, respectively. Gene ontology (GO) analysis was performed on the GO Biological process, GO Molecular Function, and Protein class using the Protein Analysis THrough Evolutionary Relationships (PANTHER) version 14.1 software program (Paul D. Thomas, University of Southern California, Los Angeles, CA, USA) 14 . Results of the GO analysis were similar between the HOT and COLD groups (Fig. S3b). Proteins were abundantly associated with cellular process, localization, cellular component organization or biogenesis, biological regulation, metabolic process and response to stimulus, in biological process (GO); binding and catalytic activity in molecular function (GO); cytoskeletal protein, transporter, nucleic acid-binding protein, protein-binding activity modulator, translational protein, metabolite interconversion enzyme, protein modifying enzyme, membrane traffic protein, and chaperone in protein class (GO). The fold changes and relative abundances of all identified proteins for both HOT and COLD lesions are visualized in Fig. S4. Table 1. Clinicopathological information of patients with luminal B breast cancer. CT computed tomography; HER2 human epidermal growth factor receptor 2; PgR progesterone receptor; TNM tumor-node-metastasis. *This is a case corresponding to so-called occult cancer. That is, the primary lesion was small and the metastatic lesion was found first. pT1aN2aM0: High malignant type of Grade 3, Ki-67 high in stage IIIA. Lymph node metastases were first discovered. The surgical specimen was n (5/19), but the size of the invasive cancer (infiltration diameter) was only 2 mm and 1 mm. The distance between the two was 3 mm, but lymphatic invasion was remarkable. www.nature.com/scientificreports/ Identification of data-driven key protein networks by WGCNA. Co-expression protein networks are defined as undirected, weighted networks. Modules are clusters of highly interconnected proteins, which correspond to clusters of proteins with high absolute or positive correlations 8 . We identified 28 protein modules by constructing a weighted correlation network and clustered all the identified proteins ( Fig. 1). A soft threshold power of 10 was selected to define the adjacency matrix according to the criteria of approximate scale-free topology, with a minimum module size of 5 and a module detection sensitivity deepSplit of 4. The correlations between resultant modules and two clinical traits (HOT and COLD) were determined to identify protein modules with upregulated or downregulated expression in the HOT or COLD lesion samples. We could not identify modules which were significantly correlated with clinical traits (i.e., |r|> 0.5 and p < 0.05) throughout the 28 modules identified (Fig. S5). However, several WGCNA modules exhibited moderate correlations characteristically with the clinical trait − HOT (|r|> 0.4), which includes WM2 (light green), WM5 (cyan), WM6 (tan), and WM20 (dark grey). The WM5 (cyan) module indicated the highest correlation and significance of its member proteins within the module membership versus gene significance (Corr. = 0.96, p = 7.4 × 10 −30 ) (Fig. S5).
Statistical ORA. Trait correlation analysis between eigen components of WGCNA modules and clinical traits often tends to overlook important modules for investigating disease-related molecular networks. Regarding clinical traits which are quite close, such as HOT and COLD in this study, a difficulty in trait-module relationship would be encountered to attain a high significance in the identification of key WGCNA modules. Statistical ORA 13 would assist in evaluating potential key WGCNA modules with identified proteins uniquely expressed and upregulated in the HOT trait. Notably, 476 and 180 proteins were uniquely expressed in the HOT and COLD lesions, respectively; while 297 and 136 proteins were differentially upregulated, respectively, with |R SC |> 1 (higher than twice fold change) and p < 0.05 in G-statistics (Figs. S3a and S4). The overlaps between the WGCNA-derived protein modules and identification-based significantly expressed proteins were subsequently assessed using the over-representation test. We confirmed five important WGCNA modules that showed signifi-  www.nature.com/scientificreports/ cant overlap (maximum q-value among the groups < 0.05) with protein groups uniquely and/or highly upregulated in the HOT trait (Fig. 2a).
Functional enrichment analysis of WGCNA modules screened by ORA. Our WGCNA analysis followed by ORA-based screening identified the five protein modules significant to proteins expressed uniquely and upregulated in the HOT lesions: WM2 (light green), WM5 (cyan), WM6 (tan), WM11 (red), and WM20 (dark grey) (Fig. 2b). The biological connectivity among the proteins in each module was analyzed by mapping the module proteins in the human protein-protein interaction (PPI) network, and among the biological pathways by pathway enrichment analysis 15 . Figure 3 shows data-driven protein co-expression networks; hub proteins for respective modules identified using the cytoHubba plugin 16 and eigen-proteins are indicated in red and blue dashed circles, respectively, and pathway enrichment results identified for the HOT trait are shown. Similarly, data-driven protein co-expression networks and pathways enriched for the COLD trait are presented in Fig. S6. The enriched pathways of the WM2 (light green) module included mRNA metabolic process, signal-recognition particle-dependent co-translational protein targeting to membrane, nuclear-transcribed mRNA catabolic process, nonsense-mediated decay, ribonucleoprotein complex, and ribosome (Fig. 3b). The hub protein guanine nucleotide-binding protein subunit beta-2-like 1 (GNB2L1) is known as the receptor of activated protein C kinase 1 (RACK1), which is a component of the 40S ribosomal subunit involved in translational repression and Overlapping proteins unique to the clinical traits and/or upregulated under the HOT or COLD traits, and those from the weighted gene co-expression network analysis (WGCNA). (a) Results of identified proteins and spectral counting-based semi-quantitative comparison. Each row represents results for each protein group. The red and pink cells in the HOT and COLD columns indicate that the proteins in the group are uniquely expressed and significantly upregulated, respectively (upregulated with |Rsc|> 1 (HOT > COLD or HOT < COLD). The fourth column shows the number of proteins in each protein group. The fifth column provides notes for each protein group. The WGCNA modules with significant overlap with each protein group are listed in the sixth column ('Modules' column). (b) Overlap in proteins between the groups according to the protein expression profiles and the modules by WGCNA. Each row in the embedded table represents the overlap analysis results for each module. The first and second columns in the table represent module ID and color name of the module, respectively. The third column represents the number of proteins in each module. The fourth, fifth, sixth, and seventh columns indicate the q values for overlap in proteins between a module by WGCNA and the four protein groups. In the five columns, significant q-values are highlighted in red. The eighth column represents the value of the most significant q-value (max q-value) in each module. The nine modules with max q-values < 0.05 are listed in order. www.nature.com/scientificreports/ initiation of the ribosome quality control 17 . The module member protein ELAVL1, the RNA-binding protein HuR, is also involved in adenylate-uridylate-rich element-mediated mRNA decay. A breast cancer study using IHC of 160 primary breast carcinomas revealed that higher RACK1 expression was correlated with shorter overall survival, and suggested a correlation between RACK1 and the commonly used clinicopathological biomarkers (e.g., Ki-67 and ER) in breast cancer 18 . The enriched pathways of the WM5 (cyan) module involved peptide biosynthetic process, translation, organonitrogen compound metabolic process, and translational initiation (Fig. 3b). The hub protein elongation factor 1-alpha 1 (eEF1α1) encoded by EEF1A1 belongs to the translation factor-related (TRAFAC) class translation factor GTPase superfamily. It strongly promotes the heat shock response, which protects cancer cells from proteotoxic stress, such as oxidative stress and hypoxia 19 . EEF1A1 mRNA levels are downregulated in most breast cancers, and this low expression has been associated with poor prognosis for patients with ER-positive breast cancer 20 . In contrast, the protein eEF1α1 was overexpressed in both ER and progesterone receptor-positive and lymph node-negative ductal breast carcinomas, where it should be noted that mRNA expression and protein expression of EEF1A1 are opposite 20 .
The enriched pathways of WM6 (tan) included intracellular transport, organonitrogen compound biosynthetic process, membrane organization, and metabolism of amino acids and derivatives (Fig. 3b). The subnetwork 3-1 (Fig. 3a) is involved in the oxidation-reduction process, detoxification of reactive oxygen species (ROS), and tumor protein p53 (TP53) regulates metabolic genes, in which thioredoxin reductase 1 (TXNRD1) and peroxiredoxin 2 (PRDX2) are the main members in the network cluster of redox-active center and selenocysteine. TXNRD1, the top hub protein in this module, protects cancerous cells against oxidative stress by reducing thioredoxin 1 (TXN), which reduces oxidized cysteines in cellular proteins and scavenges peroxides by PRDX 21 . TXNRD1 is upregulated in many tumors, such as oral squamous cell carcinoma 22 , lung cancer 23 , and breast cancer 24 . The enzymatic TXN/TXNRD1 system involves a key mechanism governing S-nitrosothiol homeostasis. However, the ER-positive status of breast tumors has been associated with significantly lower levels of TXNDR1 www.nature.com/scientificreports/ protein expression 24 . Indeed, we also observed the low expression of TXNDR1 compared with TXN in this study, which may be relevant to the HER2-negative breast carcinomas. High TXN expression enhances angiogenesis through upregulation of protein expression of hypoxia-inducible factor 1alpha (HIF-1α) and vascular endothelial growth factor (VEGF). It also inhibits apoptosis by binding to the pro-apoptotic proteins, apoptosis signalregulating kinase 1 (ASK1) and phosphatase and tensin homolog (PTEN), which promote cancer cell growth 21 .
It was suggested that an impairment of the TXN/TXNRD1enzymatic system is associated with the development of resistance against hormonal therapies. The enriched pathways of the WM11 (red) module included macromolecule localization, interspecies interaction between organisms, respiratory electron transport, adenosine triphosphate synthesis by chemiosmotic coupling, and heat production by uncoupling proteins, and mRNA Splicing-Major Pathway (Fig. 3b). The subnetworks 4-1 and 4-2 are involved in oxidative phosphorylation and mRNA Splicing-Major Pathway, respectively (Fig. 3a). The hub protein cyclin-dependent kinase 1 (CDK1) plays a key role in regulating cell cycle progression, and is a potent therapeutic target for inhibitors in the treatment of cancer. MYC has been an attractive therapeutic target for the treatment of breast cancer, whereas a direct inhibition of MYC remains challenging. Kang et al. recently proposed that targeting of CDK1, but not CDK4/6 or CDK2, could be selectively lethal to MYC-dependent human breast cancer cells 25 . CDK1 is currently considered the best CDK target for breast cancer therapy 26 .
The WM20 (dark grey) module was enriched with pathways including the establishment of protein localization to organelle, protein targeting, and extracellular exosome (Fig. 3b). The top hub proteins in this module included heat shock protein (HSP) HSP90-alpha (HSP90AA1), T-complex protein 1 subunit alpha (TCP1/ CCT1), and T-complex protein 1 subunit zeta (CCT6A), which also belong to HSPs. HSPs play a crucial role as molecular chaperones under carcinogenic stress conditions. Overexpression of HSP90AA1, TCP1/CCT1, and CCT6A results in poor survival for patients with breast cancer 27, 28 .
Master and upstream regulators predicted by ingenuity pathway analysis (IPA). We conducted an analysis of causal networks and upstream regulators for the identified modules, using the IPA (http://www. ingen uity.com) software 31 . Table 2 lists in brief the top master regulators predicted to be activated or inhibited (|z-value|> 2.0) for the WGCNA modules associated with the HOT lesions. The top 10 master regulators with high values in activation or inhibition score (z-score) in causal networks significantly associated with the HOT lesions (WM2, WM5, WM6, WM11, and WM20) and COLD lesions (WM7, WM12, WM18, and WM24), are presented in Tables S1 and S2, together with their participating regulators and target molecules in the datasets.
MAPK-interacting serine/threonine-protein kinases 1 and 2 (MNK1/2), encoded by MNK1/2 (MKNK1/2), were highly activated. MNK1/2 interplays by modulating oncogene eukaryotic initiation factor 4E (EIF4E) between the two major signaling pathways, Ras/MNK and PI3K/AKT/mechanistic/mammalian target of rapamycin (PI3K/AKT/mTOR), which are important in tumorigenesis, oncogenic transformation and progression, and chemoresistance 32,33 . The PI3K/Akt/mTOR pathway is frequently activated in breast cancer; of note, PIK3CA is the most common mutation in ER-positive breast cancer. F2R, also termed proteinase-activated receptor 1 (PAR1), encodes a seven-transmembrane G-protein-coupled receptor (GPCR) family member. It was reported that F2R/PAR1, as a direct transcriptional target of Twist, enhances the tumorigenic and metastatic capacity of breast cancer cells by suppressing the Hippo pathway and activating epithelial-mesenchymal transition 34 . FGFR2 is a tyrosine-protein kinase that acts as a cell-surface receptor for fibroblast growth factors, playing a crucial role in the regulation of cell proliferation, differentiation, migration, apoptosis, etc. The FGFR2 locus was reported as the top hit with the risk variants identified repeatedly in genome-wide association studies for ER-positive breast cancer 35 . Genome-wide association studies analyzing 4,398 cases with familial breast cancer causes and 4,316 controls identified five single-nucleotide polymorphisms of FGFR2 significantly associated with breast cancer 36 . SRPK1 is a serine/arginine-rich protein-specific kinase and one of the core splicing factors 37 ; it is highly expressed in more aggressive basal breast cancer. High expression levels of SRPK1 correlated with low metastasis-free survival in patients with ER-positive, but not ER-negative, breast cancer 38 . NOX4 is a ROS-producing NOX protein which has an oncogenic function and produces ROS in the mitochondria 39 . AATF, also termed CHE-1, is involved in transcriptional regulation, cell cycle control, DNA damage responses, and cell death 40 . SLC2A4, is also termed glucose transporter type 4, insulin-responsive (GLUT4). High activation of the GLUT4 causal network most likely indicates its estrogen receptor 1 (ESR1)-mediated enhancement 41 . In addition, it may be involved in metabolic reprogramming to oxidative phosphorylation under hypoxia, which increased the activity of the mitochondrial www.nature.com/scientificreports/ oxidation of pyruvate 42 . Both BMP1 receptor and ACVRL1 are members of the BMP receptors, which may be important in breast cancer promotion and progression. BMP1 receptor comprises BMPR1A and BMPR1B, which belong to transmembrane serine/threonine kinases, the cytokine growth factor transforming growth factor beta (TGFβ) family. Overexpression of BMP receptors exhibits tumor-promoting phenotypes with increased invasion and metastasis, whereas BMP receptors act mostly as tumor suppressors 43 . ACVRL1 encodes activin receptor-like kinase 1 (ALK1), which is also a type I receptor of the TGFβ superfamily. Since inadequate efficacy of VEGF-targeted agents (including bevacizumab) has been realized, ALK1 has attracted considerable attention  www.nature.com/scientificreports/ as an alternative pathway for the regulation of neo-angiogenesis 44,45 . PAK2 encodes serine/threonine-protein kinase PAK2, which is involved in various signaling pathways, including cytoskeleton regulation, cell motility, cell cycle progression, apoptosis, or proliferation. Anti-estrogen resistance in ER-positive breast cancer is associated with activated insulin-like growth factor 1 receptor (IGF1R). Zhang et al. performed a kinome siRNA screening study that identified 10 regulators of IGF1R mediated anti-estrogen. These regulators included the tamoxifen resistance suppressors and inducers, among which PAK2 is the strongest resistance inducer 46 . NFE2L2 (or Nrf2), encoding the transcription factor Nrf-2 (nuclear factor, erythroid 2 like 2), is a master regulator of antioxidant response 47 . Overexpression of Nrf-2 may contribute to tumorigenesis and chemoresistance by upregulating its target genes 48 . It was also reported that low NFE2L2 mRNA expression levels may be an independent predictor of poor prognosis, especially in ER-positive breast cancer 49 . HMGCR encodes 3-hydroxy-3-methylglutarylcoenzyme A reductase (HMG-CoA reductase), a transmembrane glycoprotein, which is the key enzyme in cholesterol biosynthesis and the target for statin treatment. High mRNA levels of HMGCR together with other genes in the mevalonate pathway were associated with resistance to statin treatment and poor survival in breast cancer 50 . The cholesterol biosynthesis pathway was recently shown to be upregulated in ER-positive breast cancer cell lines resistant to estrogen deprivation. This finding suggested that dysregulation of cholesterol biosynthesis may be a mechanism of endocrine resistance in hormone receptor-positive breast cancer 51 . Representative causal networks associated with the HOT trait are presented in Fig. 5, together with their target molecules in the dataset. Those associated with the COLD trait are presented in Fig. S7.

Discussion
Our study identified five significant protein-network modules associated with disease mechanisms of this malignant luminal breast carcinomas using WGCNA following in-depth proteomic analysis. The WM2 module centrally involves pathways of ribosome-associated quality control of mRNA and protein including the nonsensemediated decay. Through these pathways, tumors exploit gene expression for their survival, downregulating the expression of tumor suppressor genes. The WM5 and WM6 modules were both involved in the heat shock response process and angiogenesis caused mainly by oxidative stress and hypoxia. The hub proteins PRDX2 and lactate dehydrogenase A (LDHA) in the WM6 network module are a peroxide scavenger and key enzyme in aerobic glycolysis, respectively. Their function is to protect cancer cells against both oxidative stress and hypoxia. The hub HSP proteins and highly activated NFE2L2/Nrf2 pathway enriched in the WM20 module predominantly participate in pathways of heat shock response and/or oxidative stress. Regarding the WM11 module, CDK1 plays a central role in cell cycle progression and, together with its subnetwork, is involved in oxidative phosphorylation. Collectively, most data-driven protein networks are commonly associated with activities of heat shock response, angiogenesis, and cancer cell survival. A semi-quantitative correlation of key-protein expression, protein co-regulation analysis using ProteomeHD, and multivariate correlation analysis suggested co-regulations via network-network interaction among the four key modules characteristic to the HOT lesions.
Highly activated master and upstream regulators of causal networks predicted to the five data-driven protein networks were mostly characteristic to ER-positive breast cancer, as well as associated with oncogenic transformation, resistance to chemotherapy, and endocrine therapy. The integrative networks constructed from the representative master and participating regulators predicted for the HOT-characteristic WGCNA modules exhibited a predominant activation of ESR1, MYC, nuclear receptor 4A1 (NR4A1), and NFE2L2 (Fig. S8). Interestingly, the top causal networks predicted for the WM2 and WM5 modules listed highly inhibited interventions by numerous chemical inhibitors, indicating the involvement of potential therapeutic targets. The WM2 module included 5-fluorouracil, sirolimus (rapamycin), and SGI-1776, while the WM5 module included imatinib, emodin, lovastatin, AEE788, and sitravatinib (Fig. 6).
The limitation of this study is the number of patients examined, which was attributed to only these 5 cases of Ki-67 values at HOT > 80-% available in our hospital, corresponding to 3.8% (5/133) of the total Luminal-B like cases. We plan to verify/validate the results of this study by using a larger sample size of the external cohort being accumulated in the future.
In conclusion, the WGCNA combined with ORA-based protein screening was successfully applied to clinical proteomic datasets. Our results could identify key data-driven networks and their upstream regulators characterizing the HOT lesions. The limitation of this study is the small number of patients analyzed. We are planning a larger cohort study of patient-derived samples, including genomic alteration analysis to investigate core datadriven proteogenomic networks. This approach will provide clinically important information on proteogenomic landscapes of ER+HER2-Ki-67high malignant luminal breast cancer.

Methods
FFPE tissue specimens and sample preparation. A total of 671 patients underwent surgical breast cancer resection at the Tokyo Medical University Hachioji Medical Centre between 2015 and 2019. Pathological specimens were independently reviewed by three pathologists (MW, ES, and HH) to confirm that they satisfied the 2015 WHO classification of breast tumors (histological criteria) 52 . Only 21 tumors were histologically confirmed as ER+HER2-Ki-67high (> 70%) luminal B breast cancers. Of those, five cases with high Ki-67 (> 80%) were selected for investigation in this study. FFPE tumor tissue blocks from the corresponding five surgical specimens were obtained without patient identifiers from the Tokyo Medical University Hospital and Hachioji Medical Centre. Informed consent was provided by all participating patients, and the study protocol was approved by the institutional review board of the Tokyo Medical University Hospital (approval no. H-266). The study was conducted in accordance with the tenets of the Declaration of Helsinki.
The terms HOT and COLD are defined as follows: HOT spot, the highest density of Ki-67-positive cells in the lesion (i.e., area with the highest Ki-67 value); COLD spot, the lowest density of Ki-67-positive cells (i.e., www.nature.com/scientificreports/ area with the lowest Ki-67 value). For tissue microdissection, sections (thickness: 10 μm) from the FFPE tumor blocks were cut onto DIRECTOR slides (OncoPlex Diagnostics Inc., Rockville, MD, USA). Prior to dissection, the sections were de-paraffinized and stained only with hematoxylin using standard histological methods. Microdissection was performed using a Leica LMD7 Microdissection Microscope (Leica, Wetzlar, Germany) (see Fig. S2). Typically, a total area of 5 mm 2 with approximately 19,000 tumor cells was transferred from the FFPE sections via laser dissection directly into the cap of a 200-μL low-binding tube. Proteins were extracted and digested with trypsin using Liquid Tissue MS Protein Prep kits (OncoPlex Diagnostics Inc.) according to the instructions provided by the manufacturer 53 . The procedures were previously described 6,7 . Briefly, dried microdissection pellets were suspended in 20 μL of Liquid Tissue buffer, heated at 95 °C for 90 min, and then cooled on ice, at which point 0.1 μg of trypsin was added to each tube. The tubes were then incubated at 37 °C for 18 h. The digested samples were dried and resuspended in 50 μL of 2% acetonitrile aqueous solution containing 0.1% TFA. Finally, the digested samples were frozen at − 80 °C until further processing.  , and (f) HMGCR, together with their target molecules in the dataset, which were representatively associated with the HOT trait. Node shapes indicate molecular types: triangle, kinase; square (dashed), growth factor; rectangle (horizontal), ligand-dependent nuclear receptor; rectangle (vertical), ion channel; diamond (vertical), enzyme; diamond (horizontal), peptidase; trapezoid, transporter; oval (horizontal), transcription regulator; oval (vertical), transmembrane receptor; double circle, complex; and circle, other. Red or light red colors indicate highly or moderately increased expression of a mutant protein in the dataset. Orange/light orange and blue/ light blue colors indicate the extent of confidence for predicted activation and inhibition, respectively. Lines denote predicted relationships. A solid or dashed line indicates direct or indirect interaction, respectively. Orange, leading to activation; blue, leading to inhibition; yellow, findings inconsistent with the state of a downstream molecule; grey, not predicted effect. BMP1, bone morphogenetic protein 1; F2R/PAR1, coagulation factor II thrombin receptor/ proteinase-activated receptor 1; MNK1/2, MAPK-interacting serine/threonineprotein kinases 1 and 2; NOX4, NADPH oxidase 4; PAK2, p21-activated kinase 2; HMGCR, 3-hydroxy-3methylglutaryl-CoA reductase. www.nature.com/scientificreports/ interfaced with a Q Exactive Orbitrap mass spectrometer (Thermo Fisher Scientific, Bremen, Germany) via a nano-ESI device Dream-Spray (AMR Inc., Tokyo, Japan). The LC system consisted of an Ultimate 3000 HPLC System (Thermo Fisher Scientific), a trap-cartridge L-column (0.3 mm × 5.0 mm, CERI, Tokyo, Japan), and a capillary separation column (Zaplous column alpha-PepC18, 3 μm, 12 nm, 0.1 mm × 150 mm, AMR Inc.). An auto-sampler (HTC-PAL, CTC Analytics, Zwingen, Switzerland) loaded an aliquot of samples into the trap, which was then washed with solvent A (2% acetonitrile aqueous solution containing 0.1% trifluoroacetic acid) to concentrate the peptides in the trap and desalt them. Subsequently, the trap was connected in series to the separation column, and the peptides were eluted from the whole column with 0.1% formic acid aqueous solution and acetonitrile by linear 5-40% acetonitrile concentration gradient over 90 min at a flow-rate of 500 nL min −1 .

MS
The mass spectrometer was operated in data-dependent acquisition mode. All LC-MS/MS data were acquired using Xcalibur, version 4.0.27.19 (Thermo Fisher Scientific) in high-resolution data-driven analysis (DDA) mode, with the survey scan (MS in the mass range m/z 400-1600) acquired in the Orbitrap at 70,000 resolution (at m/z 200) in profile mode. The top ten most intense peaks from the survey scan were selected for fragmentation with higher-energy collisional dissociation with a normalized collision energy of 27 and an isolation window of m/z 1.6. The dynamic exclusion time for precursor ions selected for MS/MS fragmentation was 10 s, and the automatic gain control target values for MS and MS/MS were 1 × 10 6 and 1 × 10 5 , respectively.
Peptide sequence matching was performed using the MASCOT software ver. 2.6.0 (http://www.matri xscie nce. com) against the UniProt Homo sapiens database downloaded in January 2017. A target-decoy search strategy was employed for increased confidence in protein identifications 54 . This search considered tryptic peptide candidates, and the formylation of lysine and oxidation of methionine were considered as variable modifications. The MASCOT search engine considered a precursor mass tolerance of 10 ppm and a fragment bin tolerance of 0.02. The validity of the peptide spectrum matches was assessed using the MASCOT software. All identification results were reported with < 1% FDR, both at the peptide and protein levels.
The expression levels of identified proteins were assessed by spectral count-based protein quantification (Fig. S4). Fold changes in protein expression in the base 2 logarithmic scale (R SC ) 55 and normalized spectral abundance factors (NSAF) 56 representing relative abundances of expressed proteins were calculated using the spectral count (SpC)-that is the number of MS/MS spectra assigned to each protein.
Weighted correlation network analysis. Weighted-gene co-expression network analysis (WGCNA) 8 was used to identify systems-level differences in the protein expression pattern of the HOT and COLD spots in malignant luminal breast tissues. The similarity in protein expression patterns for all protein pairs was calculated according to their pairwise Pearson's correlation coefficient, i.e., the similarity between proteins i and j was defined as (1 − r i,j )/2, where r i,j is the Pearson's correlation coefficient of the protein expression patterns between these two proteins. We performed the network topology analysis for various soft-thresholding powers ranging www.nature.com/scientificreports/ from 1 to 100 to choose an optimal value to balance between independence and mean connectivity. The power had been set to 10, under which the network reaching to the scale-free topology. By implying a soft thresholding parameter, the weighted gene expression network emphasizes highly correlated protein pairs and filters low correlations. This reduces the noise of correlation in the adjacency matrix until the network resembles a scalefree graph. Next, to measure the connection strength between all protein pairs, the topological overlap measure (TOM) was calculated from the adjacency matrix. TOM dissimilarity matrix (1-TOM) was subsequently used to perform average linkage hierarchal clustering, which generated a protein clustering tree with modules corresponding to the branches of the tree. Dynamic tree cutting was used to trim the branches and identify protein modules. Modules were summarized by the first principal component referred to as eigen-protein in the text. Module membership, defined as the correlation between the protein expression profile and the module eigen-protein, was measured with values ranging 0-1; 0 represents a gene that is not part of the module, while 1 represents high connectivity to the module. Subsequently, the module-trait association was determined using the correlation between the module eigen-protein and the two clinical traits (HOT and COLD). WGCNA analysis was performed using the WGCNA R-package 8 , implemented as a gadget in the GARUDA PLATFORM (The Systems Biology Institute, Tokyo, Japan). PPI network construction. We used the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database (version 10.5) to construct a protein interaction network for a protein module 15 . STRING networks were calculated under the criteria for linkage only with experiments, databases, text mining, and coexpression with the default settings i.e., medium confidence score: 0.400, network depth: 0 interactions. Functional enrichment results were obtained for canonical pathways under p < 0.05. Proteins in a protein module were mapped in the protein interaction network from the STRING database, to produce the results of the enrichment analysis on the biological process (GO) and Reactome pathways.
A hub protein refers to a "highly connected protein." The proteins inside co-expression modules exhibit high connectivity, and the proteins within the same module may play similar roles. The PPI networks were reconstructed using the Cytoscape software (version 3.7.1.), followed by importation of the results obtained from the STRING PPI network analysis of eigen-proteins in each module. We identified hub proteins in each module according to their intra-modular connectivity and correlation with module eigen-proteins. The top 20 high-degree proteins were identified using the cytoHubba plugin 16 . The three top-ranked genes in each module were considered to be hub proteins.
Causal network analysis by IPA. Upstream regulators and causal networks were predicted using the IPA software 31 . Protein expression data (quantile-normalized for selected modules) were used as input datasets. Causal networks (p < 0.05) were predicted from the WGCNA network modules significantly associated with the two clinical traits (HOT and COLD), where their activation or inhibition was defined by z-values > 2.0 or < − 2.0, respectively.

Data availability
The unfiltered MS datasets generated and analyzed in this study have been deposited in the in the ProteomeXchange (http://prote omece ntral .prote omexc hange .org) and jPOST with the dataset identifiers PXD021912 and JPST000981, respectively.