Triple SILAC identified progestin-independent and dependent PRA and PRB interacting partners in breast cancer

Progesterone receptor (PR) isoforms, PRA and PRB, act in a progesterone-independent and dependent manner to differentially modulate the biology of breast cancer cells. Here we show that the differences in PRA and PRB structure facilitate the binding of common and distinct protein interacting partners affecting the downstream signaling events of each PR-isoform. Tet-inducible HA-tagged PRA or HA-tagged PRB constructs were expressed in T47DC42 (PR/ER negative) breast cancer cells. Affinity purification coupled with stable isotope labeling of amino acids in cell culture (SILAC) mass spectrometry technique was performed to comprehensively study PRA and PRB interacting partners in both unliganded and liganded conditions. To validate our findings, we applied both forward and reverse SILAC conditions to effectively minimize experimental errors. These datasets will be useful in investigating PRA- and PRB-specific molecular mechanisms and as a database for subsequent experiments to identify novel PRA and PRB interacting proteins that differentially mediated different biological functions in breast cancer.

www.nature.com/scientificdata www.nature.com/scientificdata/ breast cancer proteomes; unliganded-PRA regulates proteins involved in the TCA cycle while unliganded-PRB regulates proteins involved in cell cycle and apoptosis 20 .
Here we identified PRA and PRB interacting proteins in the presence and absence of progestin using sensitive, reliable technology. Tet-inducible PRA and PRB constructs were expressed in PR-null T47DC42 breast cancer cells using lentiviral transduction (see Materials and Methods). To specifically purify PRA and PRB complex proteins, a HA tag (YPYDVPDYA) was attached to the PR-isoform C-terminus (Fig. 1a). Similar levels of PRA and PRB were induced with doxycycline (Dox). Treatment with synthetic progesterone (R5020) decreased PRA and PRB levels (Fig. 1b). Co-immunoprecipitation (Co-IP) of PRA and PRB complexes were successfully achieved using the HA-tag-specific monoclonal antibody (Fig. 1c). The tet-inducible PR-isoform models were previously characterized showing normal transcription, localization, and function 20,21 .
We applied Stable Isotope Labeling with Amino acids in Cell culture (SILAC) coupled with coimmunoprecipitation (Co-IP) 22 . SILAC with high-affinity purification provided a highly effective method to identify protein-protein interactions with lower nonspecific binding than other traditional affinity purifications.
To identify a list of interacting proteins with high confidence, we conducted three-forward and three-reverse SILAC experiments in the presence and absence of progestin to minimize experimental bias and errors: forward SILAC, PRA was labeled with Light isotope, PRB was labeled with Heavy isotope; reverse SILAC experiments, PRA and PRB labeling were swapped. We combined the equivalent amount of protein from uninduced-PRA and uninduced-PRB cells cultured in intermediate SILAC medium as controls to help minimize nonspecific protein bindings. The experimental workflow is described, Fig. 2.
We evaluated the correlation across replicates using Pearson's correlation and found the average correlation of progesterone-independent and dependent at 0.677 and 0.712, respectively, as shown in Figs. 3,4. Analysis by LC-MS/MS in progestin-independent and dependent conditions identified a total of 742 proteins and 646 proteins, respectively. In the absence of progestin, we identified 210 and 202 interacting PRA and PRB interacting partners that were progestin-independent. In the presence of progestin, we identified 141 and 135 PRA and PRB interacting partners that were progestin-dependent. To identify high confidence of PRA and PRB interacting partners, only PRA and PRB interacting partners detected in at least 4 out of 6 replicates were allowed for statistical analysis with a one-sample t-test (p-value < 0.05). Protein with p-value < 0.05 and showed a minimum fold-change of greater than 2 (log2 SILAC ratio ≥ 1) were allowed in significant candidate protein lists, which were provided as described in the data record. We found 64 and 20 of PRA and PRB, respectively, significant interacting partners that were progestin-independent and found 31 and 15 of PRA and PRB, respectively, significant interacting partners that were progestin-dependent. We identified known interacting partners of PRA and PRB including HSP90, HSP70, DDX5, FKBP5, and PARP1 proteins, and others as listed 23,24 . We also identified several novel PRA and PRB interacting partners under ligand-independent and dependent conditions. Since we applied stringent criteria to rule out nonspecific binding and performed traditional immunoprecipitation without cross-linking agents, interacting partners identified in this study are likely PRA or PRB binding proteins with high-affinity stable protein interactors. According to the list, we found more PRA interacting partners in the absence of ligand, consistent with a previous study that found PRA is a more active isoform compared www.nature.com/scientificdata www.nature.com/scientificdata/ to PRB under progestin-independent condition 19 . Moreover, these two receptors exhibited distinct conformations and PRA contain an inhibitory domain (ID), prompting PRA to function as a strong ligand-dependent transdominant repressor of steroid hormone receptor transcriptional activity 25 . Since progestin-bound receptors get phosphorylated and degraded via a proteasome-dependent pathway and PRB rapidly degrades as compared to PRA (Fig. 1b) 26,27 , we found fewer progestin-dependent PRB interacting partners compared to that of PRA. The majority of PRB interacting partners 19 out of 20 proteins (95%) and 14 out of 15 (93%) proteins of progesterone-independent and dependent, respectively, are a subset of PRA progesterone-independent and dependent interacting partners as shown in the Venn diagrams in Figs. 5a, 6a. Together, our data support the small number of potential PRB interactors as compared to those of PRA.
Ingenuity Pathway Analysis (IPA) showed that PRA and PRB interacting proteins enriched similar pathways but differences in significance value, except for the splicing of mRNA pathway that was only involved in unliganded-PRA, Fig. 5b. Moreover, unique interactor proteins of unliganded-PRA are involved in gluconeogenesis, glycolysis, among others, as shown in Fig. 5c. In the presence of progestin, proteins preferentially interacting with progesterone-bound PRA and PRB enriched similar pathways but showed differences in significance value, except for proteins in DNA damage and dysfunction of mitochondria pathways that were only interacting in progesterone-bound PRA, Fig. 6b. Moreover, unique interacting proteins of progesterone-bound PRA are involved in abnormal metabolism, excision repair of DNA, remodeling of chromatin as shown in Fig. 6c.
Importantly, we discovered novel PRA and PRB interacting partners in progesterone-independent and dependent conditions. Our dataset of PRA and PRB interacting partners will be useful in investigating the molecular mechanisms of PRA and PRB in breast cancer. These new PRA and PRB interactome data will serve as molecular resources benefiting future interrogation into the PRA and PRB mediate breast cancer progression.

Methods
Inducible HA-tag PRA and HA-tag PRB with a Tet-on lentiviral system. To identify PRA and PRB interacting partners, we applied a Tet-on lentiviral transduction technique to transduce PRA or PRB into T47DC42 (ER -, PR-) breast cancer cells, as previously described 20,28 . To check the expression of PRA and PRB, T47DC42-PRA and T47DC42-PRB 200,000 cells were plated in phenol red-free DMEM (Dulbecco's modified eagle medium), 5% DCC-FBS (Dextran-coated charcoal-stripped FBS; Gibco/Life Technologies) and 1% penicillin/streptomycin (PenStrep) in a 6-well plate and incubated overnight. The next day, cells were treated with 1000 ng/mL of Dox for 24 h or Dox with 10 nm R5020 for 1 h. Cells were washed once with ice-cold PBS and lysed with RIPA lysis buffer (Merck Millipore) containing proteinase inhibitor cocktail (Roche). Cells were scraped and the lysate was collected and rotated end-over-end for 30 min at 4 °C. The supernatant was collected, and protein concentration was performed using a Bradford assay (Bio-Rad). Similar amounts of protein were separated on a 10%SDS-PAGE gel, and proteins were transferred onto PVDF membranes. Blots were probed with Equal amounts of protein lysate from Light PRA and Heavy PRB and Medium control cells were immunoprecipitated with HA-tag antibody-conjugated with agarose beads. Repeated washings were performed to remove nonspecific protein bindings. PR and PR-interacting protein complexes were eluted using laemmli buffer. Eluted proteins from light, medium, and heavy were mixed 1:1:1 and separated by SDS-polyacrylamide gel electrophoresis (SDS-PAGE). Gels were stained with Coomassie blue stain, cut into slices, and digested with trypsin before injecting into high-resolution LC-MS/MS analysis.
The Co-IP products from light, medium, and heavy were then combined 1:1:1 and short-run on 10% SDS-PAGE gels for 10 min. Each gel was fixed with 50% methanol-7% acetic acid for 1 h with gentle shaking before staining with Coomassie blue for 1 h and destained with milliQ water overnight. In gel digestion was performed as previously described 30 . Briefly, individual gels were cut into 1 × 1 mm pieces, then 50% acetonitrile (ACN)-50 mM NH 4 HCO 3 was added and incubated for 10 min at room temperature to de-stained the color. Gel pieces were reduced and alkylated with 5 mM tributylphosphine and 20 mM acrylamide (Sigma) in 100 mM NH 4 HCO 3 for 90 min at room temperature. Then the gel pieces were dehydrated with 100% ACN before proteins were digested with trypsin (Sigma Proteomics grade) at 37 °C overnight. After trypsin digestion, the supernatant containing the peptides were sonicated in a water-bath for 10 min before collection. The sonication step was repeated after adding 100 µL of 50% ACN-0.1% formic acid and peptides solution was collected and combined with the previous one. The peptide volume was reduced to 30 µL by rotary evaporation and centrifuged at 14,000 g for 10 min to remove interfering materials before LC-MS/MS analysis. www.nature.com/scientificdata www.nature.com/scientificdata/ Nano LC-MS/MS. The nano LC-MS/MS was set up as previously described 31 . The Acquity M-class nanoLC system (Waters, USA) was used to analyze the peptide sample. A 5 µL aliquot of the sample was loaded onto a nanoEase Symmetry C18 trapping column (180 µm × 20 mm) over a 3 minute period at 15 µL/min. The sample was then washed onto a PicoFrit column (75 µmID × 300 mm; New Objective, Woburn, MA) which was packed with Magic C18AQ resin (3 µm, Michrom Bioresources, Auburn, CA). The eluted peptides were loaded into the mass spectrometer (Q Exactive Plus mass spectrometer; Thermo Scientific). The program configuration was: 5-30% MS buffer B (98% Acetonitrile + 0.2% Formic Acid) for a time period of 90 minutes, then 30-80% MS buffer B for 3 minutes, followed by 80% MS buffer B over 2 minutes, and then 80-5% for a further 3 min. The peptides obtained after elution were ionised at 2400 V. The Data Dependant MS/MS (dd-MS 2 ) investigation was executed using a survey scan of 350-1500 Da performed at 70,000 resolution for peptides of charge state 2 + or higher with an AGC target of 3e6 and maximum injection time of 50 ms. Using an isolation window of 1.4 m/z, an AGC target of 1e5 and a maximum injection time of 100 ms, the top 12 peptides were chosen and fragmented in the HCD cell. The selected fragments were scanned using the Orbitrap analyzer at a resolution of 17,500. The resulting product ion fragment masses were measured (mass range of 120-2000 Da). The precursor peptide mass was subsequently excluded for 30 seconds.

Data analysis.
A total of 12 raw files corresponding to three forward and three reverse SILAC of both progestin independent and dependent ( Table 1) were analyzed using the MaxQuant software suite 1.6.0.16 (www.maxquant.org) 32 . We processed six replicates with both forward and reverse labeled samples together and searched against an in silico tryptic digest of human proteins from the UniProt sequence database (September 1, 2017) by the Andromeda search engine. Enzyme specificity was set to trypsin with only tryptic peptides with a minimum of seven amino acids in length and a maximum of two missed cleavages considered. A precursor mass tolerance of 20 ppm and a fragment mass tolerance of 0.5 Da with an FDR < 0.01 at the level of proteins, peptides, and modifications were set for mass spectra searching. The search included propionamide (C) as a fixed modification, acetylation of protein amino (N)-termini, oxidation of methionine, deamidation of asparagine and glutamine, medium (Arg + 6, Lys + 4), and heavy (Arg + 10, Lys + 8) isotope labeling were set as variable modification. The "proteinGroups.txt" file produced by MaxQuant was further analyzed in Perseus (version 1.6.1.1). The SILAC ratios were log2-transformed and proteins from the reverse database, proteins only identified by site, and contaminants were removed. Only proteins identified in at least four of the six replicates were allowed for further analysis. Statistical analysis was performed by employing a one-sample t-test (p-value < 0.05), when log2 SILAC ratios (L/M and H/M) were compared against a value of 0 (control, log2(1)). Moreover, proteins with an average SILAC ratio ≥ 1 were only considered as high confidence protein partners. The use of SILAC ratio cut-off plus the p-value < 0.05 as criteria instead of applying multiple testing correction could help reduce the false positive without excluding true-positive interacting partners in quantitative proteomics 33 .
Biological functions associated with PRA and PRB interacting partners were projected using IPA software (Qiagen Inc., USA, https://www.qiagenbioinformatics.com/products/ingenuity-pathway-analysis) 34 . Fisher's exact test was performed to calculate p-value, and a p-value < 0.05 was considered statistically significant.

Data Records
The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE 35 partner repository with the dataset identifier PXD023920 36 . The dataset includes 12 raw files, 2 MaxQuant parameter files (mqpar.xml), and 2 result files, "proteinGroups.txt" of progesterone-independent and dependent conditions. 12 raw files represent 3 biological replicates of forward and 3 biological replicates of reverse SILAC from progesterone-dependent and independent conditions (Table 1). Raw files are non-processed outputs from Q-Exactive plus mass spectrometer. The short-run of gel images and interacting candidate proteins with statistical significance of PRA and PRB both progesterone-independent and dependent were provided via the figshare repository 37 . The protein interactions data have been submitted to the IMEx (http://www.imexconsortium.org) consortium through IntAct and assigned the identifier IM-28705 38 . www.nature.com/scientificdata www.nature.com/scientificdata/

Technical Validation
Cell lines used in our study were free from mycoplasma contamination and were routinely tested for mycoplasma contamination using the MyoAlert TM mycoplasma detection kit (Lonza, Switzerland). Our cell model for PRA and PRB interacting partners was characterized in our previous study 20 . In brief, we successfully induced similar amounts of PRA and PRB as verified by Western blots (Fig. 1b). The biological function of inducible PRA and PRB were characterized and were shown to function similar to PRA and PRB expressed in PR-positive breast cancer cells. Co-IP technique using anti-HA-tag was successfully optimized, as showed in Fig. 1c.
T47DC42, PR-null breast cancer cells were derived from ER/PR-positive T47D cells through long term culture in estrogen deprived medium, resulting in a T47D variant with low to no ER/PR expression 39 . T47D was suggested as an ideal breast cancer cells model to study progesterone signaling as it reflects a luminal A-ER and PR positive subtype, which is the most common type of breast cancer 40,41 . Thus, T47DC42 cells-T47D subclone are suitable for re-expressed PR isoforms as they should contain appropriate factors required for PR function or response to progestin. Moreover, using the Tet-inducible PR expression system, we can induce and identify individual PRA and PRB interacting partners under both progestin-independent and dependent conditions. Since we individually re-expressed PRA or PRB isoforms in PR-null breast cancer cells, our potential limitation is that only homodimer not heterodimer of PRA and PRB interacting partners are investigated.
We applied triple SILAC labeling, Light, Medium, and Heavy, to distinguish the protein interacting partners between PRA and PRB and also nonspecific binding by comparing the ratio measurement of Light/Medium or Heavy/Medium. To ensure the reliability of our data, we applied both three biological replicates forward, and three biological replicates reverse of each condition of SILAC to help enhance the reliability and reproducibility and also correct the experimental errors by averaging ratios measurement of identifying interacting partners. Moreover, we applied the triple SILAC labeling and used the Medium labeling-uninduced-PRA combined with uninduced-PRB as controls for the experiments to reduce the nonspecific binding results that often occur during the traditional IP. To reduce false identification of PR interacting proteins, only proteins that met the following criterion; detected in at least 4 out of 6 replicates, showed the statistical significance of SILAC ratio p-value < 0.05, with a minimum fold-change > 2 (log2 SILAC ratio ≥ 1) were considered as highly confident PR interacting partners. The average SILAC ratio cut-off was based on Co-IP validation of an unliganded-PRA interacting partner, splicing factor proline and glutamine-rich (SFPQ), which showed the lowest average SILAC ratio 1.00 (data not shown). Moreover, correlation coefficients of log2 SILAC ratio across replicates indicated reliability and reproducibility between replicates as the average correlation of progesterone-independent and dependent are 0.677 (value range 0.453-0.851) and 0.712 (value range 0.488-0.84), respectively shown in the scatter plot, Figs. 3, 4. Finally, the IPA functional annotations of PRA interacting partners -glycolysis, abnormal metabolism, and remodeling of chromatin, similar to previous studies that identified PRA-rich breast cancer cells expressed proteins that involved in cell metabolism and chromatin remodeling processes 20 . Moreover, in a proteomic analysis of the mouse hypothalamus-PRA interacting partners showed enrichment of proteins involved cell metabolism 42 .