Genetically encoded protein photocrosslinker with a transferable mass spectrometry-identifiable label

Coupling photocrosslinking reagents with mass spectrometry has become a powerful tool for studying protein–protein interactions in living systems, but it still suffers from high rates of false-positive identifications as well as the lack of information on interaction interface due to the challenges in deciphering crosslinking peptides. Here we develop a genetically encoded photo-affinity unnatural amino acid that introduces a mass spectrometry-identifiable label (MS-label) to the captured prey proteins after photocrosslinking and prey–bait separation. This strategy, termed IMAPP (In-situ cleavage and MS-label transfer After Protein Photocrosslinking), enables direct identification of photo-captured substrate peptides that are difficult to uncover by conventional genetically encoded photocrosslinkers. Taking advantage of the MS-label, the IMAPP strategy significantly enhances the confidence for identifying protein–protein interactions and enables simultaneous mapping of the binding interface under living conditions.

μM DsbA (c), or 3 μM AspG2 (d) was added respectively and the mixture was incubated for another 30 min. A significant increase in anisotropy was observed after the addition of the client protein, indicating the binding of HdeA to its client under acid condition. Then the solution was neutralized to pH 7 and it leads to a significant decrease in anisotropy, indicating the release of HdeA from its client after neutralization. FA of the HdeA-S27C-bimane was monitored for 140 min or 70 min during the whole process. (The representative result from 2 replicates is shown).

Supplementary Figure 9. Identification of HdeA interaction partners at neural pH in living cells using IMAPP strategy. (a)
DiZHSeC was incorporated at residue F35 on HdeA dimer interface to photocrosslink with its interaction partner in living E.coli cells. The SDS-PAGE gel analysis showed that the dimer HdeA is the dominant crosslinked complex formed. When the gel bands were cut and further subjected to IMAPP analysis, only three hit proteins were identified (HdeA, FkbA, YrbC), among which HdeA was the dominant hit. (b) Relative abundance of the three IMAPP identified proteins. HdeA was the dominant hit among the three candidates. The relative abundance was calculated through the normalized spectral abundance factor (NSAF) 3,4 . (The representative result from 2 replicates is shown).  Figure 11. Crosslinking radius of DiZHSeC as measured from a 3D structural model. The crosslinking radius was defined as the distance from the C(α) atom to the C(13) atom (the C atom on diazirine) , which was measured as 14 Å based on a structure model generated by the ChemBio Ultra 13.0 (CambrigeSoft). Far-UV CD spectra of wild-type HdeA or HdeA mutants (0.2 mg ml -1 ) were collected in 10 mM phosphate buffer at pH 7 in a 0.1-cm path length quartz cuvette at r.t., and similar CD spectra suggests that these HdeA-DiZHSeC mutants have similar structures as that of WT-HdeA. (The representative result from 3 replicates is shown). Figure 14. Photocrosslinking of HdeA-F35DiZHSeC/WT-HdeA heterodimer at different pH. HdeA-F35DiZHSeC (carrying a C terminal His-tag) was used to photocrosslink with WT-HdeA (carrying no tag) at either pH 7 or pH 2. At both pH conditions, HdeA-F35DiZHSeC was able to capture the WT-HdeA to form the crosslinked HdeA-F35DiZHSeC/WT-HdeA heterodimer. The heterodimers are marked with red arrows on the SDS-PAGE gel. (The representative result from 3 replicates is shown). Figure 15. Crosslinking sites illustrated the dynamic change of HdeA dimer interface from pH 7 to pH 2. DiZHSeC was incorporated at position F35 of HdeA (carrying a C-terminal His-tag) to photocrosslink with the WT-HdeA (carrying no tag) at pH 7 and pH 2 respectively. The crosslinked dimer was further subjected to IMAPP strategy to identify crosslinking peptides and sites. The crosslinking peptides and sites at pH 7 and pH 2 are listed in the table respectively. The region that is predicted to harbor the crosslinking site based on the MS/MS spectra is colored in green. The crosslinking site that could be unambiguously assigned to one specific residue based on the MS/MS spectra is colored in red. (The representative result from 3 replicates is shown). S12 Supplementary Figure 16. Identification of a novel HdeA/DegP interaction interface. (a) HdeA interacted with DegP mainly through the two hydrophobic regions. Top: HdeA is composed of two terminal domains (colored in red) and two hydrophobic domains in the middle (colored in blue). The two hydrophobic domains are known for interactions of HdeA with its substrate proteins according to the literature 6 . Bottom: Photocrosslinking of HdeA/DegP-S210A complexes with DiZHSeC incorporated in different sites. High efficient crosslinking was observed with DiZHSeC incorporated at residue T31, L39, V49 and V58 on the two hydrophobic regions (colored in blue) respectively, while low efficiency was observed when DiZHSeC was incorporated at residue A6 near the N-terminus (colored in red). (The representative result from 3 replicates is shown). (b) A sequence diagram illustrating that DegP consists of a protease domain (1-259), a PDZ1 domain (260-358) and a PDZ2 domain (359-448) 7,8 . (c-f) The crosslinking peptides and sites identified by IMAPP strategy, with DiZHSeC incorporated at different sites on the two hydrophobic regions are displayed on DegP-S210A protein sequence, which are all localized on the protease domain and the PDZ1 domain. (The representative result from 2 replicates is shown). (g) Integrated view of all identified crosslinking peptides and sites. The crosslinking peptides are labeled in blue. All the peptides identified by LC-MS/MS analysis that could be assigned to DegP, including the crosslinking peptide and the non-crosslinking peptide, are colored in red. The region that is assigned to harbor a crosslinking site based on the MS/MS spectra is colored in yellow. The crosslinking site that could be unambiguously mapped to one specific residue based on the MS/MS spectra is colored in green. Figure 17. Multiple HdeA may bind to a single DegP molecular under acidic condition. A solution of 50 μM HdeA-V58DiZHSeC or WT-HdeA in the present or absence of 15 μM DegP-S210A was incubated at pH 2.0 for 30 min at 37 °C. The solution was then treated with or without UV irradiation followed by the SDS-PAGE gel separation and analyzed by coomassie blue staining. Protein bands corresponding to the crosslinking complex with a 1:1 (HdeA/DegP) binding stoichiometry were marked with a black arrow. Protein bands corresponding to the crosslinking complexes with a higher (HdeA/DegP) binding stoichiometry were marked with red arrows. The immunoblotting analysis of the same gel is shown in   3,4 . All the identified HdeA-crosslinked proteins from this study are listed, which include 50 envelope proteins (colored in red) and 2 cytosolic proteins (resulted from non-specific crosslinking as discussed in the maintext, colored in blue). Because HdeA is a periplasmic chaperone that is expected to only interact with the envelope proteins (located in periplasm or outer and inner membrane), the abundance of all non-crosslinked envelop proteins from the native E. coli periplasmic extraction are also listed for comparison (colored in black). The non-crosslinked proteins from cytosol or with unknown location are not included in the table. All the proteins are ranked according to their native abundance (NSFA values, from high to low). ND: not detected. The table shows that the HdeA clients identified by IMAPP spans the whole range of protein abundance in cell envelope. Majority of our identified HdeA client proteins (colored in red) were significantly enriched in the "crosslinking group", whereas most of the proteins colored in black were not enriched. Some highly abundant proteins such as Tpx and Agp were not crosslinked and had a very low abundance in the crosslinking group. Meanwhile, some other identified client proteins with extremely low abundance such as FdoG were efficiently enriched and showed moderate abundance in the crosslinking group. Taken together, these data and analysis indicate that our photocrosslinking results reflect the intrinsic nature and binding preference of HdeA rather than the non-specific interactions based only on native protein abundance. (The representative result from 2 replicates is shown).

Subcellular location
Supplementary Table 2  to the closest C atom of the crosslinking residues measured from the crystal S35 structure of HdeA dimer (pH 7). The region that is predicted to harbor the crosslinking site based on the MS/MS spectra is colored in green. The crosslinking site that could be unambiguously assigned to one specific residue based on the MS/MS spectra is colored in red.

Synthesis of DiZHSeC
The overall synthetic route of DiZHSeC is shown in Supplementary Fig. 1.
Compound 1 was synthesized as described previously 6  To a nitrogen-protected solution of selenomethionine 3 (788 mg, 4 mmol) in liquid ammonia (~ 30 mL) cooled in a dry ice/acetone bath was added sodium metal (276 mg, 12 mmol) in small pieces over 40 min. The solution was stirred for another 1 h at -78 °C. Then, the bath temperature was gradually raised up to r.t. and excessive ammonia was blown away with a gentle stream of nitrogen inside a well-ventilated fume hood. Residual ammonia was removed under vacuum inside a well-ventilated fume hood to give the crude product 4. Product 4 was used without further purification. The crude product was carefully dissolved with 15 ml degassed ice-cold water under nitrogen atmosphere. Compound 2 (1.02 g, 4 mmol) was dissolved with 9 ml degassed ethanol and added to the solution of 4 dropwise over 10 min under nitrogen atmosphere at 0 °C. The temperature was gradually raised up to r.t., and the mixture was allowed for stirring overnight. Then, the pH of the solution was adjusted to 5~6 to precipitate the crude product as a light yellow solid. The solid was further purified by HPLC and gave the product DiZHSeC as a white solid (600 mg, 42.8%).

Preparation and analysis of E.coli periplasmic proteome.
The protocol for isolation of E. coli periplasmic proteins was performed according to the Epicentre PeriPreps™ Periplasting Kit. In brief, a 10 ml volume of E.coli cells bearing the HdeA-V58DiZHSeC were harvested by centrifugation at 4000 rpm for 10 min and the supernatant was discarded. The pellet was re-suspended in 0.5 ml of periplasting buffer (20% sucrose, 1 mM EDTA, and 30000 unit ml -1 lysozyme). The sample was incubated on ice for 5 min followed by quickly adding of 0.5 ml of ice-cold water. Then the sample was incubated on ice for additional 5 min followed by centrifugation at 12000 rpm for 2 min. The supernatant was recovered as the periplasmic fraction. Then, the proteome was separated by the SDS-PAGE gel and the corresponding protein bands were excised and cut into pieces. To obtain the crosslinking proteome, the E. coli cells were subjected to the protocol described above after photocrosslinking, followed by the Ni-NTA purification procedure. The crosslinking proteome were separated by the SDS-PAGE gel and the corresponding protein bands were excised and cut into pieces. The gel pieces were dehydrated in acetonitrile, incubated in buffer I (10 mM DTT, 50 mM ammonium bicarbonate) at 56 °C for 30 min, and were further incubated in buffer II (55 mM iodoacetamide, 50 mM ammonium bicarbonate) at ambient temperature for 1 h in the dark before being dehydrated. Then the samples were in-gel digested with sequencing grade trypsin (5 ng μl -1 trypsin, 50 mM ammonium bicarbonate, pH 8.0) overnight at 37 °C. The resulting peptides were extracted twice with 5% formic acid/50% acetonitrile in water, and then vacuum-centrifuged to dryness. The extracted peptides were reconstituted in 0.2% formic acid, loaded onto a 100 μm x 2 cm pre-column and separated on a 75 μm x 20 cm capillary column both of which were packed in-house with 4 μm C18 bulk materials (InnosepBio, China). An Easy nLC 1000 system (Thermo Scientific, USA) was used to generate the following HPLC gradient: 5-30% B in 120 min, 30-75% B in 4 min, then held at 75% B for 20 min (A = 0.1% formic acid in water, B = 0.1% formic acid in acetonitrile). The eluted peptides were sprayed into an LTQ-Orbitrap-Elite mass spectrometer (Thermo Scientific, USA) equipped with a nano-ESI source. The mass spectrometer was operated in data-dependent mode with one MS scan in FT mode at a resolution of 120000 followed by 10 CID (Collision Induced Dissociation) MS/MS scans in the ion trap for each cycle. Raw data files produced in the Xcalibur software (Thermo Scientific) were transformed to mgf files through MSConvert and then searched with Mascot V.2.3.02 (Matrix Science) against SwissProt 57.15 (515,203 sequences; 181,334,896 residues) Escherichia coli database (22,646 sequences).