Abstract
While photo-cross-linking (PXL) with alkyl diazirines can provide stringent distance restraints and offer insights into protein structures, unambiguous identification of cross-linked residues hinders data interpretation to the same level that has been achieved with chemical cross-linking (CXL). We address this challenge by developing an in-line system with systematic modulation of light intensity and irradiation time, which allows for a quantitative evaluation of diazirine photolysis and photo-reaction mechanism. Our results reveal a two-step pathway with mainly sequential generation of diazo and carbene intermediates. Diazo intermediate preferentially targets buried polar residues, many of which are inaccessible with known CXL probes for their limited reactivity. Moreover, we demonstrate that tuning light intensity and duration enhances selectivity towards polar residues by biasing diazo-mediated cross-linking reactions over carbene ones. This mechanistic dissection unlocks the full potential of PXL, paving the way for accurate distance mapping against protein structures and ultimately, unveiling protein dynamic behaviors.
Similar content being viewed by others
Introduction
Diazirine is a widely used functional group in photochemistry1,2,3,4,5, as its three-membered ring readily opens up upon irradiation6,7,8,9. In chemistry, diazirine has been used to modify small molecules and to build polymers10,11. Diazirine’s photo-reaction is triggered upon the irradiation at a wavelength of ~ 365 nm12, which is routinely employed in biological and chemical experiments13,14,15,16,17,18,19,20,21. As a result, diazirine chemistry has been employed in a multitude of biological applications, most notably, photo-labeling and photo-cross-linking (PXL) of proteins.
Photo-labeling is a process to conjugate a probe to a protein with irradiation, which has been used for receptor identification and drug discovery22,23,24,25. In PXL, a photo-cross-linker reacts with two protein residues in the vicinity26,27,28,29. PXL differs from photo-labeling as the probe has a second functional group for protein conjugation. For example, NHS-diazirine (SDA) is a common photo-cross-linker28, which contains an N-hydroxy-succinimide (NHS) group for chemical reaction with protein primary amine group and a diazirine group for photo-reaction.
The development of chemical cross-linking (CXL) has greatly contributed to the recent progress of structural proteomics. The probe used in CXL contains two chemically reactive functional groups for protein conjugation, e.g., the two NHS groups in bis-sulfo-succinimidyl suberate (BS3)30. Subsequently, the cross-linked peptides are identified by using high-resolution mass spectrometry31,32,33,34,35. For its easy implementation and standardized analysis workflow36, CXL has been increasingly used to assess the structure and dynamics of proteins and protein complexes37,38,39,40,41,42,43. However, CXL reactions typically occur in minutes, much slower than photo-reactions38. Moreover, the inter-residue distances from CXL are usually longer than those from PXL1,28,38,44,45, thus providing only weak structural constraints. Nevertheless, PXL reactions can be highly heterogeneous, yielding a mixture of products with protein residues23,46,47,48,49,50. The lack of a detailed and quantitative understanding of the photo-reaction mechanism and products has prevented the wide use of PXL for protein structure analysis.
Diazirine photo-reaction is the key to the success of photo-labeling and PXL applications18. Fluorine-substituted aryl diazirines upon irradiation have been shown to yield carbene intermediates that can be inserted into C-H or N-H bonds of adjacent residues19,51. Alkyl diazirines, in comparison, are less bulky and less hydrophobic than aryl ones. Sinz and coworkers have shown that in addition to the carbene intermediate, irradiation of alkyl diazirine can give rise to a diazo intermediate that can react with acidic residues48. More recently, Woo and coworkers have shown that diazirine-based compounds are preferentially labeled towards protein acidic patches in a pH-dependent manner47. As both diazo and carbene intermediates can be derived from diazirine upon irradiation10,52,53,54,55, how the two intermediates are generated, their relative yields and their preferences toward protein residues are yet to be established.
Commercially available lightbox typically features one or two broad-wavelength mercury bulbs with limited irradiation power. Moreover, \(\left[{hv}\right]\) cannot be varied, and the relationship between photo-reaction and optical power density cannot be evaluated. These limitations hinder a deeper understanding of the photo-reaction mechanism of alkyl diazirine.
In this work, we build an instrument equipped with an array of LEDs and in-line mass spectrometry monitoring. Using this setup, we obtain detailed kinetic parameters and demonstrate distinct preferences of diazo and carbene towards protein side chains, thus allowing protein structure evaluations with residue-specific PXLs.
Results
Uncovering diazirine photolysis mechanism with a power-modulated photo-reaction system
When absorbing a photon, the diazirine group undergoes photolysis through one of the four mechanisms, as shown in Fig. 1. Alkyl diazirine A transforms to diazo intermediate B or directly to carbene intermediate C, with the kinetic rate constants of \({k}_{1}\left[{hv}\right]\) and \({k}_{3}\left[{hv}\right]\), respectively. B absorbs a second photon and transforms to C, with the kinetic rate constant of \({k}_{2}\left[{hv}\right]\). Thus, model I is a simplified version of model II, differing in the lack of a direct A-to-C process. Model III assumes that the B-to-C process occurs spontaneously without irradiation, while model IV involves no B-to-C process.
At a given \(\left[{hv}\right]\), theoretical curves for these four models can be plotted, in which the concentrations of A, B, and D exhibit time-dependent changes (Fig. 1e–h). Model I differs from model II with a lag in the buildup of D, while model IV predicts a single-exponential curve of D. Model III predicts a time-dependent change of D that is sensitive to optical power density \(\left[{hv}\right]\), and, therefore, would appear differently at different \(\left[{hv}\right]\) (see Supplementary Note).
To identify which model best describes the photolysis mechanism of alkyl diazirine, we built a real-time photo-reaction system coupled with in-line mass spectrometry (MS) detection (Supplementary Fig. 1a). In this MS setup, the sample is injected with a constant flow into a PFA tube arranged zigzaggedly and reacts continuously under irradiation. An array of light-emitting diode (LED) at the 365 nm wavelength was arranged to produce a uniform irradiation field covering the PFA tube (Supplementary Fig. 1b). The reaction mixture was subsequently injected into a coupled triple-quadruple MS for real-time multiple reaction monitoring (MRM) analysis, allowing for highly specific and sensitive label-free quantification.
A pulse-width modulation scheme was used to manipulate the photo-reaction time t, as the sample flows through the PFA tube of a fixed length at a constant flow rate (Supplementary Fig. 1c). As the photo-reaction occurs in real-time with a constant flow scheme, no internal standard is needed56,57,58. On the other hand, the optical power density \(\left[{hv}\right]\) can be adjusted, providing a second dimension for the establishment of the kinetic mechanism (Supplementary Fig. 1d). Together, a large number of experimental observables can be obtained.
Diazo not carbene is the main intermediate upon alkyl diazrine photolysis
To dissect the photolysis mechanism of alkyl diazirine. We used sulfo-SDA, for its extra sulfate group and excellent MRM signal in the negative ion mode. We monitored the photo-reaction in real-time with MRM-MS and found that the time-dependent increase of the MS signal for D cannot be fit to a single-exponential function (Fig. 2a), which allowed us to exclude model IV. On the other hand, increasing optical power density \(\left[{hv}\right]\) from 101 mW/cm2 to 242 mw/cm2 caused little change to the profile of D (Supplementary Fig. 2), which led to the exclusion of model III.
In either model I and model II, the diazo intermediate B, isomeric to A, is quickly converted to the carbene intermediate C with the loss of a nitrogen molecule. The production and subsequent disappearance of diazo intermediate B can be monitored with in-line NMR spectroscopy based on the characteristic peak of the methyl group (Supplementary Fig. 3). Moreover, the diazo intermediate can be captured with methacrylate to generate a pyrazole product that can be confirmed with MS (Supplementary Fig. 4). To further distinguish between model I and model II, we assessed the time-dependent change of [A], which should follow a single-exponential decay (Fig. 1e, f), while the exponent \(({k}_{1}+{k}_{3})\left[{hv}\right]\) could be determined by varying optical power density \(\left[{hv}\right]\) (Fig. 2b). We also determined the second exponent \({k}_{2}\left[{hv}\right]\) with linear regression over different optical power density, and found it smaller than \(({k}_{1}+{k}_{3})\left[{hv}\right]\) (Fig. 2b).
We determined the values of \({k}_{1}\left[{hv}\right]\) and \({k}_{3}\left[{hv}\right]\) by varying \(\left[{hv}\right]\) and obtained the ratio of \({k}_{1}/({k}_{1}+{k}_{3})\) (Fig. 2c). The \({k}_{1}/({k}_{1}+{k}_{3})\) ratio decreases slightly from about 0.92 to 0.85 at increasing optical power density \(\left[{hv}\right]\), meaning that an A-to-B process is dominant for the consumption of alkyl diazirine. As such, model II best describes the photolysis mechanism of sulfo-SDA (Fig. 1).
Alkyl diazirine can selectively react with polar residue
Reactions were performed between SDA and AXA tripeptide (also denoted as HY, for the elementary reactions discussed below), with X representing any amino acid (Fig. 3a), at different optical power density \(\left[{hv}\right]\) and irradiation time t. The tripeptide is N-terminally acetylated and C-terminally methylated (Supplementary Fig. 5), and therefore, the residue X largely mimics that in a protein59,60. The production of SDA-HY is mediated by either diazo or carbene intermediate, which comprises five elementary processes defined as \({\left[{{{{\bf{SDA}}}}}{{{{\boldsymbol{-}}}}}{{{{\bf{HY}}}}}\right]}_{\left\{\left[h\nu \right],t\right\}}={\left[{{{{\bf{SDA}}}}}{{{{\boldsymbol{-}}}}}{{{{\bf{HY}}}}}\right]}_{\left\{\left[h\nu \right],t\right\}}^{{{{{\rm{B}}}}}}+{\left[{{{{\bf{SDA}}}}}{{{{\boldsymbol{-}}}}}{{{{\bf{HY}}}}}\right]}_{\left\{\left[h\nu \right],t\right\}}^{{{{{\rm{C}}}}}}={b}_{1}{I}_{{b}_{1}}+{b}_{2}{I}_{{b}_{2}}+{b}_{3}{I}_{{b}_{3}}+{b}_{4}{I}_{{b}_{4}}+c{I}_{c}\). The reaction involving the diazo intermediate B proceeds through one of four mechanisms, in which \({I}_{{b}_{1}}\) and \({I}_{{b}_{2}}\) represent the direct reaction between the protonated form of the tripeptide HY and B and between the deprotonated form of Y- and B, respectively, whereas \({I}_{{b}_{3}}\) and \({I}_{{b}_{4}}\) represent the corresponding proton-catalyzed processes, respectively. The carbene intermediate C reacts with HY via carbene insertion, and \({I}_{c}\) is the corresponding function for the production of SDA-HY, related to optical power density \([h\nu ]\) and irradiation time t.
With a fixed irradiation time t, all four sub-reactions of B maximize at a particular optical power density. This is because a larger \(\left[{hv}\right]\) causes more conversion of B to C, thus decreasing the concentration of B. The maximum peaks are more pronounced for proton-catalyzed \({I}_{{b}_{3}}\) and \({I}_{{b}_{4}}\) processes than non-proton catalyzed \({I}_{{b}_{1}}\) and \({I}_{{b}_{2}}\) processes (Fig. 3b). On the other hand, carbene-mediated reaction \({I}_{c}\) increases monotonically with \(\left[{hv}\right]\). Thus, owing to the different dependence over optical power density, the relative contributions of these five elementary processes, \({b}_{1}\), \({b}_{2}\), \({b}_{3}\), \({b}_{4}\), and c values, for the production of SDA-HY can be evaluated experimentally.
The experimental data and fitted curves for ASA and AIA tripeptides are shown in Fig. 3c, and all other residues in Supplementary Fig. 6; the fitted values of \({b}_{1}\), \({b}_{2}\), \({b}_{3}\), \({b}_{4}\), and c are provided in Supplementary Table 1. A positive value of \(({b}_{1}+{b}_{2}+{b}_{3}+{b}_{4}-c)\) indicates a predominantly diazo-mediated production of SDA-HY, while a negative value, a carbene mechanism (Fig. 3d). Aliphatic and non-polar residues such as Gly, Ala, Val, Leu, Ile, and Met exhibit \(({b}_{1}+{b}_{2}+{b}_{3}+{b}_{4}-c)\) values close to −1, whereas polar residues such as Ser and Thr residues, close to 1. Note that the hydrophilic residues such as Gln and Asn also exhibit negative values due to the lack of nucleophiles. Interestingly, though the carbene intermediate can react with the Tyr side-chain, the diazo intermediate reacts more preferably either with Tyr-O- or, to a smaller extent, with Tyr-OH but catalyzed by proton. We could identify the carboxylate-modified product of AEA tripeptide with one-dimensional proton NMR for the formation of a new ester bond (Supplementary Fig. 7), while it is not the case for photo-adduct with AIA tripeptide for the highly heterogeneous carbene insertion (Supplementary Fig. 8).
To what extent the diazo or carbene intermediate is involved in the reaction also determines the overall yield of the photo-adduct. The yield of diazo-mediated SDA-HY maximizes at a particular optical power density \(\left[{hv}\right]\), whereas the yield of carbene-mediated adduct increases monotonically with \(\left[{hv}\right]\). Moreover, a long irradiation time t, which has been a common practice in batch PXL experiments26,61,62, would lead to excessive generation of carbene intermediate (Fig. 3b). As such, to improve the selectivity for polar residues, a relatively large optical power density \(\left[{hv}\right]\) and a relatively short irradiation time t should be used (Fig. 4 and Supplementary Fig. 9).
We then assessed the absolute yield of SDA-HY based on the relative decrease of peptide MS signal. With the irradiation optimized for polar residues, most polar residues have a yield close to 100% within 2 min of irradiation. In contrast, the conversion rates for non-polar residues are much lower, with the yield of Ile and Val ~ 50-fold lower (Fig. 3e). It should be noted that though Ala-appended AXA tripeptides are intended to mimic a protein, only enhanced cross-linking above the yield observed for AAA tripeptide can be considered for authentic contribution from the X residue. Interestingly, the yield of the ATA adduct is much lower than other polar residues, which is likely due to steric hindrance from the vicinal methyl group.
We further assessed the conversion rates of tripeptides in water-DMSO mixed solvent. Water readily reacts with either diazo or carbene intermediates, leading to a large drop in the yield of SDA-AXA. Yet, the photo-adduct with polar residues generally gives a higher yield than that with non-polar residues (Supplementary Fig. 10). Note that the yield is almost 0 for ASA adduct at 99% water content, as the Ser hydroxyl group is out-competed by water. Thus, relatively buried residues in a protein should be better placed for diazirine reactions.
Residue-specific PXL allows for accurate distance mapping of protein structure
We performed PXL experiments using SDA for nine model proteins. The reaction was carried out in two steps: first, the cross-linking reagent reacts with protein Lys residues in the amine-free buffer, and second, irradiation of 365- nm light at a set power initiates photo-reaction between the alkyl diazirine group and an adjacent protein residue. Note that chemical cross-linking not only enhances the local effective concentration for subsequent photo-cross-linking but also partially shields the photolysis intermediate from the solvent.
We assigned the cross-linked peptide spectra using pLink263, a search engine that has been primarily used for the analysis of CXL spectra. The abundance of cross-linked Asp, Glu, and Tyr residues far exceeds their natural abundance, especially at a low optical power density \(\left[{hv}\right]\) (Fig. 5a). The number of peptide-spectrum match (PSM) follows a similar trend, as the percentage of PSM for polar residues generally exceed the corresponding occurrence in the proteins but decreases at a higher optical power density (Supplementary Fig. 11). That the specific PXLs for polar residues are mediated by diazo intermediate was further confirmed with the identification of α ion fragments for the esterified peptide, which could be cleaved from the Lys side-chain of the cross-linked peptides (Fig. 5b and Supplementary Fig. 12).
The differential utilization of the diazirine photolysis intermediate is also evidenced by the analysis of the loop links. The loop-links involve fewer polar residues, which percentage is generally lower than the corresponding natural abundance (Fig. 5c and Supplementary Fig. 11). This is because many non-polar residues are simply nearby and ready for loop-link reactions, even though carbene intermediate is short-lived and susceptible to water quenching. Interestingly, the loop links involving Tyr are extremely rare, which not only confirms preferential utilization of the diazo intermediate but may also be due to the unfavorable geometry of Tyr bulky side-chain.
We then calculated the Cα-Cα distances of the cross-linked residues based on the protein structures (Fig. 5d). The Cα-Cα distances involving polar residues that readily react with the diazo intermediate are generally within the range permitted by the cross-linker (Supplementary Fig. 13). In contrast, a large proportion of the cross-links involving non-polar residues afford the calculated Cα-Cα distances exceeding the corresponding maximum distance (Supplementary Figs. 14 and 15). Gly, Ala, and Ile residues represent some extreme cases, with the calculated Cα-Cα distances over length by 10 Å or more (Fig. 5d). As the hydrophobic residues are poorly reactive with alkyl diazirine through the carbene mechanism, especially upon water exposure (Fig. 3e and Supplementary Fig. 9), one explanation is that these residues are incorrectly assigned using the search engine designed for CXL. Indeed, the precursor mass errors of the assigned peptides are somewhat larger for cross-links involving non-polar residues than polar residues (Supplementary Fig. 16).
The calculated distances for PXLs involving Glu, Tyr, and Asp are mostly consistent with the protein structures. In contrast, a large proportion of the PXLs involving Thr was found to be over-length (Supplementary Fig. 14 and 15). Moreover, the cross-linked Thr residues are found in highly solvent-exposed regions as compared to the average solvent exposure of Thr in the test proteins (Supplementary Fig. 17). The water-quenching experiment indicated that solver-exposed hydroxyl group is unlikely to out-compete water for the diazo intermediate (Supplementary Fig. 10). Even in a water-free solution, SDA photo-reaction with Thr has a much lower yield than with other polar residues (Fig. 3e). Thus, to minimize false positives, PXLs involving Thr residues are better not used as structural restraints.
Indeed, relatively buried residues are more likely cross-linked, affording high-quality structural restraints. On one hand, Lys residues with a relatively small solvent exposure are more shielded from the solvent, and therefore, once chemical conjugated with the cross-linker, the diazo intermediate is more likely to cross-link to adjacent polar residues without being quenched. For example, residues K245 and K297 in the BSA are both localized in a helix, with K245 more buried (Fig. 5e). As a result, over twice as many PXL residue pairs and matched spectra are found for K245 than for K297. On the other hand, Tyr residues are often buried and largely shielded from water (Supplementary Fig. 17), which accounts for the high abundance of Tyr cross-links.
Together, mechanistic dissection of alkyl diazirine photo-reactions significantly expands the repertoire of cross-links, which now includes polar residues of Glu, Tyr, Ser, Asp, as well as Arg and His, though with smaller occurrence. These unique high-quality specific cross-links allow for distance mapping against the protein structure (Fig. 5f and Supplementary Fig. 18), and also allow for the identification of truly over-length cross-links arising from protein dynamics. In fact, a close inspection revealed that the over-length PXLs involving polar residues could be indicative of transient domain closure of the test proteins64.
Discussion
In this work, we have built a power-modulated photochemical reaction system with the optical power density that can be adjusted from 0.1 mW/cm2 to nearly 300 mW/cm2. For the 365 nm light passing through a sample of 1 cm thickness, about 30 µM/s photons are expected at an optical power density of 10 mW/cm2. Considering possible solvent absorption, the density of the photons should be higher than the concentration of the photo-reactive group. As such, our setup enabled us to perform in-line real-time monitoring of photo-reaction, which cannot be done on a commercial one. Using our setup, we have dissected the detailed mechanism of the photolysis of alkyl diazirine and subsequent reactions with protein residues, and obtained residue-specific PXLs that can be mapped to protein structures.
We have shown that upon the irradiation of alkyl diazirine, diazo, and carbene intermediates are generated largely in sequential order. The diazo intermediate preferentially and specifically reacts with protein polar residues. In comparison, the carbene intermediate reacts with non-polar residues, yielding heterogeneous products. The selectivity toward polar residues can be further enhanced with an optimal combination of optical power density \(\left[{hv}\right]\) and irradiation time t, whereas prolonged irradiation would result in an increased formation of carbene intermediate and a decreased selectivity for polar residues.
Through the dissection of the photo-reaction mechanism of alkyl diazirine, we have demonstrated that PXLs for polar residues can be quantitatively analyzed in a similar fashion to what has been established for the analysis of CXLs. Significantly, the cross-linked residues now include Glu, Asp, and Tyr with high confidence. Moreover, the cross-linked residues are relatively buried and thus are likely localized in protein-ordered regions. As CXLs preferentially involve residues in protein-disordered regions65, the PXLs can provide more stringent constraints of protein structure. On the other hand, metabolically incorporated photo-Leu residues are often deeply buried in the protein hydrophobic core and, therefore, may only react with adjacent hydrophobic residues through the carbene mechanism1. This would result in the production of heterogeneous products, as we have shown, making a quantitative interpretation of the PXLs difficult.
The correct assignment of the buried polar residues allows for distance mapping and evaluation of protein structures individually or on a proteomics scale. Residue-specific assignment of PXLs at the protein interface can also be obtained to construct protein complex structures, complementing AI-based structural modeling43,66,67. Moreover, a plethora of accurate distance restraints would cross-validate one another, thus enabling the identification of genuinely over-length PXLs incompatible with the known protein structures, which can be a manifestation of protein dynamics68,69. Lastly, as photo-reaction can be modulated with temporal precision thanks to the fast reaction kinetics, we envision PXL be used for capturing time-resolved protein intermediary conformations in a reaction trajectory.
Methods
Reagents and materials
NHS-diazirine (SDA) was purchased from Bidepharm (Shanghai, China, catalog number 1239017-80-1), and sulfo-NsHS-diazirine (sulfo-SDA) from Thermo Fisher Scientific (Shanghai, China, catalog number 26173). AXA tripeptides (X stands for any amino acid, with N-terminus acetylated and C-terminus methylated) were ordered from TGpeptide Biotechnology (Nanjing, China) for X = Ala, Asp, Asn, Cys and Lys, and Genscript Biotech (Nanjing, China) for the others. The model proteins used include (1) non-structural protein 5, also known as the main protease, from SARS-CoV2 (Uniprot P0DTD1, PDB code 6XB0)70, (2) Glutathione S-transferase class-mu 26 kDa isozyme (Uniprot ID P08515, PDB code 1B8X), (3) p27 capsid from Rous Sarcoma Virus (RSV, Unitprot ID P03322, PDB code 7NO0)71, (4) bovine serum albumin (UniProt ID P02769, PDB code 4F5S, purchased from Sigma Aldrich with catalog number 146897-68-9), (5) lactoferrin (UniProt ID Q6LBN7, AlphaFoldDB AF-Q6LBN7-F1, purchased from Fujifilm Wako Pure Chemical with catalog number 146897-68-9, Tokyo, Japan), (6) monomeric ultra-stable GFP (Uniprot ID P42212, PDB code 5JZK)72, (7) conalbumin from chicken egg (UniProt ID P02789, PDB code 2D3I, purchased from Sigma Aldrich with catalog number 1391-06-6)73, (8) proteasomal ubiquitin receptor ADRM1/Rpn13 (UniProt ID Q16186, PDB code 2KR0)74, (9) adenylate kinase from E. coli (UniProt ID P69441, PDB code 1AKE)64. The non-commercially available proteins were purified with established protocols in refs. 64,70,71,72,74,75., for proteins (1), (2), (3), (6), (8), and (9), respectively.
Real-time automated photo-reaction with in-line monitoring
The sample was pumped using a Shimadzu Nexera XR LC-30AD Pump in 0.200 mL/min. The solution was injected through the opaque PEEK tube into a transparent PFA tube (1/16 -inch O.D., 0.5 mm I.D.; for a total length of 1 meter, 0.196 mL in volume) for photo-reaction. Programmable LED and adjustable constant current supply were customized by Lightwells (Shenzhen, China). Adjustable constant current supply and pulse-width modulated (PWM) signal were controlled with Raspberry Pi 4B (https://www.raspberrypi.com). Upon photo-reaction in the PFA tube, the solution was continuously injected into Shimadzu 8050 MS for real-time multiple-reaction monitoring (MRM) analysis (experimental parameters are provided in Supplementary Tables 2 and 3), following the design illustrated in Supplementary Fig. 1. LabSolutions (Version 5.91) from Shimadzu Corporation was used for data analysis.
Sulfo-SDA was dissolved in 100 mM DMSO, and diluted with acetonitrile to a final concentration of 0.25 µM. The flow rate was set at 0.200 mL/min through the PFA tube, which takes 58.9 seconds (~ 1 min). The irradiation time t was linearly manipulated by PWM under a constant optical power density \(\left[{hv}\right]\). The reactant A and product D (Fig. 1) were monitored in the MRM mode, with the total ion chromatogram normalized before fitting. Time-dependent changes of A and D were fitted with linear regression to obtain \(({k}_{1}+{k}_{3})\left[{hv}\right]\) and \({k}_{2}\left[{hv}\right]\); the photo-reactions and measurements were repeated at least three times.
The NMR data was collected on a Bruker Avance III 500 MHz Spectrometer (B1 = 500.13 MHz), equipped with a 5.0 mm Probe head (BBO 500S1 BBF-H-D-05 Z SP). NMR signal was collected in-line with the photo-reaction system; 72 transients were collected, with 57344 points in the time domain, 12 ppm in spectral width, 2.5 ppm for the transmitter frequency offset, 1.0 s in relaxation delay, 4.78 s in total acquisition time, and 64 in receiver gain.
The short-lived diazo intermediate B was captured with methyl methacrylate. SDA and methyl acrylate were dissolved in deuterated DMSO to a final concentration of 10 mM. The solution was irradiated under 2.8 mW/cm2 of 365 nm light for 30 min. The captured product, methyl 5-(3-((2,5-dioxopyrrolidin-1-yl)oxy)−3-oxopropyl)−5-methyl-4,5-dihydro-1H-pyrazole-3-carboxylate, was confirmed with the use of Fourier Transform Ion Cyclotron Resonance Mass Spectrometer (Solarix XR, Bruker), affording m/z of 312.118 ([M + H]+), 334.101 ([M + Na]+), and 350.075 ([M + K]+).
SDA and AXA tripeptides [HY] were separately dissolved in 100 mM DMSO and diluted with acetonitrile to a final concentration of 5 µM and 1 µM, respectively, which were then mixed. Note that all the peptides can be readily dissolved. The solution was irradiated with optical power density of 100 mW/cm2 for 2 min. The concentrations of [HY] were monitored in Shimadzu 8050 MS in MRM analysis in real-time as described. The yield was determined through the conversion rate or the decrease of MRM signal of HY. The measurement was repeated three times for each peptide. To assess the water quenching effect, water was mixed with DMSO to the desired water content (v/v) for photo-reaction.
Mathematic analysis for the kinetic parameters of diazirine photolysis
The normalized intensity of A (sulfo-SDA) over irradiation time t was fitted with \({I}_{A}={\left[A\right]}_{0}\exp \left(-\left({k}_{1}+{k}_{3}\right)\left[h\nu \right]t\right)+{I}_{0}\), in which \({\left[A\right]}_{0}\) is the initial concentration and \({I}_{0}\) is the offset. The detailed deduction for the production of D is provided in the Supplementary Note. At an irradiation time t ≥ 20 s, with the assumption of \({k}_{2}\) < \({k}_{1}+{k}_{3}\), the real-time concentration of D can be represented as \({I}_{D}={\left[A\right]}_{0}\left(1-{D}_{0}\exp \left(-{k}_{2}\left[h\nu \right]t\right)\right)+{I}_{0}\).
Upon obtaining \(\left({k}_{1}+{k}_{3}\right)\left[h\nu \right]\) and \({k}_{2}\left[h\nu \right]\), \(\left({k}_{1}+{k}_{3}\right)\) and \({k}_{2}\) was fitted with linear regression by varying optical power density [hν], as shown below
Quantitative analysis of photo-cross-linking (PXL) reaction with proteins
The model proteins were dissolved in SEC buffer (20 mM HEPES, 150 mM NaCl, pH 7.8) to a final concentration of 600 µg/mL, and were aliquoted into 100 μL. For the NHS-ester reaction with primary amine, the samples were incubated in the dark after the addition of 1 μL sulfo-SDA solution. 4 μL of 1 M Tris-HCl at pH 7.8 solution was added to the sample to quench the chemical cross-linking reaction. Photo-cross-linking reactions were performed with various optical power density \(\left[{hv}\right]\) and irradiation time t. Immediately after irradiation, each sample was added with 1 mL pre-chilled acetone and placed at − 20 °C overnight to fully precipitate the protein, which was then collected by centrifugation at 15,000 rpm for an hour.
The precipitate was dissolved in 8 M urea and 0.1 M Tris-HCl at pH 8.5, reduced with 5 mM DTT at 25 °C for 10 min and alkylated with 10 mM iodoacetamide in the dark for 15 min. Subsequently, 3 volumes of Tris-HCl were added to the sample, which also contained 1 mM CaCl2 (to suppress chymotrypsin activity) and 20 mM methylamine (to reduce carbamate modification at the N-terminus of the peptide). Trypsin digestion was carried out at 37 °C overnight with sequencing-grade trypsin (Promega, diluted at a mass ratio of 1:20). The reaction was quenched with trifluoroacetic acid at a final concentration of 5%.
Trypsin-digested peptides were purified with C18 spin tips (Thermo Fisher) and were analyzed on the Orbitrap Fusion Lumos mass spectrometer (Thermo Fisher) coupled to an EASY-nLC 1200 liquid chromatography system, with a 75 μm, 2 cm Acclaim PepMapTM 100 column. The peptides were eluted using a 65 min linear gradient from 95% buffer A (water with 0.1% formic acid) to 35% buffer B (acetonitrile with 0.1% formic acid) at a flow rate of 200 nL/min. Each full MS scan (at a resolution of 70,000) was followed by 15 data-dependent MS2 scans (at a resolution of 17,000), with high-energy collisional dissociation set to 30 and an isolation window of 1.6 m/z. Precursors of charge state ≤ 3 were collected for MS2 scans in the enumerative mode; precursors of charge states of 3–6 were collected for MS2 scans in the cross-link discovery mode. Mono-isotopic precursor selection was enabled, and a dynamic exclusion window was set to 30 s.
The cross-linking data were analyzed with pLink232. The following search parameters were used: MS1 accuracy = ± 20 ppm, MS2 accuracy = ± 20 ppm, enzyme = trypsin (with full tryptic specificity but allowing ≤ 3 missed cleavages), cross-linker = SDA (with Lys one of the cross-linked residues); fixed modifications = carbamidomethylation on cysteine; variable modifications = oxidation on methionine and acetylation at the N-terminus. A false discovery rate of < 5% was used. The α-fragment with an additional cleavage of the cross-linked peptide at the Lys isopeptide bond was identified manually.
Both the cross-linked residues/sites and the number of peptide-spectrum match (PSM) were used to compute the relative abundance of the PXLs. PXL experiments were repeated at least three times for each protein, and only cross-links that were identified in all three experiments were used for statistics. PXL data have been deposited at the ProteomeXchange Consortium (https://www.ebi.ac.uk/pride/) via the PRIDE partner repository, identifier PXD048452.
Cartesian distances between the Cα atoms of cross-linked residues in each PSM was computed from the known structures, assuming the proteins are strictly monomeric. The Solvent accessible surface area was calculated by the Python package freesasa76. Structural figures were illustrated with PyMOL (version 3.6, Schrödinger LLC).
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The cross-link data generated in this study have been deposited in the ProteomeXchange Consortium (PRIDE) database under accession code PXD048452, and the tabulated CSV file for the identified PXLs for the test proteins upon irradiation at 35 mW/cm op for 10 min also provided in the Source Data file. All other data is available from the corresponding author upon request. Source data are provided with this paper.
References
Stahl, K., Graziadei, A., Dau, T., Brock, O. & Rappsilber, J. Protein structure prediction with in-cell photo-crosslinking mass spectrometry and deep learning. Nat. Biotechnol. 41, 1810–1819 (2023).
Lenz, S. et al. Reliable identification of protein-protein interactions by crosslinking mass spectrometry. Nat. Commun. 12, 3564 (2021).
Bond, M. R., Zhang, H., Vu, P. D. & Kohler, J. J. Photocrosslinking of glycoconjugates using metabolically incorporated diazirine-containing sugars. Nat. Protoc. 4, 1044–1063 (2009).
Cuthbert, T. J. et al. Covalent functionalization of polypropylene filters with diazirine–photosensitizer conjugates producing visible light driven virus inactivating materials. Sci. Rep. 11, 19029 (2021).
Manzi, L. et al. Carbene footprinting accurately maps binding sites in protein–ligand and protein–protein interactions. Nat. Commun. 7, 13288 (2016).
Ahn, D.-S. et al. Mode-dependent fano resonances observed in the predissociation of diazirine in the S1 state. Angew. Chem. Int. Ed. 49, 1244–1247 (2010).
Park, Y. C., An, H., Lee, Y. S. & Baeck, K. K. Dynamic Symmetry Breaking Hidden in Fano Resonance of a Molecule: S1 State of Diazirine Using Quantum Wave Packet Propagation. J. Phys. Chem. A 120, 932–938 (2016).
Yamamoto, N. et al. Mechanism of Carbene Formation from the Excited States of Diazirine and Diazomethane: An MC-SCF Study. J. Am. Chem. Soc. 116, 2064–2074 (1994).
Procacci, B., Roy, S. S., Norcott, P., Turner, N. & Duckett, S. B. Unlocking a Diazirine Long-Lived Nuclear singlet state via photochemistry: NMR detection and lifetime of an unstabilized diazo-compound. J. Am. Chem. Soc. 140, 16855–16864 (2018).
Ollevier, T. & Carreras, V. Emerging applications of aryl trifluoromethyl diazoalkanes and diazirines in synthetic transformations. ACS Org. Inorg. Au 2, 83–98 (2022).
Li, M.-L., Yu, J.-H., Li, Y.-H., Zhu, S.-F. & Zhou, Q.-L. Highly enantioselective carbene insertion into N–H bonds of aliphatic amines. Science 366, 990–994 (2019).
Lepage, M. L. et al. A broadly applicable cross-linker for aliphatic polymers containing C–H bonds. Science 366, 875–878 (2019).
Suchanek, M., Radzikowska, A. & Thiele, C. Photo-leucine and photo-methionine allow identification of protein-protein interactions in living cells. Nat. Methods 2, 261–267 (2005).
Yang, T., Li, X.-M., Bao, X., Fung, Y. M. E. & Li, X. D. Photo-lysine captures proteins that bind lysine post-translational modifications. Nat. Chem. Biol. 12, 70–72 (2015).
Tanaka, Y. & Kohler, J. J. Photoactivatable crosslinking sugars for capturing glycoprotein interactions. J. Am. Chem. Soc. 130, 3278–3279 (2008).
Halloran, M. W. & Lumb, J. P. Recent applications of diazirines in chemical proteomics. Chem. Eur. J. 25, 4885–4898 (2019).
Das, J. Aliphatic diazirines as photoaffinity probes for proteins: recent developments. Chem. Rev. 111, 4405–4417 (2011).
Brunner, J., Senn, H. & Richards, F. M. 3-Trifluoromethyl-3-phenyldiazirine. A new carbene generating group for photolabeling reagents. J. Biol. Chem. 255, 3313–3318 (1980).
Musolino, S. F., Pei, Z., Bi, L., DiLabio, G. A. & Wulff, J. E. Structure-function relationships in aryl diazirines reveal optimal design features to maximize C-H insertion. Chem. Sci. 12, 12138–12148 (2021).
Zhang, M. et al. A genetically incorporated crosslinker reveals chaperone cooperation in acid resistance. Nat. Chem. Biol. 7, 671–677 (2011).
Li, X.-M., Huang, S. & Li, X. D. Photo-ANA enables profiling of host–bacteria protein interactions during infection. Nat. Chem. Biol. 19, 614–623 (2023).
Ruoho, A. E., Kiefer, H., Roeder, P. E. & Singer, S. J. The mechanism of photoaffinity labeling. Proc. Natl. Acad. Sci. USA 70, 2567–2571 (1973).
O’Brien, J. G. K., Jemas, A., Asare-Okai, P. N., Am Ende, C. W. & Fox, J. M. Probing the mechanism of photoaffinity labeling by dialkyldiazirines through bioorthogonal capture of diazoalkanes. Org. Lett. 22, 9415–9420 (2020).
Dubinsky, L., Krom, B. P. & Meijler, M. M. Diazirine based photoaffinity labeling. Bioorg. Med. Chem. 20, 554–570 (2012).
Bayley, H. & Knowles, J. R. in Methods Enzymol. Vol. 46 69–114 (Academic Press, 1977).
Müller, F., Graziadei, A. & Rappsilber, J. Quantitative photo-crosslinking mass spectrometry revealing protein structure response to environmental changes. Anal. Chem. 91, 9041–9048 (2019).
Belsom, A., Schneider, M., Fischer, L., Brock, O. & Rappsilber, J. Serum albumin domain structures in human blood serum by mass spectrometry and computational biology. Mol. Cell. Proteomics 15, 1105–1116 (2016).
Brodie, N. I., Makepeace, K. A. T., Petrotchenko, E. V. & Borchers, C. H. Isotopically-coded short-range hetero-bifunctional photo-reactive crosslinkers for studying protein structure. J. Proteomics 118, 12–20 (2015).
Petrotchenko, E. V., Nascimento, E. M., Witt, J. M. & Borchers, C. H. Determination of protein monoclonal–antibody epitopes by a combination of structural proteomics methods. J. Proteome Res. 22, 3096–3102 (2023).
Gong, Z. et al. Visualizing the ensemble structures of protein complexes using chemical cross-linking coupled with mass spectrometry. Biophys. Rep. 1, 127–138 (2015).
Zhang, W. et al. SpotLink enables sensitive and precise identification of site nonspecific cross-links at the proteome scale. Brief. Bioinform. 23, https://doi.org/10.1093/bib/bbac316 (2022).
Chen, Z.-L. et al. A high-speed search engine pLink 2 with systematic evaluation for proteome-scale identification of cross-linked peptides. Nat. Commun. 10, 3404 (2019).
Götze, M. et al. StavroX—A software for analyzing crosslinked products in protein interaction studies. J. Am. Soc. Mass Spectrom. 23, 76–87 (2012).
Rinner, O. et al. Identification of cross-linked peptides from large sequence databases. Nat. Methods 5, 315–318 (2008).
Petrotchenko, E. V. & Borchers, C. H. Crosslinking combined with mass spectrometry for structural proteomics. Mass Spectrom. Rev. 29, 862–876 (2010).
Petrotchenko, E. V. & Borchers, C. H. Protein chemistry combined with mass spectrometry for protein structure determination. Chem. Rev. 122, 7488–7499 (2022).
Piersimoni, L., Kastritis, P. L., Arlt, C. & Sinz, A. Cross-linking mass spectrometry for investigating protein conformations and protein–protein interactions─A method for all seasons. Chem. Rev. 122, 7500–7531 (2021).
Wang, J.-H. et al. Characterization of protein unfolding by fast cross-linking mass spectrometry using di-ortho-phthalaldehyde cross-linkers. Nat. Commun. 13, 1468 (2022).
Jian-Hua, W. et al. Fast cross-linking by DOPA2 promotes the capturing of a stereospecific protein complex over nonspecific encounter complexes. Biophys. Rep. 8, 239–252 (2022).
Kogut, M., Gong, Z., Tang, C. & Liwo, A. Pseudopotentials for coarse-grained cross-link-assisted modeling of protein structures. J. Comput. Chem. 42, 2054–2067 (2021).
Gong, Z., Ye, S.-X., Nie, Z.-F. & Tang, C. The conformational preference of chemical cross-linkers determines the cross-linking probability of reactive protein residues. J. Phys. Chem. B 124, 4446–4453 (2020).
Coffman, K. et al. Characterization of the raptor/4E-BP1 interaction by chemical cross-linking coupled with mass spectrometry Analysis *. J. Biol. Chem. 289, 4723–4734 (2014).
Yan, X. et al. AI-empowered integrative structural characterization of m6A methyltransferase complex. Cell Res. 32, 1124–1127 (2022).
Brodie, N. I., Petrotchenko, E. V. & Borchers, C. H. The novel isotopically coded short-range photo-reactive crosslinker 2,4,6-triazido-1,3,5-triazine (TATA) for studying protein structures. J. Proteomics 149, 69–76 (2016).
Wei, G. et al. Conformational ensemble of native α-synuclein in solution as determined by short-distance crosslinking constraint-guided discrete molecular dynamics simulations. PLoS Comput. Biol. 15, https://doi.org/10.1371/journal.pcbi.1006859 (2019).
Ziemianowicz, D. S., Bomgarden, R., Etienne, C. & Schriemer, D. C. Amino acid insertion frequencies arising from photoproducts generated using aliphatic diazirines. J. Am. Soc. Mass Spectrom. 28, 2011–2021 (2017).
West, A. V. et al. Labeling preferences of diazirines with protein biomolecules. J. Am. Chem. Soc. 143, 6691–6700 (2021).
Iacobucci, C. et al. Carboxyl-photo-reactive MS-cleavable cross-linkers: unveiling a hidden aspect of diazirine-based reagents. Anal. Chem. 90, 2805–2809 (2018).
Gutierrez, C. et al. Enabling photoactivated cross-linking mass spectrometric analysis of protein complexes by novel MS-cleavable cross-linkers. Mol. Cell. Proteomics 20, https://doi.org/10.1016/j.mcpro.2021.100084 (2021).
Hogan, J. M. et al. Residue-level characterization of antibody binding epitopes using carbene chemical footprinting. Anal. Chem. 95, 3922–3931 (2023).
Hashimoto, M. & Hatanaka, Y. Recent progress in diazirine-based photoaffinity labeling. Eur. J. Org. Chem. 2008, 2513–2523 (2008).
Zhang, Y., Burdzinski, G., Kubicki, J. & Platz, M. S. Direct observation of carbene and diazo formation from aryldiazirines by ultrafast infrared spectroscopy. J. Am. Chem. Soc. 130, 16134–16135 (2008).
Admasu, A. et al. A laser flash photolysis study of p-tolyl(trifluoromethyl)carbene. J. Chem. Soc. Perkin Trans. 2, 1093–1100 (1998).
Toscano, J. P., Platz, M. S. & Nikolaev, V. Lifetimes of simple ketocarbenes. J. Am. Chem. Soc. 117, 4712–4713 (1995).
Mix, K. A., Aronoff, M. R. & Raines, R. T. Diazo compounds: versatile tools for chemical biology. ACS Chem. Biol. 11, 3233–3244 (2016).
Cheng, S., Wu, Q., Xiao, H. & Chen, H. Online monitoring of enzymatic reactions using time-resolved desorption electrospray lonization mass spectrometry. Anal. Chem. 89, 2338–2344 (2017).
Fabry, D. C., Sugiono, E. & Rueping, M. Online monitoring and analysis for autonomous continuous flow self-optimizing reactor systems. React. Chem. Eng. 1, 129–133 (2016).
Attwood, P. V. & Geeves, M. A. Kinetics of an enzyme-catalyzed reaction measured by electrospray ionization mass spectrometry using a simple rapid mixing attachment. Anal. Biochem. 334, 382–389 (2004).
Beck, D. A. C., Alonso, D. O. V., Inoyama, D. & Daggett, V. The intrinsic conformational propensities of the 20 naturally occurring amino acids and reflection of these propensities in proteins. Proc. Natl. Acad. Sci. USA 105, 12259–12264 (2008).
Rosenberg, A. A., Yehishalom, N., Marx, A. & Bronstein, A. M. An amino-domino model described by a cross-peptide-bond Ramachandran plot defines amino acid pairs as local structural units. Proc. Natl. Acad. Sci. USA 120, e2301064120 (2023).
Belsom, A., Mudd, G., Giese, S., Auer, M. & Rappsilber, J. Complementary benzophenone cross-linking/mass spectrometry Photochemistry. Anal. Chem. 89, 5319–5324 (2017).
Iyer, L. K., Moorthy, B. S. & Topp, E. M. Photolytic cross-linking to probe protein–protein and protein–matrix interactions inlyophilized Powders. Mol. Pharm. 12, 3237–3249 (2015).
Guangcan, S. et al. How to use open-pFind in deep proteomics data analysis?— A protocol for rigorous identification and quantitation of peptides and proteins from mass spectrometry data. Biophys. Rep. 7, 207–226 (2021).
Gong, Z., Gu, X.-H., Guo, D.-C., Wang, J. & Tang, C. Protein structural ensembles visualized by solvent paramagnetic relaxation enhancement. Angew. Chem. Int. Ed. 56, 1002–1006 (2017).
Zhang, B. et al. Decoding protein dynamics in cells using chemical cross-linking and hierarchical analysis**. Angew. Chem. Int. Ed. 62, e202301345 (2023).
Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500 (2024).
Pei, H.-H. et al. The δ subunit and NTPase HelD institute a two-pronged mechanism for RNA polymerase recycling. Nat. Commun. 11, 6418 (2020).
Walzthoeni, T. et al. xTract: software for characterizing conformational changes of protein complexes by quantitative cross-linking mass spectrometry. Nat. Methods 12, 1185–1190 (2015).
Ding, Y.-H. et al. Modeling protein excited-state structures from “over-length” chemical cross-links. J. Biol. Chem. 292, 1187–1196 (2017).
Jin, Z. et al. Structure of Mpro from SARS-CoV-2 and discovery of its inhibitors. Nature 582, 289–293 (2020).
Campos-Olivas, R., Newman, J. L. & Summers, M. F. Solution structure and dynamics of the Rous sarcoma virus capsid protein and comparison with capsid proteins of other retroviruses11Edited by P. E. Wright. J. Mol. Biol. 296, 633–649 (2000).
Scott, D. J. et al. A novel ultra-stable, monomeric green fluorescent protein for direct volumetric imaging of whole organs using CLARITY. Sci. Rep. 8, 667 (2018).
Chen, X., Lee, B.-H., Finley, D. & Walters, K. J. Structure of proteasome ubiquitin receptor hRpn13 and Its activation by the scaffolding protein hRpn2. Mol. Cell 38, 404–415 (2010).
Liu, Z. et al. Structural basis for the recognition of K48-linked Ub chain by proteasomal receptor Rpn13. Cell Discovery 5, 19 (2019).
Bjorndahl, T. C., Andrew, L. C., Semenchenko, V. & Wishart, D. S. NMR Solution structures of the apo and peptide-inhibited human rhinovirus 3C protease (Serotype 14): structural and dynamic comparison. Biochemistry 46, 12945–12958 (2007).
Mitternacht, S. FreeSASA: An open source C library for solvent accessible surface area calculations. F1000Res. 5, https://doi.org/10.12688/f1000research.7931.1 (2016).
Acknowledgements
We thank the National Center for Protein Sciences at Peking University in Beijing, China, for assistance with mass spectrometry experiments. We thank Profs. Meng-Qiu Dong and Jianbo Wang for stimulating discussions, and thank Prof. Meng-Qiu Dong for the kind gift of GFP protein. This work has been supported by grants from the National Key R&D Program of China (2023YFF1204400 to C.T.) and the National Natural Science Foundation of China (92353304 to C. T. and 22161132013 to C.T.).
Author information
Authors and Affiliations
Contributions
Y. J. and C. T. designed the project, Y. J. and H. N. set up the MRM-MS analysis instrument, and Y. J. collected the data with the assistance of NMR and MS from H. N., S. D., H. F., Xiu. Z., and L. W. Y. J., Xinghe. Z. and J. F. analyzed the data, and Y. J. and C. T. wrote the manuscript, with comments from all other authors.
Corresponding author
Ethics declarations
Competing interests
C.T., Y.J., and H.N. have filed a Chinese utility model patent application (ZL2024207219076) for the “Multidimensional In-line Photo-reaction Monitoring System” described in this work. All other authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Francis O Reilly and the other anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Jiang, Y., Zhang, X., Nie, H. et al. Dissecting diazirine photo-reaction mechanism for protein residue-specific cross-linking and distance mapping. Nat Commun 15, 6060 (2024). https://doi.org/10.1038/s41467-024-50315-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-024-50315-y
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.