Introduction

The AAA+ ATPase p97 is an essential protein involved in numerous cellular processes and plays multiple key roles during protein homeostasis1,2. p97 consists of six identical monomers that assemble into a functional C6-symmetrical hexamer. Each monomer can be subdivided into an N-terminal domain, the Walker A and Walker B motif-containing ATPase domains D1 and D2, and an extended unstructured C-terminal tail (Fig. 1a). The ATPase functions convert the energy of ATP hydrolysis into mechanical energy required to extract ubiquitylated proteins from membranes, chromatin or macromolecular complexes1. The functional diversity of p97 is mediated through the interaction with a large number of distinct protein cofactors, which mainly interact with the N-terminal domain of p97 and also the C-terminal tail3,4. Cofactors binding to the N-domain interact with two different regions. UBX (ubiquitin regulatory X) or UBXL (UBX-like) domain-containing cofactors as well as cofactors harboring a VIM (VCP-interacting motif) or VBM (VCP-binding motif) bind into a cleft between the Nn- and Nc-subdomains of the N-domain. In contrast, cofactors with a SHP-motif interact with an extended region located in the Nc-subdomain5.

Point mutations in p97 lead to inclusion body myopathy associated with Paget’s disease of the bone and frontotemporal dementia6 and familial amyotrophic lateral sclerosis7, now collectively referred to as multisystem proteinopathy. Furthermore, due to its significant role in regulating a variety of physiological responses, especially as part of the ubiquitin-proteasome-system, p97 has emerged as a potential therapeutic target, in particular to treat cancer8. Inhibition of p97 leads to cell death in HCT116, A5499 and B-ALL10 tumor cells.

Based on their mechanism of action, known p97 inhibitors can be divided into three categories7,11,12: (i) Competitive D1 and/or D2 ATPase inhibitors (e.g., DBeQ13, ML240/24114, NMS-85915 or CB-508316); (ii) allosteric inhibitors of the ATPase function by a non-competitive mechanism (e.g. NMS 87317, UPCDC3024518 or MSC109430819); and (iii) inhibitors with an unknown mechanism of action (e.g. clotrimazol20 or eeyarestatin I21). Furthermore, inhibitors can be divided into non-covalent ligands and covalent binders, the latter mainly modifying residue C522 in the nucleotide-binding site of the D2 domain11,15. So far, only the natural product xanthohumol, occurring in hop plants, has been reported to bind to the N-domain of p9722.

As demonstrated by the failure of the CB-5083 inhibitor in clinical phase I trials12, inhibitors that competitively target the ATPase functions of p97 may exhibit limited selectivity due to the inhibition of other nucleotide-dependent enzymes. Furthermore, inhibition of the ATPase function leads to an indiscriminate loss of cellular p97 functions. In contrast, inhibition of cofactor binding to the N-domain by a protein-protein-interaction (PPI) inhibitor would enable the selective targeting of specific p97 functions. Beyond the long-term goal of using them as therapeutics, these inhibitors would also be important to dissect the molecular and cellular functions of p97 and its cofactors, thus helping to unravel how certain cofactors control p97 activity.

To identify chemical starting points for the development of PPI inhibitors targeting the N-domain of p97 (p97-N), we used a fragment-based screening approach (Fig. 1b). In contrast to screenings with drug-like compounds, the use of fragments (i.e., smaller compounds with 10–20 heavy atoms) can sample the regions of chemical space more efficiently23. A fragment-based approach is also best suited to address more challenging targets, such as interfaces of protein-protein complexes, and has already been used in the successful development of PPI inhibitors against IL-224, TNF-α or BCL-XL25. In the single hitherto reported fragment screening with p9726, six fragments binding to the N-domain were identified by surface plasmon resonance spectroscopy (SPR) and saturation transfer difference (STD)-NMR methods.

The screen reported here was conducted by biolayer-interferometry (BLI), followed by STD-NMR experiments. To map the binding sites of the identified fragments mixed-solvent molecular dynamics (MD) simulations were used as well as X-ray crystallography and 1H-15N-HSQC (heteronuclear single quantum coherence) NMR measurements to confirm the binding sites predicted by the simulations. Importantly, the first structure of a small molecule in complex with the N-domain of p97 could be obtained by X-ray crystallography, presenting a starting point for further optimization and structure-based drug design. Moreover, this study demonstrates that the BLI technique, which has been reported so far only sporadically in the literature27, is well suited for detecting fragments and is a viable option besides the commonly used SPR method.

Results

Initial biolayer interferometry screenings

As starting point for the initial screen, a BLI binding assay was developed with a p97-ND1 construct (amino acids 2-481) to allow validation with adenosine diphosphate (ADP) as a positive control. The p97-ND1 construct was enzymatically biotinylated via an Avi-Tag and Escherichia coli biotin ligase28 and loaded to high response (>5 nm) on Super Streptavidin Biosensors (SSA), yielding a homogeneous sensor surface. The affinity (KD) of ADP to p97-ND1 was measured to be 200 ± 14 nM using a Langmuir model based on steady state measurements of a dilution series. This value is in good agreement with reported affinities for ND1 constructs of similar lengths; e.g., a value of 200 nM was reported for a ND1 fragment spanning amino acids 1-458 and a value of 98 nM for a ND1 fragment spanning amino acids 1-48026,29. The kinetic analysis of the sensorgrams indicated a clear 1:1 binding model for ADP. The global exponential regression resulted in an affinity of 239 ± 9 nM, in agreement with previous results. A comparison of the kinetic rate constants determined by SPR measurements with a comparable ND1 construct29 revealed kon and koff rates in a similar range for ADP (see Supplementary Fig. 1). Another surprising result were the high responses upon ADP-binding with shifts over 0.7 nm. These high values indicate that the BLI signal is not only dependent on the molar weight as in SPR but may also reflect conformational changes induced in p97 upon ADP binding.

For hit discovery, a fragment library located at the Facility for Biophysics, Structural Biology and Screening at the University of Bergen (BiSS) containing 679 fragments was screened. Due to the lack of a positive control for p97-N, the fragment library was first screened with p97-ND1 to validate the experimental conditions. The screen resulted in an overall Z-value30 of 0.67, demonstrating its high quality (values between 0.5 and 1.0 represent excellent assays). A fraction of 82 fragments was identified above the chosen threshold (>1.0 sigma, Supplementary Fig. 2). Fragments showing abnormally high or negative signals or untypical shapes of the BLI curves were eliminated after visual inspection, leaving 48 compounds (7.0 %) for subsequent hit validation. Out of these, 29 fragments exhibited a dose-dependent response, demonstrating the functionality of the assay. Therefore, the same conditions were transferred to a screen with p97-N and the aforementioned library. 66 fragments were above the threshold (>1.0 sigma), and 42 compounds (6.5%) were selected for dose-response experiments after visual inspection of the sensorgrams (Supplementary Fig. 2).

Biolayer interferometry dose-response analysis of N-domain hits

The selected 42 fragments of the screen with the N-domain were tested in a dose-dependent manner using concentrations from 1000.0 µM to 31.3 µM in 1:1 dilution steps. 36 compounds exhibited dose-dependent response signals. While some sensorgrams displayed a behavior characteristic of unspecific binding, 22 curves displayed binding curves indicative of fragment binding (Supplementary Fig. 3). The sensorgrams of these fragments were fitted with a 1:1 Langmuir model to estimate their affinities and analyzed with respect to the following parameters: (i) The presence of a stationary phase; (ii) curve shapes according to either a homogeneous (1:1) or a heterogeneous (2:1) binding model, and (iii) justification of a 1:1 binding model via the obtained empirical constants kobs of the association phase. Based on these analyses, a ScoreBLI was calculated (for details see Supplementary Methods in the Supporting Information). The calculated ScoreBLI should be close to or above 1.0 for ligands that can be described with a 1:1 model, whereas ligands showing unspecific binding yield values << 1.0. Based on the ScoreBLI, we divided the fragments into a low-scoring (<0.70) and a high-scoring (≥0.70) group.

Confirmation of selected fragments by STD-NMR

To confirm the hits of the BLI screening, fragments with p97-N KD values below 700 µM were investigated by STD-NMR. A total of 19 fragments satisfied this affinity cutoff. The STD-NMR method provides insights into the transient binding of a small molecule to a protein in solution. Therefore, the method is suitable to eliminate false positive hits arising from the immobilization of the target protein in the BLI-based screen. However, one has to keep in mind that the strength of the STD effect does not directly correlate with the affinity of a molecule and is influenced by other factors as well31. Accordingly, the STD-NMR measurements were not used for rank-ordering or affinity-based priorization.

The NMR measurements were carried out with the biologically more relevant p97-ND1 construct in the presence of 500 µM ADP to block the nucleotide binding site. The STD effect was calculated by dividing the intensity of the difference spectrum by the intensity of the off-spectrum (Idiff/ Ioff × 100) for each signal32. For further analysis only resonances in the aromatic region of the 1H NMR spectrum of the individual ligand were considered. A mean STD effect of each fragment interacting with p97-ND1 was calculated using the sum of the individual STD effects divided by the number of signals considered. Furthermore, the maximum STD effect of each compound was determined. Three out of the 19 fragments exhibited no significant effect (S/N < 3.0) and were therefore not confirmed by the STD-NMR assay. The remaining fragments showed maximum STD effects in a range from 10.5 % (TROLL9) up to 72.5 % (TROLL6). Similar results were obtained in the previously reported fragment screen with p9726. An overview of the STD effects is shown in Supplementary Fig. 4.

Selection of the identified fragments for further analysis

For further selection of the fragments the ScoreBLI was used as criterion. All fragments showing a ScoreBLI value greater than 0.7 were further investigated, resulting in 13 molecules (Supplementary Fig. 4).

To rule out the possibility that the fragments bind to regions of the N-domain that are normally blocked by the D1-domain of p97, dose-dependent measurements were carried out with the ND1 construct, again using the BLI method. For 12 fragments, a dose-dependent binding was observed and a 1:1 Langmuir model could be applied to obtain the respective KD value. The determined KD(p97-ND1) values mostly agreed well with those previously determined for the respective fragments with the isolated N-domain. Only TROLL1 showed no response with the p97-ND1 construct, and VIK40 bound the ND1 construct with only low affinity (KD > 700 µM). TROLL1 and VIK40 were therefore not considered further.

The sensorgrams of the fragments showed surprisingly slow on and off rates as mentioned above for the binding of ADP. Therefore, based on the kinetic data the affinity was determined and plotted against the affinity deduced from the steady state analysis (Supplementary Fig. 5). 7 fragments showed comparable affinities indicating that the observed slow on and off rates result from the BLI assay. 3 fragments (TROLL13, TROLL14 and TROLL18) showed a higher discrepancy for the calculated affinities with up to 3.5 times lower affinity deduced from the kinetic data. TROLL10 showed the highest discrepancy with a 9.5 times lower affinity and a poor fit to the underlying model and was therefore rejected. The remaining 10 fragments were subjected to additional experimental characterization. An overview of these 10 fragments is given in Table 1. Figure 2 shows the sensorgrams and Langmuir regressions for four fragments of particular interest for the subsequent analyses (as described below).

Chemical space analysis

To characterize the chemical properties of the fragments identified in this study, also in relation to those reported by Chimenti et al.26, the positions of the fragments within the chemical space of the BiSS fragment library were determined. As a standard tool to analyze the chemical diversity of compound libraries33,34,35, a principal component analysis (PCA) was applied. The PCA was based on 10 molecular descriptors (as specified in Materials and Methods).

Figure 3 shows a plot of the first two principal components, which together describe 47.6% of the variance in the BiSS library. The positions of the previously reported fragments26 and the compounds identified in this work are mostly located in the same area in the lower right quartile of the plot: Eight of the 10 fragments from the present study and five of the six compounds previously reported can be found here. Only TROLL2 and its close analogue TROLL7, as well as ID2 are located in different regions of chemical space. The PCA analysis also indicates that there are additional enrichment options for further enhancements of screens against p97-N since the fragment library under investigation does not fully coincide with the chemical space where most of the p97-N ligands can be found.

Ligand-based pharmacophore models

The fragments identified here as well as those reported earlier26 were further analyzed for common or similar functional groups in terms of putative pharmacophore models. High structural agreement was found between TROLL12, TROLL14, VIK20 and ID4 regarding the position of an aromatic feature (i.e., a phenyl ring), an aromatic heterocyclic scaffold and the position of a hydrogen-bond donor function represented by a hydroxyl- or amino-group (Fig. 4, superpositions a to c). TROLL18 and ID2 share a 3-amino-pyrazole substructure in combination with a para-substituted phenyl ring (Fig. 4d). The overlay of TROLL2 or its analogue TROLL7 with TROLL8 revealed two conserved hydrogen-bond donor functions, in combination with a hydrophobic, substituted aromatic ring system (Fig. 4e). Furthermore, TROLL6 and ID3 display similar structural properties (Fig. 4f). It can be hypothesized that the identified common features may be involved in interactions with residues in their p97 binding site.

Ligand efficiencies of the identified fragments

One of the main questions in the early phase of drug discovery is the selection of compounds best suited for optimization towards lead structures. Besides an increase in affinity and specificity, especially for weakly binding molecules such as fragments, the optimization towards favorable physicochemical and ADME (Absorption, Distribution, Metabolism, and Excretion) profiles is of high importance. For this purpose, different ligand efficiency metrics were developed to guide the process of molecular optimization36.

In this work, four different ligand efficiencies were calculated for the 10 selected molecules shown in Table 1 based on their KD value for p97-N, the heavy atom count and the clogP value: the ligand efficiency (LE)37, the size-independent ligand efficiency (SILE)38, the lipophilicity corrected efficiency (LELP)39, and the modified lipophilic ligand efficiency by Mortenson and Murray (LLEAT)40. The four ligand efficiency metrics include indices for controlling the size (LE, SILE) and lipophilicity (LELP, LLEAT) of the fragments. For LE and SILE higher values indicate more promising candidates. A lower limit of 0.3 is commonly suggested for LE. LELP is calculated as (c)logP/LE and outlines the cost of the LE paid in lipophilicity; with a value of 0.3 for LE, LELP should lie between −10 and 10. The LLEAT represents a more suitable metric for fragments, taking size and lipophilicity of the fragments into account. As with LE, its suggested lower cut-off is 0.3, and fragments with higher values are more promising40. The calculated indices were used to compare our fragment screen with screening campaigns reported in the literature. Table 2 summarizes average values for selected physicochemical descriptors and ligand efficiency metrics of published hits from fragment screens36 and compares them with the values for the fragments binding to the N-domain reported by Chimenti et al.26 and the fragments identified in this study.

The average values for the 10 hits of our p97-N fragment screen are very close to literature averages of successful fragment screens. In comparison to the fragments reported by Chimenti et al., the compounds identified in this study show superior properties. Especially the LELP and LLEAT parameters (which are dependent both on clogP and affinity) indicate that binding of the fragments identified by Chimenti et al. may be dominated by hydrophobic interactions, which is not ideal for further optimization. The hits identified here appear more promising because of a better balance between affinity and lipophilicity. In particular, the hit compounds TROLL2, TROLL7, TROLL8 and TROLL12 show an LLEAT value above the cut-off of 0.3, which indicates that the binding is not dominated by hydrophobic interactions.

Identification of p97 N-domain binding sites via mixed-solvent MD simulations

Mixed-solvent MD simulations are a common tool to analyze possible interaction hot spots for small molecules in protein binding sites41,42,43,44,45. In contrast to blind docking approaches to the entire protein surface, where the protein is mostly rigid and no explicit solvent is used, mixed-solvent simulations provide a more realistic model of the protein in solution. Typically, fragments employed in this technology are even smaller in size than the fragments used in biophysical screens or the hits identified here. Nevertheless, Martinez–Rosell et al.46 showed that the mixed-solvent simulation approach can handle such larger molecules as well. Consequently, mixed-solvent simulations were carried out to identify putative binding sites. The approach used here is based on the “Site Identification by Ligand Competitive Saturation” (SILCS) methodology41, where a high fragment concentration is subjected to an all-atom explicit-solvent MD simulation. The aggregation of fragments is prevented by dummy atoms that are placed into the molecules to exert a repulsive force between the fragments.

We simulated each fragment at a concentration of 1 M and calculated occupancies of the fragments on the surface of p97-N to identify possible binding regions. For every fragment, the two sites with the highest occupancies were selected, and the binding site was defined by all amino acids within 6 Å of the fragment’s dummy atom. Most of the residues identified by this approach are in the Nc-subdomain of p97-N, especially in the area of the SHP binding region (Fig. 5), while only a few residues are located in the Nn-subdomain. This suggests the Nc-subdomain and particularly its SHP-binding region as the most druggable part of the N-domain. Based on the observations in the mixed-solvent simulations, the SHP binding region was subdivided into three regions (SHP I, SHP II and SHP III). SHP I is the region addressed by the conserved leucine residue of the SHP binding motif. SHP II is targeted by the aromatic amino acids F228 of UFD1 and W242 of Derlin-147,48. SHP III is an adjacent hydrophobic cavity not directly addressed by SHP motif containing cofactors, which harbors a solvent-exposed cysteine residue (C184).

Within the identified binding sites, the occupancies of the heavy atoms of each fragment were further inspected to obtain information about the orientation of the molecules and to distinguish between specific and unspecific binding. In case of the four fragments TROLL2, TROLL7, TROLL8 and TROLL12, the resulting occupancies indicated an orientational preference, thus allowing to predict a binding mode based on the observed densities (Fig. 5e). For TROLL2 (which was simulated as both the R- and the S-enantiomer) only the S-enantiomer was predicted to bind within a pocket formed by residues R113, H115, E167 and H183 (SHP II, Fig. 5e, i). The closely related analogue TROLL7 presumably exhibits the same interactions (Fig. 5e, ii). TROLL12 was also found in this region, but oriented differently (Fig. 5e, iv), whereas TROLL8 bound to a sub-pocket near C184 in the SHP III region (Fig. 5e, iii). Additional occupancies for the methylfurane ring of TROLL8 were also identified in the SHP II region. The binding poses of TROLL2 and TROLL7 were identified in all replicas of the MD simulations, whereas for TROLL8 and TROLL12 a binding event to the aforementioned regions could be observed in three of the four replicas.

Crystal structure of TROLL2 in complex with the N-domain of p97

To confirm the predicted binding sites, X-ray crystallographic studies were performed. As the fragment with the highest binding affinity in the BLI screen, a high-resolution structure of TROLL2 in complex with p97-ND1 could be obtained. The identification of the Br-containing TROLL2 fragment and its binding site was assisted by the anomalous signal of the halogen atom. The structure was solved by molecular replacement using PDB structure 5DYG as the search model. In analogy to Tang & Xia, the same hexagonal space group (P622) with one ND1-monomer in the asymmetric unit was obtained49.

In four data sets a significant anomalous signal corresponding to one binding site as determined by the SHELXCD pipeline50 could be identified. To confirm that the anomalous signal is not an artefact of the crystallization conditions, data from apo-crystals (n = 43) obtained under identical conditions, but without any fragment in the crystallized solution, were analyzed for anomalous signals. No anomalous signal could be identified in the same region, and any observed anomalous signal was significantly weaker (by 6σ) compared to the TROLL2 data. These data demonstrate that the observed anomalous signal is due to the presence of the TROLL2 fragment. The data set with the strongest anomalous signal was refined to a crystallographic R-factor of 19.20% and a free R-factor of 23.53% at 1.73 Å resolution with Phenix51 (see Supplementary Table 1 for data collection and refinement statistics). A superposition of PDB structure 5DYG with our structure in complex with TROLL2 showed a root mean square deviation (RMSD) of only 0.2 Å for the Cα-atoms.

The anomalous signal of the Br-atom was unambiguously assigned to the binding pocket identified in the mixed-solvent simulations (Fig. 6a). A polder omit map contoured at 3σ showed a clear density for the Br-atom and most parts of the neighboring carbon atoms of the phenyl ring. Furthermore, weak densities are visible for the oxygen and the nitrogen atom of the ligand (Fig. 6b). The hydrophobic substructure of TROLL2 interacts with a region formed by the aliphatic parts of residues R113, H115, E167 and E185. The phenyl ring of TROLL2 adopts a favorable orientation for a T-shaped interaction with H183. The charged head groups of R113, E167 and the adjacent D169 are all oriented away from the pocket, whereas the pocket itself is outlined by the alkyl chains of the corresponding side chains. To analyze whether there is an energetic binding preference for bromine or aromatic carbon atoms in this environment, a grid-based interaction potential analysis for bromine and aromatic carbon probes was conducted using MOE (Molecular Operating Environment, 2019.01, Chemical Computing Group) on the basis of the X-ray crystal structure (Supplementary Fig. 6). For both probes, interaction hot spots were predicted within this SHP II site, which is also in line with the aromatic residues of the SHP binding motif interacting with this area (Supplementary Fig. 6b).

The binding region contains a contact from an arginine (R113*) of a symmetry-related N-domain which is not part of the p97-ND1 hexamer. This arginine engages in a cation-π interaction with the phenyl ring of TROLL2. A detailed comparison of the results of the mixed-solvent MD simulations shows a high level of agreement in the orientation of the resulting occupancy densities with the experimentally obtained structure (Fig. 6c). Additionally, the simulations revealed that R113 from the same monomer can also interact with the phenyl ring of the fragment via a cation-π interaction; in solution, it should thus be able to replace the R113* interaction resulting from crystal packing.

Binding epitope mapping with STD-NMR

To obtain additional information about the putative binding mode of the selected fragments, full STD-NMR build-up curves were measured for the fragments TROLL2, TROLL7, TROLL8 and TROLL12 and the initial rates of STD build-up (STD0) were calculated. Besides measurements with the ND1 construct, additional STD measurements with the isolated N-domain were carried out, which again confirmed the binding of all four fragments to the N-domain of p97 detected in the BLI measurements. As expected, the saturation effects were higher for the ND1 construct compared to the isolated N-domain due to the stronger nuclear Overhauser effect (NOE) of the much larger protein. Because of water suppression, not all hydrogens of the selected fragments were experimentally accessible. For the remaining hydrogen atoms, different STD0 values were determined, indicating a defined orientation of the fragments upon binding to p97 (Supplementary Fig. 7). For TROLL2, TROLL7 and TROLL12 a clear dominance in the saturation of the protons at the hydrophobic aromatic parts of the molecules was visible.

To compare these observations from the STD experiments with the results of the mixed-solvent MD simulations, a representative binding pose of each fragment was analyzed with the CORCEMA-ST program to calculate theoretical STD0 values (see Supplementary Methods in the Supporting Information). For TROLL2 the theoretical STD0 values were additionally predicted based on the crystal structure. The best agreement between calculated and experimental STD0 values was found for TROLL2 measured with the isolated N-domain. The protons in the ortho position received the highest amount of saturation, followed by the protons in the meta position and the methylene group, which is in line with the predicted pose of the MD simulation (NOE R-factor of 0.34). The calculation based on the experimental crystal structure resulted in a nearly identical NOE R-factor of 0.33 (Fig. 6d) and showed high agreement with the predictions based on the MD simulation pose (NOE R-factor of 0.14).

It should be kept in mind that these weakly interacting fragments could have additional binding sites on p97 which are not covered by the conducted MD simulations. Moreover, even at a given binding site the fragments may show considerable flexibility or even occur in multiple binding modes which are not adequately represented by a single pose selected for the calculations. This is also obvious from the crystal structure, where the density shows binding but is not sharply defined for the entire fragment. Accordingly, the binding model used to predict STD effects is certainly incomplete, and a full agreement with the experimentally determined STD0 values can hardly be expected. Nonetheless, the CORCEMA predictions support the binding site of TROLL2 at the N-domain of p97. The close analog TROLL7, on the other hand, showed a higher saturation effect for the meta position, in contrast to the predicted effects based on the MD simulation, resulting in high NOE R-factors (Supplementary Table 2). An overlaid additional binding mode of TROLL7 not detected by the MD simulations represents a possible explanation. Similarly, the disagreement with the MD model for TROLL8 may indicate an additional or alternative binding site(s) (which is also suggested by the 1H-15N-HSQC measurements described in the next section). Finally, for TROLL12 the predicted binding pose shows a fair agreement in the trend of the STD effects, which is more pronounced with the measurements for the isolated N-domain. Taken together, the quantitative STD-NMR experiments confirm the binding to the N-domain of p97 and the CORCEMA-ST calculations support the predicted binding mode for TROLL2 and TROLL12, but not for TROLL7 and TROLL8. As mentioned, additional binding sites or binding poses (not atypical for fragments), which may not have been captured by the MD simulations, render quantitative predictions with CORCEMA-ST difficult.

1H-15N-HSQC NMR measurements with the N-domain of p97

To clarify the results of the quantitative STD-NMR measurements and since no crystal structures of p97-ND1 in complex with TROLL7, TROLL8 and TROLL12 could be obtained, 1H-15N-HSQC NMR measurements with the isolated N-domain of p97 were performed to further complement the mixed-solvent MD simulations. The 1H-15N-HSQC NMR experiments of all fragments showed chemical shift changes within the regions postulated by the mixed-solvent MD simulations (Fig. 6e, f and Supplementary Fig. 8). Specifically, the largest chemical shift changes could be observed for TROLL7 and TROLL2 for residues I114 and V166, which are both located in the SHP II region. Small shifts for both fragments were also detected for T168 within the SHP II site and C184 in the neighboring SHP III region. Additional chemical shifts were assigned to R95, K190 and R191. In full-length p97 these three residues are close to the D1-domain and are either not accessible or exhibit a different conformation. Due to this fact the observed chemical shifts of these amino acids were not further considered. The remaining chemical shift changes were all in or in close proximity of the SHP II region, in agreement with the binding region identified by the mixed-solvent MD simulations and the crystal structure with TROLL2. TROLL12 showed only for I114 a slight chemical shift, originating within the predicted region of the mixed-solvent MD simulations, but the data here were not as robust as they were for TROLL2 or TROLL7.

For TROLL8 a significant chemical shift was observed for C184 in the SHP III region predicted by the simulations and for E141. E141 is located between the SHP I site and the UBX/L binding region. Both regions were not identified as possible binding sites by the mixed-solvent MD simulations; however, because of the lack of a suitable pocket in the UBX/L region, a binding to the SHP I region seems to be more likely. Additional minor chemical shifts were located again within the SHP II region. Here, the simulations showed a low occupancy for the compound’s methylfurane ring.

The observed chemical shift changes were not particularly strong and caution is certainly warranted in the interpretation. It should also be noted that parts of the SHP II site and most of the SHP III site could not be assigned and so no experimental data were available for these areas. Titration experiments for affinity determination were technically not possible, due to the poor solubility of the fragments. However, the observed chemical shifts were remarkably focused to single amino acids, and their location is consistent with the presumed fragment binding in the SHP region. Interestingly, no strong chemical shift changes were observed within the N terminal subdomain, in agreement with the general outcome of the mixed-solvent MD simulations. At the very least, the results of the 1H-15N-HSQC NMR measurements confirm the addressability of the SHP binding site with small molecules. Furthermore, by indicating additional interaction sites for TROLL8, the experiments provide a possible explanation for the discrepancy of the CORCEMA-ST predictions in case of this fragment.

Discussion

BLI is a well-suited method for fragment screening

So far the BLI method has been reported only sporadically in the field of fragment-based drug design27. Here, we established BLI as a robust and well-performing fragment-screening platform. Key advantages of BLI as a fluid-free system are an easier setup and handling in comparison to other biosensor platforms. The present study demonstrates that this method is a suitable biophysical approach for detecting the binding of low molecular-weight compounds, thus providing promising candidates for a fragment-based drug design approach, also in the context of PPI inhibitors.

The N-domain of p97 as a main cofactor interaction site and potential target for PPI inhibitors was the focus of this work. Due to the absence of a specific binder as a positive control for the isolated N-domain, the quality of the screen was first assessed with the p97-ND1 construct. The screen with p97-ND1 revealed a Z-factor of 0.67 and a signal distribution that reflects a high overall quality. The optimized screening conditions for the ND1 domain were then transferred to the isolated N-domain. An overlap of 18 fragments between the two screens was detected, where most of the N-domain hits were also identified with the ND1 construct (see Supplementary Table 3). Stringent screening allowed to identify 10 high-confidence hits, which were subsequently confirmed by STD-NMR using the ND1 construct as well as the isolated N-domain and additional BLI measurements using the ND1 construct. The satisfactory performance of the screen was further demonstrated when comparing our results with the SPR-based screen of the ND1 domain conducted by Chimenti et al.26 using principal component and pharmacophore analyses. This comparison showed that the BLI-based screen was capable of identifying similar fragments in terms of their position in the chemical space, but also revealed new binders exhibiting different properties.

Nonetheless, even when promising fragment candidates have been identified for a target protein, the selection of candidates for a more detailed experimental characterization and further optimization is a challenging task52. The ScoreBLI applied in this work proved helpful for eliminating hits with unwanted properties, such as unspecific binding. With respect to ligand efficiencies the fragments identified here display superior properties compared to those reported earlier26; in particular, our hits are not dominated by hydrophobic interactions and exhibit more favorable ligand efficiencies for subsequent optimization. The use of mixed-solvent MD simulations for the further selection of fragments binding to specific regions of the target protein proved to be a highly beneficial step as the postulated binding region could essentially be confirmed by X-ray and NMR measurements. Taken together, our BLI-based fragment screening strategy in combination with subsequent in silico tools turned out to be a successful approach.

Identification of starting points for PPI inhibitors targeting the p97 SHP-motif binding site

A selective inhibiton of cofactor binding to the N-domain of p97 would represent a new and valuable avenue in the field of p97 inhibitors. The work described here represents a significant advance towards this goal. The structural relevance of the identified fragments, TROLL2, TROLL7, TROLL8 and TROLL12, as starting points for the development of a PPI inhibitor is shown in Fig. 7. A comparison of the binding sites in the crystal structures of the SHP-motif containing cofactors Derlin-1 and UFD1 with the identified target sites of the fragments reveals a high degree of overlap, in particular in the SHP-II region. The identified fragments address two important hotspots of the PPI of SHP-motif containing cofactors: (i) The postulated binding poses of TROLL7 and TROLL12 derived from the mixed-solvent MD simulations as well as the binding pose found in the crystal structure of TROLL2 mimic the binding of the aromatic amino acids F228 of UFD1 and W242 of Derlin-1 in the SHP-II region, respectively. The importance of this hotspot interaction for the binding of the SHP-motif containing cofactor UFD1 is demonstrated in the reduced affinity of the UFD1 F228A mutant towards p9747,53. (ii) Furthermore, the 1H-15N-HSQC NMR measurements indicated that TROLL8 might bind adjacent to the SHP-III region into a sub-pocket referred to as SHP-I region, which is addressed by L235 (UFD1) or L248 (Derlin-1). Like the aforementioned aromatic residues these aliphatic side chains are hotspot residues and are important for the binding to p97 as shown by their respective replacements with alanine48,53.

Taken together, our results suggest that the design of a PPI inhibitor addressing this site should be possible, since the SHP-motif binding region on p97 has the ability to bind to small molecules. Besides the potential to more comprehensively understand the regulation of p97 by its cofactors, a PPI inhibitor targeting the interaction between p97 and SHP-motif containing cofactors would yield a promising strategy to modulate the degradation of ubiquitylated proteins, with the ultimate promise of a new strategy to treat cancer.

Our successful fragment screen against the N-domain of p97 used the previously underemployed BLI approach and establishes it as a viable alternative to SPR. Importantly, the characterization of the obtained hits resulted in the first crystal structure of a small molecule in complex with the SHP-motif binding region of p97. Mixed-solvent MD simulations and NMR measurements also indicated this region as the most likely target site of the top-ranked fragments. The compounds identified in this study display favorable physicochemical properties for further drug design and provide promising starting points for the development of a PPI inhibitor addressing the SHP-motif binding site in the N-domain of p97.

Materials and methods

Cloning, protein expression and purification

Constructs of p97-N (aa 21-199, N-terminal 3 C protease-cleavable His6-tag) and p97-ND1 (aa 2-481, N-terminal TEV protease-cleavable His6-tag) harbouring a C-terminal Avi-tag were generated by inverse mutagenesis as described28. The correct nucleotide sequence of all constructs was verified by DNA sequencing (Microsynth Seqlab, Göttingen, Germany). Expression and purification were carried out as described in Hänzelmann et al.54. Biotinylation of Avi-tagged p97-N and p97-ND1 was conducted as described28, followed by size exclusion chromatography.

For crystallization p97-ND1(L198W) (residues 1-460) encoding a C-terminal His6-tag with a two amino acid linker (RS) according to the construct used by the Xia lab49 was generated by the restriction free cloning method55 and site-directed mutagenesis (QuickChange Site-Directed Mutagenesis Kit, Agilent). Expression and purification were carried out as described in Tang and Xia49.

Bacterial expression of 15N-labelled protein was performed according to a published protocol56. p97-N (aa 1-208, N-terminal TEV protease-cleavable His6-tag57) was expressed in E. coli BL21 (DE3) RIL cells. Cells were grown in LB medium until the OD600 reached a value between 0.6 and 0.7. After centrifugation, the medium was removed and the cells were resuspended in M9 minimal medium supplemented with 15NH4Cl as the sole nitrogen source, where the volume of the M9 medium corresponded to one quarter of the amount of LB medium previously used. After incubating the cells for 1 h at room temperature, p97 expression was induced with 1 mM IPTG overnight at 16 °C. The protein was purified in a buffer containing 50 mM Tris pH 8, 150 mM KCl, 5% glycerol (v/v), 5 mM MgCl2 and 5 mM β-mercaptoethanol by Ni-IDA affinity chromatography and size exclusion chromatography (Superdex 200, Cytiva). Prior to size exclusion chromatography, the protein was dialyzed overnight at 4 °C in the presence of TEV protease (1:50) followed by Ni-IDA affinity chromatography to remove uncleaved His-tagged protein.

BLI fragment screening

Fragment library

For fragment screening, the library compiled by the Facility for Biophysics, Structural Biology and Screening at the University of Bergen (BiSS) was used. This fragment library is a 679 compound subset of the OTAVA chemicals solubility library which was filtered to exclude similar compounds and compounds with unwanted functionalities58. The fragments have a mean molecular weight of 202 Da and an average clogP (calculated partition coefficient) of 1.52. The fragments contain on average two ring systems, one hydrogen-bond donor, two hydrogen-bond acceptors and two rotatable bonds. Hits from the initial screenings were ordered from OTAVA chemicals for further experimental characterizations.

BLI sensor preparation

Protein constructs for BLI measurements were enzymatically biotinylated using BirA28. Biotinylated p97 was loaded on Super Streptavidin Biosensors (SSA) obtained from Sartorius as follows. SSA-sensors were equilibrated in PBS-buffer, loaded with a protein solution containing 50 µg/mL (p97-ND1) or 100 µg/mL (p97-N) protein, blocked with biocytin and washed in PBS buffer. Reference SSA-sensors were set up by blocking them with a 10 µg/mL biocytin solution for 5 min27. The SSA-sensors were loaded to a shift of up to 7 nm with p97-ND1 and 10 nm with p97-N, respectively.

General screening setup

Screening plates were prepared by adding 10 µL of a 10 mM DMSO stock solution of library compound to 190 µL of PBS buffer (containing 0.05% TWEEN 20, 1 mM DTT and 5 mM MgCl2 [only for p97-ND1]) in 96 well plates (Greiner). The final DMSO concentration was 5% (v/v). All BLI measurements were performed on an Octet RED96e (Sartorius) device.

All fragments were measured twice at a concentration of 500 µM by screening every plate in the forward and backward direction. The assay settings were as follows: Baseline measurement 15 s; association time 60 s and dissociation time 300 s. The resulting data were processed using the double reference method of the Octet Analysis software.

Initial screening using the ND1-domain

Signals of positive (500 nM ADP) and negative controls (buffer) were collected before and after each screening plate. The signals of each sensor were corrected for partial inactivation of the protein over time and for different protein loading states of the sensors using the signals of the 500 nM ADP positive control (see Supplementary Fig. 9 for details on the correction approach).

Initial screening using the N-domain

One fragment (ID5) reported to bind to the N-domain26 could be purchased from Enamine and was tested as a positive control. The compound showed an uninterpretable behavior in the BLI assay using the ND1- as well as the N-domain and no valid dose response could be obtained. Because of the lack of another valid positive control (small molecule) for the isolated N-domain of p97, the assay conditions of the ND1 screen were transferred to the N-domain screen. Every sensor with immobilized p97-N was used for measuring only two screening plates and then discarded. Sensors showing artificial signals during the screening were replaced. The signals were normalized against the protein loading signal of each sensor. Signals with artificially high or large negative values based on visual inspection were eliminated.

Concentration dependencies

Selected fragments of the initial screenings were confirmed by dose-response-titrations using six concentrations ranging from 1000 to 31.3 µM in a 1:1 dilution series. Every concentration was measured twice. The assay settings were as follows: baseline measurement 15 s; association time 120 s and dissociation time 180 s. The double referenced data were further analyzed in Origin Pro to estimate the affinity by fitting the signals of the steady states to a 1:1 Langmuir model. For the fitting all measured data points (n = 2) were considered.

Principal component analysis (PCA) and ligand-based pharmacophore models

The PCA for fragments of the BiSS library was carried out in R (R Core Team, 2020, Vienna, Austria: R Foundation for Statistical Computing) using the prcomp function. It was based on the following descriptors: HA, number of heavy atoms; HDon, number of hydrogen-bond donors; HAcc, number of hydrogen-bond acceptors; TPSA, topological polar surface area; Nrot, number of rotatable bonds; NChir, number of chiral atoms; NRings, number of ring systems; clogP, calculated logarithmic partition coefficient; Narom, number of aromatic atoms; Fsp3, fraction of sp3 hybridized atoms. All descriptors were calculated with MOE (Molecular Operating Environment, 2019.01, Chemical Computing Group ULC, 1010 Sherbooke St. West, Suite #910, Montreal, QC, Canada, H3A2R7, 2021). A flexible molecular alignment was carried out in MOE with the selected fragments of this study and the fragments reported by Chimenti et al.26. The aligned molecules were analyzed using the 3D pharmacophore builder in MOE.

Mixed-solvent MD simulations

For the mixed-solvent MD simulations the structure of p97-N in complex with the SHP-motif of UFD1 (PDB:5B6C)47 was used. All water and buffer molecules as well as the UFD1 peptide were removed. The structure was prepared and protonated (at pH 7) in MOE using default values. For mixed-solvent simulations the Site Identification by Ligand Competitive Saturation (SILCS) method41 was used. To prevent aggregation of lipophilic fragments, dummy atoms were inserted at the center of mass of each ligand. These atoms served as virtual sites for repulsive interactions (only between fragments) using the Lennard-Jones parameters adopted from the SILCS method (ε = −0.01 kcal/mol; Rmin = 24.0 Å), but without applying a switching function. 1.14*CM1A-based charges for the ligands were retrieved from the LigParGen web server59,60,61. A cubic simulation box with a 1 M fragment concentration and 5 nm edge length was solvated with TIP3P water. The solvation box was energy-minimized for 50,000 steps using the steepest descent minimization, equilibrated under NVT and, subsequently, NPT conditions for 100 ps each, and then simulated for 1 ns (NPT) using GROMACS [2019.01]62 with the OPLS-AA force field63 in order to obtain a mixed-solvent box with a homogenous ligand concentration. Coupling of temperature and pressure was performed using the velocity rescaling thermostat and the Parrinello-Rahman barostat64. The protonated protein structure was then solvated with this mixed-solvent box, resulting in a cubic starting system with at least a 1 nm distance from the simulation box border to the protein. The system was again minimized and equilibrated as described above and then subjected to a 40 ns production run with 2 fs time steps, writing out snapshots every 10 ps. Positional restraints on protein heavy atoms were applied during the equilibration phase. Four replicas were performed for each ligand, resulting in a total simulation time of 160 ns per fragment. For TROLL2 and TROLL7 simulations of both stereoisomers were carried out. The trajectories were aligned on the Cα atoms and occupancy densities were calculated for the dummy atoms and selected heavy atoms of the fragments with a resolution of 1.0 Å using the volmap plugin in VMD65. The densities of individual replicas were added using the function volutil resulting in a sum density map for each fragment. In cases where the resulting occupancies indicated an orientational preference, thus allowing to predict a binding mode based on the observed densities, a representative binding pose of the corresponding fragment was selected as follows: For each replica, an average structure was calculated from the trajectory, starting at the time point where the binding event occurred. The snapshot showing the smallest RMSD with respect to this average structure was taken as binding pose of this replica. The binding poses from each replica of a given fragment were compared with the calculated occupancies, and the pose showing the highest agreement with the occupancies was chosen as representative binding pose of that fragment.

X-ray crystallography

The p97-ND1 L198W variant was incubated with ADP (molar ratio 1:13) and crystals were grown at 20 °C by vapor diffusion in hanging drops containing equal volumes (1:1) of protein solution (15 mg/ml) and reservoir solution consisting of 5% PEG 600 (v/v), 4.0 M sodium formate (pH 6.0), 5% glycerol (v/v) and TROLL2 in a final molar ratio of 1:10.

p97-ND1 crystals were cryo-protected by soaking in mother liquor containing increasing amounts (10–30% [v/v]) of glycerol and TROLL2 in a molar ratio of 1:10. The crystals were flash-cooled in liquid nitrogen, and data collection was performed at 100 K at the ESRF in Grenoble (beamline ID30B, wavelength 0.9184 Å). Data were processed using XDS66 and STARANISO67. The anomalous signal of the bromine atom was detected using SHELX50 and AnoDe68. Further calculations were performed using the CCP4 Suite69 and Phenix51. Phases were obtained by molecular replacement using Phaser with the ND1-domain of p97 (PDB: 5DYG49) as search model. TROLL2 was modelled based on the difference map. The structure was ultimately refined in Phenix51. TLS groups were obtained from the TLSMD-Webserver70. Restraints for TROLL2 were calculated using eLBOW in Phenix.

STD-NMR spectroscopy

All Saturation-Transfer-Difference (STD)32 measurements were performed on a Bruker Avance III 400 MHz spectrometer (Karlsruhe, Germany) operating at 400.13 MHz with a PABBI 1H/D-BB Z-GRD inverse probe at 300 K adjusted with a BSVT (Bruker) temperature control unit. Temperature calibration was carried out with 99.8% CD3OD (282–330 K). The stddiffesgp.3 pulse program from the Bruker pulseprogram library was used for data acquisition. The suppression of the HDO signal (4.703 ppm) was achieved by the excitation sculpting method. The saturation frequency for the on- and off-resonances was set to −400 Hz and −16,000 Hz, respectively. The saturation time was 3 s with a relaxation delay D1 of 4 s.

The protein stock solutions were transferred to a deuterated buffer containing 35 mM potassium phosphate and 25 mM NaCl in D2O at a pD of 7.5 using 0.5 mL Centricon devices (Merck Millipore) with a MWCO of 30 kDa by washing and centrifuging several times at 10,000 × g and 4 °C. Buffer-exchanged protein was adjusted to a concentration of 30 µM with NMR buffer. The respective fragment concentration was 428 µM with a final DMSO-d6 fraction of 5% (v/v). All measurements were performed in the presence of 500 µM ADP (in case of the ND1 domain) and 5 mM MgCl2. All data were analyzed using TopSpin 4.0.8 (Bruker Corporation, Billerica, MA, USA).

For binding epitope mapping of selected fragments, the initial growth rates approach was used as described71. For determination of the initial rate of STD build up (STD0) the STD intensities for seven saturation times (0.5, 0.75, 1.0, 1.5, 2.0, 3.0, 5.0 s) were measured for the ND1 as well as for the isolated N-domain and fitted by an exponential equation using R. The recycle delay (d1) was set to 5 s. For each fragment also a negative control without protein was measured with a saturation time of 5.0 s. Theoretical STD0 values on the basis of the obtained X-ray crystal structure as well as for representative mixed-solvent MD poses were calculated using the Complete Relaxation and Conformational Exchange Matrix Analysis of Saturation Transfer (CORCEMA-ST) software version 3.872 and ShiftX version 1.173. Predicted and experimental STD-data were evaluated using the NOE R-factor (see Supplementary Methods in the Supporting Information).

HSQC NMR measurements

15N-labelled p97-N was concentrated to 50–60 µM in 25 mM HEPES (pH 7.5), 125 mM NaCl, 5 mM DTT and 0.01% NaN3. Before the measurements were conducted, 10% D2O and 3-(trimethylsilyl)propane-1-sulfonate (DSS) at a final concentration of 0.1 mM were added to the samples. DSS was used for direct 1H chemical shift referencing as 0.00 ppm. 15N chemical shifts were indirectly referenced by the gyromagnetic ratio74.

1H-15N-HSQC spectra were recorded on a Bruker Avance III (600 MHz) spectrometer at 298 K. DMSO-dissolved ligands were added to the proteins, resulting in molar ratios of 1:80 or 1:160 (protein:ligand). As a control, equivalent volumes of pure DMSO were added. Assignments were obtained from previously recorded spectra of the p97 N-domain75,76. Spectra were plotted and analysed using CcpNMR Analysis77,78 version 3.0.4. Chemical Shift Perturbations (CSP) were calculated according to the equation:

$${CSP}=\sqrt{{\Delta \delta }_{H}^{\,2}+{(0.13\cdot {\Delta \delta }_{N})}^{2}}$$

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.