Cesium based phasing of macromolecules: a general easy to use approach for solving the phase problem

Over the last decades the phase problem in macromolecular x-ray crystallography has become more controllable as methods and approaches have diversified and improved. However, solving the phase problem is still one of the biggest obstacles on the way of successfully determining a crystal structure. To overcome this caveat, we have utilized the anomalous scattering properties of the heavy alkali metal cesium. We investigated the introduction of cesium in form of cesium chloride during the three major steps of protein treatment in crystallography: purification, crystallization, and cryo-protection. We derived a step-wise procedure encompassing a “quick-soak”-only approach and a combined approach of CsCl supplement during purification and cryo-protection. This procedure was successfully applied on two different proteins: (i) Lysozyme and (ii) as a proof of principle, a construct consisting of the PH domain of the TFIIH subunit p62 from Chaetomium thermophilum for de novo structure determination. Usage of CsCl thus provides a versatile, general, easy to use, and low cost phasing strategy.


Results
HEWL crystallization and phasing. As a proof of concept for the feasibility of our strategy, we initiated our analysis with different approaches to HEWL crystallization. As outlined above, we introduced CsCl at different stages in the crystallization process. Starting with the substitution of standard buffer components like KCl or NaCl we supplemented HEWL with 0.25 M CsCl, a salt concentration that is commonly used in protein buffers for crystallization. Additionally, we supplemented CsCl in the crystallization buffer and/or the cryo-protectant solution (see Table 1). In the case of HEWL the addition of CsCl did not affect crystal growth in any of the evaluated approaches. Crystal morphology or the space group were not affected either indicating no major impact on protein quality or crystallization behaviour (Fig. 2). Subsequently, data sets were collected from the crystals obtained from these different approaches to compare the feasibility and success rate. All approaches led to crystals that could be phased using the anomalous signal as described in the methods section. All cesium sites that were identified during the different approaches have been numbered and are depicted in Fig. 3.
The cesium substructure, after supplementing with CsCl at the different stages towards crystallization, is depicted in Fig. 4. The substructure of CsCl only present in the protein buffer is not shown, as no bound cesium ions could be observed. The data collection and refinement statistics and the statistics for the different steps in the structure solution process provided by the Crank2 pipeline are shown in Tables 2, 3 and 4, respectively. The occurrence and occupancies of the cesium sites for all HEWL datasets are summarized in Table 5. Table 5 shows that cesium sites could only be observed when CsCl was used in the buffer and was at least present in the cryo-protectant. The anomalous signal of the data where CsCl is present in the experiment also improves supporting that Cs is incorporated at stable positions (Table2). In line with this the HEWL dataset #1 in which no Cs site could be detected shows the smallest anomalous signal as indicated by the RCR (Rms Correlation Ratio) anomalous. However, this dataset could still be solved using the standardized approach indicating that the sulfur signal was picked up for successful phasing. The phasing success rate and the statistics given at each step for the Crank2 pipeline (Table 4) support the observation that Cs incorporation is beneficial for phasing HEWL. We observe a clear step and concentration dependent effect of CsCl in the figure of merit (FOM) derived from the initial phases and after initial density modification (Tables 1 and 4) thus further supporting that HEWL phasing has benefited from the described procedure and phasing statistics have improved as compared to dataset #1. In addition, the number of sites show a clear concentration dependent effect (Table 5). When comparing datasets #3 and #4, one additional site and a higher overall occupancy sum could be observed for the latter. In comparison to dataset #2 both, #3 and #4, are superior with respect to sites and occupancy. Incorporating CsCl at high concentrations in the crystallization condition and the cryo-protectant yields a high number of sites with the highest occupancy as indicated in dataset #5. In summary, our data indicate that CsCl is a feasible phasing option and is easily incorporated into protein structures using different approaches. More importantly, we could observe that CsCl is readily interchangeable with commonly used salts like KCl and NaCl. However, one caveat in this experimental approach was that the structure of HEWL could also be solved using the standardised SAD procedure for Cs phasing in the absence of Cs sites indicating that the anomalous signal derived from the sulfur sites is also present in the phasing procedure for Cs thus impairing a final judgement on the feasibility of this strategy.      www.nature.com/scientificreports/ crystallography characterized protein. As a target for the de novo phasing approach we chose a subdomain of the p62 protein from the TFIIH complex of the eukaryote Chaetomium thermophilum. We cloned the pleckstrin homology domain of p62 (p62 PH) and overexpressed it in Escherichia coli (see methods section for details). The His-tagged protein was first purified via affinity chromatography. The subsequent size exclusion chromatography (SEC) was performed either in NaCl-buffer or in CsCl-buffer to assess whether CsCl has an effect on protein quality or oligomerisation. The elution profiles are virtually identical, revealing no significant effect when cesium was utilized instead of sodium in this step of the purification process ( Fig. 5). Furthermore, the presence of CsCl in the crystallization solution did not impact crystallization as depicted in Fig. 6. Taken together, these results further support that CsCl may be highly compatible with purification and crystallization of macromolecules. The different approaches are summarised in Table 6. The final data collection and refinement statistics are provided in Table 7.
Phasing and structure solution of p62 PH. Using our above described strategy, we were able to solve and build the complete p62 PH protein model. We succeeded with 4 of the 6 employed approaches for phasing (Tables 6, 7, 8 and 9). The same phasing strategy as for HEWL was employed to obtain comparable results. The phasing statistics for the p62 pH domain improved with the stepwise addition of CsCl, indicating incorporation of Cs that can be harnessed during the phasing procedure. This is again reflected by the anomalous signal of the datasets representing the different approaches. The first two datasets that could not be phased experimentally show no or only a very small anomalous signal (RCR anomalous, Table 8). With the stepwise increase in CsCl concentration in the experiment, the anomalous signal increased to values between 1.5 and 1.9 as defined by the overall anomalous RMS correlation ratio given by aimless. However, the FOM derived from the initial phases did not permit a clear distinction on the success rate since only the last two datasets containing the highest concentrations of CsCl in the experimental approach showed better FOM values compared to the other datasets. After initial density modification the FOMs improved and ultimately led to successful automated structure solution. Data sets #3 and #4 only led to a significant solution after the automated model building routine was employed, suggesting that the signal that can be derived from Cs was very weak but could be utilized. To obtain an overview for all the approaches, unsuccessful de novo phasing cases were phased via rigid body refinement against models from solved datasets. All cesium sites that were observed during the different approaches (Tables 6, Table 10) have been numbered and are depicted in Fig. 7. The cesium substructure after supplementing with CsCl at different stages of the purification and crystallization process is depicted in Fig. 8. For treatment with CsCl only during SEC (crystal #2), a low anomalous peak was observed at site 4. Compared to p62 PH without CsCl treatment (crystal #1) the anomalous peak at this site is higher for dataset #2 (Fig. 9). Thus this site was modelled as potassium in dataset #1 and a Cs in dataset#2 (Fig. 9) and the following datasets. However, the resolution of dataset #1 was lower compared to #2 ( Table 7).
The occurrence and occupancies of the final individual cesium sites for all datasets are listed in Table 10, alongside with the overall occupancy sum and average occupancy per site. Cesium site 1 poses a particular case as it lies on a special position, i.e. a crystallographic two-fold axis (Fig. 10). In this case the doubled occupancy is given. Crystals #3 and #4 display slightly different unit cell parameters (Table 7), going along with a disordered loop region for these datasets (Fig. 11a). As cesium site 3 is coordinated by this loop (Fig. 11b), this site is absent in crystals #3 and #4.
The analysis of supplementing with CsCl during the purification, crystallization, or cryo-protection process (Table 10), indicates an additive effect with respect to bound ions and overall occupancy, which is in line with the observations during the automated phasing procedure employed by the Crank2 pipeline. The comparison of datasets #4 and #5 reveals three additional cesium sites and a higher overall occupancy sum for the latter. A beneficial application of this result can be observed in dataset #3, where the CsCl supplement during SEC was combined with a lowered CsCl concentration in the cryo-protection step. As expected, this approach resulted in fewer occupied sites and a lower overall occupancy, yet this procedure was still sufficiently powerful to overcome the phase problem by means of SAD (Table 8 and 9). Importantly, in contrast to HEWL, phasing for p62 PH was only possible with the CsCl approach, whereas S-SAD alone or MR was not successful. For MR, an NMR model of the human PH domain was available as Table 5. Occurrence, occupancy and B factor of cesium sites in HEWL for the different datasets. a Occupancy/B factor of observed sites is given. Numbers correspond to sites in Fig. 3. b CsCl concentration in mol/l HEWL was dissolved in. c Supplement of CsCl to crystallization condition in mol/l. d Supplement of CsCl to cryo-protectant in mol/l. e Total number of observed sites. f Sum of occupancies of all observed sites. g Average occupancy per site.

Discussion
CsCl was introduced during all three major steps of sample treatment in crystallography: purification, crystallization, and cryo-protection. No detrimental effects during SEC (Fig. 5), crystallization (Fig. 2, Fig. 6), or cryo-protection could be observed. Ultimately, de novo structure solution by means of SAD was successful employing our strategy for p62 PH (Table 6 and 9), whereas the S-SAD approaches failed. Remarkably, even low incorporation as shown for datasets #3 and #4 support structure solution. The expected electrons from Cs as compared to S at the employed wavelength (1.7712 Å) should lead to a signal for Cs that is approx. 12 times higher than for S 27 thus permitting successful phasing with one Cs site that is only partially occupied in the case of p62 PH. The expected higher signal of Cs is reduced in all cases that we analysed since the sites were only partially occupied. However, beneficial effects for phasing can still be observed due to the much higher expected signal. The high compatibility with all three steps in protein handling renders CsCl a highly versatile compound for experimental phasing and enables a flexible adjustment of heavy atom introduction, depending on the specific needs for a particular project. Due to its compatibility with purification, crystallization, and cryo-protection, CsCl can be used in various ways. First, usage in cryo-soaks ("quick-soaks") as described for halides by Dauter et al. 21 is possible, as demonstrated for p62 PH dataset #4. Cryo-soaking with cesium provides a good alternative to soaking with halides, especially when crystals suffer from halide treatment or no bound halides can be obtained due to unfavourable surface charge of the target protein. Here, the opposite charge of cesium can be beneficial. Second, supplementing with CsCl at an early step of protein handling. i.e. during SEC can be combined with cryo-soaks. For both proteins tested in this study, additive effects with respect to bound ions and overall occupancy could be observed. The boosted anomalous signal might be beneficial for difficult borderline cases. Third, this additive effect can be exploited to reduce the CsCl concentration in the cryo-protection step. This approach might be beneficial for proteins, which can only tolerate limited amounts of CsCl, as this procedure would provide much milder soaking conditions. Fourth, if NaCl or KCl are present in the crystallization condition, co-crystallization with CsCl can be conducted. Substitution of NaCl or KCl with CsCl has been successfully pursued for HEWL and p62 PH, respectively.
We therefore suggest to introduce CsCl in the work flow at the earliest possible stage i.e. at the SEC step if applicable. It remains to be investigated whether our protocol can be applied successfully to cases where significantly larger proteins need to be phased with CsCl. However, given the strong anomalous signal provided by Cs at energies that can be readily accessed at most synchrotron beamlines and the high compatibility with current protein purification and crystallization strategies application to larger proteins seems highly feasible.
The usage of CsCl provides an elegant, easy to use, and low cost phasing strategy. No special equipment is needed and the procedure can be seamlessly integrated into the common procedure of sample treatment. CsCl is broadly commercially available and much cheaper as for example seleno-methionine and the potent anomalous   www.nature.com/scientificreports/ scattering propensities make cesium a very powerful agent for phasing. The phasing procedure with CsCl permits a flexible adjustment to the specific needs of a particular project and can be performed in a step-wise procedure. The DNA sequence encoding the p62 PH domain from Chaetomium thermophilum was cloned into a pBADM-11 vector (EMBL) with an N-terminal 6 × His-tag and a TEV cleavage site. P62 PH was expressed in Arctic Express (DE3) RIL cells (Agilent). After cell harvest, the pellet was resuspended and lysed in Lysis buffer, and purified in two steps. First, IMAC was performed using Ni-TED beads (Macherey-Nagel) and bound protein was eluted with Elution buffer. Second, SEC was performed using a HiLoad 16/600 Superdex 200 pg column (Cytiva) with either NaCl-buffer or CsCl-buffer. Peak fractions were pooled and concentrated with centrifugal filter units (Merck Millipore) to 11-13 mg/ml. HEWL was purchased as dry powder (Carl Roth) and dissolved to reach a concentration of 50 mg/ml in deionized water with 0.1 M sodium acetate pH 4.5, or 0.25 M CsCl. No further purification steps were applied.

Reagents
Crystallization. Crystallization experiments were performed using the vapor diffusion method. All solutions used for crystallization were filtered through 0.2 µm filters (Sartorius Stedim Biotech) prior to use.
Crystallization of HEWL was pursued via the hanging drop method in 24 well plates (Crystalgen). 3 µl of protein solution at a concentration of 50 mg/ml was mixed with 3 µl precipitant solution and equilibrated against 1 ml of the precipitant solution. Crystals appeared within 1 or 2 days with edge lengths mostly between 200 und 500 µm. Crystallization and cryo-protectant conditions are listed in Table 1.
Crystallization trays of p62 PH were set up via the hanging drop method in 24 well plates. 1 µl of protein solution at a concentration of 11-13 mg/ml was mixed with 1 µl precipitant solution and equilibrated against 1 ml of the precipitant solution. Plate like crystals appeared within 1 or 2 days with edge lengths mostly between 200 and 600 µm, and a thickness of 20-30 µm. Crystallization and cryo-protectant conditions are listed in Table 6.
Crystals were harvested with cryo-loops (Hampton Research) and flash frozen in liquid nitrogen.
Data collection and processing. Data were collected via the rotation method and datasets were indexed, integrated, and scaled with XDS 28 . One dataset per crystal was collected comprising a full rotation of 360°, except for crystals #3 and #4 of p62 PH. For these, two datasets (2 full rotations of 360°) from one crystal were collected, combined, and brought to a common scale with XSCALE. Data were merged with Aimless 29 . The HEWL data for the cesium approach were collected to resolutions similar to that of the p62 PH data to obtain more comparable data for the analysis. Data collection and processing statistics are given in Tables 2 and 7 for HEWL and p62 PH, respectively.
Structure solution and refinement. Structure solution was performed using a unified unbiased approach applying the Crank2 pipeline 30 that is part of the current CCP4 software package. We deliberately used the default workflow without any modifications, except for the number of SHELXD trials which were raised from 2,000 to 10,000. We used the SAD pipeline that comprises the following setup: 1) Substructure detection with SHELXC,SHELXD 31 , 2) substructure phasing using refmac5 32 , 3) hand determination using solomon and multicomb, 4) density modification with parrot and refmac5, 5) automated model building with buccaneer 33 , refmac5, and parrot, and 6) model refinement using refmac5. The resolution cutoff for substructure detection that is suggested by default from SHELXC was used in all cases and ranged between 3.2 and 2.4 Å for all datasets. Phasing was performed using all data. We used 10 initial sites as estimate for the substructure search for all datasets. The structure solution procedure for each dataset is given in Tables 3 and 8 for HEWL and p62 PH, www.nature.com/scientificreports/ respectively. The main indicators for the quality of each step in the phasing procedure are listed in Tables 4 and  9 for HEWL and p62 PH, respectively. The structures were completed and corrected with Coot 34 . Structures were refined directly against the SAD data with refmac5. The substructure occupancy was refined as well. Model stereochemistry was analysed via the MolProbity server 35 . Refinement and model statistics are given in Tables 2  and 7 for HEWL and p62 PH, respectively. Final statistics for the Cs atoms for HEWL and p62 PH are given in Tables 5 and 10, respectively. (Tables 8, 9, 10).
Anomalous difference maps and final ion assignment. Anomalous difference maps were generated by directly refining against the SAD data and were used as guidance for final ion placement. Anomalous peak heights of sulfur from cysteines/methionines were used as reference to distinguish cesium from other ions. Hereby, peaks clearly exceeding the sulfur peak heights were attributed to cesium. Chloride ions were placed based on the comparison with datasets without cesium. Potassium and chloride ions were distinguished by consideration of bonding distances 36,37 .  www.nature.com/scientificreports/