Small-wedge synchrotron and serial XFEL datasets for Cysteinyl leukotriene GPCRs

Marin, Egor; Luginina, Aleksandra; Gusach, Anastasiia; Kovalev, Kirill; Bukhdruker, Sergey; Khorn, Polina; Polovinkin, Vitaly; Lyapina, Elizaveta; Rogachev, Andrey; Gordeliy, Valentin; Mishin, Alexey; Cherezov, Vadim; Borshchevskiy, Valentin

doi:10.1038/s41597-020-00729-2

Download PDF

Data Descriptor
Open access
Published: 12 November 2020

Small-wedge synchrotron and serial XFEL datasets for Cysteinyl leukotriene GPCRs

Scientific Data volume 7, Article number: 388 (2020) Cite this article

2206 Accesses
4 Citations
30 Altmetric
Metrics details

Subjects

An Author Correction to this article was published on 23 November 2020

This article has been updated

Abstract

Structural studies of challenging targets such as G protein-coupled receptors (GPCRs) have accelerated during the last several years due to the development of new approaches, including small-wedge and serial crystallography. Here, we describe the deposition of seven datasets consisting of X-ray diffraction images acquired from lipidic cubic phase (LCP) grown microcrystals of two human GPCRs, Cysteinyl leukotriene receptors 1 and 2 (CysLT₁R and CysLT₂R), in complex with various antagonists. Five datasets were collected using small-wedge synchrotron crystallography (SWSX) at the European Synchrotron Radiation Facility with multiple crystals under cryo-conditions. Two datasets were collected using X-ray free electron laser (XFEL) serial femtosecond crystallography (SFX) at the Linac Coherent Light Source, with microcrystals delivered at room temperature into the beam within LCP matrix by a viscous media microextrusion injector. All seven datasets have been deposited in the open-access databases Zenodo and CXIDB. Here, we describe sample preparation and annotate crystallization conditions for each partial and full datasets. We also document full processing pipelines and provide wrapper scripts for SWSX and SFX data processing.

A Correction to this paper has been published: https://doi.org/10.1038/s41597-020-00759-w

Measurement(s)	X-ray diffraction data • protein complex • protein structure data • protein crystallization
Technology Type(s)	small-wedge synchrotron crystallography • x-ray crystallography assay • X-ray free electron laser serial femtosecond crystallography
Factor Type(s)	type of G-protein-coupled receptor • type of antagonist

Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.13128758

Routine sub-2.5 Å cryo-EM structure determination of GPCRs

Article Open access 15 July 2021

Cryo-electron microscopy for GPCR research and drug discovery in endocrinology and metabolism

Article 29 February 2024

Visualizing drug binding interactions using microcrystal electron diffraction

Article Open access 31 July 2020

Background & Summary

Cysteinyl leukotrienes, produced from arachidonic acid via the 5-lipooxygenase pathway, are pro-inflammatory mediators that modulate vascular permeability and immune response; hence, they are involved in multiple disorders including asthma, cardiovascular diseases and cancer¹. Cysteinyl leukotrienes elicit their action through two G protein-coupled receptors (GPCRs), CysLT₁R and CysLT₂R, that share 38% sequence identity¹. CysLT₁R is mostly expressed in the lungs and immune cells, and its stimulation leads to allergic symptoms in the airways². CysLT₂R is found additionally in cardiovascular and brain tissues, with demonstrated involvement in ischemia and acute brain injuries^3,4, however, the role of this receptor remains controversial and poorly understood. Both CysLT₁R and CysLT₂R have been implicated in progression of various cancers^5,6,7,8, while the mutated form of CysLT₂R with L129Q substitution has been associated with uveal melanoma^9,10. Thus, CysLTRs are important pharmaceutical targets¹¹, what inspired us to determine their high-resolution structures in complex with antiasthmatic drugs and other prospective antagonists.

Over the last few years, small-wedge synchrotron crystallography (SWSX) and serial femtosecond crystallography (SFX) have developed into powerful techniques, enabling high-resolution structure determination of many difficult to crystallize targets^12,13. Several approaches to data processing have been developed for both SFX¹³ and SWSX^{14,15,16,17,18,19,20}, and several papers reported deposition of raw serial crystallography data for challenging targets^{21,22,23,24,25}. Many datasets can be found online on SBGrid (data.sbgrid.org)²⁶, Zenodo (zenodo.org) or CXIDB (cxidb.org)²⁷; the latter is used for SFX and other XFEL-related data deposition, whereas SBGrid and Zenodo host SWSX among other types of data.

Recently, we have determined crystal structures of CysLT₁R²⁸ (PDB codes 6RZ4, 6RZ5) and CysLT₂R²⁹ (PDB codes 6RZ6, 6RZ7, 6RZ8, 6RZ9). Here, we present fully-annotated SWSX and SFX datasets for these structures, as well as unpublished SFX data of a new crystal form of CysLT₁R. The raw diffraction data, consisting of five SWSX and two SFX datasets, represent a wide range of resolutions (2.4–3.5 Å), SWSX miniset³⁰ wedge sizes (3–180°), and space groups (6 different space groups). We carefully document crystallization conditions and harvesting details for each dataset, allowing one to investigate crystal non-isomorphism. Finally, we describe all data processing steps, provide supporting code and intermediate results, aiming for reproducibility of deposited data processing.

Methods

The preparation of CysLT₁R and CysLT₂R samples, data collection, and processing have been described previously^28,29. Here, we provide a summary for each sample.

Construct engineering, expression, purification, and crystallization of CysLT₁R and CysLT₂R

The human CysLT₁R gene (UniProt ID Q9Y271) was codon-optimized for expression in Spodoptera frugiperda (Sf9) insect cell line and modified for crystallization by a C-terminal truncation at K311 and by the insertion of a fusion protein BRIL³¹ (thermostabilized apocytochrome b₅₆₂ from Escherichia coli with mutations M7W, H102I, and R106L) in the third intracellular loop (ICL3) between K222 and K223 using the S and SG linkers on each side, respectively (Fig. 1a). For CysLT₂R, the human WT gene (UniProt ID Q9NS75) was modified by truncating amino acids 1–16 from the N-terminus and 323–346 from the C-terminus and inserting BRIL into ICL3 between residues E232 and V240. Three point mutations, W51^1.45V, D84^2.50N, and F137^3.51Y (superscripts refer to the generic Ballesteros-Weinstein numbering of residues in Class A GPCR³²), were further introduced to improve receptor surface expression as well as its stability and yield (Fig. 1b).

Each gene of interest was cloned into a modified pFastBac1 plasmid, containing a cleavable influenza hemagglutinin signal sequence (HA), a Flag tag, a AKLQTM linker, a 10 × His tag, and a Tobacco Etch Virus (TEV) protease site followed by KpnI restriction site on the N-terminal side of the inserted gene (Fig. 1c). The plasmid was then transfected into Sf9 insect cells using the bac-to-bac expression system (Invitrogen). High-titer recombinant baculovirus (>3 × 10⁸ viral particles per ml) was obtained and used to infect Sf9 cells at a density of (2-3) × 10⁶ cells per ml culture and a multiplicity of infection of 5–10 in the presence of a ligand: 8 µM zafirlukast (Cayman Chemical) for CysLT₁R or 3 µM BayCysLT₂ (Cayman Chemical) for CysLT₂R. The protein surface expression and the virus titer were measured using flow cytometry. Cells were harvested 48–50 hours post infection by centrifugation at 2,000 × g and stored at −80 °C until use.

Protein purification was conducted at 4 °C. For each protein-ligand complex, the relevant ligand was added during purification. Cells were thawed and lysed by repetitive homogenization with a glass douncer followed by ultracentrifugation (30 min at 220,000 × g), 2 times in hypotonic buffer (10 mM HEPES pH 7.5, 20 mM KCl, and 10 mM MgCl₂) and 3 times in high osmotic buffer (10 mM HEPES pH 7.5, 20 mM KCl, 10 mM MgCl₂, and 1 M NaCl) with the addition of a protease inhibitor cocktail (500 µM 4-(2-aminoethyl)benzenesulfonyl fluoride hydrochloride (Gold Biotechnology), 1 µM E-64 (Cayman Chemical), 1 µM Leupeptin (Cayman Chemical), 150 nM Aprotinin (A.G. Scientific)).

Membranes were then incubated for 30 min in 10 mM HEPES pH 7.5, 20 mM KCl, 10 mM MgCl₂, 2 mg ml⁻¹ iodacetomide, protease inhibitor cocktail, and 25 µM ligand. Then receptors were solubilized by the addition of an equal volume of solubilisation buffer (300 mM NaCl, 2% (w/v) n-dodecyl-β-D-maltopyranoside (DDM; Avanti Polar Lipids) 0.4% (w/v) cholesteryl hemisuccinate (CHS; Sigma), 10% glycerol) and incubation for 3.5 hours. After 1-hour centrifugation (650,000 × g) to remove insolubilized material, the supernatant was incubated with a TALON IMAC resin (Clontech) overnight in the presence of 10/20 mM imidazole, 100 мМ HEPES pH 7.5 for CysLT_1/2R with NaCl concentration increased to 800 mM.

The resin was then washed with 10 column volumes (CV) of wash buffer I (8 mM ATP, 100 mM HEPES pH 7.5, 10 mM MgCl₂, 500 mM NaCl, 15 mM imidazole, 10 μM ligand, 10% glycerol, 0.1% DDM, 0.02% CHS), then with 5 CV of wash buffer II (25 mM HEPES pH 7.5, 250/500 mM NaCl for CysLT_1/2R, 30 mM imidazole, 10 μM ligand, 10% glycerol, 0.015% DDM, 0.003% CHS), then buffer was exchanged into buffer III (25 mM HEPES pH 7.5, 250/500 mM NaCl for CysLT_1/2R, 10 mM imidazole, 10 μM ligand, 10% glycerol, 0.05% DDM, 0.01% CHS) and the protein-containing resin was treated with PNGase F (Sigma) for 5 hours. Resin was further washed with 5 CV of wash buffer III and eluted with (25 mM HEPES pH 7.5, 250/500 mM NaCl for CysLT_1/2R, 300 mM imidazole, 10 μM ligand, 10% glycerol, 0.05% DDM, 0.01% CHS) in several fractions. After removing imidazole using a PD10 desalting column (GE Healthcare), the protein was incubated with 50 µM ligand and a His-tagged home-made TEV protease overnight to remove the N-terminal tags. Reverse IMAC was performed on the following day and the protein was concentrated up to 50–70 mg ml⁻¹ using a 100 kDa molecular weight cut-off centrifugal concentrator (Millipore). The protein purity was checked by SDS-PAGE, and the protein yield and monodispersity were estimated by analytical size exclusion chromatography (aSEC).

Crystals for SWSX were grown using high-throughput nanovolume LCP crystallization. The purified and concentrated protein solution was combined with a lipid mixture: 90% monoolein (Sigma), 10% cholesterol (Affymetrix) in the ratio of 2:3 v/v and homogenized using a lipid syringe mixer until a transparent gel-like LCP formed³³. Crystallisation was set up in 96-well glass sandwich LCP plates (Marienfeld), with 40 nL LCP drops and 800 nL precipitant drops, which were pipetted using an NT8-LCP robot (Formulatrix). All LCP manipulations were performed at room temperature (20–23 °C), and plates were incubated and imaged at 22 °C using an automated incubator/imager RockImager 1000 (Formulatrix).

CysLT₁R-pranlukast crystals had a needle shape (Fig. 1d) and gained their full size after 3-4 weeks; however, the best diffraction was obtained from samples incubated for 2 months. CysLT₂R crystals grew to their full size within 1–3 weeks. Crystals of CysLT₂R in complex with ligands 11a and 11b had a shape of an elongated plate with a maximal size up to 150 µm (Fig. 1e–g). In case of CysLT₂R-11c complex, crystals grew as flat parallelepipeds as long as 30–50 µm in diagonal (Fig. 1h). For the full list of crystallization conditions for crystals used in the data collection see Table 1.

Table 1 Summary of crystallization conditions for SWSX datasets.

Full size table

Microcrystals of the CysLT₁R-zafirlukast complex for SFX were grown in 100 µl gas-tight Hamilton syringes as previously described^34,35. Briefly, approximately 5 µl of protein-laden LCP was transferred through a coupler (Formulatrix) into a syringe, containing 50 µl of precipitant, so that LCP extends towards the plunger as a straight filament. For experiments conducted in 2016 (dataset CysLT1R_zafirlukast-P21), zafirlukast was added at 50 μM prior to the protein concentration. Crystals grew in the following precipitant conditions: 100 mM ammonium phosphate, 31–34% v/v PEG400, 100 mM HEPES pH 7.0, 1 µM zafirlukast. For experiments conducted in 2017 (dataset CysLT1R_6RZ5), zafirlukast was added at 200 μM prior to the protein concentration. Crystals grew in the following precipitant conditions: 120–200 mM sodium/potassium phosphate, 31–34% v/v PEG400, 100 mM HEPES pH 7, no zafirlukast added. Crystals grew for 1-2 weeks, reaching an average crystal size of 5 × 2 × 2 µm (Fig. 1c).

Synchrotron data collection

Crystal harvesting

Crystals were harvested directly from LCP using 50–200 µm dual thickness MicroMounts or 400–700 µm MicroMesh loops (MiTeGen) with various hole sizes and flash frozen in liquid nitrogen, as described³⁶.

Full sets data collection

Single-crystal datasets (for CysLT1R_6RZ4 and CysLT2R_6RZ8) were collected using the following procedure. First, the best diffracting position was found using automatic X-ray centring³⁷ with a microfocus beam, followed by characterization³⁷ and dose estimation using BEST³⁸ software, and further data collection as proposed by BEST. This resulted in over 90% complete datasets, however, with a relatively low resolution (>3 Å).

Partial sets data collection

To improve resolution, SWSX partial datasets (minisets, as introduced by Basu et al.³⁰) were collected using an updated version of the raster-scanning approach³⁹. The process is illustrated in Fig. 2a. Each loop was first visually aligned and oriented with its plane perpendicular to the X-ray beam. Then, the whole loop was scanned with the beam to identify locations with diffracting crystals (shutterless mode was used on the ID29 and ID30b beamlines). Raster scans were performed using a minimal dose per image, which allowed for visual detection of diffraction spots, but was less than 1% of the total dose per dataset. The grid spacing was set around $\sqrt{{\bf{2}}}{\boldsymbol{R}}$, where R is the beam profile radius (HWHM). The overlap between adjacent beam spots was introduced to improve accuracy in location of the best diffracting positions and to maximize the grid coverage by HWHM profiles. The grid cells showing diffraction spots were ranked by the DOZOR score³⁷ and then manually selected for further data collection. In the case of large single crystals spanning through several grid cells, minisets were collected starting from the highest ranked location and then moving to the next best location along the crystal but skipping grid cells if they had a common edge with the cells already used for data collection to avoid collecting data from previously exposed parts of the crystal. Consecutive minisets from the same crystal were collected by ensuring 1-2° overlaps in the goniometer rotation ω angle. When the goniometer rotation angle exceeded 10° from its original orientation, a new line raster scan was performed to re-align the crystal with the beam. Each miniset was collected restricting an estimated dose per diffraction location within ∼20 MGy and using 0.1–0.2° oscillation and 3–20° total wedge size. The wedge size and the corresponding exposure time were selected based on the total number of harvested crystals from the particular condition and were adjusted by decreasing the wedge size and increasing the exposure time when preliminary data processing indicated that a complete dataset had been already collected, or in case of a weak diffraction. The beam size was chosen to match the smallest crystal dimension. A summary of miniset parameters for each SWSX entry is given in Table 2.

Table 2 Summary of SWSX datasets.

Full size table

XFEL data collection

Loading crystals into injector

Precipitant solutions were slowly withdrawn from 3 syringes containing microcrystals of appropriate size and density through a 22 s gauge Hamilton needle. The remaining samples of LCP with microcrystals embedded in it were consolidated from these 3 syringes into one syringe using a syringe coupler (Formulatrix). An aliquot of ~10% of 7.9 MAG lipid was added to the sample to absorb the excess of the precipitant and to avoid LCP freezing upon extrusion in the vacuum chamber⁴⁰. A total sample volume of 15–20 µl was loaded into an LCP injector as described⁴⁰.

LCLS data collection: 2016

An overall scheme of the data collection setup is shown in Fig. 2b. SFX data of CysLT1R_Zafirlukast-P21 were collected in August 2016 at the CXI instrument of the Linac Coherent Light Source (LCLS) at the SLAC National Accelerator Laboratory, Menlo Park, California. LCLS was operated at a wavelength of 1.305 Å (9.50 keV) delivering individual X-ray pulses of 40 fs duration and 2.6 × 10¹⁰ photons per pulse focused into a spot size of ~1.5 µm in diameter using a pair of Kirkpatrick-Baez mirrors. LCP with protein microcrystals was extruded at room temperature and at a flow rate of 0.3 μl min⁻¹ inside a vacuum chamber into the beam focus region using an LCP injector⁴⁰ with a 50-μm diameter capillary. The XFEL beam was attenuated at transmission levels of 6.1% to avoid disruptions of the LCP stream. Diffraction images were collected at an XFEL pulse repetition rate of 120 Hz using a 2.3 Megapixel Cornell-SLAC Pixel Array Detector⁴¹ (CSPAD).

A total number of 900,173 detector images were collected, of which 22,047 (2% of total) were identified as potential crystal hits with more than 15 Bragg peaks with SNR = 6.0, threshold 100 and min-pix-count 3.0 using peakfinder8 algorithm as implemented in Cheetah⁴². The overall time of data collection from a sample with a total volume of 27 μl was about 2 h 6 min.

LCLS data collection: 2017

SFX data of CysLT1R_6RZ5 were collected in August 2017 at the CXI instrument. LCLS was operated at a wavelength of 1.302 Å (9.52 keV) delivering individual X-ray pulses of 43 fs duration and 1.9 × 10¹⁰ photons per pulse focused into a spot size of ~1.5 µm in diameter using a pair of Kirkpatrick-Baez mirrors. LCP with protein microcrystals was extruded at room temperature and at a flow rate of 0.3 μl min⁻¹ inside a vacuum chamber into the beam focus region using an LCP injector⁴⁰ with a 50-μm diameter capillary. The XFEL beam was attenuated at transmission levels of 6.3–10% to avoid disruptions of the LCP stream. Diffraction images were collected at an XFEL pulse repetition rate of 120 Hz using a 2.3 Megapixel Cornell-SLAC Pixel Array Detector⁴³ (CSPAD).

A total number of 390,442 detector images were collected, of which 43,417 (11% of total) were identified as potential crystal hits with more than 20 Bragg peaks with SNR = 4.0, threshold 200 and min-pix-count 3.0 using peakfinder8 algorithm as implemented in Cheetah⁴². The overall time of data collection from a sample with a total volume of 15 μl was about 54 min.

Data processing

All datasets, except for the SFX dataset CysLT1R_Zafirlukast-P21 (P21 space group), have been previously indexed, integrated, sorted, and merged to solve the structures of the corresponding receptor complexes by molecular replacement, as described^28,29. Re-processing of the data with the same or better processing statistics as in the original manuscripts is described in the Technical validation section.

Data Records

SWSX data^{44,45,46,47,48,49} have been deposited to Zenodo under accession numbers provided in Table 3. Each SWSX dataset folder contains subfolders, representing each miniset collected, regardless of the angular range for data collection. Each miniset subfolder is named as XXX_YY_ZZ_NN, where XXX is the sequential number of the miniset, YY is the crystallization condition ID, ZZ is the serial number of the harvesting loop within each crystallization condition, NN – the serial number of the miniset within each loop. Each miniset subfolder contains a subfolder ‘images‘ with all diffraction images in either cbf or HDF5 format. It also contains an XDS parameter file XDS.INP with the keyword NAME_TEMPLATE_OF_DATA_FRAMES pointing to files in ‘images‘ subfolder, and other parameters as used during reprocessing (see keywords for express.py below). Also, each miniset subfolder contains all XDS-related files (including geometry correction x_geo_corr.cbf and y_geo_corr.cbf for cbf files) for this miniset (everything up to CORRECT.LP and XDS_ASCII.HKL for successfully integrated datasets, and only COLSPOT.LP for non-successful ones). A summary of all SWSX entries is shown in Table 2, and a summary of all SWSX entries crystallization conditions is present in Table 1, with a full description provided in Supplementary Table 1.

Table 3 Data availability on Internet. Github gist, associated with the publication, has the ‘download_all.sh‘ script for Linux to download all data entries described in this publication.

Full size table

SFX data have been deposited to CXIDB as ID106⁵⁰ (CysLT1R_6RZ5) and ID107⁵¹ (CysLT1R_Zafirlukast-P21). Only those images identified as crystal hits by Cheetah are included in the deposited dataset. Each SFX dataset folder contains a subfolder ‘raw_data’ with all runs as written by Cheetah, their respective cheetah.ini files and cxi files with images. Also, each SFX dataset folder contains a file ‘initial.geom’ that was used during reprocessing. A summary for all SFX entries is given in Table 4.

Table 4 Summary of SFX datasets.

Full size table

Technical Validation

Data processing

During the preparation of this manuscript, all data were re-processed in a consistent manner. Here we present a pipeline for data processing that results in similar or better resolution values and figures of merit compared to those reported in the original papers. Data processing statistics for all datasets is shown in Table 5.

Table 5 Crystallographic data collection statistics.

Full size table

SWSX data

For SWSX data, the processing algorithm works as following (note that the treatment of both full datasets and minisets is the same). For each dataset, initial indexing and integration are performed by XDS within the resolution range of 40–2.5 Å using the beamline-provided XDS.INP file, without specifying the unit cell parameters and the space group identity (for Dectris images, the “neggia” library was used, as described here (https://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Eiger). Each integration runs first with the keywords “JOB = XYCORR INIT COLSPOT IDXREF DEFPIX INTEGRATE CORRECT”, then the integration parameters are updated using the output of the CORRECT step as described in the section “Final polishing: Re-INTEGRATEing with the correct spacegroup, refined geometry and fine-slicing of profiles” (https://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Optimisation#Re-INTEGRATEing_with_the_correct_spacegroup.2C_refined_geometry_and_fine-slicing_of_profiles), and integration is re-run using the keywords “JOB = DEFPIX INTEGRATE CORRECT”. After that, the algorithm attempts to scale all obtained XDS_ASCII.HKL files using XSCALE, and runs several rounds of ΔCC_1/2 rejection of non-isomorphous minisets using xdscc12 subprogram as described¹⁷ (until there are no rejected minisets at the subsequent iteration). For most datasets, the miniset rejection was applied in two steps: first, using the low-resolution range (e.g. 30.0–10.0 Å, ΔCC_1/2 threshold 0–2), and then using the high-resolution range (e.g. 10.0-2.5 Å, ΔCC_1/2 threshold 1–5). All processing parameters are summarized in Table 6. The obtained dataset is merged and used as a REFERENCE_DATA_SET during the 2^nd integration attempt of all minisets (including those rejected during the previous integration attempts). If CC_1/2 in the highest resolution shell exceeds 0.15, the RESOLUTION_RANGE is increased manually for the 3^rd integration. Next, another round of ΔCC_1/2 rejection of non-isomorphous minisets is performed followed by merging to produce a final dataset. Improvements in the figures of merit for a dataset as a result of ΔCC_1/2 rejection are shown in Fig. 3. For the CysLT2R_6RZ8 dataset (space group I4), there is an indexing ambiguity with two indexing options available for each miniset, thereby some minisets have to be re-indexed using ‘REIDX_ISET = 0 -1 0 0 -1 0 0 0 0 0 -1 0‘ keyword in XSCALE. This is done by following an iterative procedure: first, two largest minisets are merged together using two possible indexing options for the second set, and the indexing option resulting in a smaller R_meas is chosen. Then, all other minisets are added one by one, using the indexing choice that producess smaller R_meas for the merged dataset. For the final merged dataset, phenix.xtriage reports no significant twinning.

Table 6 SWSX data processing parameters. For CysLT2R_6RZ6 and CysLT2R_6RZ7, only full resolution range rejection was performed.

Full size table

SFX data

CysLT1R_6RZ5. Previously published data²⁸ were processed using CrystFEL (v. 0.6.3 + 23ea03c7). For peak finding, peakfinder8 with min-snr = 4.5, threshold = 210 was used. For indexing, the following indexers were employed: felix, dirax, asdf, taketwo, mosflm-nolatt-cell, mosflm-nocell-latt, and xds (in that order), with–multi option enabled. Data were merged using process_hkl, with push-res = 1.8 and max-adu = 14,000. For reprocessing, the same parameters in CrystFEL (v. 0.8.0) were used. The final reprocessed dataset included 28,900 indexed lattices (67% of the frames selected by Cheetah). Among indexers, felix was the most successful one, providing 16,717 indexed lattices (57.8% of all indexed lattices). Improvements of R_split, CC* and I/sigma are shown in Fig. 4a–d.

CysLT1R_Zafirlukast-P21

Data (previously unpublished) were processed using CrystFEL (v. 0.8.0). For peak finding, peakfinder8 with highres = 3.0, min-snr = 4.4, threshold = 20, max-res = 300 and min-res = 80 was used. For indexing, indexers dirax, taketwo, mosflm, xds, and asdf (in that order) were used. Data were merged using process_hkl, with mincc = 0.3 and push-res = 5.0. The final dataset resulted in 17,193 indexed lattices (79% of the frames selected by Cheetah). Among indexers, dirax was the most successful one, providing 14,457 indexed lattices. Improvements of Rpim, CC* and I/sigma are shown in Fig. 4e–h.

Usage Notes

Downloading data

The information about downloading data is shown in Table 3. A Linux script ‘download_all.sh‘, fetching all data using curl utility is provided on the Github gist, associated with the publication. Folder with each entry is archived in a single tar.gz file for more convenient fetching.

Data processing assistance scripts

Here, a brief description of scripts is given. Please, find a more detailed description in the github gist (https://gist.github.com/marinegor/96102c9b7ce87509a0832649d11ba927), associated with the publication.

1.
create_xscale.inp.py — a simple script to include all existing XDS_ASCIIs to XSCALE.INP

Given the structure of folders as in data deposited in this publication, creates an input file for express.py in the csv format
2.
express.py — the SWSX integration pipeline

Given a list of folders with XDS.INP and a path to the respective data sets, the script runs XDS for all data sets in the list, optionally adding UNIT_CELL_CONSTANTS, SPACE_GROUP_NUMBER, INCLUDE_RESOLUTION_RANGE, setting SPOT_RANGE same as DATA_RANGE, and setting REFERENCE_DATA_SET. Adds MAXIMUM_NUMBER_OF_PROCESSORS and MAXIMUM_NUMBER_OF_JOBS for processing on large clusters. Runs xscale_par afterwards.
3.
xdscc.py — parsing of XDSCC.LP logfile of xdscc12 utility for rejection of minisets based on ΔCC_1/2.

Analyses the output of xdscc12 utility together with the last XSCALE.INP used, providing the list of datasets with their ΔCC_1/2 values. Saves list of those which have ΔCC_1/2 higher than the input threshold value.
4.
reject.sh — iterative “until no dataset with negative ΔCC_1/2 are left” dataset rejection script

Scales XDS_ASCII.HKL files in all subfolders of the current folder. Then iteratively runs ΔCC_1/2 rejection with the given resolution range and the number of cycles. Saves all intermediate XSCALE.INP-s and XSCALE.LP-s.
5.
run_crystfel.sh

A wrapper for the indexamajig routine, which i) arranges all CrystFEL-related files into subfolders, ii) automatically assigns the date and time for each generated stream and respective log file, iii) links the last created stream to ‘laststream‘ link, and shuffles the input file list, so that one could quickly and reliably check the indexing rate before the indexing finishes.
6.
analysis.sh

A wrapper for process_hkl, partialator, check_hkl, and compare_hkl routines, which produces an XSCALE.LP-like statistics table, counts images indexed with different indexers, produces a command-line visible histogram of the image resolution (for a simple estimation of the push-res parameter), and writes logs.

Code availability

The code used for data reprocessing (see usage notes in Technical validation section) is available on github gist (https://gist.github.com/marinegor/96102c9b7ce87509a0832649d11ba927). The utility xdscc12 is available through XDS-Wiki website (https://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Xdscc12). In previous publications, for SFX data processing, CrystFEL version 0.6.3 + 23ea03c7 (available on https://stash.desy.de/projects/CRYS/repos/crystfel/commits) was used. For SWSX data processing, XDS (version BUILT = 20161101 for CysLT1R_6RZ4 and BUILT = 20161205 for CysLT2R_6RZ5–9) and XSCALE (version BUILT = 20161101 for CysLT1R_6RZ4 and BUILT = 20180319 for CysLT2R_6RZ4–9) were used in the original publication, together with the “neggia” library for reading HDF5 images, as described (https://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Eiger). For data reprocessing, XDS and XSCALE version BUILT = 20190315, and CrystFEL 0.8.0, were used.

Change history

23 November 2020
A Correction to this paper has been published: https://doi.org/10.1038/s41597-020-00759-w

References

Bäck, M. et al. International Union of Basic and Clinical Pharmacology. LXXXIV: Leukotriene Receptor Nomenclature, Distribution, and Pathophysiological Functions. Pharmacol. Rev. 63, 539–584 (2011).
Article PubMed Google Scholar
Singh, R. K., Tandon, R., Dastidar, S. G. & Ray, A. A review on leukotrienes and their receptors with reference to asthma. J. Asthma 50, 922–931 (2013).
Article CAS PubMed Google Scholar
Shi, Q.-J. et al. Intracerebroventricular injection of HAMI 3379, a selective cysteinyl leukotriene receptor 2 antagonist, protects against acute brain injury after focal cerebral ischemia in rats. Brain Res. 1484, 57–67 (2012).
Article CAS PubMed Google Scholar
Colazzo, F., Gelosa, P., Tremoli, E., Sironi, L. & Castiglioni, L. Role of the Cysteinyl Leukotrienes in the Pathogenesis and Progression of Cardiovascular Diseases. Mediators Inflamm. 2017, 1–13 (2017).
Article Google Scholar
Magnusson, C. et al. Low expression of CysLT1R and high expression of CysLT2R mediate good prognosis in colorectal cancer. Eur. J. Cancer 46, 826–835 (2010).
Article CAS PubMed Google Scholar
Magnusson, C. et al. Cysteinyl leukotriene receptor expression pattern affects migration of breast cancer cells and survival of breast cancer patients. Int. J. Cancer 129, 9–22 (2011).
Article CAS PubMed Google Scholar
Tsai, M.-J. et al. Cysteinyl Leukotriene Receptor Antagonists Decrease Cancer Risk in Asthma Patients. Sci. Rep. 6, 23979 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Duah, E. et al. Cysteinyl leukotriene 2 receptor promotes endothelial permeability, tumor angiogenesis, and metastasis. Proc. Natl. Acad. Sci. 116, 199–204 (2019).
Article CAS PubMed Google Scholar
Moore, A. R. et al. Recurrent activating mutations of G-protein-coupled receptor CYSLTR2 in uveal melanoma. Nat. Genet. 48, 675–680 (2016).
Article CAS PubMed PubMed Central Google Scholar
Ceraudo, E. et al. Uveal Melanoma Oncogene CYSLTR2 Encodes a Constitutively Active GPCR Highly Biased Toward Gq Signaling. bioRxiv 1–60, https://doi.org/10.1101/663153 (2019).
Yokomizo, T., Nakamura, M., Shimizu, T., Sasaki, F. & Yokomizo, T. Leukotriene receptors as potential therapeutic targets. J. Clin. Invest. 128, 2691–2701 (2018).
Article PubMed PubMed Central Google Scholar
Yamamoto, M. et al. Protein microcrystallography using synchrotron radiation. IUCrJ 4, 529–539 (2017).
Article CAS PubMed PubMed Central Google Scholar
Mishin, A. et al. An outlook on using serial femtosecond crystallography in drug discovery. Expert Opin. Drug Discov. 14, 933–945 (2019).
Article CAS PubMed PubMed Central Google Scholar
Zander, U. et al. Merging of synchrotron serial crystallographic data by a genetic algorithm. Acta Crystallogr. Sect. D Struct. Biol. 72, 1026–1035 (2016).
Article CAS Google Scholar
Santoni, G., Zander, U., Mueller-Dieckmann, C., Leonard, G. & Popov, A. Hierarchical clustering for multiple-crystal macromolecular crystallography experiments: the ccCluster program. J. Appl. Cryst 50, 1844–1851 (2017).
Article CAS Google Scholar
Foadi, J. et al. Clustering procedures for the optimal selection of data sets from multiple crystals in macromolecular crystallography. Acta Crystallogr. Sect. D Biol. Crystallogr. 69, 1617–1632 (2013).
Article CAS Google Scholar
Assmann, G., Brehm, W. & Diederichs, K. Identification of rogue datasets in serial crystallography. J. Appl. Crystallogr. 49, 1021–1028 (2016).
Article CAS PubMed PubMed Central Google Scholar
Hanson, M. A. et al. Crystal Structure of a Lipid G Protein-Coupled Receptor. Science (80-.). 335, 851–855 (2012).
Article ADS CAS Google Scholar
Diederichs, K. Dissecting random and systematic differences between noisy composite data sets. Acta Crystallogr. Sect. D Struct. Biol. 73, 286–293 (2017).
Article CAS Google Scholar
Brehm, W. & Diederichs, K. Breaking the indexing ambiguity in serial crystallography. Acta Crystallogr. Sect. D Biol. Crystallogr. 70, 101–109 (2014).
Article CAS Google Scholar
Asada, H. et al. Crystal structure of the human angiotensin II type 2 receptor bound to an angiotensin II analog. Nat. Struct. Mol. Biol. 25, 570–576 (2018).
Article CAS PubMed Google Scholar
White, T. A. et al. Serial femtosecond crystallography datasets from G-protein-coupled receptors. Sci. Data 3, 160057 (2016).
Article CAS PubMed PubMed Central Google Scholar
Toyoda, Y. et al. Ligand binding to human prostaglandin E receptor EP4 at the lipid-bilayer interface. Nat. Chem. Biol. 15, 18–26 (2019).
Article CAS PubMed Google Scholar
Kato, H. E. et al. Structural mechanisms of selectivity and gating in anion channelrhodopsins. Nature 561, 349–354 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Kim, Y. S. et al. Crystal structure of the natural anion-conducting channelrhodopsin GtACR1. Nature 561, 343–348 (2018).
ADS CAS PubMed PubMed Central Google Scholar
Morin, A. et al. Collaboration gets the most out of software. Elife 2, 1–6 (2013).
Article Google Scholar
Maia, F. R. N. C. The coherent X-ray imaging data bank. Nature Methods 9, 854–855 (2012).
Article CAS PubMed Google Scholar
Luginina, A. et al. Structure-based mechanism of cysteinyl leukotriene receptor inhibition by antiasthmatic drugs. Sci. Adv. 5, eaax2518 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Gusach, A. et al. Structural basis of ligand selectivity and disease mutations in cysteinyl leukotriene receptors. Nat. Commun. 10, 5573 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Basu, S. et al. Automated data collection and real-time data analysis suite for serial synchrotron crystallography. J. Synchrotron Radiat. 26, 244–252 (2019).
Article CAS PubMed PubMed Central Google Scholar
Chun, E. et al. Fusion partner toolchest for the stabilization and crystallization of G protein-coupled receptors. Structure 20, 967–976 (2012).
Article CAS PubMed PubMed Central Google Scholar
Ballesteros, J. A. & Weinstein, H. Integrated methods for the construction of three-dimensional models and computational probing of structure-function relations in G protein-coupled receptors. Methods Neurosci. 25, 366–428 (1995).
Article CAS Google Scholar
Caffrey, M., Cherezov, V., Caffrey, M. & Cherezov, V. Crystallizing membrane proteins using lipidic mesophases. Nat. Protoc. 4, 706–731 (2009).
Article CAS PubMed PubMed Central Google Scholar
Liu, W., Ishchenko, A. & Cherezov, V. Preparation of microcrystals in lipidic cubic phase for serial femtosecond crystallography. Nat. Protoc. 9, 2123–2134 (2014).
Article CAS PubMed PubMed Central Google Scholar
Ishchenko, A., Cherezov, V. & Liu, W. Preparation and delivery of protein microcrystals in lipidic cubic phase for serial femtosecond crystallography. J. Vis. Exp. 2016, e54463 (2016).
Google Scholar
Liu, W. & Cherezov, V. Crystallization of Membrane Proteins in Lipidic Mesophases. J. Vis. Exp. e2501, https://doi.org/10.3791/2501 (2011).
Svensson, O., Malbet-Monaco, S., Popov, A., Nurizzo, D. & Bowler, M. W. Fully automatic characterization and data collection from crystals of biological macromolecules. Acta Crystallogr. Sect. D Biol. Crystallogr. 71, 1757–1767 (2015).
Article CAS Google Scholar
Popov, A. N. & Bourenkov, G. P. Choice of data-collection parameters based on statistic modelling. Acta Crystallogr. Sect. D Biol. Crystallogr. 59, 1145–1153 (2003).
Article Google Scholar
Cherezov, V. et al. Rastering strategy for screening and centring of microcrystal samples of human membrane proteins with a sub-10 $μ$m size X-ray synchrotron beam. J. R. Soc. Interface 6, 587–597 (2009).
Article Google Scholar
Weierstall, U. et al. Lipidic cubic phase injector facilitates membrane protein serial femtosecond crystallography. Nat. Commun. 5, 3309 (2014).
Article ADS PubMed Google Scholar
Hart, P. et al. The CSPAD megapixel x-ray camera at LCLS. In X-Ray Free-Electron Lasers: Beam Diagnostics, Beamline Instrumentation, and Applications., https://doi.org/10.1117/12.930924 (2012).
Barty, A. et al. Cheetah: software for high-throughput reduction and analysis of serial femtosecond X-ray diffraction data. J. Appl. Crystallogr. 47, 1118–1131 (2014).
Article CAS PubMed PubMed Central Google Scholar
Herrmann, S. et al. CSPAD-140k: A versatile detector for LCLS experiments. Nucl. Instruments Methods Phys. Res. Sect. A Accel. Spectrometers, Detect. Assoc. Equip. 718, 550–553 (2013).
Article ADS CAS Google Scholar
Marin, E. et al. CysLT1R receptor complex with Zafirlukast (P21 space group) structure (SFX@LCLS). Coherent X-ray Imaging Data Bank, https://doi.org/10.11577/1660938 (2020).
Marin, E. et al. CysLT1R_6RZ4. Zenodo https://doi.org/10.5281/zenodo.4032826 (2019).
Marin, E. et al. CysLT2R_6RZ6. Zenodo https://doi.org/10.5281/zenodo.4032836 (2019).
Marin, E. et al. CysLT2R_6RZ7. Zenodo https://doi.org/10.5281/zenodo.4032837 (2019).
Marin, E. et al. S. CysLT2R_6RZ8. Zenodo https://doi.org/10.5281/zenodo.4032840 (2019).
Marin, E. et al. CysLT2R_6RZ9. Zenodo https://doi.org/10.5281/zenodo.4032841 (2019).
Pándy-Szekeres, G. et al. GPCRdb in 2018: adding GPCR structure models and ligands. Nucleic Acids Res. 46, D440–D446 (2018).
Article PubMed Google Scholar
Marin, E. et al. CysLT1R receptor complex with Zafirlukast (P1 space group) structure (SFX@LCLS). Coherent X-ray Imaging Data Bank https://doi.org/10.11577/1660939 (2020).

Download references

Acknowledgements

SWSX data analysis and treatment was supported by Russian Science Foundation (project no. 19-74-00088). XFEL sample preparation and SFX data analysis was done with support of grant 19-29-12022 from the Russian Foundation for Basic Research (RFBR). A.L., A.R., V.G., A.M. and V.B. are thankful for the Ministry of Science and Higher Education of the Russian Federation (agreement # 075-00337-20-03, project FSMG-2020-0003). V.P. acknowledges support by the project Structural Dynamics of Biomolecular Systems (ELIBIO) (CZ.02.1.01/0.0/0.0/15_003/0000447) from the European Regional Development Fund and the Ministry of Education, Youth and Sports (MEYS) of the Czech Republic. Use of the Linac Coherent Light Source (LCLS), SLAC National Accelerator Laboratory, is supported by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences under Contract No. DE-AC02-76SF00515. We acknowledge the European Synchrotron Radiation Facility for provision of beam time on ID23-1, ID30a3, ID29 and ID30b and we would like to thank structural biology group for assistance. V.C. acknowledges that the University of Southern California is his primary affiliation.

Author information

Anastasiia Gusach
Present address: MRC Laboratory of Molecular Biology, Cambridge CB2 0QH, UK

Authors and Affiliations

Research Сenter for Molecular Mechanisms of Aging and Age-related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, 141701, Russia
Egor Marin, Aleksandra Luginina, Anastasiia Gusach, Kirill Kovalev, Sergey Bukhdruker, Polina Khorn, Elizaveta Lyapina, Andrey Rogachev, Valentin Gordeliy, Alexey Mishin, Vadim Cherezov & Valentin Borshchevskiy
Institut de Biologie Structurale (IBS), Université Grenoble Alpes, CEA, CNRS, Grenoble, France
Kirill Kovalev, Vitaly Polovinkin & Valentin Gordeliy
Institute of Biological Information Processing (IBI-7: Structural Biochemistry), Forschungszentrum Jülich GmbH, 52425, Jülich, Germany
Kirill Kovalev, Sergey Bukhdruker, Vitaly Polovinkin, Valentin Gordeliy & Valentin Borshchevskiy
Institute of Crystallography, University of Aachen (RWTH), Aachen, Germany
Kirill Kovalev & Valentin Gordeliy
Bridge Institute, Michelson Center for Convergent Bioscience, University of Southern California, Los Angeles, CA, 90089, USA
Vadim Cherezov
Department of Chemistry, University of Southern California, Los Angeles, CA, 90089, USA
Vadim Cherezov
JuStruct: Jülich Center for Structural Biology, Forschungszentrum Jülich GmbH, 52425, Jülich, Germany
Kirill Kovalev, Sergey Bukhdruker, Valentin Gordeliy & Valentin Borshchevskiy
Joint Institute for Nuclear Research, Dubna, 141980, Russia
Andrey Rogachev
ELI Beamlines, Institute of Physics, Czech Academy of Science, Na Slovance 2, 18221, Prague, Czech Republic
Vitaly Polovinkin
ESRF—The European Synchrotron, 38000, Grenoble, France
Sergey Bukhdruker

Authors

Egor Marin
View author publications
You can also search for this author in PubMed Google Scholar
Aleksandra Luginina
View author publications
You can also search for this author in PubMed Google Scholar
Anastasiia Gusach
View author publications
You can also search for this author in PubMed Google Scholar
Kirill Kovalev
View author publications
You can also search for this author in PubMed Google Scholar
Sergey Bukhdruker
View author publications
You can also search for this author in PubMed Google Scholar
Polina Khorn
View author publications
You can also search for this author in PubMed Google Scholar
Vitaly Polovinkin
View author publications
You can also search for this author in PubMed Google Scholar
Elizaveta Lyapina
View author publications
You can also search for this author in PubMed Google Scholar
Andrey Rogachev
View author publications
You can also search for this author in PubMed Google Scholar
Valentin Gordeliy
View author publications
You can also search for this author in PubMed Google Scholar
Alexey Mishin
View author publications
You can also search for this author in PubMed Google Scholar
Vadim Cherezov
View author publications
You can also search for this author in PubMed Google Scholar
Valentin Borshchevskiy
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

E.M. performed SWSX data collection, processed SFX and SWSX data, organized raw data and wrote the manuscript. A.L. produced and crystallized protein, performed SWSX data collection and wrote the manuscript. A.G. produced and crystallized protein, performed SWSX data collection and wrote the manuscript. K.K. performed SWSX data collection and processed SWSX data. S.B. deposited the data and helped with manuscript preparation. P.K. produced protein. V.P. helped with crystal harvesting and SWSX data collection. E.L. produced protein. A.R. deposited the data. V.G. supervised the project. A.M. performed SWSX data collection, helped with manuscript preparation and supervised the project. V.C. performed SFX data collection, wrote the manuscript and supervised the project. V.B. performed SWSX data collection, wrote the manuscript and supervised the project.

Corresponding authors

Correspondence to Vadim Cherezov or Valentin Borshchevskiy.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Table 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files associated with this article.

Reprints and permissions

About this article

Cite this article

Marin, E., Luginina, A., Gusach, A. et al. Small-wedge synchrotron and serial XFEL datasets for Cysteinyl leukotriene GPCRs. Sci Data 7, 388 (2020). https://doi.org/10.1038/s41597-020-00729-2

Download citation

Received: 10 July 2020
Accepted: 07 October 2020
Published: 12 November 2020
DOI: https://doi.org/10.1038/s41597-020-00729-2

Subjects

Abstract

Similar content being viewed by others

Routine sub-2.5 Å cryo-EM structure determination of GPCRs

Cryo-electron microscopy for GPCR research and drug discovery in endocrinology and metabolism

Visualizing drug binding interactions using microcrystal electron diffraction

Background & Summary

Methods

Construct engineering, expression, purification, and crystallization of CysLT1R and CysLT2R

Synchrotron data collection

Crystal harvesting

Full sets data collection

Partial sets data collection

XFEL data collection

Loading crystals into injector

LCLS data collection: 2016

LCLS data collection: 2017

Data processing

Data Records

Technical Validation

Data processing

SWSX data

SFX data

CysLT1R_Zafirlukast-P21

Usage Notes

Downloading data

Data processing assistance scripts

Code availability

Change history

23 November 2020

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Supplementary information

Supplementary Table 1

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links

Construct engineering, expression, purification, and crystallization of CysLT₁R and CysLT₂R