Membrane protein megahertz crystallography at the European XFEL

The world’s first superconducting megahertz repetition rate hard X-ray free-electron laser (XFEL), the European XFEL, began operation in 2017, featuring a unique pulse train structure with 886 ns between pulses. With its rapid pulse rate, the European XFEL may alleviate some of the increasing demand for XFEL beamtime, particularly for membrane protein serial femtosecond crystallography (SFX), leveraging orders-of-magnitude faster data collection. Here, we report the first membrane protein megahertz SFX experiment, where we determined a 2.9 Å-resolution SFX structure of the large membrane protein complex, Photosystem I, a > 1 MDa complex containing 36 protein subunits and 381 cofactors. We address challenges to megahertz SFX for membrane protein complexes, including growth of large quantities of crystals and the large molecular and unit cell size that influence data collection and analysis. The results imply that megahertz crystallography could have an important impact on structure determination of large protein complexes with XFELs.


Supplemental Notes:
Supplementary Note 1. Crystallization, data collection, data analysis, and structure solution for the cryogenic synchrotron structure of T. elongatus PSI presented here.
Crystallization. For the PSI crystallized for synchrotron X-ray diffraction at cryogenic conditions, crystallization was performed as described previously 1 with further details described in Fromme 1998 2 . Briefly, the crystals were grown at low ionic strength by dialysis with microand subsequent macro-seeding steps. The freezing procedure also matched the procedure used for the Jordan et al. PSI structure (PDB ID=1JB0) 1 . Crystals were transferred in 10 steps from the crystal stabilization buffer (5 mM MES pH=6.4, 0.02% -DDM) into the freezing buffer (2 M sucrose, 5 mM MES pH=6.4, 0.02% -DDM) and incubated for 1 hour in the 2 M sucrose buffer before crystals were fished into cryo-loops and flash-frozen in liquid propane.
Data collection and analysis. Data were collected at the Advanced Light Source (ALS) at Lawrence Berkeley National Laboratory at beamline 8.2.1 at 100 K with a 100 x 200 µm beam focus. The data was collected from a single crystal which was rotated one degree per image and 180 images were collected. The data set was indexed and merged in XDS 3 . The space group determination and scaling was achieved with Aimless from the CCP4 software suite 4 .

Supplementary Note 2.
Crystallization at low ionic strength for SFX and standard crystallography.
Crystallization at low ionic strength offers potential to perform SFX studies of other difficult to crystallize proteins at MHz repetition rates. Most proteins are less soluble at low ionic strength compared to medium ionic strength. At low ionic strength protein interactions are enabled by hydrophilic interactions. As the surface of the protein is depleted of counter ions that effectively shield opposite charges on the surface of a protein, direct contacts between opposing surface charges induce the formation of crystal contacts. In contrast, crystallization at the other side of the phase diagram is induced when the solubility of the protein is decreased by addition of crystallization agents like PEG or at high ionic strength. Here, the protein and ions or PEG compete for the water needed for solvation. Thereby at high ionic strength or in the presence of PEG, hydrophobic interactions are enhanced. While crystallization at low ionic strength is not a novel method of crystallization 5 , it is rarely used for crystallization of membrane proteins. This is most likely a matter of convenience; vapor diffusion is the most common method for crystallization and is the basis of almost all robotic crystallization systems. While crystallization at low ionic strength is quite labor-intensive for standard crystallography, it is the a very useful method for growth of nano-and microcrystals for MHz crystallography. In addition to the advantages for SFX listed in the main text it is also easy to perform and is reproducible. Crystals form quickly by ultrafiltration crystallization, can be used as a final purification step and the method is also fully reversible (no "PEG skin" is formed that can prevent seeding when PEG is used for crystallization). Thereby, seeding is easily implemented and the protein can be crystallized and dissolved many times to obtain the desired crystal size and purity. The AGIPD 6,7 will be able to store 3520 diffraction patterns per second.
Supplementary Table 2 Comparison of PSI structures determined using crystallographic data collected at the EuXFEL and at the Advanced Photon Source synchrotron. Various superpositions were performed using PyMOL 8 and their RMSD values are shown. The two structures were similar at 2.9 Å resolution, especially when comparing individual subunits. For reference, the poly-Ala superposition of two evolutionarily closely-related polypeptides, PsaA and PsaB, is ~0.8 Å and the superposition of the poly-Ala chains of the two singletransmembrane helix (TMH) subunits PsaI and PsaJ is ~4.3 Å. M1 corresponds to the PSI monomer whose core polypeptides are chains A and B in the PDB file, M2 corresponds to the PSI monomer whose core polypeptides are chains G and H in the PDB file, and M3 corresponds to the PSI monomer whose core polypeptides are chains Y and Z in the PDB file.   In the structures solved in P21 without symmetry, one of the monomers of the trimer is further from the other two monomers. This monomer is colored darker than the other two in the middle and right columns.
Supplementary Fig. 8 B-factor comparison of the three T. elongatus structures discussed in the main text. The "putty" view of the polypeptides is shown where larger and red means a higher Bfactor and smaller and blue means a lower B-factor. B-factor scale was from 20 -100 Å 2 .
Supplementary Fig. 9 Packing of PSI in space group P21. Unit cell edges a, b, and c are labelled. The arrangement of PSI trimers in the P21 unit cell is shown in a view onto the a/c plane (top) and in parallel to the a/c plane (bottom). PyMOL 8 was used to generate symmetry mates within the unit cell. The membrane-normal (top) and membrane parallel (bottom) views of the protein are shown within the unit cell (white lines). Only protein secondary structure is shown for clarity. Supplementary Fig. 10 Packing of PSI in space group P21 from the structure of PSI from S. sp. PCC 6803 reported previously (PDB ID=5OY0) 11 . Unit cell edges a, b, and c are labelled. The arrangement of PSI trimers in the P21 unit cell is shown in a view onto the a/c plane (top) and in parallel to the a/c plane (bottom). PyMOL was used to generate symmetry mates within the unit cell. The membrane-normal (top) and membrane parallel (bottom) views of the protein are shown within the unit cell (while lines). Only protein secondary structure is shown for clarity. Supplementary Fig. 11 Example differences between individual monomers of the trimer without symmetry-imposed refinement of the XFEL model. The two images on the top are from "M1", the monomer of the trimer where the core polypeptides are composed of chains A and B in the associated PDB file. The two images on the bottom are from "M2", the monomer of the of the trimer where the core polypeptides are composed of chains G and H in the associated PDB file.
On the left is a lumenal view of two TMH (one from each core polypeptide) that sits below the primary electron donor, a pair a Chl molecules called P700. On the right shows a region toward the lumenal edge of PsaA near the monomer-monomer interface. Both differences in density and structure are labelled; however, note that at 2.9 Å resolution, these differences are considered negligible. The 2Fo-Fc map is shown at 1.5σ for all panels.
Supplementary Fig. 12 PSI microcrystal unit cell distribution. PSI microcrystals were determined by indexing EuXFEL SFX data with MOSFLM 12 . The red lines show a Gaussian function fit to the unit cell constant distribution and the corresponding peak value is listed in each sub-panel.
Supplementary Fig. 13 Annealed composite omit map (1σ) of the diffraction data generated in the Phenix software suite 13 . 0.5% of the atoms within the asymmetric unit were iteratively omitted and all other options were left to their default settings. In all images, protein is colored cyan, Chl molecules are colored green, β-carotenes are colored orange, lipids are colored yellow. Nitrogen atoms are additionally colored blue, oxygen atoms are additionally colored red, and magnesium atoms are additionally colored bright green. a and e show the omit map of the "special pair" of chlorophylls, "P700". b and f show the omit map a β-carotene molecule, c and g show the omit map of the 4Fe-4S cluster, "FX", and d and h show the omit map for a phosphatidylglycerol headgroup axial coordination of a Chl molecule.
Supplementary Fig. 14 Manual omission of selected ligands from the Fo-Fc electron density map at 1σ of the XFEL-and synchrotron-derived data. In each row, the electron density map is shown after refinement with (left column, white map) and without (right column, pink map) the ligand of interest. For the map where the ligand was omitted, the full model was placed back into the electron density for visual reference. The first row shows the Chl that is the initial electron acceptor in the electron transfer chain, "A0". The second row shows a β-carotene. The third row shows a 4Fe-4S cluster, FX. The fourth row shows a phosphatidyl glycerol molecule whose headgroup provides the axial ligand to the central magnesium of a Chl a.