Subnanometer-resolution structure determination in situ by hybrid subtomogram averaging - single particle cryo-EM

Cryo-electron tomography combined with subtomogram averaging (StA) has yielded high-resolution structures of macromolecules in their native context. However, high-resolution StA is not commonplace due to beam-induced sample drift, images with poor signal-to-noise ratios (SNR), challenges in CTF correction, and limited particle number. Here we address these issues by collecting tilt series with a higher electron dose at the zero-degree tilt. Particles of interest are then located within reconstructed tomograms, processed by conventional StA, and then re-extracted from the high-dose images in 2D. Single particle analysis tools are then applied to refine the 2D particle alignment and generate a reconstruction. Use of our hybrid StA (hStA) workflow improved the resolution for tobacco mosaic virus from 7.2 to 4.4 Å and for the ion channel RyR1 in crowded native membranes from 12.9 to 9.1 Å. These resolution gains make hStA a promising approach for other StA projects aimed at achieving subnanometer resolution.

The script below could be used together with the built-in Tilt Series functionality of SerialEM. It modifies the exposure times for high-and low-dose images and, importantly, overwrites the parameters specified in the SerialEM setup. In case of strong preferred orientation TargetHighDoseAngle could be modified, in this case dose-symmetric checkbox probably should be disabled in the Tilt Series Setup of SerialEM. In this subsection we explain how to export a project from the Dynamo StA project to Relion. For that, purpose we will use the second, larger TMV dataset to recreate the results shown in Figures 3E and 3F. The original stacks, results of the Dynamo project and the necessary alignment files for the use in Imod can be downloaded from the EMPIAR database using the following link: https://www.ebi.ac.uk/pdbe/emdb/empiar/entry/10393/ The files are the final Dynamo table file (.tbl); and the .st, .tlt, .xf and .defocus files for each tomogram. Below we will explain how to use the dyn2rel package, and then provide a description of it.

Preprocessing of input tomographic stacks:
Before aligning the tomographic stack, it must be normalized and weighted according to the applied electron dose. In our case, we want to maximize the dynamic range that can be represented in an unsigned 8-bit data type, that its, to have mean values µ = 128, and standard deviations σ LD = 11. As mentioned in the text of the manuscript, we used e -HD = 15 e -/Å 2 and e -LD = 2 e -/Å 2 , the ratio of standard deviations is σ HD = 0.37 * σ LD = 4. This normalization is applied to each tomographic stack using the "newstack" command provided by the Imod package. In our dataset, the high-dose image was the 21-th projection on the stack: newstack -in stack.st -ou stack_norm.st -meansd 128,11 newstack -in stack_norm.st -ou hd_frame.mrc -fr -se 21 newstack -in hd_frame.mrc -ou hd_frame_norm.mrc -meansd 128,4 newstack -in hd_frame_norm.mrc -ou stack_norm.st -fr -rep 21 Determining the defocus and assembling the Imod-style .defocus files A simple way to detect the defocus per micrograph is by using the ctfplotter functionality from Etomo GUI (final aligned stack -> Correct Ctf). Ctfplotter can be scripted 27 .
Alternatively, its possible to use Gctf which is fast and robust and works particularly well for untilted images. We use a Matlab script; please create a folder slices_forGctf and either copy the individual files there or create symbolic links according to the pattern slices_forGctf/tomo_$ N_proj_$M.mrc where N is the running number of the tomogram, M -number of projection in a given tomogram. Go to the folder and run Gctf from your installation.
After that tomographic reconstruction could be performed as usual. However, during tomographic processing, only use the rigid-body geometry for tomographic reconstruction. Do not use local alignments and offsets for tomogram positioning. After the tomographic reconstructions are performed, pick particles, perform subtomogram averaging. The current workflow assumes the Dynamo-style output.

Postprocessing with the hybrid StA script:
You can download the results of our conventional StA processing of the TMV dataset from Figure 3 from EMPIAR. In the dataset EMPIAR-10393 you can find all the files needed to export a Dynamo project into a Relion one. To simplify file access, we arranged the tomograms' files inside folders and named them following a simple naming convention. The downloaded folder should have the following scheme (showing the files for the first two tomograms only): In our case, we used the same name for each .st, .tlt, .ali and .defocus file inside a folder designated for each tomogram. This is not required, however helps to fill the required file name information in the dyn2rel.Tomogram class. The additional three files, "sta_rslt.tbl", "mask_b1.mrc" and "create_stack.m", are the resulting table from the Dynamo project, the mask used in said project, and the main exporting script, respectively. The exporting procedure consists of three parts: 1. Setting up tomograms: The information of each tomogram is read into a dyn2rel.Tomogram class, and stored in an array of it. In our case we are using the original unbinned stack, with pixel size of 1.1 Å (recorded in superresolution mode on a K2). We must set up the .st, .xf, .tlt, and .defocus file for each tomogram, along with the full unbinned tomogram size, and the pixel size. 2. Setting up the exporter: A dyn2rel.Exporter is created and configured according to our Dynamo project. In our case, the table was the result of processing a once binned tomogram, so the coordinates and shifts in our table must be multiplied by 2 (exporter.tbl_mul = 2). Also, we want our Relion project to work with a box size of 200 voxels, but binned once too (bin=1). To accomplish this, we set up a cropping size of 400 and a binning of 1 (exporter.out_siz = 400; exporter.out_bin = 1). Finally, we used column 23 in our dynamo table to define to which particle half-set the particles belong, we provide this information for the exporter too (exporter.split_h = 23). 3. Export the project: The final step crops the particles from the stacks according to the information on the dyn2rel.Tomogram array and the table 'sta_rslt.tbl' and creates the MRCS stack and the STAR file. In our case, we chose the prefix 'hyb_b1' for the project. This creates the 'hyb_b1.mrcs' and 'hyb_b1.star' files. This last command was executed on a computing cluster as a part of a submission script. In cases of poor alignment of tomograms or of subtomograms to the average, it is possible to attempt increasing --sigma_ang up to 0.2.
Finally, we exported the particles to cryoSPARC and performed per-particle CTF refinement (https://cryosparc.com/docs/tutorials/ctf-refinement). The resulting defocus values were used for another round of refinement in Relion which improved the resolution to 4.4 Å.
Note: For the RyR1 dataset we started with a traditional autorefine Relion project, however, in order to achieve higher resolution, we disabled the autorefinement and enabled the "always_cc" flag. The Relion's refinement command looks like this: • dyn2rel.Exporter: Class that exports the Dynamo project into Relion for 2D refinement.
It reads the Dynamo table, projects the coordinates of each particle into the high-dose projection on the respective stack, crops the corresponding patch, stores it into a MRCS file and writes out a STAR file. The cropping procedure is controlled by the "xcor_sel" property, which sets the percentage of the best particles to be cropped (according to the cross correlation score, column 10 of a Dynamo table). The location of each particle is calculated by adding the shifts (columns 4, 5 and 6) to the position (columns 24, 25 and 26) and then multiplying the result by the "tbl_mul" property. The projection and defocus values are adjusted using the tomogram information (dyn2rel.Tomogram), and the cropped patch size is set by the "out_siz" property. After the patch is cropped, it can be inverted, according to the "invert" property, and binned, according to the "out_bin" property. Additionally, the "split_h" can be used to define how the random half sets are set. Here we give a small summary of the class' properties grouped by purpose: • Particles selection: ▪ xcor_sel: fraction of particles to be cropped, according to the cross correlation value (column 10 in a Dynamo table). • Table scaling: ▪ tbl_mul: Multiplication factor to be applied to the position and shifts on the table to bring the table scaling to the unbinned tomogram scale. • Particle's patch cropping: ▪ out_siz: size of the unbinned cropped patch. ▪ out_bin: Binning level of the patch. It is applied after the cropping stage, and it is done using Dynamo's dbin command. ▪ Invert: Is "true" the values on the patch will be inverted. • Extra information: ▪ split_h: Sets how the value of "RandomSubset" will be set: • split_h < 1: the field will be not set.
• split_h > 1: the RandomSubset value will be set from the value of the "split_h"-th column in the Dynamo table. Column 23 of Dynamo tables is typically non-assigned.
Finally, the dyn2rel.Exporter class has only one method: exec. This method performs the exporting procedure. It requires 3 input parameters and produces one output: • dyn2rel.Exporter.exec inputs: • out_pfx: (string) base name of the resulting files for the Relion project. The method creates a out_pfx.star file and a out_pfx.mrcs file. • tomo_list: (dyn2rel.Tomogram array) contains the information of the tomograms used in the project. It must be created with the dyn2rel.create_tomos_list function. • table: (Dynamo table, or filename) table created as a result of a Dynamo project.