A minority of final stacks yields superior amplitude in single-particle cryo-EM

Zhu, Jianying; Zhang, Qi; Zhang, Hui; Shi, Zuoqiang; Hu, Mingxu; Bao, Chenglong

doi:10.1038/s41467-023-43555-x

Download PDF

Article
Open access
Published: 10 December 2023

A minority of final stacks yields superior amplitude in single-particle cryo-EM

Jianying Zhu¹^na1,
Qi Zhang^2,3,4,5^na1,
Hui Zhang⁶,
Zuoqiang Shi^1,7,
Mingxu Hu ORCID: orcid.org/0000-0003-3603-3966^2,3,4,5,8 &
…
Chenglong Bao ORCID: orcid.org/0000-0002-1201-1212^1,7,9

Nature Communications volume 14, Article number: 7822 (2023) Cite this article

7261 Accesses
14 Altmetric
Metrics details

Subjects

Abstract

Cryogenic electron microscopy (cryo-EM) is widely used to determine near-atomic resolution structures of biological macromolecules. Due to the low signal-to-noise ratio, cryo-EM relies on averaging many images. However, a crucial question in the field of cryo-EM remains unanswered: how close can we get to the minimum number of particles required to reach a specific resolution in practice? The absence of an answer to this question has impeded progress in understanding sample behavior and the performance of sample preparation methods. To address this issue, we develop an iterative particle sorting and/or sieving method called CryoSieve. Extensive experiments demonstrate that CryoSieve outperforms other cryo-EM particle sorting algorithms, revealing that most particles are unnecessary in final stacks. The minority of particles remaining in the final stacks yield superior high-resolution amplitude in reconstructed density maps. For some datasets, the size of the finest subset approaches the theoretical limit.

Digital colloid-enhanced Raman spectroscopy by single-molecule counting

Article 17 April 2024

Pooled multicolour tagging for visualizing subcellular protein dynamics

Article Open access 19 April 2024

Best practices for single-cell analysis across modalities

Article 31 March 2023

Introduction

The transformative impact of cryo-EM single-particle analysis (SPA) on the field of structural biology has been widely recognized by the scientific community¹. Cryo-EM has advanced significantly due to a series of technological innovations^2,3,4,5,6,7, enabling the technique to provide macromolecular structures with up to atomic resolution at an unprecedented rate. This technological progress is commonly referred to as the resolution revolution⁸. Cryo-EM involves using electron microscopy images of biomolecules embedded in vitreous, glass-like ice⁹, which are then combined to generate three-dimensional density maps. These maps provide valuable insights into the function of macromolecules and their role in biological processes.

The stability and electron-optical performance of electron microscopes do not hinder the use of cryo-EM¹⁰. However, biological samples studied in cryo-EM are radiation-sensitive^11,12. Therefore, a trade-off must be made between improving the signal-to-noise ratio (SNR) and limiting radiation damage^13,14. It was concluded that statistically well-defined three-dimensional (3D) structures could not be obtained from individual biological macromolecules at atomic resolution^15,16. Instead, increasing the SNR by averaging image data from many identical macromolecules is the only way to progress^13,17,18. Over two decades ago, Henderson estimated that structures could be determined at a resolution of nearly 3 Å by merging data from approximately 12,000 particles, even for particles as small as approximately 40 kDa¹⁹. Later, Rosenthal and Henderson argued that the electron microscopy community should adopt the same threshold criterion for structure factor quality as the X-ray protein crystallography community, which was set at a figure-of-merit of 0.5 corresponding to a phase error of 60°¹⁶. The theoretical limit of the minimum number of particle images required to achieve a specific resolution can be calculated using the theory proposed by Henderson and Rosenthal^16,19, given the B-factor of the instrument (e.g., electron microscopy and camera)^13,14,20. In practice, the final stacks of cryo-EM still far fall short of the theoretical limit, indicating a considerable gap between what can be accomplished and the physical limit of what cryo-EM can do²¹. The initial particle datasets obtained by particle picking from micrographs undergo multiple rounds of laborious 2D and 3D classification to generate the final stack for model determination. The final stacks, which yield atomic or sub-atomic resolution density maps, typically comprise several orders of magnitude fewer particles than the original datasets. Therefore, the cryo-EM field faces the long-standing question of how close we can approach the theoretical limit in practice. The lack of an answer to this open question has hindered the quantification of the performance of various underdeveloped sample preparation methods and impeded the investigation of trends and the understanding of the underlying mechanisms of sample behavior. To answer the question of how close cryo-EM can approach its theoretical limit, it is crucial to determine the minimum number of particles required to achieve a high-resolution 3D reconstruction within a given dataset.

In this work, we introduce CryoSieve²², an iterative particle sorting and/or sieving algorithm that identifies the smallest subset of particles necessary to generate high-resolution density maps, which we call the finest subset. CryoSieve compares the high-frequency components of synthetic and observed particle images. A higher CryoSieve score indicates superior quality rather than typical cryo-EM damage or artifacts. Extensive experiments show that CryoSieve outperforms other particle sorting algorithms in various metrics and reveal that most particles in final stacks are futile. The finest subsets generate 3D density maps with better high-resolution amplitude, using much fewer particles than the final stacks. We propose that CryoSieve removes radiation-damaged particles within cryo-EM datasets, supported by experiments on the dataset consisting of particles exposed to various levels of electron dose. Finally, we compare the minimum particles required in theory with the size of the finest subsets obtained by CryoSieve, finding that some datasets come close to the theoretical limit after being sieved by CryoSieve. From our experiments, we suggest that advancements during the sample preparation process, aimed at increasing the proportion of the finest subset in the final stack, could potentially facilitate the development of cryo-EM.

Results

Design of CryoSieve

We have developed a particle sorting and/or sieving model called CryoSieve that iteratively performs 3D reconstruction and particle selection, eliminating futile particles during each iteration. A flow chart scheme is provided in Supplementary Fig. 3. In CryoSieve, the relion_reconstruct module of RELION is used to reconstruct a new density map with the retained particle images, which is then used in the subsequent iteration. The retained particle images in each iteration form a subset of those from the previous iteration, as shown in the following formula:

$$\left\{{i}_{1}^{\left(k\right)},\, {i}_{2}^{\left(k\right)}\cdots,\, {i}_{{n}^{\left(k\right)}}^{\left(k\right)}\right\}\subset \left\{{i}_{1}^{\left(k-1\right)},\, {i}_{2}^{\left(k-1\right)},\cdots,\, {i}_{{n}^{\left(k-1\right)}}^{\left(k-1\right)}\right\},$$

(1)

where n^(k−1) represents the number of retained particles. At each iteration, let b_j be the j-th particle image, A_j be its forward operator defined by the estimated parameters and x^(k−1) be the reconstructed density map from the retained particle images in the previous iteration, particles are sieved out based on their CryoSieve score, which is defined as follows:

$${g}_{j}:={{||}{H}^{\left(k\right)}{b}_{j}{||}}_{2}^{2}-{{||}{H}^{\left(k\right)}\left({b}_{j}-{A}_{j}{x}^{\left(k-1\right)}\right){||}}_{2}^{2},\, {j}\in \left\{{i}_{1}^{\left(k-1\right)},\, {i}_{2}^{\left(k-1\right)},\cdots,\, {i}_{{n}^{\left(k-1\right)}}^{\left(k-1\right)}\right\}.$$

(2)

Here, H^(k) is the highpass operator at the k-th iteration, and its threshold frequency increases linearly as the iteration progresses (Supplementary Table 4). Given that g_j relies on the accurate amplitude of the reconstructed density map x^(k), CryoSPARC is not the optimal choice for reconstruction in the particle selection step (Supplementary Fig. 2). It tends to deviate significantly from the true amplitude (Supplementary Fig. 2c). Furthermore, the amplitude information within the CryoSieve score proves vital, and the phase residual is ineffective as a metric for particle selection (Supplementary Fig. 4).

The CryoSieve score estimates the similarity between a particle and a reference projection above a given frequency. A higher CryoSieve score indicates that the particle and the reference projection share a higher proportion of signal energy, indicating better particle quality. As radiation damage mainly affects the high-frequency range, the CryoSieve score includes a highpass operator to extract the high-frequency part. We have demonstrated that the CryoSieve score can identify particles with incorrect pose parameters or components in the high-frequency range through theoretical analysis and simulation verification (Supplementary Material I and III). Specifically, assuming that noise in particles follows a Gaussian distribution, we have shown that, with high probability, the CryoSieve score is an ideal indicator of particle image quality, distinguishing it from typical cryo-EM damage or artifacts (Supplementary Material I). Furthermore, the CryoSieve score exhibits remarkable accuracy in removing particles with incorrect pose and CTF parameter estimations, achieving a high accuracy of over 90% (Supplementary Material III).

Majority of the particles are futile in final stacks

We demonstrate the versatility of our method by applying it to eight experimental datasets (Table 1). The first dataset is derived from the human TRPA1 ion channel (EMPIAR-10024)²³. The second dataset is from influenza hemagglutinin trimer (EMPIAR-10097)²⁴, of which the preferred orientation necessitated 40° tilts during data acquisition. The third dataset involves LAT1-CD98hc bound to MEM-108 Fab (EMPIAR-10264)²⁵, while the fourth features membrane-bound pfCRT complexed with Fab (EMPIAR-10330)²⁶. Both of these datasets utilized signal subtraction during data processing. The fifth dataset is from CS-17 Fab-bound TSHR-Gs (EMPIAR-11120)²⁷. The sixth is from TRPM8 bound to calcium (EMPIAR-11233)²⁸. The seventh dataset is derived from human apoferritin (EMPIAR-10200)²⁹, achievable to a resolution above 2Å. The eighth dataset originates from streptavidin (EMPIAR-10269)³⁰, with a molecular weight of only 52 kDa. All datasets were obtained using a voltage of 300 kV and an amplitude contrast of 0.07 or 0.1. The TEM systems and electron detectors used in the experiments are listed in Table 1, along with additional metadata such as the number of particles in the final stacks, spherical aberration, symmetry and molecular weight.

Table 1 Microscopic imaging parameters of eight experimental datasets along with their associated metadata

Full size table

All of the datasets are deposited in the Electron Microscopy Public Image Archive (EMPIAR) ³¹ as final stacks. These final stacks, which also contain the corresponding refined Euler angles, were used to generate the final published reconstructions. The final stacks are generated by manually selecting significantly smaller subsets through multiple rounds of 2D/3D classification, resulting in a substantially reduced number of particles compared to the original particle stacks.

We employed CryoSieve to process the eight experimental datasets. CryoSieve removed 20% of the particles in each iteration, resulting in a retaining ratio of 80.0%, 64.0%, 51.2%, and so on. The highpass cutoff frequency of CryoSieve increases linearly across iterations. The retained particles in different iterations were then used for ab initio reconstruction to determine the finest subset of particles. The finest subset only contained 21.0% to 32.8% of the particles in the final stack. However, the quality of the reconstructed map from the finest subset was consistent with that obtained from all particles in the final stack, as demonstrated in Fig. 1. For some datasets, the density maps showed a certain degree of improvement, which was visualized by the restoration of some previously blurred or missing side chains in the density map (Supplementary Fig. 8). The results demonstrate that CryoSieve is proficient in discarding more than half of the particles, utilizing the CryoSieve score—a metric reflecting the discrepancy between the particle image and its reference projection. Crucially, this process does not compromise the quality of the final reconstruction. Moreover, the pose distribution of the removed particles was similar to those of all particles in the final stacks (Supplementary Fig. 6). Therefore, CryoSieve is highly effective in selecting the most informative particles.

**Fig. 1: CryoSieve is capable of maintaining resolutions after removing the majority of particles in the final stacks.**

We performed a comparative analysis of CryoSieve with other cryo-EM particle sorting criteria or software currently used in the field, including the normalized cross-correlation (NCC) method³², the angular graph consistency (AGC) approach³³ and the non-alignment classification⁶. The parameter settings for CryoSieve and the other comparative algorithms were listed in Supplementary Material VI. In our experiments, we used final stacks composed of relatively high-quality particles. NCC retains an equal number of particles compared to CryoSieve at each iteration, while AGC’s retaining ratio is self-determined. However, AGC’s retaining ratio was mainly over 90%, resulting in only a small fraction of particles being removed. Thus, the quality of the reconstructed map using the retained particles did not improve or worsen (Supplementary Table 2), as these tested final stacks are composed of relatively high-quality particles. For the non-alignment classification applied to hemagglutinin, LAT1, and apoferritin, less than half of the particles were removed, resulting in some enhancement (Supplementary Material V). However, this enhancement still falls notably short of the results achieved by CryoSieve (Supplementary Material V). For the other five datasets, the retaining ratios using non-alignment classification exceeded 90%, resulting in the quality of maps reconstructed from the retained particles either remaining unchanged or deteriorated (Supplementary Material V). Additionally, we randomly selected the same number of particles from the tested final stacks at each iteration to observe the baseline effect of particle number reduction.

For all the aforementioned methods (CryoSieve, NCC, AGC, non-alignment classification, and random), we discarded the refined Euler angles published and deposited on EMPIAR to prevent the inadvertent transfer of information from the removed particles to the retained particles. Thus, the retained particles were used for ab initio reconstruction by CryoSPARC to obtain refreshed sets of Euler angles and density maps. Several metrics, including FSC-based resolution¹⁶, Q-score³⁴ and Rosenthal-Henderson B-factor¹⁶ were used to measure the quality of the refreshed density maps. Based on these metrics, our analyses reveal that CryoSieve effectively sieves out 67.2% to 79.0% (varying based on datasets) of particles from the final stacks without deteriorating the yielded density maps (Fig. 2). In contrast, subsets of equal size retained by the other methods failed to reconstruct density maps of the same quality as the original (Fig. 2). Therefore, CryoSieve significantly outperforms other particle sorting algorithms, demonstrating that the majority of particles are dispensable in the final stacks. A key factor in CryoSieve’s superiority over both NCC, AGC and non-alignment classification is the integration of the highpass operator when computing the CryoSieve score. Without the truncation of high frequencies, scores may be predominantly influenced by low-frequency components, making it challenging to differentiate non-contributory particles in cryo-EM.

**Fig. 2: CryoSieve outperformed other algorithms in terms of FSC-based resolutions, Q-scores and Rosenthal-Henderson B-factors.**

CisTEM⁵ can report a score for each single-particle image after 3D refinement. During the 3D refinement process of cisTEM, the pose parameters of particles are re-estimated or refined. Therefore, due to differences in alignment and other image processing workflows between cisTEM and CryoSPARC, cisTEM cannot be strictly compared with CryoSieve. We compared CryoSieve and cisTEM by sorting particles using the cisTEM score and retaining equal particle counts for ab initio reconstruction in CryoSPARC (details in Supplementary Material II). CryoSieve outperformed cisTEM in all eight experimental datasets (Fig. 2).

We analyzed the differences between the particle images retained and removed using CryoSieve by performing 2D classification of the particles into 50 classes using CryoSPARC. To ensure a comparable number of particles for both retained and removed groups, we ran CryoSieve and terminated at the third iteration, yielding a retention ratio of 51.2% and a removal ratio of 48.8%. CryoSPARC reported the 2D resolution of each class, along with the number of particle images belonging to it. The particles retained by CryoSieve (Fig. 3, steel blue) were distributed at a higher resolution compared to those removed by it (Fig. 3, crimson). In six out of the eight datasets, particle images with the highest resolution, i.e., 7.4–7.1 Å in TRPA1, 8.5–9.6 Å in hemagglutinin, 6.6–8.2 Å in LAT1, 7.2–11.6 Å in pfCRT, 7.2–8.5 Å in TSGH-Gs, and 11.6–7.5 Å in TRPM8, were entirely retained by CryoSieve. For apoferritin, the majority of particles within the highest resolution range (5.5–5.3 Å) were constituted by the particles retained by CryoSieve. However, for streptavidin, possibly due to the adoption of a phase plate during data collection, unusually high resolutions were reported in the 2D classification step, rendering such a comparison between retained and removed particles ineffective. In conclusion, our analysis suggests that CryoSieve selectively retained the higher-quality particle images in the final stack while discarding lower-quality ones. It is noted that some information remains in these discarded particles, but it does not enhance the information present in the finest subset (Supplementary Material VII).

**Fig. 3: The two-dimensional resolution distribution between retained and removed particles was compared.**

Better high-resolution amplitude with much fewer particles

B-factors, also known as Debye-Waller factors or temperature factors, reflect the rate at which the amplitude of high-resolution information decreases¹⁶. Lower B-factors indicate that the high-resolution signal has been better preserved during sample preparation, imaging, and image processing, implying that the particle images are of higher quality. B-factors are widely used to measure image quality in cryo-EM quantitatively^{35,36,37,38,39}. In our eight experimental datasets, the finest subset, consisting of only 21.0% to 32.8% of particles in the final stack, generates 3D density maps with the Rosenthal and Henderson’s B-factors reduced by 21.1 Å² to 169.0 Å², in comparison to those produced by the original final stacks (Table 2, column D and E). The process of fitting and solving for Rosenthal and Henderson’s B-factors is visualized in Supplementary Fig. 5. Moreover, the B-factors determined by CryoSPARC are presented in Table 2, columns B and C. In other words, the density maps reconstructed from the finest subset have a better high-resolution amplitude, meaning they contain a greater high-resolution intensity, despite the fact that the finest subsets only contain a small fraction of particles in the final stack. This indicates that CryoSieve significantly reduced the temperature factor and alleviated the amplitude contrast decay, suggesting that high-quality particles contribute to the density map and can be effectively selected by CryoSieve.

Table 2 The finest subsets alleviate high-resolution amplitude decay, along with a comparison to their theoretical number of particle limit

Full size table

CryoSieve can effectively detect radiation-damaged particles

We hypothesize that some particle images in the final stacks have been subject to some degree of radiation damage and cannot be screened out by conventional methods. These particles do not contribute positively to the reconstructed density map. To verify the possibility of this conjecture, we acquired micrograph movie stacks of the proteasome using a Titan Krios 300 keV cryo-EM equipped with a K3 direct electron detection camera. The defocus range was set between 0.5 μm and 1.5 μm. Each stack comprised 32 frames with a total electron dose of 50 e⁻Å⁻². The electron dose was uniformly distributed across all frames. Particles were picked from identical positions using averages from frames 5–14, 10–19, 15–24, and 20–29. Consequently, we constructed a dataset consisting of 183,464 particles that represented four different levels of absorbed electron doses (Fig. 4a).

**Fig. 4: CryoSieve prioritizes the removal of radiation-damaged particles.**

We assessed the retention behavior of CryoSieve, NCC, cisTEM, AGC, and non-alignment classification in particles subjected to varying radiation damage levels, using random retention as a comparative baseline. As the number of iterations increased, the retention rate diminished. Notably, CryoSieve demonstrated enhanced proficiency in identifying particles with elevated radiation damage levels relative to NCC and cisTEM (Fig. 4b). The retention ratio for cisTEM was equated to each iteration of CryoSieve (Supplementary Material II). For AGC and non-alignment classification, the retention ratio was autonomously determined. We simultaneously compared the distribution of particles across the four radiation damage levels, selecting the sixth iteration (with a retention ratio of 26.2%) of CryoSieve, NCC, and cisTEM for this analysis. The analysis also incorporated particles retained by the AGC and non-alignment classification methods, with retention ratios auto-determined for these methods (Fig. 4c). Model-to-map FSCs (Fig. 4d) and a thorough comparison of density maps (Fig. 4e) affirmed CryoSieve’s superiority over the other methods. Retained particles, utilizing the cisTEM score as a selection criterion, exhibited a preferred orientation, resulting in diminished quality (Supplementary Fig. 7).

While the approach of grouping frames from micrograph movie stacks cannot remove other potential complications that particles might endure, such as erroneous poses, CTF parameters, and denaturation, we sought additional validation. To this end, we employed InSilicoTEM to generate synthesized particles exhibiting varying simulated radiation damage. With these simulated radiation-damage datasets, CryoSieve consistently outperformed all other methods. Notably, in the final iterations, CryoSieve exclusively retained particles unaffected by radiation damage (Supplementary Material VIII).

It is worth noting that CryoSieve can efficiently remove particles with incorrect pose and CTF parameter estimations, achieving a high accuracy of over 90% (Supplementary Material III). However, these particles are also removed by the non-alignment classification approach (Supplementary Material III), making them unlikely to be present in the final stacks.

The finest subsets may be close to the theoretical number of particles limit

The theoretical number limit of particle images, given by Rosenthal and Henderson¹⁶, is

$${{{{{{\rm{N}}}}}}}_{{{{{{\rm{particles}}}}}}}=\frac{1}{{N}_{{{{{{\rm{asym}}}}}}}}\frac{\frac{{S}^{2}}{{N}^{2}}30\pi }{{N}_{e}{\sigma }_{e}d}\exp \left(\frac{B}{2{d}^{2}}\right),$$

(3)

where N_asym, $\frac{S}{N}$, N_e, σ_e, d, B stand for the number of asymmetric units, the signal-to-noise threshold criteria of the resolution, the electron dose, the elastic cross-section for carbon, the resolution, and the overall temperature factor, respectively. In the above formula, $\frac{S}{N}=\frac{1}{\sqrt{3}}$, which is equivalent to a phase error of 60° or 0.143-threshold of half-maps FSC¹⁶. Meanwhile, N_e = 5 e⁻Å⁻², which is believed to be the limiting dose due to radiation damage for features near-atomic resolution^16,19,40,41. The electron dose used in practice is typically a fold higher than the limiting dose. Although the additional dose does not contribute to the structure factor amplitudes at near-atomic resolution, it may have increased the signal up to the resolution limit of the final map, thus making the determination of particle parameters easier¹⁶. This conjecture agrees with the observation in the study of micrograph movie stack dose weighting, which found that only the initial few frames, not the subsequent frames, contribute to near-atomic features^42,43,44. Finally, σ_e = 0.004 Å² is the elastic cross-section for carbon at 300 kV⁴⁵.

The overall temperature factor, or Rosenthal and Henderson’s B-factor, is the dominant factor in estimating the theoretical limit. Here, we proposed a simplified assumption that limits only exist on instruments (TEM and electron detector) and that no other resolution-limiting factors exist. In other words, we assumed that all other procedures or techniques were ideal. For example, vitrified non-amorphous ice is perfectly flat and of ideal thickness, there is no beam-induced motion, and orientations of particles follow a uniform distribution, and there is no electron-charging effect. Therefore, B-factor represents a summary of all resolution-limiting factors of a given electron microscope and describes the overall quality of the instrumental setup. Holger Stark and his colleagues have summarized the current knowledge on existing state-of-the-art commercial EM hardware and their B-factors⁴⁶. For the standard Titan Krios, they concluded that its B-factor is 50 Å², which was determined by re-evaluating data from EMPIAR-10216 as described by⁴⁷, with modifications to account for off-axial aberrations by splitting the micrographs into nine subsets⁴⁸. Therefore, we computed the theoretical number of particle limits at B = 50 Å² (Table 2, column D). The sizes of the finest subsets obtained by CryoSieve were compared with such theoretical limits (Table 2, column E).

Out of the eight datasets examined, three (pfCRT, TSHR-Gs and apoferritin) were found to be close to their theoretical limits (Table 2, column E, emphasized by bold font). However, the TRPA1 dataset fell short of the theoretical limit by approximately 22-fold. This could be due to the lower resolution capabilities of the TF30 Polara TEM used in the study compared to more advanced models like the Titan Krios. It is possible that the assumed B-factor of 50 Å² for the TF30 Polara is relatively low and does not accurately reflect the properties of the TEM. Moreover, the sample preparation techniques employed during the TRPA1 study in 2015 might not have been fully optimized to attain the highest possible resolution. Hemagglutinin also fell short of the theoretical limit by roughly a factor of 36 due to using a tilt-collection strategy to compensate for the preferred orientation, which resulted in a larger effective ice thickness and a degradation in the quality of particle images. Lastly, LAT1 and TRPM8 exceeded the theoretical limit by factors of 9.8 and 6.3, respectively, suggesting that improvements in sample preparation could be made for these datasets.

Discussion

In this study, we introduced the CryoSieve algorithm, which has the ability to estimate the minimum number of particles in a dataset, referred to as the finest subset. CryoSieve demonstrated that most particles in the final stacks are superfluous and do not contribute to reconstructing density maps. On the other hand, the minority of particles that remain in the final stacks yields superior high-resolution amplitude. We also discovered that for some datasets, the size of the finest subset comes close to the theoretical limit. Therefore, CryoSieve can, to some degree, provide insight into a long-standing question in the cryo-EM field: How close can we approach the theoretical limit in practice?

CryoSieve can potentially establish a metric for the quantitative evaluation of various sample preparation techniques by measuring image quality based on the gaps between the theoretical limits and the size of the finest subsets. One of the possible future directions is to address the variables encountered during sample and grid preparation and establish cause-and-effect relationships. Resolving these issues, among others, cryo-EM could become a more versatile and influential technology in structural biology, potentially addressing research questions and aiding the growth of methodologies as the field advances⁴⁹.

Methods

Details of comparing the performance of particle sorting algorithms

Since cryo-EM single-particle image processing software has experienced rapid development in the past few years, some of the final stacks deposited in EMPIAR can be better processed by state-of-the-art algorithms. To eliminate effects from different refinement software and their versions, ensuring fair comparisons between various particle sorting algorithms, the final stacks deposited on EMPIAR were reprocessed under a standard workflow using CryoSPARC v4.1.0 following a standard workflow. For hemagglutinin, the initial model was generated by low-pass filtering its atomic model to 30 Å, while for the other proteins, initial models were generated by arbitrary random initialization using CryoSPARC. Then, uniform refinement was applied for TRPA1, TRPM8, hemagglutinin, LAT1, and apoferritin, while non-uniform refinement was applied for pfCRT and TSHR-Gs. For streptavidin, we employed local refinement. This was potentially due to the use of a phase plate in the streptavidin dataset, as ab initio reconstruction failed to produce a density map for streptavidin.

To enable unbiased comparisons of density maps before and after particle sorting, the retained particles obtained from each particle sorting algorithm underwent identical refinement procedures, as previously described using CryoSPARC v4.1.0 in the standard workflow. The reconstructed density maps were used for subsequent measurements. To ensure that there’s no undue influence of information from the discarded particles via their contribution to pose estimation, the former Euler angles were discarded (except streptavidin), and new sets of Euler angles were determined through the refinement of the retained particles. Moreover, in order to maintain independence between the two half sets and ensure that the Fourier Shell Correlation (FSC) served as the golden standard, half-set splits were preserved throughout the subsequent procedure by turning off the option “Force re-do GS split”.

The reconstructed density maps were evaluated by several metrics, including FSC-based resolution, Q-score and Rosenthal-Henderson B-factor. CryoSPARC produced two raw half maps and an auto-postprocessed density map (FSC-weighted, B-factor sharpened, two half sets averaged), accompanied by reporting half-maps FSC.

FSC-based metric includes half-maps FSC (directly reported by CryoSPARC) and model-to-map FSC. Map-to-model FSC resolution was calculated using the following procedure, with the auto-postprocessed density map as input. The corresponding atomic model of the dataset was converted to the ground-truth density map by the molmap function of Chimera at Nyquist resolution. The mask was generated from the ground-truth density map (after low-pass filtering to 8Å, extending by 4 pixels and applying a cosine-edge of 4 pixels) using RELION. Model-to-map FSC curves were determined between the input density map (obscured by the mask) and the ground-truth density map. The resolution threshold of the map-to-model FSC was set to 0.5.

As Q-score is sensitive to B-factor sharpening, the Q-scores of both the raw maps and the auto-postprocessed maps were measured. The auto-postprocessed maps were directly provided by CryoSPARC, while the raw maps were obtained by first averaging the two raw half maps provided by CryoSPARC, then low-pass filtering them to an appropriate resolution, in order to eliminate the impact of varying noise intensities on the density maps. The low-pass filtering threshold frequency ranged from 0.3 Å to 0.5 Å higher than the CryoSPARC reported half-maps FSC resolution, thus ensuring the retention of useful signals. Specifically, the threshold frequency for TRPA1 was 3.5 Å, for TRPM8 and TSHR-Gs it was 2.7 Å, for hemagglutinin it was 3.4 Å, for pfCRT it was 3.0 Å, for apoferritin it was 1.6 Å, for streptavidin it was 2.8 Å, and for LAT it was 2.8 Å. Q-score was calculated using the MAPQ plugin for UCSF Chimera, with all parameters set to their default values.

Rosenthal-Henderson B-factors were determined by fitting the formula that describes the relationship between resolution and the number of particles used for reconstruction. Five half-splitting repetitions were adopted for each dataset. After each repetition, the Euler angles were re-estimated by CryoSPARC, and the reported resolution was used for data fitting.

All conversions between CryoSPARC and RELION were performed using the pyem script.

CryoSieve’s parameters

CryoSieve iteratively performs 3D reconstruction and particle sieving, while maintaining independence between two half sets by independently sieving each set of particles. 3D reconstructions of each subset were performed using RELION v4.0-beta-2, with the option “–-subset” to preserve the half-set splitting. A mask, generated from the atomic model using RELION (low-pass filtered to 8 Å), was applied to the reconstructed raw density map to obtain x^(k−1) in Eq. 2 of the CryoSieve score. The same mask was applied to other particle sorting algorithms such as NCC and AGC, to ensure fair comparisons. Subsequently, particles were sieved out based on the ascending order of the CryoSieve score. In total, nine iterations were carried out, with each iteration retaining 80% of the particles from the previous iteration. The cutoff frequency of the highpass operator H^(k) increased linearly as the iteration progressed. For all datasets, except for LAT1 and apoferritin, the initial cutoff frequency was set at 40 Å, and the final cutoff frequency was 3 Å. For LAT1, the initial cutoff frequency was 50 Å, and the final cutoff frequency was also 3 Å. For apoferritin, the initial cutoff frequency was 40 Å, and the final cutoff frequency was also 2 Å.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The raw final stack datasets analyzed in this study were downloaded from the EMPIAR repository using accession codes EMPIAR-10024, EMPIAR-11233, EMPIAR-10097, EMPIAR-11120, EMPIAR-10264, EMPIAR-10330, EMPIAR-10269, EMPIAR-10200. Atomic coordinates from Protein Data Bank 6PCQ were used for the generation of simulated particles using InSilicoTEM v2.1.0. Source data are provided with this paper.

Code availability

CryoSieve²² is now open-sourced and available on GitHub [https://github.com/mxhulab/cryosieve]. A detailed tutorial can also be found on its homepage. Moreover, datasets used in this manuscript, along with the expected outputs after running CryoSieve, have been deposited on GitHub and can be accessed via CryoSieve’s homepage. Code has been uploaded to Zenodo and can be accessed via [https://doi.org/10.5281/zenodo.10040463].

References

Nogales, E. The development of cryo-EM into a mainstream structural biology technique. Nat. Methods 13, 24–27 (2016).
Article CAS PubMed PubMed Central Google Scholar
Bai, X., Fernandez, I. S., McMullan, G. & Scheres, S. H. Ribosome structures to near-atomic resolution from thirty thousand cryo-EM particles. eLife 2, e00461 (2013).
Article PubMed PubMed Central Google Scholar
Campbell, M. G. et al. Movies of ice-embedded particles enhance resolution in electron cryo-microscopy. Structure 20, 1823–1828 (2012).
Article CAS PubMed PubMed Central Google Scholar
Li, X. et al. Electron counting and beam-induced motion correction enable near-atomic-resolution single-particle cryo-EM. Nat. Methods 10, 584–590 (2013).
Article CAS PubMed PubMed Central Google Scholar
Grant, T., Rohou, A. & Grigorieff, N. cisTEM, user-friendly software for single-particle image processing. eLife 7, e35383 (2018).
Article PubMed PubMed Central Google Scholar
Scheres, S. H. W. RELION: implementation of a Bayesian approach to cryo-EM structure determination. J. Struct. Biol. 180, 519–530 (2012).
Article CAS PubMed PubMed Central Google Scholar
Punjani, A., Rubinstein, J. L., Fleet, D. J. & Brubaker, M. A. cryoSPARC: algorithms for rapid unsupervised cryo-EM structure determination. Nat. Methods 14, 290–296 (2017).
Article CAS PubMed Google Scholar
Kühlbrandt, W. The resolution revolution. Science 343, 1443–1444 (2014).
Article ADS PubMed Google Scholar
Dubochet, J. & McDowall, A. W. Vitrification of pure water for electron microscopy. J. Microsc. 124, RP3–RP4 (1981).
Article Google Scholar
Glaeser, R. M. How good can single-particle cryo-EM become? What remains before it approaches its physical limits? Annu. Rev. Biophys. 48, 45–61 (2019).
Article CAS PubMed Google Scholar
Baker, L. A., Smith, E. A., Bueler, S. A. & Rubinstein, J. L. The resolution dependence of optimal exposures in liquid nitrogen temperature electron cryomicroscopy of catalase crystals. J. Struct. Biol. 169, 431–437 (2010).
Article CAS PubMed Google Scholar
Bammes, B. E., Jakana, J., Schmid, M. F. & Chiu, W. Radiation damage effects at four specimen temperatures from 4 to 100K. J. Struct. Biol. 169, 331–341 (2010).
Article CAS PubMed Google Scholar
Glaeser, R. M. Limitations to significant information in biological electron microscopy as a result of radiation damage. J. Ultrastruct. Res. 36, 466–482 (1971).
Article CAS PubMed Google Scholar
Glaeser, R. M. Retrospective: Radiation damage and its associated ‘Information Limitations’. J. Struct. Biol. 163, 271–276 (2008).
Article CAS PubMed Google Scholar
Breedlove, J. R. & Trammell, G. T. Molecular microscopy: fundamental limitations. Science 170, 1310–1313 (1970).
Article ADS CAS PubMed Google Scholar
Rosenthal, P. B. & Henderson, R. Optimal determination of particle orientation, absolute hand, and contrast loss in single-particle electron cryomicroscopy. J. Mol. Biol. 333, 721–745 (2003).
Article CAS PubMed Google Scholar
Kuo, I. A. M. & Glaeser, R. M. Development of methodology for low exposure, high resolution electron microscopy of biological specimens. Ultramicroscopy 1, 53–66 (1975).
Article CAS PubMed Google Scholar
Unwin, P. N. T. & Henderson, R. Molecular structure determination by electron microscopy of unstained crystalline specimens. J. Mol. Biol. 94, 425–440 (1975).
Article CAS PubMed Google Scholar
Henderson, R. The potential and limitations of neutrons, electrons and X-rays for atomic resolution microscopy of unstained biological molecules. Q. Rev. Biophys. 28, 171–193 (1995).
Article CAS PubMed Google Scholar
Glaeser, R. M. Review: Electron crystallography: present excitement, a nod to the past, anticipating the future. J. Struct. Biol. 128, 3–14 (1999).
Article CAS PubMed Google Scholar
Glaeser, R. M. How good can cryo-EM become? Nat. Methods 13, 28–32 (2016).
Article CAS PubMed Google Scholar
Zhu, J. et al. A minority of final stacks yields superior amplitude in single-particle cryo-EM, CryoSieve 1.2.2. Zenodo https://doi.org/10.5281/zenodo.10040463 (2023).
Paulsen, C. E., Armache, J.-P., Gao, Y., Cheng, Y. & Julius, D. Structure of the TRPA1 ion channel suggests regulatory mechanisms. Nature 520, 511–517 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
Tan, Y. Z. et al. Addressing preferred specimen orientation in single-particle cryo-EM through tilting. Nat. Methods 14, 793–796 (2017).
Article CAS PubMed PubMed Central Google Scholar
Lee, Y. et al. Cryo-EM structure of the human L-type amino acid transporter 1 in complex with glycoprotein CD98hc. Nat. Struct. Mol. Biol. 26, 510–517 (2019).
Article CAS PubMed Google Scholar
Kim, J. et al. Structure and drug resistance of the Plasmodium falciparum transporter PfCRT. Nature 576, 315–320 (2019).
Article CAS PubMed PubMed Central Google Scholar
Faust, B. et al. Autoantibody mimicry of hormone action at the thyrotropin receptor. Nature 609, 846–853 (2022).
ADS CAS PubMed PubMed Central Google Scholar
Diver, M. M., Cheng, Y. & Julius, D. Structural insights into TRPM8 inhibition and desensitization. Science 365, 1434–1440 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Zivanov, J. et al. New tools for automated high-resolution cryo-EM structure determination in RELION-3. eLife 7, e42166 (2018).
Article PubMed PubMed Central Google Scholar
Fan, X. et al. Single particle cryo-EM reconstruction of 52 kDa streptavidin at 3.2 Angstrom resolution. Nat. Commun. 10, 2386 (2019).
Article ADS PubMed PubMed Central Google Scholar
Iudin, A. et al. EMPIAR: the Electron Microscopy Public Image Archive. Nucleic Acids Res. 51, D1503–D1511 (2023).
Article PubMed Google Scholar
Zhou, Y., Moscovich, A., Bendory, T. & Bartesaghi, A. Unsupervised particle sorting for high-resolution single-particle cryo-EM. Inverse Probl. 36, 044002 (2020).
Article ADS MathSciNet MATH Google Scholar
Méndez, J., Garduño, E., Carazo, J. M. & Sorzano, C. O. S. Identification of incorrectly oriented particles in cryo-EM single particle analysis. J. Struct. Biol. 213, 107771 (2021).
Article PubMed Google Scholar
Pintilie, G. et al. Measurement of atom resolvability in cryo-EM maps with Q-scores. Nat. Methods 17, 328–334 (2020).
Article CAS PubMed PubMed Central Google Scholar
Zheng, L. et al. Uniform thin ice on ultraflat graphene for high-resolution cryo-EM. Nat. Methods 20, 123–130 (2023).
Article CAS PubMed Google Scholar
Gyobu, N. et al. Improved specimen preparation for cryo-electron microscopy using a symmetric carbon sandwich technique. J. Struct. Biol. 146, 325–333 (2004).
Article CAS PubMed Google Scholar
Bock, L. V. & Grubmüller, H. Effects of cryo-EM cooling on structural ensembles. Nat. Commun. 13, 1709 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Ma, H., Jia, X., Zhang, K. & Su, Z. Cryo-EM advances in RNA structure determination. Signal Transduct. Target. Ther. 7, 1–6 (2022).
Google Scholar
Zhang, K., Pintilie, G. D., Li, S., Schmid, M. F. & Chiu, W. Resolving individual atoms of protein complex by cryo-electron microscopy. Cell Res. 30, 1136–1139 (2020).
Article PubMed Google Scholar
Henderson, R. & Glaeser, R. M. Quantitative analysis of image contrast in electron micrographs of beam-sensitive crystals. Ultramicroscopy 16, 139–150 (1985).
Article CAS Google Scholar
Henderson, R. Image contrast in high-resolution electron microscopy of biological macromolecules: TMV in ice. Ultramicroscopy 46, 1–18 (1992).
Article CAS PubMed Google Scholar
Zivanov, J., Nakane, T. & Scheres, S. H. W. A Bayesian approach to beam-induced motion correction in cryo-EM single-particle analysis. IUCrJ 6, 5–17 (2019).
Article CAS PubMed PubMed Central Google Scholar
Grant, T. & Grigorieff, N. Measuring the optimal exposure for single particle cryo-EM using a 2.6 Å reconstruction of rotavirus VP6. eLife 4, e06980 (2015).
Article PubMed PubMed Central Google Scholar
Zheng, S. Q. et al. MotionCor2: anisotropic correction of beam-induced motion for improved cryo-electron microscopy. Nat. Methods 14, 331–332 (2017).
Article CAS PubMed PubMed Central Google Scholar
Reimer, L. Transmission Electron Microscopy: Physics of Image Formation and Microanalysis (Springer, 2013).
Yip, K. M., Fischer, N., Paknia, E., Chari, A. & Stark, H. Breaking the next Cryo-EM resolution barrier—atomic resolution determination of proteins! Preprint at bioRxiv https://doi.org/10.1101/2020.05.21.106740 (2020).
Danev, R., Yanagisawa, H. & Kikkawa, M. Cryo-electron microscopy methodology: current aspects and future directions. Trends Biochem. Sci. 44, 837–848 (2019).
Article CAS PubMed Google Scholar
Zivanov, J., Nakane, T. & Scheres, S. H. W. Estimation of high-order aberrations and anisotropic magnification from cryo-EM data sets in RELION-3.1. IUCrJ 7, 253–267 (2020).
Article CAS PubMed PubMed Central Google Scholar
Callaway, E. The revolution will not be crystallized: a new method sweeps through structural biology. Nature 525, 172–174 (2015).
Article ADS CAS PubMed Google Scholar

Download references

Acknowledgements

This work was supported by the National Key R&D Program of China (No. 2021YFA1001300) (to C.B.), the National Natural Science Foundation of China (No. 12271291) (to C.B.), the Advanced Innovation Center for Structural Biology (to M.H.), the Beijing Frontier Research Center for Biological Structure (to M.H.), Shenzhen Academy of Research and Translation (to M.H.), Natural Science Foundation of China (No. 12071244) (to Z.S.). We would like to express our gratitude to Shouqing Li and Ranhao Zhang for generously sharing their expertise in particle selection and density map reconstruction in Cryo-EM. Our thanks also go to Dr. Nan Liu for providing valuable suggestions on this work, and to Jie Xu for his assistance in constructing the real radiation damage dataset of the proteasome.

Author information

These authors contributed equally: Jianying Zhu, Qi Zhang.

Authors and Affiliations

Yau Mathematical Sciences Center, Tsinghua University, Beijing, China
Jianying Zhu, Zuoqiang Shi & Chenglong Bao
Key Laboratory of Protein Sciences (Tsinghua University), Ministry of Education, Beijing, China
Qi Zhang & Mingxu Hu
School of Life Science, Tsinghua University, Beijing, China
Qi Zhang & Mingxu Hu
Beijing Advanced Innovation Center for Structural Biology, Beijing, China
Qi Zhang & Mingxu Hu
Beijing Frontier Research Center for Biological Structure, Beijing, China
Qi Zhang & Mingxu Hu
Qiuzhen College, Tsinghua University, Beijing, China
Hui Zhang
Yanqi Lake Beijing Institute of Mathematical Sciences and Applications, Beijing, China
Zuoqiang Shi & Chenglong Bao
Shenzhen Academy of Research and Translation, Shenzhen, China
Mingxu Hu
State Key Laboratory of Membrane Biology, School of Life Sciences, Tsinghua University, Beijing, China
Chenglong Bao

Authors

Jianying Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Qi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zuoqiang Shi
View author publications
You can also search for this author in PubMed Google Scholar
Mingxu Hu
View author publications
You can also search for this author in PubMed Google Scholar
Chenglong Bao
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

C.B., M.H., and Z.S. initiated the project. M.H., Q.Z., and J.Z. developed CryoSieve and carried out testing. H.Z. provided support in using InSilicoTEM. J.Z. and M.H. analyzed the data. M.H., J.Z., and C.B. wrote the manuscript.

Corresponding authors

Correspondence to Zuoqiang Shi, Mingxu Hu or Chenglong Bao.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Qun Liu and the other anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Reporting Summary

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Zhu, J., Zhang, Q., Zhang, H. et al. A minority of final stacks yields superior amplitude in single-particle cryo-EM. Nat Commun 14, 7822 (2023). https://doi.org/10.1038/s41467-023-43555-x

Download citation

Received: 19 May 2023
Accepted: 13 November 2023
Published: 10 December 2023
DOI: https://doi.org/10.1038/s41467-023-43555-x

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.