Abstract
The orientation distribution of a singleparticle electron cryomicroscopy specimen limits the resolution of the reconstructed density map. Here we define a statistical quantity, the efficiency, E _{od}, which characterises the orientation distribution via its corresponding point spread function. The efficiency measures the ability of the distribution to provide uniform information and resolution in all directions of the reconstruction, independent of other factors. This metric allows rapid and rigorous evaluation of specimen preparation methods, assisting structure determination to high resolution with minimal data.
Introduction
Recent technological advances have ushered in a new era of structure determination by singleparticle electron cryomicroscopy (cryoEM). Better electron detectors, new reconstruction algorithms and reduced movement of specimens during imaging, either through the use of software tracking algorithms or specimen supports, have all contributed to the rapid advance of the field^{1,2,3}. Still, the specimen and how it interacts with the surfaces of the support can limit and even prevent structure determination.
CryoEM requires a thin specimen that is stable in the vacuum of the electron microscope. Specimen supports are designed to create and preserve thin films of water, which contain the purified biological sample^{4}. During the cryoplunging process, the purified particles can diffuse into contact with the surface of the thin film of water anywhere from 10 to more than 1000 times before freezing using current techniques (Supplementary Note 1)^{5}. This gives ample opportunity for a particular region of the particle surface to adhere to the water surface, and thus induce a preferential orientation of the particle on the support. When imaged, micrographs are taken perpendicular to the surface of the thin film to provide highest contrast. So if the particles are not randomly oriented relative to the surface, this limits the available information and can prevent highresolution threedimensional (3D) structure determination.
Previous works addressing directional resolution in singleparticle cryoEM^{6, 7}, and electron cryotomography^{8} have characterised the final reconstructed density maps in terms of missing regions of Fourier space. Here, we define a statistical quantity, the “efficiency” (E _{od}), which characterises the specimen orientation distribution by its ability to provide uniform resolution in all directions.
Results
Analytical derivation
To determine how an orientation distribution affects 3D reconstructed density maps, we begin by assuming that each particle image is a twodimensional projection of the electron scattering potential in that direction. By the central slice theorem, each projection can be represented by a plane of density in Fourier space. We further assume that the angle assigned to the particle in the 3D reconstruction process is correct to within a defined accuracy, and that the falloff in information with resolution can be modelled as a Gaussian function characterised by a factor B, analogous to the DebyeWaller temperature factor used to describe the random thermal motion of atoms^{9}. Given a distribution of particle angles (Fig. 1a–c), we can calculate the normalised information transmission in Fourier space (Fig. 1d–f), and the corresponding point spread function (PSF), (Fig. 1g–i). We then determine the radius along a particular direction where the normalised PSF falls to 1/e ^{2}. The radii for all directions of several PSF’s calculated from the simulated distributions (Fig. 1a–c), are binned and plotted as histograms in Fig. 1g–i. Since the resolution in a particular direction is directly related to the radius of the PSF in that direction, we use this to define the efficiency as
where \(\bar r\) and σ are the overall mean and standard deviation of the resolution distribution, respectively. The E _{od}’s for the distributions in Fig. 1 are indicated. We note that in principle the PSF can be used to deconvolve some of the distortions caused by the orientation distribution as has been done in other optical microscopy methods^{10}. This is analogous to anisotropic sharpening used previously in electron microscopy^{6, 7}, but deconvolution cannot compensate for a lack of information caused by large gaps in the Fourier space density map.
We used the above theoretical framework to create an algorithm to calculate the Fourier space density map, the PSF, the directions of highest and lowest resolution, and the efficiency from a given orientation distribution. It is implemented in a freely available, open source program called cryoEF. The program takes as input the Euler angles assigned to a particle set from any of the popular 3D reconstruction programs, including Frealign, EMAN and RELION^{11,12,13}.
Properties of the metric
We defined the efficiency, somewhat arbitrarily, to have four properties useful for data assessment. First, the efficiency is linearly related to the amount of solid angle absent in the Fourier space density map (Fig. 1j, inset). Second, the efficiency can be determined accurately with only a small number of particles (~1000, Fig. 1j and Supplementary Fig. 1), as long as a threedimensional reconstruction can be computed with a reasonable angular accuracy. Third, a perfect orientation distribution would have an efficiency equal to 1, with a value of 0.5 being the approximate threshold between a distribution that will successfully reach highresolution (>1/(4 Å)) and one that will not, using current microscope technology and a typical amount of data (~10^{5} particles). Fourth, the efficiency is specifically a property of the orientation distribution, and only weakly depends on other factors that can affect resolution like image quality (Bfactor), specimen heterogeneity and data classification and processing algorithms. In this way it isolates one important factor in cryoEM structure determination: the orientation distribution.
To understand how efficiency relates to other factors that bear on the ultimate resolution of a reconstruction, we calculate how the mean resolution scales with number of particles, for different values of B (image quality), and E _{od} (efficiency), by extending the theory of image contrast and resolution vs. particle number developed previously^{9}. The loss of contrast at high resolution is modelled by a Gaussian falloff in the recorded Fourier amplitudes and the resolution limit is determined by the intersection of the average amplitude with the noise level, which scales inversely with the root of the number of particles. The result is shown in Fig. 1k. First, it is clear that the image quality, here taken as the Bfactor, is the most important determinant of the resolution achieved for the currently practical number of particle images in a data set (10^{4}–10^{6} asymmetric views). The new generation of electron detectors improve image quality and, therefore, reduce the number of particles required to reach a particular resolution. Still, if there is insufficient Fourier space coverage, even a large data set will fail to reach high resolution, as indicated in Fig. 1k. Further, given a data set, one can use this theory to estimate the number of particles required to reach a particular resolution. Thus, one can rationally decide if it would be better to collect more data to reach a desired resolution or instead try to improve the efficiency through the use of specimen tilt or other specimen supports or preparation methods.
Optimal tilt angles for data collection
We also include an algorithm for predicting tilt angles for data collection, which will improve the efficiency and the resolution of a data set. This is based on finding the minimum tilt angle that rotates the direction of highest resolution in the PSF to the direction of lowest resolution, and includes a model of contrast loss at tilt where the contrast is proportional to the cosine of the tilt angle. We demonstrate this improvement experimentally (Supplementary Fig. 2); while tilting the specimen can help in some instances, it cannot overcome the problems of reduced image quality at tilt, which is ultimately the more important factor in determining the resolution of a reconstruction (Fig. 1k). Tilting also cannot ameliorate the potentially destructive^{5} molecule–surface interactions indicated by an anisotropic orientation distribution. These can cause deformation or degradation of the specimen that cannot be compensated for in the imaging and reconstruction process.
Efficiency of previously published data sets
To evaluate the algorithm, we tested it on four previously published data sets, which have orientation distributions with a wide range of uniformity (Fig. 2). The first distribution, that of the symmetric enzyme beta galactosidase (βgal, D2 symmetry)^{14}, was close to random and provided almost uniform coverage of Fourier space and a near spherical PSF (E _{od} = 0.9). Second, the distribution of a 20S proteasome specimen^{15}, which has D7 symmetry, was assessed. Despite the fact that the vast majority of particles were oriented in one view (with the cylindrical symmetry axis parallel to the water surface), the efficiency was still high (E _{od} = 0.7) because of the symmetry of the complex. Third, an 80S ribosome data set on glowdischarged amorphous carbon^{16}, which yielded a 4.5 Å reconstruction with a moderate number of particles was assessed. Although there were clearly several preferred orientations in the distribution, they were sufficiently separated to yield an oblate PSF with an efficiency of 0.6, which is less than optimal but still sufficient to reach high resolution. Fourth, a data set^{17} for the same 80S ribosome specimen (this time without the carbon–water surface), but having strong preferential orientations and insufficient sampling to reach high resolution (19 Å), was tested and had a correspondingly low efficiency (E _{od} = 0.1). Here, the particle oriented itself strongly at the air–water surface, causing a large gap in Fourier space, which limited the resolution to ~40 Å along the preferred direction.
Discussion
The metric described here allows rapid and rigorous evaluation of specimen support materials and experimental conditions during sample preparation. This will enable one to reduce potentially detrimental surface interactions while optimising the orientation distribution for structure determination with minimal data. To demonstrate this, we collected a data set of approximately 2000 particle images (19 micrographs) of an 80S ribosome specimen, under identical conditions to those for the data in Supplementary Fig. 2, which allowed accurate determination of the efficiency of its orientation distribution. The entire process, from loading the grid in the microscope to reconstructing the density map and measuring the distribution and efficiency took 2 h. This shows that in one microscope day, one could potentially collect enough data to evaluate multiple different specimen support and preparation conditions, or quickly collect and process a test data set, which could then be used to guide data collection for the remainder of the session.
Surprisingly, the efficiency is as important as the amount of data in determining the resolution of a cryoEM structure, and can be used to predict the amount of data needed to reach a particular resolution. The efficiency can be used to characterise the orientation distribution of any singleparticle cryoEM data set, and could be included with other metrics of data quality like Fourier shell correlation coefficients and local resolution^{18, 19} when reporting cryoEM structures. Modern chromatography gels and resins comprise a myriad of specific surfaces that are carefully tuned to control molecule–surface interactions. An orientation distribution can be thought of as a map of the specimen’s surface interaction properties, and can thus be used to further our understanding of how biological molecules interact with specific surfaces. We thus envision that this statistical method will enable the further development of support surfaces like modified graphene^{17}, specifically tuned to match the specimen under study.
Methods
Efficiency calculation algorithm
To assess the effects of the orientation distribution on the resulting threedimensional reconstruction, we propose a definition of directional resolution. A unique point spread function corresponds to the geometry of any orientation distribution; we use the radius of this point spread function as a measure of directional resolution.
To obtain the point spread function for an arbitrary orientation distribution of projection images, the following steps are performed. We first calculate the transfer function, which corresponds to the coverage of Fourier space by the given distribution. If the molecule under consideration is known to be symmetric, we determine the orientation angles over all asymmetric units by applying the appropriate transformation matrices to the given distribution of views. By the central slice theorem, each twodimensional projection image from a known direction of view (ϕ, θ in spherical coordinates) corresponds to a plane in threedimensional Fourier space, perpendicular to this direction. Following Rosenthal and Henderson, we model the signal strength to decrease exponentially with Bfactor dependence in Fourier space^{9}:
$${C_i}({\bf{k}}) = \sqrt {{N_i}} {\rm{exp}}\left( {  \frac{{B{{\left {\bf{k}} \right}^2}}}{4}} \right)$$Here C _{ i } are the amplitude modulation coefficients, which describe how signal strength falls off at higher spatial frequencies for each different orientation (numbered i), N _{ i } is the number of particles in this orientation, and k is a position vector, varying over the projection plane in threedimensional Fourier space. The contributions of all projection views are then added up in quadrature, i.e., in power, rather than in amplitude, to give the threedimensional Fourier space transfer function τ, imposed by the orientation distribution:
$$\tau ({\bf{k}}) = \sqrt {{\sum} {C_i}^2({\bf{k}})}$$The transfer function is normalised to unit power. It represents the filter τ(k), with which the real object amplitude function O(k) is multiplied in Fourier space to yield the observed amplitudes of the image: I(k) = O(k) × τ(k). The corresponding real space PSF s(r) is obtained from the transfer function τ(k) by inverse Fourier transformation. By the convolution theorem, the threedimensional image is then given by the convolution of the real object function with the PSF: i(r) = o(r) * s(r). Therefore, the radius of the point spread function at a given threshold value can be a measure of the resolution imposed by the geometry of the orientation distribution. Here, we choose the threshold value to be at the \(\frac{1}{{{e^2}}}\) point relative to the central maximum, which corresponds to the point beyond which two Gaussian functions can no longer be separated.
The shape of the PSF can be used to quantitatively assess the underlying orientation distribution in terms of providing uniform Fourier space coverage, and hence isotropic resolution. By sampling the radius of the PSF in all directions, the spatial distribution of resolution is obtained. The relative spread of this distribution is directly related to the degree of anisotropy of the PSF. We define the efficiency of an orientation distribution, E _{od}, as
$${E_{{\rm{od}}}} = 1  \frac{{2\sigma }}{{\bar r}}$$where \(\bar r\) is the mean radius and σ is the standard deviation. For a perfectly uniform distribution, E _{od} = 1 since σ = 0, while for a single view E _{od} approaches zero or can even become slightly negative.
Optimal tilt angle calculation algorithm
From the PSF we identify the weakest resolved directions in the map and determine tilt axes and angles, which can improve the resolution in these directions. By applying rotation matrices, we calculate how collecting data at tilt changes the apparent particle orientation angles, and, therefore, the Fourier space coverage and the realspace PSF shape and orientation. In practice, attainable tilt angles are limited by the quality loss in the resulting micrographs. We hypothesise that micrograph quality loss depends on the cosine of the tilt angle, i.e., the effective transverse thickness of the specimen. Due to this signal degradation, a tilt angle that improves the final resolution is not guaranteed to exist in all cases. We find the optimal tilt angle and axis (if these exist), which are expected to yield the most significant resolution improvement in the weakest direction.
Software implementation and workflow
The algorithm described here is implemented in C++, using the publicly available library fftw3^{20} for Fourier transform computation on a threedimensional grid. For a visual presentation of the results, the Fourier space coverage and the corresponding realspace PSF are written to MRC format (.mrc) binary files^{21}. The point spread function can be conveniently displayed next to the corresponding density map to identify the weakest (most smeared) direction in the map.
For executing the program, a twocolumn list of the orientation angles of the particles in the format (ϕ, θ) is required. Software tools for denovo model generation are readily available^{22,23,24} and can produce sufficient angular assignment accuracy to make efficiency estimates. Still, for most accurate results a fully converged 3D reconstruction is required. The other required input is an estimate of the object size D (in Å), which determines the smallest meaningful sampling step in Fourier space \(\left( {\frac{1}{D}} \right)\). Other inputs are optional, and set to have appropriate default values. These include: FSC resolution, Bfactor, box size, symmetry group, etc. The symmetry conventions we use are identical to those employed in XMIPP^{25} and RELION^{13}. If the user inputs an FSC resolution, the algorithm scales the results accordingly. Typical computation time is of the order of 0.1–10 h, varying almost linearly with the size of the data set. This should not be a major limitation when using the method to analyse different experimental conditions, as small data sets (of ~1000 particles) are sufficient for assessment (Fig. 1j), and larger data sets can be fragmented into representative random subsets without introducing a significant error (Supplementary Fig. 2b).
Data collection and analysis
The 80S ribosome data for Supplementary Fig. 1 was collected with a specimen prepared on an all gold specimen support, as previously described^{26}. The use of gold supports reduces the radiationinduced specimen movement, the vast majority of which is perpendicular to the suspended foil, and whose transverse component severely compromises the quality of images collected at tilt. Briefly, 80S ribosome specimen was applied to a Ar:O_{2} (9:1) plasma treated all gold support without any additional layers, blotted, plunge frozen and imaged in a FEI Titan Krios at 300 keV under lowdose conditions. A small data set was collected (~ 3000 particles) and processed with RELION to obtain the orientation angles; the grid was stored during data processing. These were then used as input to the cryoEF program to asses the efficiency of the distribution and provide a predicted optimal tilt angle (29° for these data). A second small data set was then collected on another region of the same grid under the same conditions, at the prescribed tilt angle, and was analysed in the same way to generate the data for Supplementary Fig. 1.
Data availability
The source code for the presented method, accompanied by a user manual, precompiled binaries, and a test data set are freely available under the terms of an open source software license at the authors’ website (http://www.mrclmb.cam.ac.uk/crusso/cryoEF/). Previously published data sets used for testing are available from the Electron Microscopy Data Bank (https://www.ebi.ac.uk/pdbe/emdb/) under accession codes 2548, 6287 and 2275. Supporting data are available from the corresponding author upon reasonable request.
Additional information
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
 1.
Cheng, Y. Singleparticle cryoEM at crystallographic resolution. Cell 161, 450–457 (2015).
 2.
FernandezLeiro, R. & Scheres, S. H. W. Unravelling biological macromolecules with cryoelectron microscopy. Nature 537, 339–346 (2016).
 3.
Frank, J. Advances in the field of singleparticle cryoelectron microscopy over the last decade. Nat. Protoc. 12, 209–212 (2017).
 4.
Russo, C. J. & Passmore, L. A. Progress towards an optimal specimen support for electron cryomicroscopy. Curr. Opin. Struct. Biol. 37, 81–89 (2016).
 5.
Glaeser, R. M. & Han, B.G. Opinion: hazards faced by macromolecules when confined to thin aqueous films. Biophys. Rep. 3, 1–7 (2016).
 6.
Radermacher, M. Threedimensional reconstruction of single particles from random and nonrandom tilt series. J. Electron. Microsc. Tech. 9, 359–394 (1988).
 7.
Henderson, R., Baldwin, J. M. & Ceska, T. Model for the structure of bacteriorhodopsin based on highresolution electron cryomicroscopy. J. Mol. Biol. 213, 899–929 (1990).
 8.
Diebolder, C. A., Faas, F. G., Koster, A. J. & Koning, R. I. Conical Fourier shell correlation applied to electron tomograms. J. Struct. Biol. 190, 215–223 (2015).
 9.
Rosenthal, P. B. & Henderson, R. Optimal determination of particle orientation, absolute hand, and contrast loss in singleparticle electron cryomicroscopy. J. Mol. Biol. 333, 721–745 (2003).
 10.
Agard, D. A., Hiraoka, Y. & Sedat, J. W. Threedimensional microscopy: image processing for highresolution subcellular imaging. SPIE New Methods Microsc. Low Light Imaging 1161, 24–30 (1989).
 11.
Grigorieff, N. Frealign: an exploratory tool for singleparticle cryoEM. Methods Enzymol. 579, 191–226 (2016).
 12.
Ludtke, S. J. Singleparticle refinement and variability analysis in EMAN2.1. Methods Enzymol. 579, 159–189 (2016).
 13.
Scheres, S. H. W. Processing of structurally heterogeneous cryoEM data in RELION. Methods Enzymol. 579, 125–157 (2016).
 14.
Vinothkumar, K. R., McMullan, G. & Henderson, R. Molecular mechanism of antibodymediated activation of βgalactosidase. Structure 22, 621–627 (2014).
 15.
Campbell, M. G., Veesler, D., Cheng, A., Potter, C. S. & Carragher, B. 2.8 Å resolution reconstruction of the Thermoplasma acidophilum 20S proteasome using cryoelectron microscopy. eLife 4, e06380 (2015).
 16.
Bai, X.C., Fernandez, I. S., McMullan, G. & Scheres, S. H. W. Ribosome structures to nearatomic resolution from thirty thousand cryoEM particles. eLife 2, e00461 (2013).
 17.
Russo, C. J. & Passmore, L. A. Controlling protein adsorption on graphene for cryoEM using lowenergy hydrogen plasmas. Nat. Methods 11, 773–773 (2014).
 18.
Kucukelbir, A., Sigworth, F. J. & Tagare, H. D. Quantifying the local resolution of cryoEM density maps. Nat. Methods 11, 63–65 (2014).
 19.
Rosenthal, P. B. Testing the validity of singleparticle maps at low and high resolution. Methods Enzymol. 579, 227–253 (2016).
 20.
Frigo, M. & Johnson, S. G. The design and implementation of FFTW3. Proc. IEEE 93, 216–231 (2005).
 21.
Cheng, A. et al. MRC2014: extensions to the MRC format header for electron cryomicroscopy and tomography. J. Struct. Biol. 192, 146–150 (2015).
 22.
Elmlund, D. & Elmlund, H. SIMPLE: software for ab initio reconstruction of heterogeneous singleparticles. J. Struct. Biol. 180, 420–427 (2012).
 23.
Elmlund, H., Elmlund, D. & Bengio, S. PRIME: probabilistic initial 3D model generation for singleparticle cryoelectron microscopy. Structure 21, 1299–1306 (2013).
 24.
Punjani, A., Rubinstein, J. L., Fleet, D. J. & Brubaker, M. J. cryoSPARC: algorithms for rapid unsupervised cryoEM structure determination. Nat. Methods 14, 290–296 (2017).
 25.
Sorzano, C. et al. XMIPP: a new generation of an opensource image processing package for electron microscopy. J. Struct. Biol. 148, 194–204 (2004).
 26.
Russo, C. J. & Passmore, L. A. Ultrastable gold substrates for electron cryomicroscopy. Science 346, 1377–1380 (2014).
Acknowledgements
We thank R.A. Crowther for helpful discussions and suggestions throughout this project, G. McMullan for technical assistance, I.S. Fernandez and the V. Ramakrishnan lab for the gift of ribosome specimens, our colleagues at the LMB for beta testing the software and S.H.W. Scheres and R. Henderson for a critical reading of the manuscript. This work was supported by an Early Career Fellowship to C.J.R. from the Leverhulme Trust and Medical Research Council grant U105192715.
Author information
Affiliations
MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK
 Katerina Naydenova
 & Christopher J. Russo
Trinity College, University of Cambridge, Cambridge, CB2 1TQ, UK
 Katerina Naydenova
Authors
Search for Katerina Naydenova in:
Search for Christopher J. Russo in:
Contributions
K.N. and C.J.R. devised the theory, developed and implemented the algorithm, performed the experiments and wrote the manuscript.
Competing interests
The authors declare no competing financial interests.
Corresponding author
Correspondence to Christopher J. Russo.
Electronic supplementary material
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Further reading

Reducing effects of particle adsorption to the air–water interface in cryoEM
Nature Methods (2018)

Improving the efficiency of cryoEM
Nature Methods (2017)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.