An encryption–decryption framework to validating single-particle imaging

Shen, Zhou; Teo, Colin Zhi Wei; Ayyer, Kartik; Loh, N. Duane

doi:10.1038/s41598-020-79589-0

Download PDF

Article
Open access
Published: 13 January 2021

An encryption–decryption framework to validating single-particle imaging

Zhou Shen^1,2,
Colin Zhi Wei Teo^1,2,
Kartik Ayyer^3,4 &
…
N. Duane Loh^1,2,5

Scientific Reports volume 11, Article number: 971 (2021) Cite this article

1390 Accesses
6 Citations
1 Altmetric
Metrics details

Subjects

Abstract

We propose an encryption–decryption framework for validating diffraction intensity volumes reconstructed using single-particle imaging (SPI) with X-ray free-electron lasers (XFELs) when the ground truth volume is absent. This conceptual framework exploits each reconstructed volumes’ ability to decipher latent variables (e.g. orientations) of unseen sentinel diffraction patterns. Using this framework, we quantify novel measures of orientation disconcurrence, inconsistency, and disagreement between the decryptions by two independently reconstructed volumes. We also study how these measures can be used to define data sufficiency and its relation to spatial resolution, and the practical consequences of focusing XFEL pulses to smaller foci. This conceptual framework overcomes critical ambiguities in using Fourier Shell Correlation (FSC) as a validation measure for SPI. Finally, we show how this encryption-decryption framework naturally leads to an information-theoretic reformulation of the resolving power of XFEL-SPI, which we hope will lead to principled frameworks for experiment and instrument design.

On the quantification of sample microstructure using single-exposure x-ray dark-field imaging via a single-grid setup

Article Open access 07 July 2023

Free log-likelihood as an unbiased metric for coherent diffraction imaging

Article Open access 14 February 2020

Wolf phase tomography (WPT) of transparent structures using partially coherent illumination

Article Open access 19 August 2020

Introduction

X-ray free-electron lasers (XFELs) are a promising tool for studying the three-dimensional (3D) structures of macromolecular assemblies^1,2. The short and intense XFEL pulses make it possible to collect diffraction patterns of a macromolecule before the XFEL-damaged atomic nuclear motions become substantial^3,4,5,6,7.

XFEL pulses are sufficiently intense and coherent for single-particle imaging (SPI), where a single macromolecule can scatter enough photons for us to infer its 3D orientation, hence structure^8,9,10,11. XFEL-SPI makes the difficult task of growing large, well-diffracting macromolecular crystals (even micrometer size ones¹²) unnecessary.

Instead, desiccated samples are randomly injected at unknown orientations into a regular train of XFEL pulses. To understand how orientations are defined in SPI, consider what happens when a scatterer, whose 3D diffraction volume is denoted W, is presented to the SPI laboratory reference frame (Fig. 1).

Collected diffraction patterns are identified and analyzed in various ways including: determining the 3D structures that most likely produced the ensemble of SPI patterns¹³, or studying the range of 3D morphologies spanned by the XFEL scatterers^14,15,16.

Reconstructing a set of 3D structure from many SPI patterns comprises three sequential stages, each of which can be considered for validation⁶. These stages are: recovering a set of 3D diffraction intensities W from many two-dimensional (2D) SPI patterns; using phase-retrieval to reconstruct the 3D realspace scattering density from W; fitting atomic coordinates to the scattering density. Separate validation routines between these stages can help diagnose where resolution loss might have occurred.

This work focuses on validating the first stage, where we reconstruct W by inferring the latent 3D orientations of SPI diffraction patterns. This inference is challenging for small macromolecules that produce weak diffraction patterns. In these cases, the Fourier Shell Correlation (FSC)¹⁷, which is typically used to validate 3D structures recovered using cryo-electron microscopy, has become increasingly popular for estimating spatial resolution^{13,16,18,19,20,21,22,23,24,25,26,27,28,29}.

However, the use of FSC, as well as other proposed measures of reconstruction errors^6,30, to characterize XFEL-SPI resolution suffers three main issues. First, and most importantly, Fig. 2 illustrates how the resolution reported using the popular half-bit FSC criterion actually improves with increased orientation blurring. This occurs because XFEL-SPI reconstructions approach the same virtual powder average as their input patterns become more misoriented. Consequently the ‘noise terms’ between two independently reconstructed volumes (see Eq. (3) in³¹) become correlated. Hence the FSC measure, which is invariant to isotropic filtering, can paradoxically report better resolutions when the orientation uncertainty of patterns increases. Second, the threshold criterion for determining resolution is controversial even in the cryo-electron microscopy community^31,32. This criterion is demonstrably dependent on the speckle sampling ratio (i.e. size of realspace support), the symmetry of the particle, and assumes additive noise³¹. Unfortunately, there are still prominent violations of these criteria³³. Third, to compute the FSC between two 3D volumes, their relative orientations must be accurately determined.

To circumvent some of these issues with FSC, we propose examining the source of correlations between two independently reconstructed volumes: the ‘disconcurrence’, inconsistency, and agreement between how these volumes orient individual patterns. A similar orientation-based approach to validation was explored by Tegze and Bortel³⁴, where they proposed using the fraction of patterns that are well-oriented to validate intensity reconstructions. However, the so called C-factor that they proposed for validation only considered orientation precision but not accuracy nor reproducibility.

It can be useful to recast the XFEL-SPI validation problem in information theoretic terms. Indeed, information theory has been insightful for SPI³⁵ as well as coherent diffraction imaging^36,37. In fact, the half-bit criterion for FSC in cryo-electron microscopy³¹ established a connection between spatial resolution and information theory. There, however, the half-bit criterion merely referred to when the signal-to-noise ratio of an idealized noisy channel attained a value of $\sqrt{2}-1$. What this signal-to-noise ratio means for resolving spatial features within an object remains unclear.

Looking farther back, Shannon’s original proof of the noisy channel theorem was based on a straightforward encoding–decoding scheme³⁸. Below we show how Shannon’s scheme can be explicitly constructed for the orientation determination problem in SPI. Doing so, allows us to validate reconstructions using an orientation resolution that can be directly related to the mutual information of the SPI experiment.

An SPI reconstruction is similar to probabilistic symmetric-key cryptography, where plaintext messages are encrypted into ciphertexts using a correct key plus a randomness scheme. Because of this randomness, the same plaintext message can produce different ciphertexts.

The analogous messages in an XFEL-SPI experiment are the hidden orientations of illuminated single particles³⁹. The experimental setup itself can be viewed as a cipher algorithm that encrypts these messages as noisy two-dimensional (2D) diffraction patterns. When these orientations (messages) are properly decrypted, the full three-dimensional (3D) diffraction volume of the target particle can be recovered.

The conundrum for SPI, however, is that these orientations are best decrypted using the ground truth 3D diffraction volume. Hence, reconstructing this diffraction volume can be viewed as ‘cracking’ (i.e. guessing) the correct symmetric key in probabilistic cryptography. Figure 3 shows the similarities between SPI-validation and key-cracking in cryptography, which has the following correspondence:

correct key $\leftrightarrow$ ground truth 3D diffraction intensities;
encryption cipher $\leftrightarrow$ SPI experiment;
decryption cipher $\leftrightarrow$ orientation inference scheme;
ciphertexts $\leftrightarrow$ photon patterns collected in experiment;
messages $\leftrightarrow$ orientations of individual photon patterns.

Algorithms that discover the orientations of SPI patterns^8,10,40,41, analogously, try to recover the unknown key (i.e. 3D diffraction intensities) given many ciphertexts (i.e. photon patterns).

Now let us consider how one can check/validate the accuracy/correctness of a recovered key, absent the ground truth. An obvious method is to determine whether the recovered key is consistent with known prior constraints or independent measurements. Such external validations, however, are not always possible in SPI especially when resolving novel structural forms.

We know that a correct key must decipher each ciphertext into a unique message. However, this uniqueness alone is insufficient to determine correctness, since wrong keys given to a deterministic cipher can yield unique but wrong decipherments. An example of this occurs when a recovered key overfits to a set of ciphertexts. Nevertheless, we can exploit this uniqueness requirement to design a scheme that detects if at least one of two candidate keys is incorrect.

Suppose we are given two disjoint sets of ciphertexts ($\{K_A\}, \{K_B\}$) that are encrypted by the same solution key $W_T$. We can independently recover two keys ($W_A, W_B$), one from each set of ciphertexts. Disagreements between how these two keys decipher a third hidden set of ciphertexts $\{K_\text {S}\}$ betrays the incorrectness of at least one of these two keys. If the first two sets of ciphertexts are sufficiently large and randomly chosen then both candidate keys are likely incorrect.

Owing to the randomness in probabilistic encryption, it is practically impossible to guarantee a perfectly accurate key given only a finite number of noisy ciphertexts. Analogously, we cannot perfectly recover the ground truth SPI diffraction volume only from a finite number of noisy, incomplete photon patterns. Consequently, any pair of recovered keys must differ measurably from each other. This difference quantifies the decryption precision of these keys, which is the lower bound of their decryption accuracies.

Back to the SPI data analysis, we wish to find the difference in how two independently reconstructed volumes $W_A$ and $W_B$ decrypt the orientations of a third disjoint set of sentinel photon patterns, $\{K_\text {S}\}$. This difference in decryption increases if the disagreement between $W_A$ and $W_B$ increases. More importantly, it also increases as either volume departs farther from the hidden ground truth volume $W_T$. We refer to this difference as the orientation disconcurrence between these two volumes.

To define this framework in Fig. 3 requires well-defined encryption and decryption procedures. In an XFEL-SPI experiment, this encryption is described by how an illuminated scatterer at a certain orientation generates a noisy photon pattern (Fig. 1). In a Bayesian framework, the probability that a scatterer’s specific orientation (Q) is encrypted as a particular photon pattern (K) is termed the data likelihood. Inversely, the probability that a pattern K will be decrypted as a particular orientation Q is its equivalent orientation posterior distribution (OPD).

This encryption of orientation information into a photon pattern is governed by the physics of photon–particle interaction, wavefront propagation, and photon measurement on the detector. Under ideal XFEL-SPI experimental conditions the photon pattern $K_t$ is a Poisson sample from an Ewald tomogram, $W_{Qt}$, of a particle at orientation Q (Fig. 1). This idealization allows an explicit formulation of the likelihood (see Eq. (10)), and hence OPD. Additionally, one might consider factors such as extraneous photon scattering sources, non-linear detector artefacts, and the local fluence of the XFEL pulses each particle randomly encounters. Such non-Poissonian OPDs were shown to be effective in different XFEL-SPI experiments^13,19,39. More generally, there is an infinite number of alternatives to the Poissonian OPD that could be used to decrypt particle orientation from photon patterns. Exploring the efficacy of these myriad alternatives is clearly beyond the scope of this paper.

The encryption–decryption framework that validates two intensity reconstructions ($W_A, W_B$) in Fig. 3 is indifferent to the algorithms that were used to reconstruct $W_A$ and $W_B$. And while the Poissonian OPD chosen in this paper was also used in the original EMC algorithm to infer the orientations of photon patterns⁸, here this OPD is used to decrypt orientations for validating 3D intensity volumes $W_A, W_B$, which could be reconstructed with algorithms other than EMC. Since our validation occurs after $W_A$ and $W_B$ are separately reconstructed, it does not add any computational overhead during their reconstructions.

The OPD that most accurately describes the experiment should be used both to reconstruct and validate reconstructions. Hence it is unsurprising that the OPD used in both situations are identical.

Finally, since the validation framework in Fig. 3 compares the ability of two volumes $W_A$ and $W_B$ to decrypt orientations, we are essentially comparing their OPDs from decrypting the orientations of a set of sentinel patterns. To compare these OPDs, we evaluate their convolutions in orientation space to produce what we call angular displacement distributions (ADD). The orientation disconcurrence between $W_A$ and $W_B$ are then extracted from this ADD. The procedure to compute the orientation disconcurrence given $W_A$ and $W_B$ is outlined below.

1.
Partition the XFEL-SPI photon patterns $\{K\}$ into three disjoint sets: two larger and equally sized sets, $\{K_A\}$ and $\{K_B\}$, for reconstructions; and a third, smaller set of unseen sentinel patterns $\{K_\text {S}\}$ to measure orientation disconcurrence.
2.
Using any algorithm you desire, reconstruct two 3D intensities from the two larger sets of patterns: $\{K_A\} \rightarrow W_A$, and $\{K_B\} \rightarrow W_B$.
3.
For each sentinel pattern $K_\text {S}$, compute the OPD of the reconstructed volumes $W_A$ and $W_B$. This is the probability that $K_\text {S}$ corresponds to the Ewald sphere section of orientation $\Omega$ in each reconstructed volume (i.e. $P(\Omega _A|K_\text {S}, W_A)$ and $P(\Omega _B|K_\text {S}, W_B)$). This step creates $2\,|\{K_\text {S}\}|$ distributions, two for each sentinel pattern, where $|\{K_\text {S}\}|$ is the number of sentinel patterns used.
4.
Next, we compute the angular displacement distribution (ADD, defined in Eq. (13)) of the sentinel patterns from the OPD of $W_A$ and $W_B$. The ADD for each sentinel pattern $K_\text {S}$ (the red or blue distribution in Fig. 4) is essentially a convolution of OPD$_A$ and OPD$_B$ over the space of relative orientations between $W_A$ and $W_B$. If OPD$_A$ and OPD$_B$ were delta functions, then this convolution peaks at the relative orientation between $W_A$ and $W_B$. The ADD$_{AB}$ (the grey distribution in Fig. 4), which is the normalized sum of these convolutions for all sentinel patterns (Eq. (14)), is the distribution of relative orientations between $W_A$ and $W_B$ as ‘measured by’ $\{K_\text {S}\}$.
5.
Finally, from the ADD of all the sentinel patterns between the volumes $W_A$ and $W_B$, estimate their orientation disconcurrence.

Results

Measures of orientation uncertainties

The orientation disconcurrence between two independently reconstructed volumes comprises two aspects: inconsistency and disagreement. By the cryptographic analogy, the first aspect characterizes how consistently each volume separately decrypts the orientations of sentinel patterns; the second aspect describes how often the decryptions of two (or more) volumes mutually agree. These concepts are illustrated in Fig. 5, and defined below.

In the following numerical simulations, we use the disconcurrence between independent reconstructions from the same scatterer to estimate the lower bound of their correctness. Recall that this procedure requires partitioning a set of photon patterns into three disjoint sets ($\{K_A\}, \{K_B\}, \{K_\text {S}\}$). We reconstruct two 3D intensities from the first two sets ($W_A$ and $W_B$ respectively), while the last sentinel set is reserved for validation. Unlike an actual experiment, the true solution intensities $W_T$ that generated these patterns are known in these simulations, and will provide useful insights. Given these definitions, let us consider different orientation measures at the end of the procedure outlined at the end introduction section.

1.
Measure of orientation disconcurrence: $\Delta \theta _\text {c}(W_A, W_B)$ (Eq. (17)) is computed from the width of the angular displacement distribution (ADD) between intensities $W_A$ and $W_B$ that are independently reconstructed from two disjoint sets of patterns. $\Delta \theta _\text {c}$ measures the difference between the orientations of specific sentinel patterns within $W_A$ and $W_B$, despite having aligned the centroids of these two distributions (i.e. overall orientations of $W_A$ and $W_B$).
2.
Measure of average orientation inconsistency:
$$\begin{aligned} \Delta \theta _\text {i}(W_A, W_B) = \sqrt{\frac{1}{2} \sum _{i\in \{A,B\}} \Delta \theta ^2_\text {c}(W_i, W_i)}\;. \end{aligned}$$
(1)
This is the root-mean-squared (RMS) angular width of the autocorrelation of $W_A$’s and $W_B$’s orientation posterior distribution (OPD), which is equivalent to repeating the intensity model labels in Eq. (18). In Fig. 4, the angular width of the blue and red points show the orientation inconsistency for decryption the orientations of two sentinel patterns ($K_1$ and $K_2$). The RMS of $\Delta \theta ^2_\text {c}(W_A, W_A)$ and $\Delta \theta ^2(W_B, W_B)$ is used to approximate the angular width (red or blue distribution) in Fig. 4, because it is expensive to calculate the inconsistency between $W_A$ and $W_B$ for each sentinel patterns and it is a good approximation when the OPD is assumed to be a Gaussian distribution (see more details in “A one-dimensional (1D) model” section). Thus $\Delta \theta _\text {i}$ simply averages this width over all sentinel patterns and both reconstructions $W_A$ and $W_B$.
3.
Measure of orientation disagreement:
$$\Delta \theta _\text {a} (W_A, W_B) = \sqrt{\left( \Delta \theta _\text {c} (W_A, W_B)\right) ^2 - \left( \Delta \theta _\text {i}(W_A, W_B)\right) ^2}\;,$$
(2)
which is the angular displacement between reconstructions $W_A$ and $W_B$ that is not due to an overall rotation between the two volumes, nor from the angular width $\Delta \theta _\text {i}$ of the OPD. In “A one-dimensional (1D) model” section, this relation is illustrated with a 1D model in more detail.
4.
Measure of orientation inconsistency given the ground truth:
$$\begin{aligned} \Delta \theta _\text {i}^{*}=\Delta \theta _\text {c}(W_T, W_T)\; , \end{aligned}$$
(3)
which measures the angular width of the OPD in determining the patterns’ orientations given the ground truth $W_T$. With enough patterns in $\{K_A\}$ and $\{K_B\}$, such that $W_A$ and $W_B$ do not over-fit to their respective photon patterns, we expect $\Delta \theta _\text {i} \ge \Delta \theta _\text {i}^*$.
5.
Measure of orientation disconcurrence with ground truth:
$$\begin{aligned} \Delta \theta ^{*}_\text {c}(W_A)=\Delta \theta _\text {c}(W_A, W_T) \; , \end{aligned}$$
(4)
which is the angular width of the ADD between the reconstructed and ground truth intensity volumes ($W_A$ vs $W_T$ respectively). Notice that $\Delta \theta _\text {c}$ is identical to $\Delta \theta ^{*}_\text {c}$ above if we replaced $W_B \rightarrow W_T$. Hence, $\Delta \theta ^{*}_\text {c}$ is essentially the orientation disconcurrence between $W_A$ and the ground truth.
6.
Measure of average orientation disconcurrence with ground truth:
$$\begin{aligned} \langle \Delta \theta ^{*}_\text {c} \rangle =\sqrt{\frac{1}{2} \sum _{i\in \{A,B\}} \bigl(\Delta \theta ^{*}_\text {c} (W_i)\bigr)^2 } \; , \end{aligned}$$
(5)
which is the average angular width of the ADDs between the reconstructed versus the ground truth intensity volumes ($W_A, W_B$ vs $W_T$ respectively). If only two volumes were reconstructed, $W_A$ and $W_B$, then $\langle \Delta \theta ^{*}_{c} \rangle$ represents the average orientation disconcurrence against the ground truth.

Factors that influence disconcurrence

Many experimental factors influence the orientation disconcurrence of an SPI intensity reconstruction including: incident photon fluence, number of photon patterns from single particles, resolution and sampling of each pattern, amount of missing detector data (i.e. beamstop, gaps in compound detectors, inactive pixels), extent of photon background (i.e. from particles’ incoherent scattering or stray light sources), degree of structural heterogeneity between particles in the ensemble. The choice of algorithms and their parameters used to reconstruct the intensities also play important roles. Furthermore, the symmetries of the scatterer itself can also affect how the ADD is interpreted (see Fig. 9 and “Methods” section).

In this section, we focus on three of these factors: the average number of photons per pattern N, the fineness of orientation space sampling by reconstruction algorithms, and the number of patterns $M_{\text {data}}$. In each scenario studied below, we simulated diffraction patterns with a small 105 kDa protein (PDB code, 4ZW6⁴²) under experimental conditions that were modeled after those at the Tender X-ray endstation at the Linac Coherent Light Source (see Table 1). We then used the EMC algorithm to reconstruct two independent 3D volumes each from disjoint sets $\{K_A\}, \{K_B\}$, each with $M_{\text {data}}$ patterns. For each test condition, a single set of 1000 sentinel patterns was reserved $\{K_\text {S}\}$ to evaluate the six types of $\Delta \theta$ listed above. The user should choose the number of sentinel patterns such that the uncertainties of their orientation disconcurrence is acceptably small. Another consideration is whether the range of SO(3) orientations is adequately covered by randomly oriented sentinel patterns (see “Sentinel pattern coverage in the SO(3) orientation space” section).

Table 1 Range of parameters used to simulate XFEL-SPI photon patterns of a 105 kDa protein (PDB code, 4ZW6) in this paper. Here we assume that the incident beam energy 3 mJ, transmission efficiency 20%, and a binned detector is used here for computational efficiency.

Full size table

The average number of photons per diffraction pattern (N) is directly related to the mutual information for inferring latent parameters (e.g. orientations) as well as the particle’s structure⁸. N depends on the brightness of the X-ray beam, the size of the X-ray focus (i.e. beam intensity), as well as the relative alignment between particle and X-ray beams. In general, all six types of $\Delta \theta$ fall when N increases in Fig. 6. Simply put, more photons per pattern reduces orientation disagreement and inconsistency, hence disconcurrence. Additionally, the orientation disconcurrence between $W_A$ and $W_B$ falls with their respective disconcurrences with the ground truth $W_T$. This correspondence is consistent with the fact that uniqueness is a necessary condition for correctness (i.e. ‘precision $\le$ accuracy’).

How finely orientations are sampled in XFEL-SPI reconstruction algorithms impacts the quality of reconstructed results⁸. Recall, this sampling fineness is different from the adaptive refinement scheme for OPD and ADD Eq. (12): the former pertains to the reconstruction algorithm, while the latter evaluates the reconstructed results. Fig. 6 shows that a higher sampling level in the EMC reconstruction algorithm generally reduces all alignment uncertainties $\Delta \theta$. While the various forms of $\Delta \theta$ have a noticeable spread at $n=8$ orientation sampling, this spread significantly reduces when this sampling fineness is increased to $n=13$. Numerically, we found the average angular separation between the quasi-uniform unit quaternions samples to be 0.161 and 0.099 radians respectively. This figure complements the information-theoretic heuristic for deciding sampling sufficiency in⁸. With sufficient sampling, Fig. 6 shows that the orientation disconcurrence is dominated by the orientation inconsistency rather than orientation disagreement: $\Delta \theta _\text {c} (W_A, W_B) \approx \Delta \theta _\text {i} (W_A, W_B) > \Delta \theta _\text {a} (W_A, W_B)$.

In an SPI experiment the number of SPI patterns, $M_{\text {data}}$, is a product of the fraction of particles that are illuminated by x-ray pulses (i.e. hit-rate), the pulse repetition rate, and the total experiment time. One intuitively expects that reconstructions improve with larger $M_{\text {data}}$, which Fig. 7 confirms. The intrinsic orientation inconsistency of each reconstruction, $\Delta \theta _\text {i}$, falls with more patterns (blue curve). The orientation disconcurrence $\Delta \theta _\text {c}$, likewise, also falls with more patterns.

We found that in Fig. 7 that $\Delta \theta _\text {c}$ and $\Delta \theta _\text {i}$ both decrease numerically with the number of patterns as $\alpha \, M_{\text {data}}^{-\beta } + \Delta \theta _\text {i}^{*}$, where $\alpha$ is a multiplicative constant, $\beta$ is a real positive number, and $\Delta \theta _\text {i}^{*}$ is the angular width of the OPD given the patterns $\{K_\text {S}\}$ and ground truth model. Although $\Delta \theta _\text {c} \rightarrow \Delta \theta _\text {i}^{*}$ as $M_{\text {data}}\rightarrow \infty$, we can only assert that the reconstructed pairs of models ($W_A$ and $W_B$) are closer to each other, but not whether either are close to the ground truth $W_T$. The former is evident from the ratio of orientation disagreement against disconcurrence, $\Delta \theta _\text {a}^2 / \Delta \theta _\text {c}^2$ (gray dots in Fig. 7): increasing $M_{\text {data}}$ eliminates orientation disagreements ($\Delta \theta _a$) between two independent reconstructions faster than intrinsic inconsistency ($\Delta \theta _\text {i}$). Using Eq. (2) and the fitted forms in Fig. 7, this vanishing of the orientation disagreement becomes clear:

$$\begin{aligned} \Delta \theta _\text {a}&= \sqrt{\Delta \theta _\text {c}^2 - \Delta \theta _\text {i}^2} \nonumber \\&=\sqrt{\big (\alpha _\text {c} M_{\text {data}}^{-\beta _\text {c}} + \gamma _\text {c}\big )^2- \big (\alpha _\text {i}M_{\text {data}}^{-\beta _\text {i}} + \gamma _\text {i}\big )^2 }\; \nonumber \\&\approx M_{\text {data}}^{-\beta _\text {c}/2} \sqrt{\left( \alpha _\text {c} + 2 \gamma \right) \alpha _\text {c}}\; , \end{aligned}$$

(6)

where we assumed $\beta _\text {c} < \beta _\text {i}$, and $\gamma _\text {c} \approx \gamma _\text {i}=\gamma$. Obviously, when $M_{\text {data}}$ approaches infinity, $\Delta \theta _\text {a}$ gets close to 0. Simply put, as $M_{\text {data}}$ increases independently reconstructed volumes become more unique but not necessarily more correct.

Relating $\Delta \theta$ to spatial resolution

The 3D speckles in the reconstructed diffraction volume whose angular width are smaller or comparable to $\Delta \theta _\text {c}$ will lose contrast, hence spatial resolution. Let us denote the full angular width of these 3D speckles as $2\Delta \theta _{\text {sp}} (\mathbf{q} )$ at spatial frequency $\mathbf{q}$. Naturally, the resolutions of reconstructions become orientation-limited at the frequencies where $\Delta \theta _{\text {sp}} (\mathbf{q} )$ approaches the width of OPD which is about $\Delta \theta _\text {c}/\sqrt{2}$ (“A one-dimensional (1D) model” section).

We caution that the previous paragraph suggests an inequality rather than strict equality between spatial resolution and orientation disconcurrence. To understand why, consider how Fig. 8 shows that it is possible for reconstructions whose orientation disconcurrence is smaller than the angular width of a single pixel at the edge of the detector $\Delta \theta _{\text {pix}}$. This situation occurs with very high average number of photons per pattern ($N \gg 1$), abundant patterns ($M_{\text {data}}\gg 1$), and sufficiently fine sampling of the rotation group during reconstructions (Fig. 6). Thus, the dynamic range and contrast of the reconstructed 3D diffraction speckles are high up to the detector’s maximum captured resolution ($\mathbf{q} _\text {max}$), which allows us to distinguish arbitrarily small angular variations between actual diffraction patterns.

We must remember that the reconstructed diffraction volume W does not explicitly contain spatial information beyond the maximum spatial resolution $\mathbf{q} _\text {max}$. So even if $\Delta \theta _\text {c} \ll \Delta \theta _\text {pix}$, we can only say that spatial resolution is not orientation limited. Perhaps with additional priors about the structure of the particle (e.g. know sequence, similar structure known, atomicity, etc) is might be possible to extend the resolution beyond $\mathbf{q} _{\text {max}}$. But such extensions are beyond the scope of this discussion.

It should now be clear that orientation disconcurrence relates to how effectively one can resolve the orientation of an average SPI photon pattern. From this section, it should also be clear that spatial resolution can be limited by large orientation disconcurrences. More concretely, consider Fig. 8, which simulates an XFEL-SPI experiment of a 105 kDa protein at the Tender X-ray endstation at LCLS (Table 1). To resolve this protein to 10nm-resolution without significant orientation blurring requires more than 5000 patterns each with more than 600 photons. However, it is premature to define spatial resolution only in terms of orientation concurrence, especially since a decryption scheme for the spatial resolution (similar to Fig. 3) is absent. Such detailed discussions, however, are deferred to future studies.

Data sufficiency and mutual information

The question ‘how many patterns are sufficient?’ frequently occur in an XFEL-SPI experiment. The answer to this hypothetical question determines if a proposed experiment is ‘feasible’, as well as how many different samples to inject during the precious dozens of hours of XFEL beamtime allocated to each user group. Orientation disconcurrence can be used to define data sufficiency: when the number of patterns gives a disconcurrence smaller than the angular width of speckles at a target resolution $\mathbf{q}_\text {target}$:

$$\begin{aligned} 2\cdot \frac{\Delta \theta _\text {c}}{\sqrt{2}} \le \theta _\text {sp} (\mathbf{q} _\text {target}) \; . \end{aligned}$$

(7)

If the ADD peak in Fig. 4 were compact and locally Gaussian (“A one-dimensional (1D) model” section), this last condition means that approximately $74\%$ ($2\sigma$ criterion) of the oriented sentinel patterns should intersect their target 3D speckle at resolution $\mathbf{q} _\text {target}$.

With the disconcurrence target defined, we can extrapolate data sufficiency with bootstrapping. Given $M_{\text {data}}$ total patterns, one can compute $\Delta \theta _\text {c} (M_{\text {data}})$ for pairs of models reconstructed from random, non-overlapping, equal subsets from the full $M_{\text {data}}$ dataset similar to the data points in Fig. 7. Repeating this procedure via a simple bootstrapping scheme gives the orientation disconcurrence curves in Fig. 7. These curves fit reasonably well to a lifted power law, $\Delta \theta _\text {c} = \alpha _\text {c} M_{\text {data}}^{- \beta _\text {c}} + \gamma _\text {c}$. The shrinking error bars on $\Delta \theta _\text {c}$ from bootstrapping with increasing $M_{\text {data}}$ in Fig. 7 suggests that this fit requires sufficiently many patterns to be robust.

Owing to various constraints, only a finite number of XFEL-SPI patterns are collected each time (say $M_{\text {exp}}$). To maximize signal-averaging in a reconstruction logically requires the input from all collected patterns. Yet the two independent reconstructions in this framework (Fig. 3) only sees only a little less than half of the full dataset ($< M_{\text {exp}}/2$). Fortunately, the lifted power law fit in Fig. 7 allows us to extrapolate the orientation disconcurrence between a pair of hypothetical independent 3D reconstructions that each used all patterns in an XFEL-SPI dataset. Specifically, if $\Delta \theta _\text {c}(M_{\text {data}}\le M_{\text {exp}}/2)$ were computed between pairs of reconstructed volumes each using up to $M_{\text {exp}}/2$ bootstrapped photon patterns, then the angular uncertainty of a single volume with all $M_{\text {exp}}$ patterns can be extrapolated using the fit: $\Delta \theta _\text {c}(M_{\text {data}}=M_{\text {exp}}) = \alpha _\text {c} M_{\text {exp}}^{\beta _\text {c}} + \gamma _\text {c}$. A similar extrapolation from bootstrapped reconstructions was proposed to define spatial resolution in cryo-electron microscopy⁴³.

This lifted power law also helps us extrapolate to a second scenario. Should the target orientation disconcurrence be the angular width of a single pixel at the edge of the detector, $\Delta \theta _\text {c}=\Delta \theta _{\text {pix}}(\mathbf{q} _\text {max})$, then $\gamma _\text {c} < \Delta \theta _\text {pix}(\mathbf{q} _\text {max})$ is required. If this requirement is satisfied, then $\frac{1}{\beta _\text {c}}\log {\left[ \alpha _\text {c}/(\Delta \theta _{\text {pix}}(\mathbf{q} _\text {max}) - \gamma _\text {c}) \right] }$ patterns are needed to reach this target.

The lifted power law form of $\Delta \theta _\text {c} = \alpha _\text {c} M_{\text {data}}^{- \beta _\text {c}} + \gamma _\text {c}$ in Fig. 7 allows us to parametrize data sufficiency in an information-theoretic sense. Essentially, the mutual information here can be defined as the reduction in the entropy of orienting an average sentinel pattern give a set of $M_{\text {data}}$ photon patterns $\{K\}$. Ignoring factors of order unity, this mutual information, is approximately

$$\begin{aligned} I(\Omega _{\text {S}}, \{K\})&\approx \log \left( \frac{2 \pi ^2}{\Delta \theta _\text {c}^3} \right) \nonumber \\&\approx \log \left( \frac{2 \pi ^2}{ \Delta \theta _\text {i}^{*3}} \right) - \frac{3\alpha _\text {c}}{\Delta \theta _\text {i}^*} M_{\text {data}}^{-\beta _\text {c}} \; , \end{aligned}$$

(8)

assuming $M_{\text {data}}\gg 1$.

Equation (8) contains two intuitive results. First, this mutual information is bounded from above by that when the solution intensities are known: $\log \left( 2 \pi ^2 / (\Delta \theta _\text {i}^*)^3 \right)$. This upper bound can be viewed as the SPI channel capacity for decryption orientations, and is computed in the same manner as the mutual information $I(K,\Omega )|_W$ in⁸. Second, the mutual information for decryption orientations increases with the number of patterns. This assumes that $\alpha _\text {c}/\Delta \theta _\text {i}^*> 0$ and $\beta _\text {c} > 0$, which are manifest in Fig. 7. Furthermore, $\beta _\text {c} > 0.5$ in Fig. 7, which is better than one would expect if patterns were mutually independent (i.e. $\beta _\text {c} = 0$). This ‘co-dependence’ arises because additional patterns can improve the reconstructed volumes, which in turn help earlier patterns distribute their photons more precisely into orientation classes.

Focal spot size affects hit rate and orientation disconcurrence

The linear size of the XFEL focus $L_\text {focus}$ is a critical parameter in an SPI experiment (see Table 1). This choice of focus size can be paraphrased simply: given a fixed total number of photons per XFEL pulse, would it be better to ‘distribute’ them into more patterns with fewer photons each, or fewer patterns with more photons each? Whereas a larger focus can dramatically increase the odds of illuminating randomly injected particles, it also drastically decreases the number of scattered photons should a particle be illuminated (N). These odds, also known as the ‘hit-rate’, is effectively $M_{\text {data}}$ per time. In fact, $N\propto L_\text {focus}^{-2}$ while $M_{\text {data}}/\text {time} \propto L_\text {focus}^{2}$. In this hypothetical scenario, the total number of photons measured per time ($N M_{\text {data}}/\text {time}$) remains constant despite $L_\text {focus}$. Suppose that in either case, you had enough patterns to adequately sample different views of the scatterer, and were perfectly able to detect particle hits against background scatter/noise. This same ambivalence to the focus size appears again in the simple signal-to-noise ratio (SNR) described in⁸:

$$\begin{aligned} \text {SNR} = \left( \frac{N M_{\text {data}}}{M_{\text {rot}}}\right) ^{1/2} \;, \end{aligned}$$

(9)

where $M_{\text {rot}}$ is the number of rotation samples used to reconstruct the intensity volumes $W_A$ and $W_B$. This SNR is motivated by a simple distribution of photons across a limited number of Ewald tomograms, and has been used to indicate data sufficiency in the orientation space⁹.

The discussion above may lead one to believe that there is no ideal focus size. However, if we again used a smaller orientation disconcurrence $\Delta \theta _\text {c}$ to quantify when things are ‘better’, the preference is to reduce $L_\text {focus}$. Notice that nearly doubling the average number of photons per pattern ($N =355$ to $N=622$ given $M_{\text {data}}=5000$) in Fig. 6 reduces both $\Delta \theta _\text {c}$ and $\Delta \theta _\text {i}$ more than if we doubled the number of patterns ($M_{\text {data}}=5000$ to $M_{\text {data}}=10000$ given $N=355$) in Fig. 7. The total number of photons in all patterns is approximately equal in both cases. Yet doubling the average number of photons per pattern substantially improves the asymptotic orientation inconsistency (i.e. $\Delta \theta _\text {i}^*$ falls).

Discussion

In summary, we propose an encryption–decryption approach to validate 3D intensity volumes reconstructed in XFEL-SPI. This validation is based on the volumes’ ability to decrypt the orientations of sentinel patterns unused in these reconstructions. While these volumes can be reconstructed from any algorithmic means, they must strictly adhere to the data independence scheme laid out in Fig. 3. This scheme can be generalized to validate other latent information inferred within the full dataset (e.g. unmeasured local photon fluence, structural class, etc).

From realistic simulations of SPI experiments this approach can validate reconstructions in a principled information-theoretic manner. Our approach relates the challenging question of data sufficiency intuitively to key experimental variables such as the number of measured photon patterns, and nominal incident photon intensity. Furthermore, the various forms of decrypting (orientation) uncertainties shown here can be interpreted as disconcurrence, disagreement, and inconsistencies in how confidently the latent variables are inferred. These interpretations give a more informative and comprehensive view of the validation exercise.

Whereas there were studies about the expected scattered photon signals from biomolecules in idealized XFEL-SPI scenarios^44,45, systematic studies of how well these signals can be integrated into a 3D diffraction volume despite missing information is still sorely lacking. Our results show that the complex considerations that contribute to data sufficiency in XFEL-SPI can be fitted as simple parameters (e.g. $\alpha , \beta , \gamma$). Relating these parameters to basic properties of the target scatterer (e.g. mass, radius of gyration, etc), experimental conditions (e.g. beam intensity, photon wavelength, background scattering, etc), and choice of reconstruction algorithms, will be useful for experiment design and planning.

An extension of our encryption–decryption approach can be used to define and validate the spatial resolution of XFEL-SPI and cryo-electron microscopy reconstructions. In principle, the resolving power of an imaging instrument should be the reduction in uncertainty of locating spatial features within the sample. Re-framing this uncertainty reduction in the encryption–decryption framework of Fig. 3 may give rise to more interpretable notions of spatial resolution. This information theoretic formulation of this conceptual framework, similar to Eq. (8), also naturally accounts for external priors for localizing spatial features.

Ultimately, our encryption-decryption approach demonstrably overcomes the difficulties of using FSC as a validation measure for XFEL-SPI, in spite of FSC’s popularity^{13,16,18,19,20,21,22,23,24,25,26,27,28,29}. The data throughput from XFELS will rapidly increase because of higher pulse repetition rates⁴⁶, and more efficient sample injection techniques. This trend inevitably creates a larger data load, which in turn increases our reliance on statistical techniques to assign confidence to de novo structural reconstructions. Such confidence is especially important when imaging structural ensembles with considerable flexibilities, or other structural variations. Despite the specificity of our validation routine to orientations, the encryption–decryption framework proposed in Fig. 3 can be readily generalized to test the reproducibility of claims of novel reconstructed structures. Such tests, we believe, are central to illuminating our path towards novel structural insights as we navigate through the photon-limited world of XFEL-SPI.

Methods

Sampling orientations

A scatterer can take on an infinite number of possible 3D orientations. In practice these orientations Q are discretely sampled to angular divisions smaller than the intrinsic angular precision of the patterns (see “Relating $\Delta \theta$ to spatial resolution” section). We adopt a quasi-uniform sampling scheme based on⁸, which adaptively refines the 600-cell polytope with refinement parameter n. In this scheme the number orientation samples scales like $n^3$, while their angular resolution increases like 1/n.

Orientation posterior distribution (OPD) of sentinel patterns

The orientation posterior distribution (OPD) of a particular sentinel pattern $K_\text {S}$ defines the probability of orienting it within a specific 3D diffraction volume W. This OPD, written here as $P(Q\,\vert \,K_\text {S},W)$, can be inferred from the likelihood $P(K_\text {S}\,\vert \,Q, W)$ using Bayes’ theorem,

$$\begin{aligned} P(Q \,\vert \,K_\text {S},W) \propto P(K_\text {S}\,\vert \,Q,W ) \, P(Q), \end{aligned}$$

(10)

where the prior distribution of orientations, P(Q), is uniformly distributed unless the specimens have a known orientation bias. Because the space of orientations is only quasi-uniformly sampled by unit quaternions in our discretization scheme, we replace P(Q) with the numerically computed non-uniform weights w(Q)⁹. Note that this OPD can be computed even if $K_{\text {S}}$ did not in fact originate from W: such a computation will naturally yield highly uncertain orientations of $K_{\text {S}}$.

We presume the likelihood of detecting a sentinel pattern $K_{\text {S}}$ (comprising pixels indexed by t) from the Ewald tomogram at orientation Q of volume W (see Fig. 1) assuming perfect detection absent background photon sources is

$$\begin{aligned} P(K_{\text {S}}\,\vert \,Q,W ) = \prod _{t \in \text {detector}} \frac{ \text {e}^{-W_{Q i}} \,W_{Q t}^{K_{\text {S}t}} }{K_{\text {S}t}!}. \end{aligned}$$

(11)

This likelihood can be replaced if the true detection statistics departs from this Poissonian form.

Often the posterior and likelihood in Eqs. (10) and (11) of a converged intensity volume is significant only for a relatively small set of orientations. For a given pattern $K_{\text {S}}$, we represent this set of important orientations by their corresponding important unit quaternions $\{{\varvec{Q}}\,\vert \,K_{\text {S}}\}$ (written in boldface). For computation efficiency, only the probability at $\{{\varvec{Q}}\,\vert \,K_{\text {S}}\}$ is recorded; those at other quaternions are safely set to zero.

For sufficient orientation coverage, we require these important quaternions to capture at least 99% of the total posterior distribution. To implement this, all patterns’ posterior distributions are first sampled by a unit quaternion set $\{Q \,\vert \,n\}$ with 600-cell quaternion sampling strategy⁸ where n is the sampling refinement level. Then we increase n until the smallest set of important quaternions $\{{\varvec{Q}}\,\vert \,K_{\text {S}},n\}_{\text {min}} \subset \{Q \,\vert \,n\}$ that captures this total posterior distribution comprises at least 100 important quaternions:

$$\begin{aligned} \Big \langle \sum _{Q \in \{{\varvec{Q}}\,\vert \,K_{\text {S}}, n\}_{\text {min}}} P(Q \,\vert \,K_{\text {S}}, W)\Big \rangle _{K_{\text {S}}} \ge 0.99 \; , \end{aligned}$$

(12)

and the size of every $K_{\text {S}}$, $|\{{\varvec{Q}}\,\vert \,K_{\text {S}}, n\}_{\text {min}}| \ge 100$. To be concise, we omit the subscript $\cdot _\text {min}$ in subsequent formulae.

Angular displacement distribution (ADD) between two reconstructed volumes

Returning to our cryptography analogy, our next step is to compare how two diffraction volumes decrypt the orientations of a set of sentinel patterns. Three key considerations stand out here. First, the orientation of a noisy sentinel pattern is described by a probability distribution (i.e. OPD) rather than a point estimate. Second, $W_A$ and $W_B$ would almost always differ by an overall mutual 3D rotation $Q_{BA}$ because each volume is typically randomly initialized to avoid reconstruction biases. Hence, the sentinel OPDs for $W_A$ and $W_B$ would also be displaced by $Q_{BA}$. Third, we must average the OPDs for different sentinel patterns to obtain a robust estimate of the orientation disconcurrence between $W_A$ and $W_B$. These considerations are captured in the angular displacement distribution (ADD) between $W_A$ and $W_B$. The ADD allows us to compare the OPD of a single sentinel pattern ($K_{\text {S}}$) given $W_A$ and $W_B$ without having to pre-align them in the space of possible orientations.

Mathematically, the ADD for a single sentinel pattern $K_{\text {S}}$ is the outer product (or convolution) of its two OPDs given $W_A$ and $W_B$ on their respective important quaternions,

$$\begin{aligned} P({\varvec{Q}}_{BA} | K_\text {S}, W_A, W_B)&= \sum _{{\varvec{Q}}_{A}} P({\varvec{Q}}_{A}|K_\text {S}, W_A) P({\varvec{Q}}_{B} |K_\text {S}, W_B) \nonumber \\&=\sum _{{\varvec{Q}}_{A}} P({\varvec{Q}}_{A}|K_\text {S}, W_A) P({\varvec{Q}}_{BA} {\varvec{Q}}_{A} |K_\text {S}, W_B) \; , \end{aligned}$$

(13)

which is computed over the set of important unit quaternions. Here ${\varvec{Q}}_{BA} = {\varvec{Q}}_B {\varvec{Q}}_A^{-1}$ represents the possible relative orientations between the reconstructed volumes $W_A$ and $W_B$ over the two sets of important quaternions $\{{\varvec{Q}}_A | K_{\text {S}}\}$ and $\{{\varvec{Q}}_B | K_{\text {S}}\}$ as defined in Eq. (12). Since ${\varvec{Q}}_{BA}$ depends on the sentinel pattern $K_\text {S}$, the ADD in Eq. (13) may be different for different $K_{\text {S}}$. Averaging the ADD over all the set of sentinel patterns $\{ K_{\text {S}}\}$ we get

$$\begin{aligned} P({\varvec{Q}}_{BA} |\{ K_\text {S}\}, W_A, W_B) \equiv \Big \langle P({\varvec{Q}}_{BA} | {K_\text {S}}, W_A, W_B) \Big \rangle _{\{K_\text {S}\}}\; . \end{aligned}$$

(14)

Given the noise in the diffraction patterns, we expect variations in the decrypted orientations of sentinel patterns. To compute this variation, an average of an ADD must be established. When the reconstructed volumes $W_A$ and $W_B$ are similar, the ADD of their many sentinel patterns tend to cluster around the average unit quaternion ${\overline{Q}}_{AB}$ in orientation space. This overall rotation ${\overline{Q}}_{AB}$ is not a mere linear average of the unit quaternions that sample the ADD since this average may not have unit length and hence not correspond to a 3D spatial rotation. To define ${\overline{Q}}_{AB}$, let us first consider the relative rotation between ${\varvec{Q}}_{BA}$ and a presumptive average overall rotation ${\widetilde{Q}}$. This relative rotation can be written as a quaternion multiplication

$$\begin{aligned} {\varvec{Q}}_{BA}^{-1} \, {\widetilde{Q}}&= \Big \{ \cos \left( \frac{\theta }{2} \right) , \, \sin \left( \frac{\theta }{2} \right) \hat{\varvec{n}} \Big \} \, , \end{aligned}$$

(15)

which is written here as a four-component vector; $\hat{\varvec{n}}$ and $\theta$ are respectively the axis and magnitude of this relative rotation. The magnitude of this relative rotation, $\theta ({\varvec{Q}}_{BA}, {\widetilde{Q}})$, vanishes as ${\widetilde{Q}}$ approaches ${\varvec{Q}}_{BA}$.

We define the average overall rotation ${\overline{Q}}_{BA}$ of an ADD between $W_A$ and $W_B$ as that which minimizes the average $\theta$ against all the rotation samples of the ADDs for the set of sentinel patterns. Specifically, the average overall rotation is defined as the unit quaternion that minimizes the angular variance $\Theta ^2$:

$$\begin{aligned} {\overline{Q}}_{BA}&\equiv \mathop {\hbox {arg min}}\limits _{{\widetilde{Q}}} \Theta ^2\bigl ( {\widetilde{Q}} \,\big \vert \,\{K_\text {S}\}, W_A, W_B \bigr ) \, , \end{aligned}$$

(16)

and the orientation disconcurrence is the minimum value of $\sqrt{\Theta ^2}$:

$$\begin{aligned} \Delta \theta _\text {c}(W_A, W_B)&\equiv \min _{{\widetilde{Q}}} \sqrt{\Theta ^2\bigl ( {\widetilde{Q}} \,\big \vert \,\{K_\text {S}\}, W_A, W_B \bigr )}\nonumber \\&=\sqrt{\Theta ^2({\overline{Q}}_{BA} \,\vert \,\{K_\text {S}\}, W_A, W_B)}\;, \end{aligned}$$

(17)

where the angular variance is defined as

$$\begin{aligned}&\Theta ^2\bigl ({\widetilde{Q}} \,\big \vert \,\{K_\text {S}\}, W_A, W_B \bigr ) =\nonumber \\&\left\langle \sum _{\{{\varvec{Q}}_{BA} \,\vert \,K_\text {S}\}} P({\varvec{Q}}_{BA} \,\vert \,K_\text {S}, W_A, W_B) \, \theta ^2({\varvec{Q}}_{BA}, {\widetilde{Q}}) \right\rangle _{\{K_\text {S}\}}\;. \end{aligned}$$

(18)

A special case here is when $W_A$ and $W_B$ are identical. In this case, ${\overline{Q}}_{BA}=(1,0,0,0)$ which is the identity quaternion.

Resolving ambiguities from centro-symmetric diffraction volumes

To obtain the most compact ADD (Eq. (14)), we must eliminate trivial symmetries in the diffraction patterns that broaden the ADD. One such example is the centro-symmetry of 3D diffraction intensities from optically thin samples, whose scattering density distribution is effectively real-valued. Consequently, at sufficiently low resolutions any two-dimensional diffraction pattern is similar to itself after a 180° in-plane rotation about the scattering experiment’s optical axis (${\hat{z}}$). Each such photon pattern K should have similar posterior probabilities to occur at either rotation Q or $Q Q_z$:

$$\begin{aligned} P(Q\,\vert \,K, W) \approx P(QQ_z\,\vert \,K,W) \; , \end{aligned}$$

(19)

where the in-plane rotation about the z-axis is $Q_z = (0,0,0,1)$. This two-fold ambiguity plus the fact that $Q_z$ is its own inverse, means that in ADD, the relative rotation $Q_{BA}$ or $Q_{BA}^{\prime } = Q_B \,Q_z \,(Q_A)^{-1}$ could occur in Eq. (14). Hence, for each ADD sample we check the angular closeness of both $Q_{BA}$ and $Q_{BA}^{\prime }$ to the ADD’s average unit quaternion ${\overline{Q}}_{BA}$, and keep the one that is closer. This essentially replaces the $\theta$ expression in Eq. (18):

$$\begin{aligned} \theta ^2({\varvec{Q}}_{BA}, {\widetilde{Q}}) \rightarrow \text {min}\{\theta ^2({\varvec{Q}}_B{\varvec{Q}}_A^{-1}, {\widetilde{Q}}), \theta ^2({\varvec{Q}}_B Q_z {\varvec{Q}}_A^{-1}, {\widetilde{Q}})\} \; . \end{aligned}$$

(20)

Discrete symmetries in the diffraction volume

Discrete symmetries in the diffraction volume can create multiple clusters in the ADD (Fig. 9). Examples of such symmetries include icosahedral viral capsids¹³ and octahedral nanoparticles¹⁸. The multiplicity of these clusters arise because each pattern could be oriented at different and/or multiple locations of the symmetry orbit within the diffraction volume. As Fig. 9 shows, should this symmetry be known we can compute a single orientation disconcurrence by first folding these multiple symmetry-related peaks in ADD into its fundamental domain. We emphasize that this folding can be done even if this symmetry were not imposed during the reconstructions of $W_A$ and $W_B$.

Figure 9 illustrates ADD folding for a particle with chiral octahedral symmetry (O). The reconstructed diffraction intensities of this particle ($W_A$ and $W_B$) has 24 rotational symmetries (of order 24). Once $W_A$’s body axes are canonically aligned, then each of these symmetry rotations can be represented by a canonical set of unit quaternions $\{ Q_\mathbf{O} \,\vert \,\left[ Q_\mathbf{O}\right] \in \mathbf{O}\}$ ($\left[ Q_\mathbf{O}\right]$ is the equivalence class $Q_\mathbf{O} \sim -Q_\mathbf{O}$ owing to unit quaternions double covering SO(3).

To see how this symmetry manifests in an ADD, consider orienting a particular sentinel pattern $K_\text {S}$ within $W_A$ and $W_B$. Note that even though $W_A$ and $W_B$ have $\mathbf{O}$ symmetry, they are not canonically aligned by default. First, we focus on a tomogram of $W_B$ at ${\varvec{Q}}_B$, $T({\varvec{Q}}_B, W_B)$. Here, the symbol for tomogram is changed from the $W_Q$ in the main text to avoid multiple level subscript. When we align $W_B$ canonically by actively rotating it to ${\widetilde{Q}}_{{\mathbf {O}}B}[W_B]$, the tomogram should be rotated together to maintain unchanged, where ${\widetilde{Q}}_{\mathbf{O}B}$ actively rotates $W_B$ to ${\widetilde{Q}}_{{\mathbf {O}}B}[W_B]$ into the canonical axes for the symmetry operations in $\{Q_\mathbf{O}\}$. In other words, we have

$$\begin{aligned} T({\varvec{Q}}_B, W_B)&= T\bigl ({\widetilde{Q}}_{{\mathbf {O}}B}{\varvec{Q}}_B, {\widetilde{Q}}_{{\mathbf {O}}B}[W_B]\bigr ) \end{aligned}$$

(21)

$$\begin{aligned}&=T\bigl ({\widetilde{Q}}_{{\mathbf {O}}B}{\varvec{Q}}_B, (Q_{\mathbf {O}}{\widetilde{Q}}_{{\mathbf {O}}B})[W_B]\bigr ) \end{aligned}$$

(22)

$$\begin{aligned}&=T\bigl ({\widetilde{Q}}_{{\mathbf {O}}B}^{-1}Q_{\mathbf {O}}^{-1}{\widetilde{Q}}_{{\mathbf {O}}B}{\varvec{Q}}_B, W_B\bigr )\text {.} \end{aligned}$$

(23)

The 24 elements in $\{Q_\mathbf{O}\}$ give 24 same tomograms at ${\widetilde{Q}}_{{\mathbf {O}}B}^{-1}Q_{\mathbf {O}}{\widetilde{Q}}_{{\mathbf {O}}B}{\varvec{Q}}_B$ (all $Q_{\mathbf {O}}^{-1}\in \{Q_{\mathbf {O}}\}$ also), hence the same orientation posterior probability at these orientations. Recalling the ADD comprises the joint product of OPDs for $K_\text {S}$ to be oriented at ${\varvec{Q}}_A$ and ${\varvec{Q}}_B$ within $W_A$ and $W_B$ respectively. We see this multiplicity of ADD in Fig. 9b (main text), which contains 48 clusters owing to the unit quaternion double covering $\text {SO}(3)$. The number of clusters does not increase even if we include the symmetry operations of $W_A$ by assuming $W_A$ and $W_B$ are similar, for the same reason that randomly oriented sentinel patterns in an asymmetric volume still produce a 2-clustered ADD (only one branch is plotted in Fig. 4).

For each sentinel pattern $K_\text {S}$, we can fold each important unit quaternion ${\varvec{Q}}_{BA}$ in its ADD into the fundamental domain by exhaustively searching the symmetry operation in $\bigr \{{\widetilde{Q}}_{{\mathbf {O}}B}^{-1}Q_{\mathbf {O}}{\widetilde{Q}}_{{\mathbf {O}}B}{\varvec{Q}}_B\,\big \vert \,Q_{{\mathbf {O}}}\in \{Q_{\mathbf {O}}\}\bigr \}$ and in-plane inversion $Q_z$ (either $\{1,0,0,0\}$ or $\{0,0,0,1\}$) that minimizes the angular variance

$$\begin{aligned}&\theta ^2_\text {min}\left( {\widetilde{Q}}_{\mathbf{O}B}, {\widetilde{Q}} \,\vert \,K_\text {S}, {\varvec{Q}}_{BA}\right) = \nonumber \\&\min _{\{Q_\mathbf{O}\} \times \{Q_z\}} \theta ^2\left( {\widetilde{Q}}_{\mathbf{O}B}^{-1} Q_\mathbf{O}\, {\widetilde{Q}}_{\mathbf{O}B} {\varvec{Q}}_B Q_z {\varvec{Q}}_A^{-1}, {\widetilde{Q}} \,\vert \,K_\text {S} \right) \; . \end{aligned}$$

(24)

Here, ${\widetilde{Q}}$ is the presumptive average relative rotation between $W_A$ and $W_B$ similar to that in Eq. (16). Like Eq. (20), we also minimize over each pattern’s in-plane inversion. Therefore, the optimal relative rotation (${\overline{Q}}_{BA}$) and canonical realignment (${\overline{Q}}_{\mathbf{O}B}$) are found by minimizing the total angular variance weighted over all important unit quaternions for all sentinel patterns in the ADD:

$$({\overline{Q}}_{\mathbf{O}B}, \; {\overline{Q}}_{BA}) = \mathop {\hbox {arg min}}\limits _{({\widetilde{Q}}_{\mathbf{O}B}, \; {\widetilde{Q}})} \Theta ^2\left( {\widetilde{Q}}_{\mathbf{O}B}, {\widetilde{Q}} \,\vert \,\{K_\text {S}\}, W_A, W_B \right)$$

where

$$\Theta ^2\left( {\widetilde{Q}}_{\mathbf{O}B}, {\widetilde{Q}} \,\vert \,\{K_\text {S}\}, W_A, W_B \right) = \left\langle \sum _{\{{\varvec{Q}}_{BA} \,\vert \,K_\text {S}\}} P({\varvec{Q}}_{BA} | K_\text {S}, W_A, W_B)\, \theta ^2_\text {min}\left( {\widetilde{Q}}_{\mathbf{O}B}, {\widetilde{Q}} \,\vert \,K_\text {S}, {\varvec{Q}}_{BA}\right) \right\rangle _{\{K_\text {S}\}}.$$

(25)

To recapitulate, the orientation disconcurrence between two symmetric volumes $W_A$ and $W_B$ is defined by Eq. (25) as

$$\begin{aligned} \Delta \theta _c^2 = \Theta ^2\left( {\overline{Q}}_{\mathbf{O}B}, {\overline{Q}}_{BA} \,\vert \,\{K_\text {S}\}, W_A, W_B \right) \; . \end{aligned}$$

(26)

This computation involves separate optimizations: we iteratively refine ${\widetilde{Q}}_{BA} \rightarrow {\overline{Q}}_{BA}$ and ${\widetilde{Q}}_{\mathbf{O}B} \rightarrow {\overline{Q}}_{\mathbf{O}B}$ by minimizing Eq. (25); for each presumptive ${\widetilde{Q}}_{BA}$ and ${\widetilde{Q}}_{\mathbf{O}B}$, find the symmetry operation in $\{Q_\mathbf{O}\}$ for each sentinel pattern that minimizes the quantity in Eq. (24) as well as the most compatible in-plane rotations for each sentinel pattern (“Resolving ambiguities from centro-symmetric diffraction volumes” section). The results of these completed optimizations are used to fold the ADD into the fundamental domain in Fig. 9.

We note that one can discover the symmetry of $W_A$ using a special case of ADD with itself (i.e. $W_A = W_B$). This ‘self-ADD’ will be similar to Fig. 9c (main text) since there is no relative rotation between $W_A$ and itself. Because the first component of every unit quaternions in a symmetry group is independent on the choice of canonical axis, we may deduce $W_A$’s symmetry group from number and positions of their clusters in their $Q_0$ histograms of its ‘self-ADD’ (panel above Fig. 9c (main text)).

A one-dimensional (1D) model

Here, we show the relation between the orientation disconcurrence and the disagreement (misalignment of the centers of ADDs) and the inconsistency (the size of each ADDs) with a one-dimensional (1D) rotation analogy as opposed to the full 3D rotation version in Fig. 4.

The unit quaternion ${\varvec{Q}}$ that describes rotation about a 1D ring is a real number $\theta \in [0, 2\pi )$. Suppose that the two OPDs (of reconstructed models $W_A$ and $W_B$) that comprise the ADDs for a set of sentinel patterns $\{K_{\text {S}}\}$ are mostly constrained within a small segment of this 1D ring. Let us further suppose that their ADD over $\{K_{\text {S}}\}$ can be approximated by local Gaussian distribution within this angular segment. We denote the 1D ADD averaged over all sentinel patterns $\{K_{\text {S}}\}$ as $P(\varvec{Q}\,\vert \,\{K_{\text {S}}\})\equiv P(\varvec{Q}\,\vert \,\{K_{\text {S}}\}, W_A, W_B)$. For a single sentinel pattern $K_\text {S}$ its ADD, $P(\varvec{Q}\,\vert \,K_\text {S})$ (blue or red distribution in Fig. 4), we denote its mean as ${\overline{Q}}(K_\text {S})$, and variance as $\Delta \theta ^2(K_\text {S})$. Hence the mean and variance of this ADD for the entire set of sentinel patterns $\{K_{\text {S}}\}$ are equivalent to the overall orientation, ${\overline{Q}}(\{K_{\text {S}}\})$, and the square of orientation disconcurrence, $\Delta \theta _\text {c}^2(\{K_{\text {S}}\})$, defined in Eqs. (17) and (18) respectively. The square difference between the disconcurrence, $\Delta \theta _\text {c}(\{K_\text {S}\})$, and the inconsistency, $\sqrt{\mathinner {\langle {\Delta \theta ^2 (K)}\rangle }_{K\in \{K_\text {S}\}}}$, is equivalent to the RMS distance between ${\overline{Q}}(K_{\text {S}}), K_{\text {S}}\in \{K_{\text {S}}\}$ and ${\overline{Q}}(\{K_{\text {S}}\})$, can be thought of as the disagreement, $\Delta \theta _\text {a}(W_A,W_B)$, between reconstructions $W_A$ and $W_B$. This relation can be shown by

$$\begin{aligned} {} & {} |\{K_{\text {S}}\}|\Delta {{\theta _{\text {c}}}^2}(\{K_{\text {S}}\}) - \sum _{K_{{\text {S}}}}\Delta {\theta ^2}(K_{\text {S}}) \\ =&\sum _{K_{{\text {S}}}}\sum _{\varvec{Q}} P(\varvec{Q}\,\vert \,{K_{{\text {S}}}})\big (\varvec{Q} - {\overline{Q}}(\{K_{{\text {S}}}\})\big )^2\\ \quad &-\sum _{K_{{\text {S}}}}\sum _{\varvec{Q}} P(\varvec{Q}\,\vert \,K_{{\text {S}}})\big (\varvec{Q} - {\overline{Q}}(K_{{\text {S}}})\big )^2\\ =&\sum _{K_{{\text {S}}}}\sum _{\varvec{Q}} P(\varvec{Q}\,\vert \,{K_{\text {S}}})\big (\varvec{Q}^2 - 2\varvec{Q} {\overline{Q}}(\{K_{\text {S}}\})+\\&{\overline{Q}}^2(\{K_{\text {S}}\}) - \varvec{Q}^2 +2\varvec{Q}{\overline{Q}}(K_{\text {S}})-{\overline{Q}}^2(K_{\text {S}})\big )\\ =&\sum _{K_{\text {S}}}\sum _{\varvec{Q}} P(\varvec{Q}\,\vert \,K_{\text {S}})\big (- 2{\overline{Q}}(K_{\text {S}}) {\overline{Q}}(\{K_{\text {S}}\})+\\&{\overline{Q}}^2(\{K_{\text {S}}\}) +2{\overline{Q}}(K_{\text {S}}){\overline{Q}}(K_{\text {S}})-{\overline{Q}}^2(K_{\text {S}})\big )\\ =&\sum _{K_{\text {S}}}\sum _{\varvec{Q}} P(\varvec{Q}\,\vert \,K)\big ({\overline{Q}}(K_{\text {S}}) - {\overline{Q}}(\{K_{\text {S}}\})\big )^2\\ =&\sum _{K_{\text {S}}}\big ({\overline{Q}}(K_{\text {S}}) - {\overline{Q}}(\{K_{\text {S}}\})\big )^2\\ \equiv&\Delta {\theta _\text {a}}(W_A,W_B){.} \end{aligned}$$

(27)

Above we use $\sqrt{\mathinner {\langle {\Delta \theta ^2 (K)}\rangle }_{K\in \{K_\text {S}\}}}$ as the inconsistency in Eq. (27) instead of the definition in Eq. (1), because these two definitions are approximately the same if Gaussian distributions are assumed for OPDs, $P({\varvec{Q}}_i\,\vert \,K_\text {S}, W_i)$, $i=A, B$. As $P(\varvec{Q} \,\vert \,K_\text {S})$ is a convolution of these two Gaussian OPDs, its variance is $\Delta \theta ^2(K_\text {S})=\delta _A^2 + \delta _B^2$, where $\delta _A^2$ and $\delta _B^2$ are the variances of $\text {OPD}_A$ and $\text {OPD}_B$. Meanwhile, the variances of auto-convolution of two OPDs are $\Theta ^2({\overline{Q}}_{ii}=0 \,\vert \,K_\text {S}, W_i)=2\delta _i^2$, $i=A, B$, which gives us

$$\begin{aligned} \Delta \theta ^2(K_\text {S}\,\vert \,W_A, W_B) \approx \frac{1}{2} \Theta ^2(0\,\vert \,K_\text {S}, W_A) + \frac{1}{2} \Theta ^2(0 \,\vert \,K_\text {S}, W_B)=\Delta \theta _\text {i}^2(W_A, W_B)\text {.} \end{aligned}$$

(28)

The average of right hand side (RHS) of Eq. (28) over $\{K_\text {S}\}$ is consistent with RHS of Eq. (1).

The width of OPD, $\delta ^2$, quantifies how well we can identify the orientation for a given pattern. For a pixel at $\varvec{q}$ in this pattern, we cannot decide whether this pixel belongs to a diffraction speckle near its most likely orientation if the speckle’s radii $\theta _\text {sp}(\varvec{q})$ is larger than $\delta$. Strictly, if we want a $74\%$ confidence interval, then we should have $\theta _\text {sp}(\varvec{q}) \le 2 \delta$. It should be noted that the confidence interval for $2\sigma$ is $74\%$ instead of $95\%$ since OPD is a 3D Gaussian distribution even though we simplified the derivation above with a 1D Gaussian distribution. The $\delta$ is computational expensive, but it can be easily inferred from $\Delta \theta _\text {i}$ by $\delta \approx \Delta \theta _\text {i} / \sqrt{2}$ if the Gaussian assumption discussed above is utilized. Moreover, being more cautious about the conclusion, we replace the $\Delta \theta _\text {c}$ instead of $\Delta \theta _\text {i}$ in Eq. (7).

Sentinel pattern coverage in the SO(3) orientation space

Comparing a sentinel pattern to a diffraction intensity results in the former’s OPD. This OPD covers a certain region in the SO(3) orientation space. The volume of this region should be proportional to the width of the OPD which could be estimated by $\Delta \theta _\text {i} / \sqrt{2}$ as mentioned in Eq. (28). If we crudely partitioned these OPDs with boxes whose average edge length is twice the average OPD width then the average volume covered by an OPD is $(2\Delta \theta _\text {i} / \sqrt{2})^3$. Given when the number of patterns diverges (the yellow asymptote) in Fig. 7, $\Delta \theta _\text {i}=0.24$, then at least we need

$$\begin{aligned} \frac{\pi ^2}{(2 \times 0.24 / \sqrt{2})^3} \approx 250 \end{aligned}$$

(29)

OPDs to cover the whole SO(3) space, where $\pi ^2$ is the total volume of SO(3).

References

Spence, J. C. H. XFELs for structure and dynamics in biology. IUCrJ 4(4), 322 (2017).
Article CAS PubMed PubMed Central Google Scholar
Chapman, H. N. X-ray free-electron lasers for the structure and dynamics of macromolecules. Annu. Rev. Biochem. 88, 35 (2019).
Article CAS PubMed Google Scholar
Neutze, R., Wouts, R., van der Spoel, D., Weckert, E. & Hajdu, J. Potential for biomolecular imaging with femtosecond X-ray pulses. Nature 406(6797), 752–757 (2000).
Article CAS PubMed ADS Google Scholar
Jurek, Z., Faigel, G. & Tegze, M. Dynamics in a cluster under the influence of intense femtosecond hard X-ray pulses. Eur. Phys. J. D 29(2), 217–229 (2004).
Article CAS ADS Google Scholar
Chapman, H. N. et al. Femtosecond diffractive imaging with a soft-X-ray free-electron laser. Nat. Phys. 2(12), 839–843 (2006).
Article CAS Google Scholar
Yoon, C. H. et al. A comprehensive simulation framework for imaging single particles and biomolecules at the european X-ray free-electron laser. Sci. Rep. 6, 24791 (2016).
Article CAS PubMed PubMed Central ADS Google Scholar
Fortmann-Grote, C. et al. SIMEX: Simulation of experiments at advanced light sources. IUCrJ 4, 560–568 (2017).
Article CAS PubMed PubMed Central Google Scholar
Duane-Loh, N. T., Elser, V. Reconstruction algorithm for single-particle diffraction imaging experiments. Phys. Rev. Stat. Nonlinear Soft Matter Phys., 80, 26705 (2009).
Ayyer, K., Lan, T.-Y. & Elser, V. Dragonfly: An implementation of the expand–maximize–compress algorithm for single-particle imaging. J. Appl. Crystallogr. 49(4), 1320–1335 (2016).
Article CAS PubMed PubMed Central Google Scholar
Kassemeyer, S. et al. Optimal mapping of X-ray laser diffraction patterns into three dimensions using routing algorithms. Phys. Rev. E Stat. Nonlinear Soft Matter Phys. 88(4), 042710 (2013).
Article ADS CAS Google Scholar
Yoon, C. H. et al. Unsupervised classification of single-particle X-ray diffraction snapshots by spectral clustering. Opt. Express 19(17), 16542–16549 (2011).
Article PubMed ADS Google Scholar
Chapman, H. N. et al. Femtosecond X-ray protein nanocrystallography. Nature 470(7332), 73–77 (2011).
Article CAS PubMed PubMed Central ADS Google Scholar
Ekeberg, T. et al. Three-dimensional reconstruction of the giant mimivirus particle with an X-ray free-electron laser. Phys. Rev. Lett. 114(9), 098102 (2015).
Article PubMed ADS CAS Google Scholar
Loh, N. D. et al. Fractal morphology, imaging and mass spectrometry of single aerosol particles in flight. Nature 486(7404), 513–517 (2012).
Article CAS PubMed ADS Google Scholar
van der Schot, G. et al. Imaging single cells in a beam of live cyanobacteria with an X-ray laser. Nat. Commun 6, 5704 (2015).
Article PubMed ADS CAS Google Scholar
Hantke, M. F. et al. High-throughput imaging of heterogeneous cell organelles with an X-ray laser. Nat. Photonics 8(12), 943–949 (2014).
Article CAS ADS Google Scholar
Harauz, G. & van Heel, M. Exact filters for general geometry three dimensional reconstruction. Optik 73(4), 146–156 (1986).
Google Scholar
Rui, X. et al. Single-shot three-dimensional structure determination of nanocrystals with femtosecond X-ray free-electron laser pulses. Nat. Commun. 5(1), 1–9 (2014).
Google Scholar
Ayyer, K. et al. Low-signal limit of X-ray single particle diffractive imaging. Opt. Express 27(26), 37816–37833 (2019).
Article CAS PubMed ADS Google Scholar
Giewekemeyer, K. et al. Experimental 3D coherent diffractive imaging from photon-sparse random projections. IUCrJ 6(3), 357–365 (2019).
Article CAS PubMed PubMed Central Google Scholar
Hosseinizadeh, A. . & Mashayekhi, G. . Conformational landscape of a virus by single-particle X-ray scattering. Nat. Methods 14(9), 877–881 (2017).
Article CAS PubMed Google Scholar
Ikonnikova, K. A., Teslyuk, A. B., Bobkov, S. A., Zolotarev, S. I. & Ilyin, V. A. Reconstruction of 3D structure for nanoscale biological objects from experiments data on super-bright X-ray free electron lasers (XFELs): Dependence of the 3D resolution on the experiment parameters. Procedia Comput. Sci. 156, 49–58 (2019).
Article Google Scholar
Kim, S. S., Nepal, P., Saldin, D. K. & Yoon, C. H. Reconstruction of 3D Image of Nanorice Particle from Randomly Oriented Single-Shot Experimental Diffraction Patterns Using Angular Correlation Method. arXiv (2020). preprinted http://arXiv.org/10.1101/224402.
Nakano, M., Miyashita, O., Jonic, S., Tokuhisa, A. & Tama, F. Single-particle XFEL 3D reconstruction of ribosome-size particles based on Fourier slice matching: Requirements to reach subnanometer resolution. J. Synchrot. Radiat. 25(4), 1010–1021 (2018).
Article CAS Google Scholar
Poudyal, I., Schmidt, M. & Schwander, P. Single-particle imaging by X-ray free-electron lasers—How many snapshots are needed?. Struct. Dyn. 7(2), 024102 (2020).
Article CAS PubMed PubMed Central Google Scholar
Pryor, A. et al. Single-shot 3D coherent diffractive imaging of core-shell nanoparticles with elemental specificity. Sci. Rep. 8(1), 8284 (2018).
Article PubMed PubMed Central ADS CAS Google Scholar
Rose, M. et al. Single-particle imaging without symmetry constraints at an X-ray free-electron laser. IUCrJ 5(6), 727–736 (2018).
Article CAS PubMed PubMed Central Google Scholar
Shi, Y. et al. Evaluation of the performance of classification algorithms for XFEL single-particle imaging data. IUCrJ 6(2), 331–340 (2019).
Article CAS PubMed PubMed Central Google Scholar
von Ardenne, B., Mechelke, M. & Grubmüller, H. Structure determination from single molecule X-ray scattering with three photons per image. Nat. Commun. 9(1), 9 (2018).
Article CAS Google Scholar
Liu, J., Engblom, S. & Nettelblad, C. Assessing uncertainties in X-ray single-particle three-dimensional reconstruction. Phys. Rev. E 98, 013303 (2018).
Article CAS PubMed ADS Google Scholar
van Heel, M. & Schatz, M. Fourier shell correlation threshold criteria. J. Struct. Biol. 151(3), 250–262 (2005).
Article PubMed CAS Google Scholar
Liao, H. Y. & Frank, J. Definition and estimation of resolution in single-particle reconstructions. Structure 18(7), 768–775 (2010).
Article CAS PubMed PubMed Central Google Scholar
van Heel, M. & Schatz, M. Reassessing the revolution’s resolutions. bioRxivhttps://doi.org/10.1101/224402 (2017).
Article Google Scholar
Tegze, M. & Bortel, G. Coherent diffraction imaging: Consistency of the assembled three-dimensional distribution. Acta Crystallogr. A Found. Adv. 72(Pt 4), 459–464 (2016).
Article CAS PubMed Google Scholar
Elser, V. Noise limits on reconstructing diffraction signals from random tomographs. IEEE Trans. Inf. Theory 55(10), 4715–4722 (2009).
Article MathSciNet MATH Google Scholar
Elser, V. & Eisebitt, S. Uniqueness transition in noisy phase retrieval. New J. Phys. 13(2), 023001 (2011).
Article MATH ADS Google Scholar
Jahn, T., Wilke, R. N., Chushkin, Y. & Salditt, T. How many photons are needed to reconstruct random objects in coherent X-ray diffractive imaging?. Acta Crystallogr. A Found. Adv. 73(Pt 1), 19–29 (2017).
Article CAS PubMed Google Scholar
Shannon, C. E. A mathematical theory of communication. Bell Syst. Techn. J. 27(3), 379–423 (1948).
Article MathSciNet MATH Google Scholar
Loh, N. D. et al. Cryptotomography: Reconstructing 3D fourier intensities from randomly oriented single-shot diffraction patterns. Phys. Rev. Lett. 104(22), 225501 (2010).
Article CAS PubMed ADS Google Scholar
Bortel, G. & Tegze, M. Common arc method for diffraction pattern orientation. Acta Crystallogr. A 67(6), 533–543 (2011).
Article CAS PubMed ADS Google Scholar
Tegze, M. & Bortel, G. Selection and orientation of different particles in single particle imaging. J. Struct. Biol. 183(3), 389–393 (2013).
Article PubMed Google Scholar
Drinkwater, N. et al. Potent dual inhibitors of plasmodium falciparum m1 and m17 aminopeptidases through optimization of s1 pocket interactions. Eur. J. Med. Chem. 110, 43–64 (2016).
Article CAS PubMed Google Scholar
Rosenthal, P. B. & Henderson, R. Optimal determination of particle orientation, absolute hand, and contrast loss in single-particle electron cryomicroscopy. J. Mol. Biol. 333(4), 721–745 (2003).
Article CAS PubMed Google Scholar
Shen, Q., Bazarov, I. & Thibault, P. Diffractive imaging of nonperiodic materials with future coherent X-ray sources. J. Synchrotron. Radiat. 11(Pt 5), 432–438 (2004).
Article PubMed Google Scholar
Giewekemeyer, K. et al. Experimental 3D coherent diffractive imaging from photon-sparse random projections. IUCrJ 6(Pt 3), 357–365 (2019).
Article CAS PubMed PubMed Central Google Scholar
Sobolev, E. et al. Megahertz single-particle imaging at the European xfel. Commun. Phys. 3(1), 97 (2020).
Article Google Scholar

Download references

Acknowledgements

N.D.L. and Z.S. acknowledge the support of the National University of Singapore startup grant; C.Z.W.T. thanks the support of the Singapore National Research Foundation (NRF-CRP16-2015-05). The authors are grateful to Benedikt Daurer, Andrew Martin, Filipe Maia, and Tomas Ekeberg for stimulating discussions.

Author information

Authors and Affiliations

Centre for Bio-imaging Sciences, National University of Singapore, 14 Science Drive 4, 117557, Singapore, Singapore
Zhou Shen, Colin Zhi Wei Teo & N. Duane Loh
Department of Physics, National University of Singapore, 2 Science Drive 3, 117551, Singapore, Singapore
Zhou Shen, Colin Zhi Wei Teo & N. Duane Loh
Max Planck Institute for the Structure and Dynamics of Matter, Luruper Chaussee 149, 22761, Hamburg, Germany
Kartik Ayyer
Center for Free-Electron Laser Science, Luruper Chaussee 149, 22761, Hamburg, Germany
Kartik Ayyer
Department of Biological Sciences, National University of Singapore, 14 Science Drive 4, 117557, Singapore, Singapore
N. Duane Loh

Authors

Zhou Shen
View author publications
You can also search for this author in PubMed Google Scholar
Colin Zhi Wei Teo
View author publications
You can also search for this author in PubMed Google Scholar
Kartik Ayyer
View author publications
You can also search for this author in PubMed Google Scholar
N. Duane Loh
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

N.D.L. and Z.S. conceived the project. Z.S. performed all the calculations in this manuscript, with technical help from C.Z.W.T., K.A. and N.D.L. The manuscript was written by N.D.L and Z.S. with input from K.A.

Corresponding author

Correspondence to N. Duane Loh.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Shen, Z., Teo, C.Z.W., Ayyer, K. et al. An encryption–decryption framework to validating single-particle imaging. Sci Rep 11, 971 (2021). https://doi.org/10.1038/s41598-020-79589-0

Download citation

Received: 28 August 2020
Accepted: 17 November 2020
Published: 13 January 2021
DOI: https://doi.org/10.1038/s41598-020-79589-0

This article is cited by

Water layer and radiation damage effects on the orientation recovery of proteins in single-particle imaging at an X-ray free-electron laser
- Juncheng E
- Michal Stransky
- Adrian P. Mancuso
Scientific Reports (2023)
High-quality restoration image encryption using DCT frequency-domain compression coding and chaos
- Heping Wen
- Linchao Ma
- Chongfu Zhang
Scientific Reports (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.