Sparsity-based Ankylography for Recovering 3D molecular structures from single-shot 2D scattered light intensity

Mutzafi, Maor; Shechtman, Yoav; Eldar, Yonina C.; Cohen, Oren; Segev, Mordechai

doi:10.1038/ncomms8950

Download PDF

Article
Open access
Published: 20 August 2015

Sparsity-based Ankylography for Recovering 3D molecular structures from single-shot 2D scattered light intensity

Maor Mutzafi¹,
Yoav Shechtman^1,2,
Yonina C. Eldar³,
Oren Cohen¹ &
…
Mordechai Segev¹

Nature Communications volume 6, Article number: 7950 (2015) Cite this article

3156 Accesses
14 Citations
15 Altmetric
Metrics details

Subjects

Abstract

Deciphering the three-dimensional (3D) structure of complex molecules is of major importance, typically accomplished with X-ray crystallography. Unfortunately, many important molecules cannot be crystallized, hence their 3D structure is unknown. Ankylography presents an alternative, relying on scattering an ultrashort X-ray pulse off a single molecule before it disintegrates, measuring the far-field intensity on a two-dimensional surface, followed by computation. However, significant information is absent due to lower dimensionality of the measurements and the inability to measure the phase. Recent Ankylography experiments attracted much interest, but it was counter-argued that Ankylography is valid only for objects containing a small number of volume pixels. Here, we propose a sparsity-based approach to reconstruct the 3D structure of molecules. Sparsity is natural for Ankylography, because molecules can be represented compactly in stoichiometric basis. Utilizing sparsity, we surpass current limits on recoverable information by orders of magnitude, paving the way for deciphering the 3D structure of macromolecules.

Free log-likelihood as an unbiased metric for coherent diffraction imaging

Article Open access 14 February 2020

Vincent Favre-Nicolin, Steven Leake & Yuriy Chushkin

Single-protein optical holography

Article Open access 13 March 2024

Jan Christoph Thiele, Emanuel Pfitzner & Philipp Kukura

High-fidelity optical diffraction tomography of multiple scattering samples

Article Open access 11 September 2019

Joowon Lim, Ahmed B. Ayoub, … Demetri Psaltis

Introduction

Recovering the three-dimensional (3D) structure of biological molecules is of paramount importance. For example, protein characterization plays a key role in the field of structural proteomics^1,2,3. Knowing the protein structure may provide further understating of the function and mechanism even for proteins whose biochemical function is known⁴. The main methodology used today to recover 3D structure of molecules is X-ray crystallography, which requires crystallization of the probed molecules. This method relies on X-ray diffraction from a periodic structure, which averages over many molecules making up the ‘molecular crystal’. However, the molecules in such a structure are not situated in the same exact position and alignment in all unit cells, hence this method fundamentally cannot provide sufficient resolution in the recovered 3D structure of the molecule. Moreover, there is an additional even greater problem: while small molecules (having few degrees of conformational freedom) may be crystallized by various methods, such as chemical vapour deposition⁵ and re-crystallization⁶, for macromolecules, especially membrane proteins, crystallization is much more problematic⁷. In fact, thus far, crystallization attempts have been unsuccessful for most of the membrane proteins; as such, the 3D structure of many bio-molecules is still unknown⁷. Clearly, developing a method that could decipher the 3D structure of a single protein molecule is nothing less than a dream. In fact, no current method can do that even in theory.

In the past few years, it has been proposed to study such molecules using imaging with X-ray laser pulses^6,8,9,10, whose wavelength has the desired resolution. However, since X-ray light ionizes all biological molecules and changes their molecular structure, X-ray experiments on organic molecules cannot be carried out with continuous wave (CW) radiation. Rather, this has to be performed with ultrashort laser pulses. Moreover, biological molecules disintegrate after the first pulse, and therefore the information (scattered light) necessary for recovering the structure must be collected either in a single shot (the basis of Ankylography), or in multiple shots—each probing a new molecule of the same kind, followed by a calibration procedure (registration) since the molecule in each shot is inevitably rotated in 3D space. Such ideas have indeed been suggested^11,12,13. Experimentally, there were successful attempts using single-shot X-ray pulses scattered off aerosol particles, demonstrating the ability to determine the orientation of two large polystyrene spheres¹⁴ and finding the two-dimensional (2D) projection of several particles¹⁵. Going back to a single biological molecule, when a single ultrashort X-ray pulse is launched at such a molecule, and when the pulse is short enough—the flux of photons scattered off the molecule before it disintegrates carries the information about the structure^16,17. The 3D structure of the molecule can then be recovered algorithmically from this single measurement, in a process called Ankylography¹⁰. This approach for deciphering the molecular structure relies on the ability to use ultrashort (femtosecond) laser pulses in the X-ray regime. Indeed, recent developments have enabled the construction of a new X-ray free electron laser (XFEL) facility, which emits a beam with high coherence and facilitates access to atomic scale imaging^18,19. In fact, the wavelength of X-ray laser flashes are so short that even atomic details may one day become discernible (λ∼0.5 Å–6 nm). Another source of ultrashort X-ray pulses is based on the high-harmonic generation process, which already enables coherent experiments in the X-ray regime²⁰.

Desirably, the coherent scattering measurements should be taken at the surface corresponding to the Ewald sphere²¹ (a sphere in the Fourier domain; see explanation in the Supplementary Information). But even in this case, such a single-shot 2D measurement is still missing a very large part of the information necessary to recover the 3D structure. Namely, the phase information is missing, and only 2D data is obtained. Therefore, Ankylography describes an algorithmic procedure whose goal is to recover 3D information from a single-shot magnitude-only measurement taken on a 2D surface corresponding to the Ewald sphere of the sought information. The algorithmic methodology of Ankylography relies on phase-retrieval algorithms, known for several decades^22,23, which have recently found their way into applications with coherent X-ray radiation^24,25. Still, achieving phase-retrieval for 3D structures from 2D measurements, as Ankylography is attempting to do, is a formidable challenge.

In spite of these problems in trying to recover the 3D structure from highly incomplete measurements, a visionary proof-of-concept Ankylography experiment has recently been demonstrated, attracting much interest¹⁰. However, the excitement has not been unanimous among researchers. For example, in a recent exchange in Nature magazine researchers compared the idea, to pulling a 3D rabbit out of a 2D hat²⁶. The original Ankylographic method was believed to work for only objects containing <15³ voxels (volumetric picture element)^26,27, but actually the original paper has demonstrated the recovery of larger objects, with the current state of the art being 32 × 32 × 20 voxels. While researchers have not yet reached a consensus on exact limits of Ankylography¹⁰, serious doubts were cast on its feasibility^28,29,30, uniqueness and stability^31,32. Moreover, it was claimed that, Ankylography will not work in the absence of additional constraints²⁸. Notwithstanding these important arguments, recent experiments have demonstrated good progress in Ankylography, but all under stringent assumptions on the symmetry of the recovered structures³³ or multiple measurements³⁴.

Here, we propose and numerically demonstrate a new algorithmic paradigm for reconstructing 3D objects from their scattered 2D intensity. Our approach is based on sparsity: prior knowledge that the information is sparse in a known basis. In our context, sparsity is manifested in the fact that the molecule effectively occupies small number of degrees of freedom (d.f.) (because molecules are made of atoms), and that the chemical composition (stoichiometry) of the molecule is known. As such, the prior knowledge of sparsity can be utilized to recover the ‘signal’ from highly incomplete measurements. Using recently developed algorithmic tools for sparsity-based phase retrieval³⁵, we demonstrate numerically the ability to determine the atomic structures of various complex organic molecules, such as peptides. This illustrates that sparsity and optimization techniques enable surpassing current limits on the recoverable information in Ankylography, by orders of magnitude. We test the performance of our methodology with respect to sparsity (number of atoms) and noise, and conclude that sparsity can pave the way to algorithmic reconstruction of the 3D structure of molecules from a single measurement of the photon flux in the optical far field.

Results

The sparsity-based concept

Before going into the mathematical details of sparsity-based Ankylography, let us explain the logic of our approach and its background. In Ankylography, a major part of the information is lost due to physical limitations, which leads to dimension deficiency and to lack of phase information in the measured data. In the most general case, it is possible to recover 3D information by taking multiple projection measurements and appropriate signal processing. Common methods to do that include computed tomography³⁶ (CT), equally sloped tomography³⁷ and more. However, here, traditional methods to recover 3D information from 2D measurements cannot be employed, because they require multiple measurements from different projections, while in the current physical problem, multiple projections are extremely hard (if not impossible) to realize in experiments. This is the motivation for Ankylography: attempting to rely strictly on data acquired in a single-shot experiment, in spite of the fact that a large part of the information is missing in the measurements. This is where sparsity comes into play. As we show below, the underlying problem typically features only a small number of d.f.. As such, our use of sparsity is natural, relying on the logic of our earlier work on sparsity-based subwavelength imaging^38,39 and super-resolution^40,41,42, and on sparsity-based phase retrieval^35,39,41,42. The theoretical framework underlying the recovery procedure is borrowed from the emerging field of compressed sensing^43,44,45. The main theme of compressed sensing is to reduce the number of acquired measurements of a signal, while still being able to accurately recover it by relying on the fact that the signal is described by a small number of d.f. Here, our goal is to recover the complete information from an inherently incomplete set of (quadratic) measurements. To this end, we adapt the recently proposed sparsity-based phase-retrieval technique (called GESPAR³⁵) to our setting.

Physical setting and sparse representation of the physical signal

The general physical setting for Ankylography is illustrated in Fig. 1. A coherent ultrashort laser pulse of central wavelength λ and relatively narrow bandwidth δλ (such that ) is incident upon a molecule with a 3D structure we wish to recover. The light is scattered from the molecule within the ultrashort duration of the pulse (a few femtoseconds), but immediately thereafter the molecule disintegrates, such that the only measurements available are those taken from the scattered light in this single-shot experiment. The detectors can be positioned on the Ewald sphere, or more practically, use a planar camera and correct for the curvature. The 3D effective potential (core electron charge density) of the molecule, which is the source for X-ray scattering, can be described as a sum over known basis functions. For simplicity, we describe each atom as a sphere with its covalent radius⁴⁶, although a more mathematically accurate description would be provided by a set of known, spherically symmetric functions⁴⁷. It is important to note that, in spite of the fact that we described the molecule with simple basis functions, our methodology is general, and is not limited to a particular basis (see example in Fig. 5). In this scheme, the molecule resembles a set of hovering spheres. The 3D scattering potential of the molecule is given by the scalar function

**Figure 1: The physical setting for Ankylography.**

where the first summation is over the T kinds of elements comprising the molecule, and U_j(r) is the potential related to the j-th element. The second summation is over S_j, the number of atoms of the j-th element, with and being the 3D position and amplitude of the n-th atomic wavefunction of the j-th element. Physically, manifests the charge density in the core electrons of that atom, which is what scatters X-ray radiation⁴⁶. For example, U₁ reflects the potential of the 1st element in the molecule (say, carbon), while and are the 3D position and the amplitude of the third carbon atom. Importantly, the chemical composition of the molecule is known (number of atoms of each element) from stoichiometry, as well as the covalent radius associated with each element. Hence, the only unknowns are the relative positions of the atoms, as described by the centre positions of the spheres, , and the amplitudes . Altogether, the number of unknowns in the problem is relatively small, which is why sparsity-based methods can be very effective.

The scattered light intensity, which corresponds to the measured data, resides on a spherical surface of a large radius centred on the molecule, in the far field of the 3D image. Theoretically, to first order in perturbation theory²¹ the scattered field intensity is proportional to the 3D Fourier transform absolute value squared of the scattering potential, measured on the surface of a sphere called ‘the Ewald sphere’²¹:

Here, I(θ, ϕ) is proportional to the intensity of the electro-magmatic waves as a function of the angles in spherical coordinates, measured relative to the incident wave direction (z), and r is the coordinate vector. The proportionality coefficient and further details are provided in the Supplementary Information section (equation (1) there). The integration is taken over the volume defined by the spatial extent of the object (V).

Problem formulation

To set up the problem as a sparse recovery problem, we define a 3D grid (of M sites) for the possible positions of each atom, repeating the grid for T different elements separately. We arrange the unknowns in a vector (where, represents a ‘numeric column vector’: a series of values), whose entries are (where the superscript H represents conjugate transpose). Here is the vector of unknowns, of size M·T, associated with element j described on the M grid sites. The m-th entry of is , where if no atom of element j resides at site m, while means that such an atom resides at this site. The measurement vector is denoted by , where the value of the l-th entry, C_l, is proportional to the intensity at angles θ_l and ϕ_l (C_l=I(θ_l, ϕ_l)), with L being the total number of measurements, namely, of the intensity readings in the detectors. The measurement of the l-th detector is , where the vector represents one (vector) term in the transfer function of the system, , which is simply the 3D Fourier transform operator measured on the Ewald sphere.

With this notation, our mathematical problem can be described as follows:

where is the L₀ norm which counts the number of non-zero entries in the vector. Here, S_j is the number of atoms of the j-th element, which is assumed to be known from stoichiometry. Note that equation (3) is a difficult problem to solve—there is no guarantee for a unique solution, and furthermore, no assured method to find a global minimum. This is where the power of the sparsity assumption comes in: the fact that the solution is known to be sparse, (that is, that S_j are small) allows us to utilize recently developed methods that solve sparse quadratic problems such as this one. Specifically, to find a sparse solution to equation (3), we use the GESPAR³⁵ method with the set of matrices relevant to our problem. Sparsity-based Ankylography requires some modification to the formulation in ref. 35. Our algorithm is described in detail in the Methods section.

Comparing our sparsity-based technique with the HIO algorithm

A typical example is shown in Figs 2a and 3a, where we simulate the recovery of the 3D structure of the amino acid threonine. This molecule, sketched in Fig. 1 and displayed more clearly in Fig. 3a, has 17 atoms: four carbons (red spheres), one nitrogen (orange sphere), three oxygens (light green spheres) and nine hydrogens (dark blue). For clarity, we plot the 3D structure streamlined sequentially, and assign to it a one-dimensional grid index as shown in Fig. 2a. In this example, the position of each of the atoms is marked by a circle of its associated colour on the 9³ grid, where the grid is considerably denser than the radius of the smallest atom (see Supplementary Information). The vertical axis provides the amplitude, which reflects the effective charge density associated with each atom.

**Figure 2: Ankylographic reconstruction of the threonine molecule.**

**Figure 3: Ankylographic reconstruction of the 3D structure of the threonine molecule.**

First, we test the ability of the current Ankylography algorithm (used in Ref. 10) to recover the 3D structure of threonine. To do that, we use a slightly modified version of the algorithm used in ref. 10 and available at http://www.physics.ucla.edu/research/imaging/Ankylography/index.htm. Essentially, this is the standard hybrid input–output (HIO) method^22,23, which is commonly used for phase retrieval. As a model for the sought information, we insert the set of hovering spheres (defining threonine) into the algorithm. The HIO algorithm is basically iterating Fourier transforms back and forth between the object and the Fourier domains, using the measured data (absolute value of the 3D Fourier transform), and applying prior knowledge on the ‘support’ of the object (the known region within which the molecule resides). When we attempt to use the HIO algorithm for Ankylography as in ref. 10, we have to represent the 3D information with 55³ voxels (volume pixels), for the sake of sufficient resolution. This attempt to reconstruct the 3D structure of threonine has completely failed: as argued in the Comment²⁶ and Reply²⁷, this method is not expected to work because the number of voxels greatly exceeds 32X32X20, which is the current state of the art in Ankylography. Following this unsuccessful attempt, it is natural to try using the HIO algorithm with the additional prior information that the object (the molecule) can be represented as a set of spheres. The result is shown in Fig. 2b: this attempt also fails, in spite of the additional prior information. The HIO algorithm converges to a solution occupying all the possible number of d.f. (that is, the grid in Fig. 2b is fully populated), which is clearly an erroneous solution.

This is where sparsity makes the big difference. In sharp contrast to the other attempts, our sparsity-based GESPAR algorithm provides excellent reconstruction, as shown by the reconstruction on the streamline grid in Fig. 2c, and by the visual 3D plot of Fig. 3b. See further details on the algorithm in the Methods section. Clearly, sparsity-based Ankylography can recover the 3D structure of molecules of much greater complexity and details than ever anticipated from Ankylography.

It is essential at this point to elucidate the general role of sparsity (rather than the specific algorithm), in our successful reconstruction, where the standard HIO algorithm fails. To do that, we add sparsity to the HIO algorithm, as prior information. More specifically, in every iteration of the HIO algorithm, we enforce the 3D image to be sparse under the underlying basis functions (the set of spheres) by thresholding the coefficients⁴⁸. The result is shown in Figs 2d and 3c. Examining the result, it is clear that adding sparsity constraints to the HIO algorithm results in a huge improvement, but the reconstruction is still poor. Clearly, GESPAR outperforms the sparse HIO approach, consistent with ref. 35.

Following this example, and many other examples we have simulated, several conclusions can be drawn. First, Ankylography features a small number of d.f., hence it is amenable to algorithmic methods relying on sparsity. Second, our current sparsity-based phase-retrieval algorithmic methodology enables the recovery of the 3D structure of molecules occupying two orders of magnitude more voxels than what Ankylography can handle without sparsity¹⁰. In fact, our sparsity-based method has no upper limit on the size of the molecules. Last but not least, we emphasize that it is indeed the sparsity concept making this recovery possible, as we have shown that adding sparsity to standard methods considerably improves their performance. Altogether, it is clear that sparsity significantly improves Ankylography, making it a highly promising method in the next generation of structural biology experiments.

Performance of our sparsity-based Ankylography algorithm

With noise robustness being a major concern regarding the performance limits of Ankylography, it is essential to study the performance of our technique in a statistical fashion, in terms of the level of sparsity and permissible noise levels. To do that, we test our algorithm on 600 examples, under different conditions of signal-to-noise ratio (SNR) and optical wavelength. Importantly, we examine the algorithm in a realistic scenario, where the molecule is not restricted to any particular grid, while the recovery is made on a fine 3D grid (four 121³ basis functions), such that the radius of the smallest atom (hydrogen) is three times larger than the grid unit. Further details on these simulations are provided in the Methods section. The results are shown in Fig. 4, which displays the reconstruction error as a function of sparsity (total number of atoms). Here, the normalized reconstruction error (a number between 0 and 1) is defined as

**Figure 4: Performance of our sparsity-based algorithm.**

where ε is the error, f_source is the original 3D image (defined above), f_recovery is the image recovered from the 2D intensity pattern given by equation (2), and is the inner product operator. In these simulations, we use white noise (added to I(θ,ϕ)) distributed uniformly on the sphere defining the measured data (assuming the noise originates from isotropic volume scattering). The noise level, N, is defined as the fraction of noise to the total power of the scattered light (the measurements). The values we use in the simulations yield SNR that is much smaller than the SNR taken in ref. 10. Figure 4a shows the reconstruction error as a function of sparsity for three wavelengths. Expectedly, the performance is better at shorter wavelengths, which yields higher resolution. Figure 4b shows the reconstruction error as a function of sparsity for various noise levels at wavelength of 0.35 Å. This wavelength is chosen such that it corresponds to 1/3 of the finest resolution of our information (the smallest distance between centres of spheres). The conclusion drawn from these figures is that our method work well under realistic conditions. For example, for a noise level of N=0.001 and λ=0.35 Å, the algorithm performs well as long as the total number of atoms is smaller than ∼20.

Discussion

The simulations indicate that increasing the SNR is of major importance. The challenge in doing that is the photon flux at short X-ray wavelengths. The current XFEL emits ∼10¹² photons at every pulse; however, numerical simulations have indicated that the combination of self-seeding and undulator tapering techniques can increase the pulse intensity by two orders of magnitude⁴⁹. Furthermore, since photons are bosons, there is no fundamental limit on the pulse intensity, and it is expected that the intensity of XFEL will continue to increase as new techniques are being developed. As such, the SNR within which our sparsity-based method can recover structures of single molecules is within reach in the near future. Importantly, proteins have large scattering cross-sections, scattering more photons than a single amino acid, which makes our method viable especially for proteins (which contain multiple amino acids) with the present technology.

Finally, we note that our sparsity-based method is demonstrated here for the case where the signal sparsity corresponds to a small number of atoms. However, the method is applicable for much more general scenarios—for example, when the sparsity is in the number of amino acids, as is the case for many proteins of interest. In this case, the different ‘building blocks’ to be localized and oriented are the known amino acids from which the protein is composed, optimally—along with their possible conformations. Figure 5 shows the reconstruction of the 3D structure of a peptide molecule which is a combination of amino acids with peptide bonds. The molecule is a tripeptide and is composed by two glycine and one alanine amino acids. To reconstruct this structure, we use our sparsity-based procedure implemented on the basis of amino acids that spans all positions and rotations. Importantly, using additional prior knowledge on the amino acid bonds (protein conformation such as Ramachandran plot⁵⁰) and assigning a binary value for every basis element (instead of determining their amplitude) further reduces the number of d.f. dramatically, and can allow the reconstruction of significantly larger structure than what we show in Fig. 5. Our method is actually expected to perform much better for large proteins because these have larger cross-sections, and therefore scatter more photons and increase the SNR. Moreover, the recovery of structures made of large basis elements is possible with a longer wavelength, hence, amino acids of typical size of several angstrom will require the wavelength to be of the same order, up to 10 times larger than for Figs 2, 3, 4, where the X-ray laser technology is more mature.

**Figure 5: Sparsity-based reconstruction of a glycine–glycine–alanine tripeptide.**

In conclusion, we suggest a new approach to recover the 3D structure of molecules using Ankylography. Our sparsity-based methodology enables deciphering 3D structures of bio-molecules in a single-shot X-ray laser pulse, and exceeds the current limit of recovered information by orders of magnitude. We have demonstrated the reconstruction of a single amino acid and of a tripeptide, with the recovery methodology implemented on the basis of amino acids. These examples highlight the strength of the sparsity-based Ankylography concept and also demonstrate that it is actually easier to apply it to larger objects. The last example proves the generality of sparsity-based Ankylography and provides an avenue for the future of structural biology. With that, sparsity-based Ankylography can reach the level it can overcome the current bottleneck of structural biology.

Methods

Mathematical formulation

Our mathematical problem amounts to construction of a 3D sparse signal from the Fourier magnitude on the Ewald’s sphere.

Of course, when the majority of the information is lost, precise reconstruction is not possible, unless we have, or may assume, some additional information about the sought signal. In fact, the problem is even more difficult as the measurements contain noise. We assume that the scattered electro-magmatic field can be approximated adequately (hereafter, this relation is denoted by ≅) by means of known generating functions describing spheres U_j(r) of radius R^j. In other words, we want to reconstruct a 3D optical image assuming that it is comprised of a small known number of (different) spheres of known radii, as described in the Results section. Every kind of sphere U_j(r) (identified by its radius R^j) corresponds to a different atomic element (j, in this case), where the elements are known, and also how many atoms are there of each element. We emphasize that the reconstruction is done on a 3D grid, while the spheres themselves do not need to reside on any grid at all: they reflect the actual structure of the molecule which does not necessarily reside on a known grid.

As we already mentioned, the input to the algorithm is the 3D Fourier magnitude, sampled on the Ewald sphere. The output of the algorithm should be the positions of all of the atoms, and their corresponding radii.

Mathematically, the molecule is defined in equation (1), which in the spatial frequency domain yields

where, is Bessel function of order n. We define as generating functions in the frequency domain.

As described in the Results section, we define a 3D grid (of M sites) by the set for the possible positions of each atom for T different elements, and a set of sampling points (spatial frequencies) (related to the angles on the Ewald sphere), defined as . We arrange the unknowns in a vector (of size M·T), whose entries are (where the superscript H represents conjugate transpose), where is the vector of unknowns associated with element j described on the M grid sites. Here, the m-th entry in represents the amplitude of j-th element at the m-th site. Note that not necessarily reside on the grid . But if it is on the grid, then we can represent as an inner product. To do that we define the vector and . The measured signal is therefore

While the sensing matrix is

where, . The rows of relate to sampling frequencies (total of L rows) and the columns correspond to different elements (total M·T rows). For example, the (l, M+11) entry, , is related to a sphere of element #2 located at q₁₁ and sampled at the spatial frequency ν_l. The measurements vector is denoted by , where the value of the l-th entry, .

Now that we have mathematical representation of the measurements, we consider additional (spatially independent) white noise.

The noise level, , is the fraction of the noise power to the total power of the scattered light in the measurements surface, where (where is the expectation value) and . The signal power taken here also includes the scattered field at small angles θ (low spatial frequencies on the Ewald sphere), which cannot be measured (because the detectors at those angles are saturated by the incident light beam) but carry most of the energy. The values we use for the noise in the simulations yield SNR that is much smaller than the SNR taken in¹⁰, yet, as shown in the Results section, our sparsity-based approach is able to recover the 3D structures much better and with information capacity larger by orders of magnitude.

Technically, we seek the vector that conforms to the measurements (equation (9)), and at the same time has a known number of units of each element, for example, five atoms of the element carbon. We define the objective as

and solve the following optimization problem:

For the sake of further use, the derivative of the objective is calculated below.

Description of the algorithm

In order to solve this problem, we use a modified version of a new efficient (greedy) technique for sparsity-based phase retrieval, called GESPAR. The recovery of the unknown vector from the set of equations in equation (9) is an ill-posed problem. However, we have the prior information that our input signal is sparse. Relying on recent work^35,39,41,42 dealing with the similar problem of finding sparse solutions to the phase-retrieval problem (which constitutes a quadratic compressed sensing problem)—we employ the GESPAR algorithm presented in³⁵. GESPAR was originally intended to solve the sparse phase-retrieval problem of recovering a sparse signal from measurements of its Fourier magnitude, but it can also be used to solve the more general sparse quadratic problem³⁵.

In order to find a sparse solution to equation (1), we use GESPAR with the set of matrices . The algorithm requires modification to the formulation in³⁵ (in addition to defining to correspond to our system). The stages in sparsity-based Ankylography are summarized below (for a more detailed description of the GESPAR algorithm see³⁵):

Algorithm: Ankylography GESPAR

Input: Measurements and sampling matrices .

Initialize: Set empty support and initial guess .

Loop: while, the cost function is improved (that is, ) or support requirement is not satisfied yet (that is, ) do

Support update:

Given the support s, minimizing reduces to a nonlinear least-squares problem, which we solve by the damped-Gauss–Newton algorithm⁵¹ commonly used for this type of problems. The damped-Gauss–Newton procedure produces an estimate .

Perform a local search, an index k_j of element j containing a high absolute gradient value. Add k_j to the support and perform a damped-Gauss–Newton procedure, given the new support and calculate the cost function .

Add one atom of the element that minimizes the objective the most, and that at the same time satisfies .

Index swapping:

Calculate the cost function gradient around the current estimate.

Perform a local search by index swapping, an index i_j of element j from the support containing a small absolute valued element with an index k_j of element j containing a high absolute gradient value, where the gradient is calculated after zeroing the index i_j. This step differs from GESPAR because of the correlativity of the different entries in sparsity-based Ankylography where originally the bases functions in GESPAR are orthogonal. Perform a damped-Gauss–Newton procedure for the support and calculate the cost function .

Go over all the different elements, j=1, 2, 3…T and find the support that minimize the objective the most and substitute it as the new support s and . If the Index swapping step succeeded do it again.

Output: The estimated locations and amplitudes .

The difference between GESPAR and our sparsity-based Ankylography algorithm is that our problem contains constraints on the sub-vector , which GESPAR does not have. Consequently, we apply GESPAR to every sub-vector separately and select the best choice. Another difference is that we calculate the gradient of the cost function after zeroing for every index i_j.

Additional information

How to cite this article: Mutzafi, M. et al. Sparsity-based Ankylography for Recovering 3D molecular structures from single-shot 2D scattered light intensity. Nat. Commun. 6:7950 doi: 10.1038/ncomms8950 (2015).

References

Wild, D. & Saqi, M. Structural proteomics: inferring function from protein structure. Curr. Proteomics 1, 59–65 (2004).
Article CAS Google Scholar
Watson, J. D., Laskowski, R. A. & Thornton, J. M. Predicting protein function from sequence and structural data. Curr. Opin. Struct. Biol. 15, 275–284 (2005).
Article CAS Google Scholar
Zhang, C. & Kim, S.-H. Overview of structural genomics: from structure to function. Curr. Opin. Chem. Biol. 7, 28–32 (2003).
Article CAS Google Scholar
Geerlof, A. et al. The impact of protein characterization in structural proteomics. Acta Crystallogr. D Biol. Crystallogr. 62, 1125–1136 (2006).
Article Google Scholar
Powell, C. F., Oxley, J. H., Blocher, J. M. & Klerer, J. Vapor deposition. J. Electrochem. Soc. 113, 266C (1966).
Article Google Scholar
Harwood, L. M. Experimental Organic Chemistry: Principles and Practice Blackwell Scientific Publications (1989).
Carpenter, E. P., Beis, K., Cameron, A. D. & Iwata, S. Overcoming the challenges of membrane protein crystallography. Curr. Opin. Struct. Biol. 18, 581–586 (2008).
Article CAS Google Scholar
Sayre, D. Imaging Processes and Coherence in Physics 229–235Springer (1980).
Miao, J., Charalambous, P., Kirz, J. & Sayre, D. Extending the methodology of X-ray crystallography to allow imaging of micrometre-sized non-crystalline specimens. Nature 400, 342–344 (1999).
Article ADS CAS Google Scholar
Raines, K. S. et al. Three-dimensional structure determination from a single view. Nature 463, 214–217 (2010).
Article ADS CAS Google Scholar
Fung, R., Shneerson, V., Saldin, D. K. & Ourmazd, A. Structure from fleeting illumination of faint spinning objects in flight with application to single molecules. Nat. Phys. 5, 11 (2008).
Google Scholar
Loh, N. T. D. & Elser, V. Reconstruction algorithm for single-particle diffraction imaging experiments. Phys. Rev. E 80, 1–20 (2009).
Article Google Scholar
Geilhufe, J. et al. Extracting depth information of 3-dimensional structures from a single-view X-ray Fourier-transform hologram. Opt. Express 22, 24959–24969 (2014).
Article ADS CAS Google Scholar
Starodub, D. et al. Single-particle structure determination by correlations of snapshot X-ray diffraction patterns. Nat. Commun. 3, 1276 (2012).
Article CAS Google Scholar
Loh, N. D. et al. Fractal morphology, imaging and mass spectrometry of single aerosol particles in flight. Nature 486, 513–517 (2012).
Article ADS CAS Google Scholar
Wabnitz, H. et al. Multiple ionization of atom clusters by intense soft X-rays from a free-electron laser. Nature 420, 482–485 (2002).
Article ADS CAS Google Scholar
Neutze, R., Wouts, R., van der Spoel, D., Weckert, E. & Hajdu, J. Potential for biomolecular imaging with femtosecond X-ray pulses. Nature 406, 752–757 (2000).
Article ADS CAS Google Scholar
Geloni, G. et al. Coherence properties of the European XFEL. New J. Phys. 12, 035021 (2010).
Article ADS Google Scholar
Vartanyants, I. A. et al. Coherence properties of individual femtosecond pulses of an X-ray free-electron laser. Phys. Rev. Lett. 107, 144801 (2011).
Article ADS CAS Google Scholar
Popmintchev, T. et al. Bright coherent ultrahigh harmonics in the keV X-ray regime from mid-infrared femtosecond lasers. Science 336, 1287–1291 (2012).
Article ADS MathSciNet CAS Google Scholar
Born, M. & Wolf, E. Principles of Optics: Electromagnetic Theory of Propagation, Interference and Diffraction of Light Cambridge University Press (1980).
Fienup, J. R. Reconstruction of an object from the modulus of its Fourier transform. Opt. Lett. 3, 27 (1978).
Article ADS CAS Google Scholar
Fienup, J. R. Phase retrieval algorithms: a comparison. Appl. Opt. 21, 2758–2769 (1982).
Article ADS CAS Google Scholar
Miao, J., Ishikawa, T., Robinson, I. K. & Murnane, M. M. Beyond crystallography: diffractive imaging using coherent X-ray light sources. Science 348, 530–535 (2015).
Article ADS MathSciNet CAS Google Scholar
Shechtman, Y. et al. Phase retrieval with application to optical imaging: a contemporary overview. IEEE Signal Process. Mag. 32, 87–109 (2014).
Article ADS Google Scholar
Reich, E. Three-dimensional technique on trial. Nature 480, 303 (2011).
Article ADS Google Scholar
Miao, J., Chen, C. C.-C., Mao, Y., Martin, L. S. & Kapteyn, H. C. Potential and Challenge of Ankylography. Preprint at <http://arxiv.org/abs/1112.4459> (2011).
Thibault, P. Feasibility of 3D reconstructions from a single 2D diffraction measurement. Preprint at <http://arxiv.org/abs/0909.1643v1> (2009).
Miao, J. Response to‘ Feasibility of 3D reconstruction from a single 2D diffraction measurement’. Preprint at <http://arxiv.org/abs/0909.3500> (2009).
Miao, J. & Chen, C. 2nd Response to ‘Feasibility of 3D reconstruction from a single 2D diffraction measurement’. Preprint at <http://arxiv.org/abs/0910.0272> (2009).
Wang, G., Yu, H., Cong, W. & Katsevich, A. Non-uniqueness and instability of ‘Ankylography’. Nature 480, E2–E3 (2011).
Article ADS CAS Google Scholar
Wei, H. Fundamental limits of ‘Ankylography’ due to dimensional deficiency. Nature 480, E1 (2011).
Article ADS CAS Google Scholar
Xu, R. et al. Single-shot three-dimensional structure determination of nanocrystals with femtosecond X-ray free-electron laser pulses. Nat. Commun. 5, 4061 (2014).
Article CAS Google Scholar
Martin, L. S., Chen, C.-C. & Miao, J. Multi-Shell Ankylography. Preprint at <http://arxiv.org/abs/1311.4517> (2013).
Shechtman, Y., Beck, A. & Eldar, Y. C. GESPAR: efficient phase retrieval of sparse signals. IEEE Trans. Signal Process. 62, 928–938 (2014).
Article ADS MathSciNet Google Scholar
Herman, G. T. & Gabor, H. T. Fundamentals of Computerized Tomography: Image Reconstruction from Projections Springer (2009).
Miao, J., Förster, F. & Levi, O. Equally sloped tomography with oversampling reconstruction. Phys. Rev. B 72, 052103 (2005).
Article ADS Google Scholar
Gazit, S., Szameit, A., Eldar, Y. C. & Segev, M. Super-resolution and reconstruction of sparse sub-wavelength images. Opt. Express 17, 23920–23946 (2009).
Article ADS Google Scholar
Szameit, A. et al. Sparsity-based single-shot subwavelength coherent diffractive imaging. Nat. Mater. 11, 455–459 (2012).
Article ADS CAS Google Scholar
Shechtman, Y., Gazit, S., Szameit, A., Eldar, Y. C. & Segev, M. Super-resolution and reconstruction of sparse images carried by incoherent light. Opt. Lett. 35, 1148–1150 (2010).
Article ADS Google Scholar
Shechtman, Y., Eldar, Y. C., Szameit, A. & Segev, M. Sparsity based sub-wavelength imaging with partially incoherent light via quadratic compressed sensing. Opt. Express 19, 14807–14822 (2011).
Article ADS Google Scholar
Shechtman, Y. et al. Sparsity-based super-resolution and phase-retrieval in waveguide arrays. Opt. Express 21, 24015–24024 (2013).
Article ADS Google Scholar
Candès, E., Romberg, J. & Tao, T. Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 52, 1–41 (2006).
Article MathSciNet Google Scholar
Eldar, Y. C. & Kutyniok, G. Compressed Sensing: Theory and Applications Cambridge University Press (2012).
Donoho, D. L. For most large underdetermined systems of linear equations the minimal l1-norm solution is also the sparsest solution. Commun. Pure Appl. Math. 59, 797–829 (2006).
Article Google Scholar
Cordero, B. et al. Covalent radii revisited. Dalton Trans. 2832–2838 (2008).
Zolotoyabko, E. Basic Concepts of X-Ray Diffraction John Wiley & Sons (2014).
Mukherjee, S. & Seelamantula, C. S. in Proceedings of 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, 553–556 (Kyoto, 2012).
Serkez, S. et al. Proposal for a scheme to generate 10 TW-level femtosecond X-ray pulses for imaging single protein molecules at the European XFEL. Preprint at <http://arxiv.org/abs/1306.0804> (2013).
Hollingsworth, S. A. & Karplus, P. A. A fresh look at the Ramachandran plot and the occurrence of standard structures in proteins. Biomol. Concepts 1, 271–283 (2010).
Article CAS Google Scholar
Bertsekas, D. Nonlinear Programming Athena Scientific (1999).

Download references

Acknowledgements

This work was supported by the Focused Technology Area project on Nano Photonics for Detection and Sensing, by the ICORE Excellence Center ‘Circle of Light’, and by the Israel Science Foundation.

Author information

Authors and Affiliations

Physics Department and Solid State Institute, Technion, Haifa, 32000, Israel
Maor Mutzafi, Yoav Shechtman, Oren Cohen & Mordechai Segev
Department of Chemistry, Stanford University, 375 North-South Mall, Stanford, 94305, California, USA
Yoav Shechtman
Electrical Engineering Department, Technion, Haifa, 32000, Israel
Yonina C. Eldar

Authors

Maor Mutzafi
View author publications
You can also search for this author in PubMed Google Scholar
Yoav Shechtman
View author publications
You can also search for this author in PubMed Google Scholar
Yonina C. Eldar
View author publications
You can also search for this author in PubMed Google Scholar
Oren Cohen
View author publications
You can also search for this author in PubMed Google Scholar
Mordechai Segev
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed considerably to the research.

Corresponding author

Correspondence to Mordechai Segev.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Information

Supplementary Methods and Supplementary References (PDF 411 kb)

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Mutzafi, M., Shechtman, Y., Eldar, Y. et al. Sparsity-based Ankylography for Recovering 3D molecular structures from single-shot 2D scattered light intensity. Nat Commun 6, 7950 (2015). https://doi.org/10.1038/ncomms8950

Download citation

Received: 29 October 2014
Accepted: 30 June 2015
Published: 20 August 2015
DOI: https://doi.org/10.1038/ncomms8950

This article is cited by

Single-shot 3D coherent diffractive imaging of core-shell nanoparticles with elemental specificity
- Alan Pryor
- Arjun Rana
- Jianwei Miao
Scientific Reports (2018)
Single-shot and single-sensor high/super-resolution microwave imaging based on metasurface
- Libo Wang
- Lianlin Li
- Tie Jun Cui
Scientific Reports (2016)
Sparsity-based super-resolved coherent diffraction imaging of one-dimensional objects
- Pavel Sidorenko
- Ofer Kfir
- Oren Cohen
Nature Communications (2015)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Sparsity-based Ankylography for Recovering 3D molecular structures from single-shot 2D scattered light intensity

Subjects

Abstract

Similar content being viewed by others

Free log-likelihood as an unbiased metric for coherent diffraction imaging

Single-protein optical holography

High-fidelity optical diffraction tomography of multiple scattering samples

Introduction

Results

The sparsity-based concept

Physical setting and sparse representation of the physical signal

Problem formulation

Comparing our sparsity-based technique with the HIO algorithm

Performance of our sparsity-based Ankylography algorithm

Discussion

Methods

Mathematical formulation

Description of the algorithm

Additional information

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Supplementary information

Supplementary Information

Rights and permissions

About this article

Cite this article

This article is cited by

Single-shot 3D coherent diffractive imaging of core-shell nanoparticles with elemental specificity

Single-shot and single-sensor high/super-resolution microwave imaging based on metasurface

Sparsity-based super-resolved coherent diffraction imaging of one-dimensional objects

Comments

Search

Quick links

Subjects

Abstract

Similar content being viewed by others

Free log-likelihood as an unbiased metric for coherent diffraction imaging

Single-protein optical holography

High-fidelity optical diffraction tomography of multiple scattering samples

Introduction

Results

The sparsity-based concept

Physical setting and sparse representation of the physical signal

Problem formulation

Comparing our sparsity-based technique with the HIO algorithm

Performance of our sparsity-based Ankylography algorithm

Discussion

Methods

Mathematical formulation

Description of the algorithm

Additional information

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Supplementary information

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Single-shot 3D coherent diffractive imaging of core-shell nanoparticles with elemental specificity

Single-shot and single-sensor high/super-resolution microwave imaging based on metasurface

Sparsity-based super-resolved coherent diffraction imaging of one-dimensional objects

Comments

Search

Quick links