Abstract
Deciphering the threedimensional (3D) structure of complex molecules is of major importance, typically accomplished with Xray crystallography. Unfortunately, many important molecules cannot be crystallized, hence their 3D structure is unknown. Ankylography presents an alternative, relying on scattering an ultrashort Xray pulse off a single molecule before it disintegrates, measuring the farfield intensity on a twodimensional surface, followed by computation. However, significant information is absent due to lower dimensionality of the measurements and the inability to measure the phase. Recent Ankylography experiments attracted much interest, but it was counterargued that Ankylography is valid only for objects containing a small number of volume pixels. Here, we propose a sparsitybased approach to reconstruct the 3D structure of molecules. Sparsity is natural for Ankylography, because molecules can be represented compactly in stoichiometric basis. Utilizing sparsity, we surpass current limits on recoverable information by orders of magnitude, paving the way for deciphering the 3D structure of macromolecules.
Introduction
Recovering the threedimensional (3D) structure of biological molecules is of paramount importance. For example, protein characterization plays a key role in the field of structural proteomics^{1,2,3}. Knowing the protein structure may provide further understating of the function and mechanism even for proteins whose biochemical function is known^{4}. The main methodology used today to recover 3D structure of molecules is Xray crystallography, which requires crystallization of the probed molecules. This method relies on Xray diffraction from a periodic structure, which averages over many molecules making up the ‘molecular crystal’. However, the molecules in such a structure are not situated in the same exact position and alignment in all unit cells, hence this method fundamentally cannot provide sufficient resolution in the recovered 3D structure of the molecule. Moreover, there is an additional even greater problem: while small molecules (having few degrees of conformational freedom) may be crystallized by various methods, such as chemical vapour deposition^{5} and recrystallization^{6}, for macromolecules, especially membrane proteins, crystallization is much more problematic^{7}. In fact, thus far, crystallization attempts have been unsuccessful for most of the membrane proteins; as such, the 3D structure of many biomolecules is still unknown^{7}. Clearly, developing a method that could decipher the 3D structure of a single protein molecule is nothing less than a dream. In fact, no current method can do that even in theory.
In the past few years, it has been proposed to study such molecules using imaging with Xray laser pulses^{6,8,9,10}, whose wavelength has the desired resolution. However, since Xray light ionizes all biological molecules and changes their molecular structure, Xray experiments on organic molecules cannot be carried out with continuous wave (CW) radiation. Rather, this has to be performed with ultrashort laser pulses. Moreover, biological molecules disintegrate after the first pulse, and therefore the information (scattered light) necessary for recovering the structure must be collected either in a single shot (the basis of Ankylography), or in multiple shots—each probing a new molecule of the same kind, followed by a calibration procedure (registration) since the molecule in each shot is inevitably rotated in 3D space. Such ideas have indeed been suggested^{11,12,13}. Experimentally, there were successful attempts using singleshot Xray pulses scattered off aerosol particles, demonstrating the ability to determine the orientation of two large polystyrene spheres^{14} and finding the twodimensional (2D) projection of several particles^{15}. Going back to a single biological molecule, when a single ultrashort Xray pulse is launched at such a molecule, and when the pulse is short enough—the flux of photons scattered off the molecule before it disintegrates carries the information about the structure^{16,17}. The 3D structure of the molecule can then be recovered algorithmically from this single measurement, in a process called Ankylography^{10}. This approach for deciphering the molecular structure relies on the ability to use ultrashort (femtosecond) laser pulses in the Xray regime. Indeed, recent developments have enabled the construction of a new Xray free electron laser (XFEL) facility, which emits a beam with high coherence and facilitates access to atomic scale imaging^{18,19}. In fact, the wavelength of Xray laser flashes are so short that even atomic details may one day become discernible (λ∼0.5 Å–6 nm). Another source of ultrashort Xray pulses is based on the highharmonic generation process, which already enables coherent experiments in the Xray regime^{20}.
Desirably, the coherent scattering measurements should be taken at the surface corresponding to the Ewald sphere^{21} (a sphere in the Fourier domain; see explanation in the Supplementary Information). But even in this case, such a singleshot 2D measurement is still missing a very large part of the information necessary to recover the 3D structure. Namely, the phase information is missing, and only 2D data is obtained. Therefore, Ankylography describes an algorithmic procedure whose goal is to recover 3D information from a singleshot magnitudeonly measurement taken on a 2D surface corresponding to the Ewald sphere of the sought information. The algorithmic methodology of Ankylography relies on phaseretrieval algorithms, known for several decades^{22,23}, which have recently found their way into applications with coherent Xray radiation^{24,25}. Still, achieving phaseretrieval for 3D structures from 2D measurements, as Ankylography is attempting to do, is a formidable challenge.
In spite of these problems in trying to recover the 3D structure from highly incomplete measurements, a visionary proofofconcept Ankylography experiment has recently been demonstrated, attracting much interest^{10}. However, the excitement has not been unanimous among researchers. For example, in a recent exchange in Nature magazine researchers compared the idea, to pulling a 3D rabbit out of a 2D hat^{26}. The original Ankylographic method was believed to work for only objects containing <15^{3} voxels (volumetric picture element)^{26,27}, but actually the original paper has demonstrated the recovery of larger objects, with the current state of the art being 32 × 32 × 20 voxels. While researchers have not yet reached a consensus on exact limits of Ankylography^{10}, serious doubts were cast on its feasibility^{28,29,30}, uniqueness and stability^{31,32}. Moreover, it was claimed that, Ankylography will not work in the absence of additional constraints^{28}. Notwithstanding these important arguments, recent experiments have demonstrated good progress in Ankylography, but all under stringent assumptions on the symmetry of the recovered structures^{33} or multiple measurements^{34}.
Here, we propose and numerically demonstrate a new algorithmic paradigm for reconstructing 3D objects from their scattered 2D intensity. Our approach is based on sparsity: prior knowledge that the information is sparse in a known basis. In our context, sparsity is manifested in the fact that the molecule effectively occupies small number of degrees of freedom (d.f.) (because molecules are made of atoms), and that the chemical composition (stoichiometry) of the molecule is known. As such, the prior knowledge of sparsity can be utilized to recover the ‘signal’ from highly incomplete measurements. Using recently developed algorithmic tools for sparsitybased phase retrieval^{35}, we demonstrate numerically the ability to determine the atomic structures of various complex organic molecules, such as peptides. This illustrates that sparsity and optimization techniques enable surpassing current limits on the recoverable information in Ankylography, by orders of magnitude. We test the performance of our methodology with respect to sparsity (number of atoms) and noise, and conclude that sparsity can pave the way to algorithmic reconstruction of the 3D structure of molecules from a single measurement of the photon flux in the optical far field.
Results
The sparsitybased concept
Before going into the mathematical details of sparsitybased Ankylography, let us explain the logic of our approach and its background. In Ankylography, a major part of the information is lost due to physical limitations, which leads to dimension deficiency and to lack of phase information in the measured data. In the most general case, it is possible to recover 3D information by taking multiple projection measurements and appropriate signal processing. Common methods to do that include computed tomography^{36} (CT), equally sloped tomography^{37} and more. However, here, traditional methods to recover 3D information from 2D measurements cannot be employed, because they require multiple measurements from different projections, while in the current physical problem, multiple projections are extremely hard (if not impossible) to realize in experiments. This is the motivation for Ankylography: attempting to rely strictly on data acquired in a singleshot experiment, in spite of the fact that a large part of the information is missing in the measurements. This is where sparsity comes into play. As we show below, the underlying problem typically features only a small number of d.f.. As such, our use of sparsity is natural, relying on the logic of our earlier work on sparsitybased subwavelength imaging^{38,39} and superresolution^{40,41,42}, and on sparsitybased phase retrieval^{35,39,41,42}. The theoretical framework underlying the recovery procedure is borrowed from the emerging field of compressed sensing^{43,44,45}. The main theme of compressed sensing is to reduce the number of acquired measurements of a signal, while still being able to accurately recover it by relying on the fact that the signal is described by a small number of d.f. Here, our goal is to recover the complete information from an inherently incomplete set of (quadratic) measurements. To this end, we adapt the recently proposed sparsitybased phaseretrieval technique (called GESPAR^{35}) to our setting.
Physical setting and sparse representation of the physical signal
The general physical setting for Ankylography is illustrated in Fig. 1. A coherent ultrashort laser pulse of central wavelength λ and relatively narrow bandwidth δλ (such that ) is incident upon a molecule with a 3D structure we wish to recover. The light is scattered from the molecule within the ultrashort duration of the pulse (a few femtoseconds), but immediately thereafter the molecule disintegrates, such that the only measurements available are those taken from the scattered light in this singleshot experiment. The detectors can be positioned on the Ewald sphere, or more practically, use a planar camera and correct for the curvature. The 3D effective potential (core electron charge density) of the molecule, which is the source for Xray scattering, can be described as a sum over known basis functions. For simplicity, we describe each atom as a sphere with its covalent radius^{46}, although a more mathematically accurate description would be provided by a set of known, spherically symmetric functions^{47}. It is important to note that, in spite of the fact that we described the molecule with simple basis functions, our methodology is general, and is not limited to a particular basis (see example in Fig. 5). In this scheme, the molecule resembles a set of hovering spheres. The 3D scattering potential of the molecule is given by the scalar function
where the first summation is over the T kinds of elements comprising the molecule, and U_{j}(r) is the potential related to the jth element. The second summation is over S_{j}, the number of atoms of the jth element, with and being the 3D position and amplitude of the nth atomic wavefunction of the jth element. Physically, manifests the charge density in the core electrons of that atom, which is what scatters Xray radiation^{46}. For example, U_{1} reflects the potential of the 1st element in the molecule (say, carbon), while and are the 3D position and the amplitude of the third carbon atom. Importantly, the chemical composition of the molecule is known (number of atoms of each element) from stoichiometry, as well as the covalent radius associated with each element. Hence, the only unknowns are the relative positions of the atoms, as described by the centre positions of the spheres, , and the amplitudes . Altogether, the number of unknowns in the problem is relatively small, which is why sparsitybased methods can be very effective.
The scattered light intensity, which corresponds to the measured data, resides on a spherical surface of a large radius centred on the molecule, in the far field of the 3D image. Theoretically, to first order in perturbation theory^{21} the scattered field intensity is proportional to the 3D Fourier transform absolute value squared of the scattering potential, measured on the surface of a sphere called ‘the Ewald sphere’^{21}:
Here, I(θ, ϕ) is proportional to the intensity of the electromagmatic waves as a function of the angles in spherical coordinates, measured relative to the incident wave direction (z), and r is the coordinate vector. The proportionality coefficient and further details are provided in the Supplementary Information section (equation (1) there). The integration is taken over the volume defined by the spatial extent of the object (V).
Problem formulation
To set up the problem as a sparse recovery problem, we define a 3D grid (of M sites) for the possible positions of each atom, repeating the grid for T different elements separately. We arrange the unknowns in a vector (where, represents a ‘numeric column vector’: a series of values), whose entries are (where the superscript H represents conjugate transpose). Here is the vector of unknowns, of size M·T, associated with element j described on the M grid sites. The mth entry of is , where if no atom of element j resides at site m, while means that such an atom resides at this site. The measurement vector is denoted by , where the value of the lth entry, C_{l}, is proportional to the intensity at angles θ_{l} and ϕ_{l} (C_{l}=I(θ_{l}, ϕ_{l})), with L being the total number of measurements, namely, of the intensity readings in the detectors. The measurement of the lth detector is , where the vector represents one (vector) term in the transfer function of the system, , which is simply the 3D Fourier transform operator measured on the Ewald sphere.
With this notation, our mathematical problem can be described as follows:
where is the L_{0} norm which counts the number of nonzero entries in the vector. Here, S_{j} is the number of atoms of the jth element, which is assumed to be known from stoichiometry. Note that equation (3) is a difficult problem to solve—there is no guarantee for a unique solution, and furthermore, no assured method to find a global minimum. This is where the power of the sparsity assumption comes in: the fact that the solution is known to be sparse, (that is, that S_{j} are small) allows us to utilize recently developed methods that solve sparse quadratic problems such as this one. Specifically, to find a sparse solution to equation (3), we use the GESPAR^{35} method with the set of matrices relevant to our problem. Sparsitybased Ankylography requires some modification to the formulation in ref. 35. Our algorithm is described in detail in the Methods section.
Comparing our sparsitybased technique with the HIO algorithm
A typical example is shown in Figs 2a and 3a, where we simulate the recovery of the 3D structure of the amino acid threonine. This molecule, sketched in Fig. 1 and displayed more clearly in Fig. 3a, has 17 atoms: four carbons (red spheres), one nitrogen (orange sphere), three oxygens (light green spheres) and nine hydrogens (dark blue). For clarity, we plot the 3D structure streamlined sequentially, and assign to it a onedimensional grid index as shown in Fig. 2a. In this example, the position of each of the atoms is marked by a circle of its associated colour on the 9^{3} grid, where the grid is considerably denser than the radius of the smallest atom (see Supplementary Information). The vertical axis provides the amplitude, which reflects the effective charge density associated with each atom.
First, we test the ability of the current Ankylography algorithm (used in Ref. 10) to recover the 3D structure of threonine. To do that, we use a slightly modified version of the algorithm used in ref. 10 and available at http://www.physics.ucla.edu/research/imaging/Ankylography/index.htm. Essentially, this is the standard hybrid input–output (HIO) method^{22,23}, which is commonly used for phase retrieval. As a model for the sought information, we insert the set of hovering spheres (defining threonine) into the algorithm. The HIO algorithm is basically iterating Fourier transforms back and forth between the object and the Fourier domains, using the measured data (absolute value of the 3D Fourier transform), and applying prior knowledge on the ‘support’ of the object (the known region within which the molecule resides). When we attempt to use the HIO algorithm for Ankylography as in ref. 10, we have to represent the 3D information with 55^{3} voxels (volume pixels), for the sake of sufficient resolution. This attempt to reconstruct the 3D structure of threonine has completely failed: as argued in the Comment^{26} and Reply^{27}, this method is not expected to work because the number of voxels greatly exceeds 32X32X20, which is the current state of the art in Ankylography. Following this unsuccessful attempt, it is natural to try using the HIO algorithm with the additional prior information that the object (the molecule) can be represented as a set of spheres. The result is shown in Fig. 2b: this attempt also fails, in spite of the additional prior information. The HIO algorithm converges to a solution occupying all the possible number of d.f. (that is, the grid in Fig. 2b is fully populated), which is clearly an erroneous solution.
This is where sparsity makes the big difference. In sharp contrast to the other attempts, our sparsitybased GESPAR algorithm provides excellent reconstruction, as shown by the reconstruction on the streamline grid in Fig. 2c, and by the visual 3D plot of Fig. 3b. See further details on the algorithm in the Methods section. Clearly, sparsitybased Ankylography can recover the 3D structure of molecules of much greater complexity and details than ever anticipated from Ankylography.
It is essential at this point to elucidate the general role of sparsity (rather than the specific algorithm), in our successful reconstruction, where the standard HIO algorithm fails. To do that, we add sparsity to the HIO algorithm, as prior information. More specifically, in every iteration of the HIO algorithm, we enforce the 3D image to be sparse under the underlying basis functions (the set of spheres) by thresholding the coefficients^{48}. The result is shown in Figs 2d and 3c. Examining the result, it is clear that adding sparsity constraints to the HIO algorithm results in a huge improvement, but the reconstruction is still poor. Clearly, GESPAR outperforms the sparse HIO approach, consistent with ref. 35.
Following this example, and many other examples we have simulated, several conclusions can be drawn. First, Ankylography features a small number of d.f., hence it is amenable to algorithmic methods relying on sparsity. Second, our current sparsitybased phaseretrieval algorithmic methodology enables the recovery of the 3D structure of molecules occupying two orders of magnitude more voxels than what Ankylography can handle without sparsity^{10}. In fact, our sparsitybased method has no upper limit on the size of the molecules. Last but not least, we emphasize that it is indeed the sparsity concept making this recovery possible, as we have shown that adding sparsity to standard methods considerably improves their performance. Altogether, it is clear that sparsity significantly improves Ankylography, making it a highly promising method in the next generation of structural biology experiments.
Performance of our sparsitybased Ankylography algorithm
With noise robustness being a major concern regarding the performance limits of Ankylography, it is essential to study the performance of our technique in a statistical fashion, in terms of the level of sparsity and permissible noise levels. To do that, we test our algorithm on 600 examples, under different conditions of signaltonoise ratio (SNR) and optical wavelength. Importantly, we examine the algorithm in a realistic scenario, where the molecule is not restricted to any particular grid, while the recovery is made on a fine 3D grid (four 121^{3} basis functions), such that the radius of the smallest atom (hydrogen) is three times larger than the grid unit. Further details on these simulations are provided in the Methods section. The results are shown in Fig. 4, which displays the reconstruction error as a function of sparsity (total number of atoms). Here, the normalized reconstruction error (a number between 0 and 1) is defined as
where ε is the error, f_{source} is the original 3D image (defined above), f_{recovery} is the image recovered from the 2D intensity pattern given by equation (2), and is the inner product operator. In these simulations, we use white noise (added to I(θ,ϕ)) distributed uniformly on the sphere defining the measured data (assuming the noise originates from isotropic volume scattering). The noise level, N, is defined as the fraction of noise to the total power of the scattered light (the measurements). The values we use in the simulations yield SNR that is much smaller than the SNR taken in ref. 10. Figure 4a shows the reconstruction error as a function of sparsity for three wavelengths. Expectedly, the performance is better at shorter wavelengths, which yields higher resolution. Figure 4b shows the reconstruction error as a function of sparsity for various noise levels at wavelength of 0.35 Å. This wavelength is chosen such that it corresponds to 1/3 of the finest resolution of our information (the smallest distance between centres of spheres). The conclusion drawn from these figures is that our method work well under realistic conditions. For example, for a noise level of N=0.001 and λ=0.35 Å, the algorithm performs well as long as the total number of atoms is smaller than ∼20.
Discussion
The simulations indicate that increasing the SNR is of major importance. The challenge in doing that is the photon flux at short Xray wavelengths. The current XFEL emits ∼10^{12} photons at every pulse; however, numerical simulations have indicated that the combination of selfseeding and undulator tapering techniques can increase the pulse intensity by two orders of magnitude^{49}. Furthermore, since photons are bosons, there is no fundamental limit on the pulse intensity, and it is expected that the intensity of XFEL will continue to increase as new techniques are being developed. As such, the SNR within which our sparsitybased method can recover structures of single molecules is within reach in the near future. Importantly, proteins have large scattering crosssections, scattering more photons than a single amino acid, which makes our method viable especially for proteins (which contain multiple amino acids) with the present technology.
Finally, we note that our sparsitybased method is demonstrated here for the case where the signal sparsity corresponds to a small number of atoms. However, the method is applicable for much more general scenarios—for example, when the sparsity is in the number of amino acids, as is the case for many proteins of interest. In this case, the different ‘building blocks’ to be localized and oriented are the known amino acids from which the protein is composed, optimally—along with their possible conformations. Figure 5 shows the reconstruction of the 3D structure of a peptide molecule which is a combination of amino acids with peptide bonds. The molecule is a tripeptide and is composed by two glycine and one alanine amino acids. To reconstruct this structure, we use our sparsitybased procedure implemented on the basis of amino acids that spans all positions and rotations. Importantly, using additional prior knowledge on the amino acid bonds (protein conformation such as Ramachandran plot^{50}) and assigning a binary value for every basis element (instead of determining their amplitude) further reduces the number of d.f. dramatically, and can allow the reconstruction of significantly larger structure than what we show in Fig. 5. Our method is actually expected to perform much better for large proteins because these have larger crosssections, and therefore scatter more photons and increase the SNR. Moreover, the recovery of structures made of large basis elements is possible with a longer wavelength, hence, amino acids of typical size of several angstrom will require the wavelength to be of the same order, up to 10 times larger than for Figs 2, 3, 4, where the Xray laser technology is more mature.
In conclusion, we suggest a new approach to recover the 3D structure of molecules using Ankylography. Our sparsitybased methodology enables deciphering 3D structures of biomolecules in a singleshot Xray laser pulse, and exceeds the current limit of recovered information by orders of magnitude. We have demonstrated the reconstruction of a single amino acid and of a tripeptide, with the recovery methodology implemented on the basis of amino acids. These examples highlight the strength of the sparsitybased Ankylography concept and also demonstrate that it is actually easier to apply it to larger objects. The last example proves the generality of sparsitybased Ankylography and provides an avenue for the future of structural biology. With that, sparsitybased Ankylography can reach the level it can overcome the current bottleneck of structural biology.
Methods
Mathematical formulation
Our mathematical problem amounts to construction of a 3D sparse signal from the Fourier magnitude on the Ewald’s sphere.
Of course, when the majority of the information is lost, precise reconstruction is not possible, unless we have, or may assume, some additional information about the sought signal. In fact, the problem is even more difficult as the measurements contain noise. We assume that the scattered electromagmatic field can be approximated adequately (hereafter, this relation is denoted by ≅) by means of known generating functions describing spheres U_{j}(r) of radius R^{j}. In other words, we want to reconstruct a 3D optical image assuming that it is comprised of a small known number of (different) spheres of known radii, as described in the Results section. Every kind of sphere U_{j}(r) (identified by its radius R^{j}) corresponds to a different atomic element (j, in this case), where the elements are known, and also how many atoms are there of each element. We emphasize that the reconstruction is done on a 3D grid, while the spheres themselves do not need to reside on any grid at all: they reflect the actual structure of the molecule which does not necessarily reside on a known grid.
As we already mentioned, the input to the algorithm is the 3D Fourier magnitude, sampled on the Ewald sphere. The output of the algorithm should be the positions of all of the atoms, and their corresponding radii.
Mathematically, the molecule is defined in equation (1), which in the spatial frequency domain yields
where, is Bessel function of order n. We define as generating functions in the frequency domain.
As described in the Results section, we define a 3D grid (of M sites) by the set for the possible positions of each atom for T different elements, and a set of sampling points (spatial frequencies) (related to the angles on the Ewald sphere), defined as . We arrange the unknowns in a vector (of size M·T), whose entries are (where the superscript H represents conjugate transpose), where is the vector of unknowns associated with element j described on the M grid sites. Here, the mth entry in represents the amplitude of jth element at the mth site. Note that not necessarily reside on the grid . But if it is on the grid, then we can represent as an inner product. To do that we define the vector and . The measured signal is therefore
While the sensing matrix is
where, . The rows of relate to sampling frequencies (total of L rows) and the columns correspond to different elements (total M·T rows). For example, the (l, M+11) entry, , is related to a sphere of element #2 located at q_{11} and sampled at the spatial frequency ν_{l}. The measurements vector is denoted by , where the value of the lth entry, .
Now that we have mathematical representation of the measurements, we consider additional (spatially independent) white noise.
The noise level, , is the fraction of the noise power to the total power of the scattered light in the measurements surface, where (where is the expectation value) and . The signal power taken here also includes the scattered field at small angles θ (low spatial frequencies on the Ewald sphere), which cannot be measured (because the detectors at those angles are saturated by the incident light beam) but carry most of the energy. The values we use for the noise in the simulations yield SNR that is much smaller than the SNR taken in^{10}, yet, as shown in the Results section, our sparsitybased approach is able to recover the 3D structures much better and with information capacity larger by orders of magnitude.
Technically, we seek the vector that conforms to the measurements (equation (9)), and at the same time has a known number of units of each element, for example, five atoms of the element carbon. We define the objective as
and solve the following optimization problem:
For the sake of further use, the derivative of the objective is calculated below.
Description of the algorithm
In order to solve this problem, we use a modified version of a new efficient (greedy) technique for sparsitybased phase retrieval, called GESPAR. The recovery of the unknown vector from the set of equations in equation (9) is an illposed problem. However, we have the prior information that our input signal is sparse. Relying on recent work^{35,39,41,42} dealing with the similar problem of finding sparse solutions to the phaseretrieval problem (which constitutes a quadratic compressed sensing problem)—we employ the GESPAR algorithm presented in^{35}. GESPAR was originally intended to solve the sparse phaseretrieval problem of recovering a sparse signal from measurements of its Fourier magnitude, but it can also be used to solve the more general sparse quadratic problem^{35}.
In order to find a sparse solution to equation (1), we use GESPAR with the set of matrices . The algorithm requires modification to the formulation in^{35} (in addition to defining to correspond to our system). The stages in sparsitybased Ankylography are summarized below (for a more detailed description of the GESPAR algorithm see^{35}):
Algorithm: Ankylography GESPAR
Input: Measurements and sampling matrices .
Initialize: Set empty support and initial guess .
Loop: while, the cost function is improved (that is, ) or support requirement is not satisfied yet (that is, ) do
Support update:
Given the support s, minimizing reduces to a nonlinear leastsquares problem, which we solve by the dampedGauss–Newton algorithm^{51} commonly used for this type of problems. The dampedGauss–Newton procedure produces an estimate .
Perform a local search, an index k_{j} of element j containing a high absolute gradient value. Add k_{j} to the support and perform a dampedGauss–Newton procedure, given the new support and calculate the cost function .
Add one atom of the element that minimizes the objective the most, and that at the same time satisfies .
Index swapping:
Calculate the cost function gradient around the current estimate.
Perform a local search by index swapping, an index i_{j} of element j from the support containing a small absolute valued element with an index k_{j} of element j containing a high absolute gradient value, where the gradient is calculated after zeroing the index i_{j}. This step differs from GESPAR because of the correlativity of the different entries in sparsitybased Ankylography where originally the bases functions in GESPAR are orthogonal. Perform a dampedGauss–Newton procedure for the support and calculate the cost function .
Go over all the different elements, j=1, 2, 3…T and find the support that minimize the objective the most and substitute it as the new support s and . If the Index swapping step succeeded do it again.
Output: The estimated locations and amplitudes .
The difference between GESPAR and our sparsitybased Ankylography algorithm is that our problem contains constraints on the subvector , which GESPAR does not have. Consequently, we apply GESPAR to every subvector separately and select the best choice. Another difference is that we calculate the gradient of the cost function after zeroing for every index i_{j}.
Additional information
How to cite this article: Mutzafi, M. et al. Sparsitybased Ankylography for Recovering 3D molecular structures from singleshot 2D scattered light intensity. Nat. Commun. 6:7950 doi: 10.1038/ncomms8950 (2015).
References
Wild, D. & Saqi, M. Structural proteomics: inferring function from protein structure. Curr. Proteomics 1, 59–65 (2004).
Watson, J. D., Laskowski, R. A. & Thornton, J. M. Predicting protein function from sequence and structural data. Curr. Opin. Struct. Biol. 15, 275–284 (2005).
Zhang, C. & Kim, S.H. Overview of structural genomics: from structure to function. Curr. Opin. Chem. Biol. 7, 28–32 (2003).
Geerlof, A. et al. The impact of protein characterization in structural proteomics. Acta Crystallogr. D Biol. Crystallogr. 62, 1125–1136 (2006).
Powell, C. F., Oxley, J. H., Blocher, J. M. & Klerer, J. Vapor deposition. J. Electrochem. Soc. 113, 266C (1966).
Harwood, L. M. Experimental Organic Chemistry: Principles and Practice Blackwell Scientific Publications (1989).
Carpenter, E. P., Beis, K., Cameron, A. D. & Iwata, S. Overcoming the challenges of membrane protein crystallography. Curr. Opin. Struct. Biol. 18, 581–586 (2008).
Sayre, D. Imaging Processes and Coherence in Physics 229–235Springer (1980).
Miao, J., Charalambous, P., Kirz, J. & Sayre, D. Extending the methodology of Xray crystallography to allow imaging of micrometresized noncrystalline specimens. Nature 400, 342–344 (1999).
Raines, K. S. et al. Threedimensional structure determination from a single view. Nature 463, 214–217 (2010).
Fung, R., Shneerson, V., Saldin, D. K. & Ourmazd, A. Structure from fleeting illumination of faint spinning objects in flight with application to single molecules. Nat. Phys. 5, 11 (2008).
Loh, N. T. D. & Elser, V. Reconstruction algorithm for singleparticle diffraction imaging experiments. Phys. Rev. E 80, 1–20 (2009).
Geilhufe, J. et al. Extracting depth information of 3dimensional structures from a singleview Xray Fouriertransform hologram. Opt. Express 22, 24959–24969 (2014).
Starodub, D. et al. Singleparticle structure determination by correlations of snapshot Xray diffraction patterns. Nat. Commun. 3, 1276 (2012).
Loh, N. D. et al. Fractal morphology, imaging and mass spectrometry of single aerosol particles in flight. Nature 486, 513–517 (2012).
Wabnitz, H. et al. Multiple ionization of atom clusters by intense soft Xrays from a freeelectron laser. Nature 420, 482–485 (2002).
Neutze, R., Wouts, R., van der Spoel, D., Weckert, E. & Hajdu, J. Potential for biomolecular imaging with femtosecond Xray pulses. Nature 406, 752–757 (2000).
Geloni, G. et al. Coherence properties of the European XFEL. New J. Phys. 12, 035021 (2010).
Vartanyants, I. A. et al. Coherence properties of individual femtosecond pulses of an Xray freeelectron laser. Phys. Rev. Lett. 107, 144801 (2011).
Popmintchev, T. et al. Bright coherent ultrahigh harmonics in the keV Xray regime from midinfrared femtosecond lasers. Science 336, 1287–1291 (2012).
Born, M. & Wolf, E. Principles of Optics: Electromagnetic Theory of Propagation, Interference and Diffraction of Light Cambridge University Press (1980).
Fienup, J. R. Reconstruction of an object from the modulus of its Fourier transform. Opt. Lett. 3, 27 (1978).
Fienup, J. R. Phase retrieval algorithms: a comparison. Appl. Opt. 21, 2758–2769 (1982).
Miao, J., Ishikawa, T., Robinson, I. K. & Murnane, M. M. Beyond crystallography: diffractive imaging using coherent Xray light sources. Science 348, 530–535 (2015).
Shechtman, Y. et al. Phase retrieval with application to optical imaging: a contemporary overview. IEEE Signal Process. Mag. 32, 87–109 (2014).
Reich, E. Threedimensional technique on trial. Nature 480, 303 (2011).
Miao, J., Chen, C. C.C., Mao, Y., Martin, L. S. & Kapteyn, H. C. Potential and Challenge of Ankylography. Preprint at <http://arxiv.org/abs/1112.4459> (2011).
Thibault, P. Feasibility of 3D reconstructions from a single 2D diffraction measurement. Preprint at <http://arxiv.org/abs/0909.1643v1> (2009).
Miao, J. Response to‘ Feasibility of 3D reconstruction from a single 2D diffraction measurement’. Preprint at <http://arxiv.org/abs/0909.3500> (2009).
Miao, J. & Chen, C. 2nd Response to ‘Feasibility of 3D reconstruction from a single 2D diffraction measurement’. Preprint at <http://arxiv.org/abs/0910.0272> (2009).
Wang, G., Yu, H., Cong, W. & Katsevich, A. Nonuniqueness and instability of ‘Ankylography’. Nature 480, E2–E3 (2011).
Wei, H. Fundamental limits of ‘Ankylography’ due to dimensional deficiency. Nature 480, E1 (2011).
Xu, R. et al. Singleshot threedimensional structure determination of nanocrystals with femtosecond Xray freeelectron laser pulses. Nat. Commun. 5, 4061 (2014).
Martin, L. S., Chen, C.C. & Miao, J. MultiShell Ankylography. Preprint at <http://arxiv.org/abs/1311.4517> (2013).
Shechtman, Y., Beck, A. & Eldar, Y. C. GESPAR: efficient phase retrieval of sparse signals. IEEE Trans. Signal Process. 62, 928–938 (2014).
Herman, G. T. & Gabor, H. T. Fundamentals of Computerized Tomography: Image Reconstruction from Projections Springer (2009).
Miao, J., Förster, F. & Levi, O. Equally sloped tomography with oversampling reconstruction. Phys. Rev. B 72, 052103 (2005).
Gazit, S., Szameit, A., Eldar, Y. C. & Segev, M. Superresolution and reconstruction of sparse subwavelength images. Opt. Express 17, 23920–23946 (2009).
Szameit, A. et al. Sparsitybased singleshot subwavelength coherent diffractive imaging. Nat. Mater. 11, 455–459 (2012).
Shechtman, Y., Gazit, S., Szameit, A., Eldar, Y. C. & Segev, M. Superresolution and reconstruction of sparse images carried by incoherent light. Opt. Lett. 35, 1148–1150 (2010).
Shechtman, Y., Eldar, Y. C., Szameit, A. & Segev, M. Sparsity based subwavelength imaging with partially incoherent light via quadratic compressed sensing. Opt. Express 19, 14807–14822 (2011).
Shechtman, Y. et al. Sparsitybased superresolution and phaseretrieval in waveguide arrays. Opt. Express 21, 24015–24024 (2013).
Candès, E., Romberg, J. & Tao, T. Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 52, 1–41 (2006).
Eldar, Y. C. & Kutyniok, G. Compressed Sensing: Theory and Applications Cambridge University Press (2012).
Donoho, D. L. For most large underdetermined systems of linear equations the minimal l1norm solution is also the sparsest solution. Commun. Pure Appl. Math. 59, 797–829 (2006).
Cordero, B. et al. Covalent radii revisited. Dalton Trans. 2832–2838 (2008).
Zolotoyabko, E. Basic Concepts of XRay Diffraction John Wiley & Sons (2014).
Mukherjee, S. & Seelamantula, C. S. in Proceedings of 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, 553–556 (Kyoto, 2012).
Serkez, S. et al. Proposal for a scheme to generate 10 TWlevel femtosecond Xray pulses for imaging single protein molecules at the European XFEL. Preprint at <http://arxiv.org/abs/1306.0804> (2013).
Hollingsworth, S. A. & Karplus, P. A. A fresh look at the Ramachandran plot and the occurrence of standard structures in proteins. Biomol. Concepts 1, 271–283 (2010).
Bertsekas, D. Nonlinear Programming Athena Scientific (1999).
Acknowledgements
This work was supported by the Focused Technology Area project on Nano Photonics for Detection and Sensing, by the ICORE Excellence Center ‘Circle of Light’, and by the Israel Science Foundation.
Author information
Authors and Affiliations
Contributions
All authors contributed considerably to the research.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Information
Supplementary Methods and Supplementary References (PDF 411 kb)
Rights and permissions
This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
About this article
Cite this article
Mutzafi, M., Shechtman, Y., Eldar, Y. et al. Sparsitybased Ankylography for Recovering 3D molecular structures from singleshot 2D scattered light intensity. Nat Commun 6, 7950 (2015). https://doi.org/10.1038/ncomms8950
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/ncomms8950
This article is cited by

Singleshot 3D coherent diffractive imaging of coreshell nanoparticles with elemental specificity
Scientific Reports (2018)

Singleshot and singlesensor high/superresolution microwave imaging based on metasurface
Scientific Reports (2016)

Sparsitybased superresolved coherent diffraction imaging of onedimensional objects
Nature Communications (2015)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.