Introduction

Ultrafast molecular gas-phase diffraction, from either x-rays1, 2 or electrons3,4,5,6, is a vital tool for retrieving time-dependent molecular structures. In elastic molecular gas-phase diffraction experiments, x-rays or electrons scatter off of electrons and nuclei, with differing proportionality. Each pairwise atomic distance creates a pattern of scattered x-rays or electrons as a function of their transverse momentum q. The measured diffraction pattern is the sum of all such contributions, this is orientationally averaged over the lab frame ensemble distribution. We lose pairwise directional information and thus the ability to explicitly distinguish individual atomic distances. Consequently, directly inverting diffraction patterns for the molecular structure is generally intractable, this is a so-called inverse problem. Typically, we avoid this inverse problem and retrieve both the molecular structures and the molecular frame orientations by simulating the forward excited state process. These are generally time-dependent ab initio electronic and vibrational excited state simulations that explore a large parameter space (rovibration, structure, and electronic state) with trajectory bifurcations due to effects like conical intersections7,8,9,10. We refer to such simulations as complex simulations, that are typically validated through comparisons with measured diffraction patterns or pair-distribution functions (PDFs – a weighted histogram of pairwise distances). Consequently, ultrafast gas-phase diffraction is generally limited by the ability to perform these complex simulations. We aim to expand diffraction measurements for high-resolution reconstructions of molecular structure probability distribution Ψ(R, t)2 without relying on complex molecular dynamics simulations by effectively solving this inverse problem with a statistical interpretation.

A variety of studies sought to reduce reliance on complex simulations, but are either limited in the systems they address or quickly run into the curse of dimensionality. Fourier transforming the time dependence exposes dissociative and vibronic signals11,12,13 but it is insensitive to classes of isomerizations. Methods employing ensemble anisotropy have garnered much interest14,15,16,17,18,19,20,21,22,23 yet they struggle to get sub-Angstrom resolution and the full 3d structure for generic molecular structures. Optimization methods, while capable of exposing large-scale motion, are susceptible to local minima21. Pattern matching measured data against sampled isomers24,25,26 becomes intractable for moderately large molecules due to the curse of dimensionality. For example, a molecule with Natoms atoms has 3Natoms − 6 degrees of freedom. To independently sample each degree of freedom 10 times would require \(1{0}^{3{N}_{{{{{{{{\rm{atoms}}}}}}}}}-6}\) structures, becoming intractable for molecules with 7 or more atoms. Simulations reduce the structure-space of isomers to select, but this trade-off requires previous knowledge24 that potentially imparts biases.

We employ insights from molecular ensemble anisotropy methods, applied statistics, and machine learning principles to address the inverse problem and the curse of dimensionality to approximate the molecular structure probability density Ψ(R, t)2. It is important to note that instead of sampling individual molecular structures and comparing single structures to the measured data, we are sampling entire Ψ(R, t)2 probability distributions. We access the molecular frame by decomposing measured data onto anisotropic components. Then, we iteratively approximate Ψ(R, t)2 with a statistical approach uniquely suited for high repetition-rate diffraction facilities. We observe that resolution strongly improves with signal-to-noise (SNR) much faster than increasing the q range beyond moderate values. Unlike the PDF approach, this method retrieves the molecular distances and angles required to define a unique molecular structure.

In our method, we recover the molecular frame through time-dependent ensemble anisotropy20,27,28,29,30,31,32. One rotates into the molecular frame with the lab frame Euler angles \({\theta }_{{{{{{{{\rm{I}}}}}}}}}^{({{{{{{{\rm{lf}}}}}}}})}\) (polar), \({\phi }_{{{{{{{{\rm{I}}}}}}}}}^{({{{{{{{\rm{lf}}}}}}}})}\) (azimuthal), and \({\chi }_{{{{{{{{\rm{I}}}}}}}}}^{({{{{{{{\rm{lf}}}}}}}})}\) (Fig. 1a). An induced rotational wavepacket creates ensemble anisotropy given by Ψ(θ(lf), ϕ(lf), t)2. Axis distribution moments (ADMs)29,33,34 are the coefficients in the Wigner D matrix expansion of Ψ(θ(lf), ϕ(lf), t)2

$${{{\mathcal{A}}}}_{mk}^{l}(t)=\frac{2l+1}{8{\pi }^{2}} \big\langle \Psi (t)\vert {D}_{mk}^{l}\big({\phi }_{{{\rm{I}}}}^{({{{\rm{lf}}}})},{\theta }_{{{{\rm{I}}}}}^{({{{\rm{lf}}}})},{\chi }_{{{{\rm{I}}}}}^{({{{\rm{lf}}}})}\big)\vert \Psi (t)\big\rangle.$$
(1)

These ADMs (Eq. (1)) describe the ensemble of molecular frame orientations with respect to the lab frame. When calculating the ADMs, the l, m, and k are difference and sum of quantum numbers between rotational eigenstates, respectively for the total angular momentum, the projection onto the lab frame z-axis, and the projection onto the molecular frame z-axis. These ADMs transform the lab frame into the molecular frame by decomposing the measured lab frame anisotropy into Clmk(q) coefficients, which are dependent on molecular frame pairwise distances and angles \({{\Delta }}{{{{{{{{\bf{R}}}}}}}}}_{\mu \nu }={{{{{{{{\bf{R}}}}}}}}}_{\mu }-{{{{{{{{\bf{R}}}}}}}}}_{\nu }=[{{\Delta }}{R}_{\mu \nu },{\theta }_{\mu \nu }^{({{{{{{{\rm{mf}}}}}}}})},{\phi }_{\mu \nu }^{({{{{{{{\rm{mf}}}}}}}})}]\) (Fig. 1b). The PDF is not directly sensitive to these angles. After impulsively aligning the molecular ensemble, Fig. 2 illustrates how transient anisotropy (panels b and c) provides constraints on these Euler angles and consequently the molecular frame (panels d–g). For example, at 39.25 ps the anisotropy provides simultaneous constraints on \({\theta }_{{{{{{{{\rm{I}}}}}}}}}^{({{{{{{{\rm{lf}}}}}}}})}\) and \({\chi }_{{{{{{{{\rm{I}}}}}}}}}^{({{{{{{{\rm{lf}}}}}}}})}\). At 39.68 ps, \({\chi }_{{{{{{{{\rm{I}}}}}}}}}^{({{{{{{{\rm{lf}}}}}}}})}\) (the molecular frame azimuthal plane) is highly constrained. At 39.85 ps the ensemble is well localized in \({\theta }_{{{{{{{{\rm{I}}}}}}}}}^{({{{{{{{\rm{lf}}}}}}}})}\), resolving measurements along the molecular frame \({{\mbox{z}}}\mbox{-}{{\mbox{axis}}}\). Here, P\(({\phi }_{{{{{{{{\rm{I}}}}}}}}}^{({{{{{{{\rm{lf}}}}}}}})})\) is uniform due to cylindrical symmetry imparted by a linearly polarized pulse.

Fig. 1: Correspondence between the lab and molecular frame.
figure 1

Our analysis considers each pairwise distance independently and we define the origin of both the lab and molecular frames by one of the pairwise vectors. For the highlighted NO bond, the nitrogen atom (blue) defines the origin. a The lab frame is defined by the laser polarization \((\hat{{{{{{{{\bf{z}}}}}}}}})\) and propagation direction \((\hat{{{{{{{{\bf{y}}}}}}}}})\). b The molecular frame is defined by the molecule’s rovibronic ground state principal moments of inertia, where the molecular A, B, and C axes define \({\hat{{{{{{{{\bf{z}}}}}}}}}}^{({{{{{{{\rm{mf}}}}}}}})}\), \({\hat{{{{{{{{\bf{y}}}}}}}}}}^{({{{{{{{\rm{mf}}}}}}}})}\), and \({\hat{{{{{{{{\bf{x}}}}}}}}}}^{({{{{{{{\rm{mf}}}}}}}})}\). Here the NO is described by ΔRμν, \({\theta }_{\mu \nu }^{({{\mbox{mf}}})}\), and \({\phi }_{\mu \nu }^{({{\mbox{mf}}})}\) which correspond to its distance, polar angle, and azimuthal angle respectively. One accesses the molecular frame by rotating the lab frame by the Euler angles \({\theta }_{{{{{{{{\rm{I}}}}}}}}}^{({{{{{{{\rm{lf}}}}}}}})}\), \({\phi }_{{{{{{{{\rm{I}}}}}}}}}^{({{{{{{{\rm{lf}}}}}}}})}\), and \({\chi }_{{{{{{{{\rm{I}}}}}}}}}^{({{{{{{{\rm{lf}}}}}}}})}\).

Fig. 2: Axis distribution moments and ensemble anisotropy.
figure 2

The Axis distribution moments (ADMs) encapsulate the ensemble anisotropy which provides various constraints on the molecular frame as a function of time. The ADMs are parameterized by the three angular momentum quantum numbers l, m, and k which correspond to the total angular momentum, the projection along the lab frame \((\hat{{{{{{{{\bf{z}}}}}}}}})\) axis, and projection along the molecular frame \((\hat{{{{{{{{\bf{z}}}}}}}}})\) axis respectively. a We show the square norms of the ADMs and (b, c) highlight the time dependence of these normalized ADMs. d, e We show the time-dependent ensemble anisotropy probability distributions for \({\theta }_{{{{{{{{\rm{I}}}}}}}}}^{({{{{{{{\rm{lf}}}}}}}})}\) and \({\chi }_{{{{{{{{\rm{I}}}}}}}}}^{({{{{{{{\rm{lf}}}}}}}})}\), respectively. f, g We show illustrative line-outs of these Euler angle distributions for \({\theta }_{{{{{{{{\rm{I}}}}}}}}}^{({{{{{{{\rm{lf}}}}}}}})}\) and \({\chi }_{{{{{{{{\rm{I}}}}}}}}}^{({{{{{{{\rm{lf}}}}}}}})}\), respectively, with isotropy indicated by the dashed lines.

To effectively invert the molecular diffraction pattern and approximate Ψ(R, t)2, we use Bayesian inference. Bayesian inference describes a class of statistical inference techniques using Bayes Theorem to update one’s model based on observed data35. We first approximate Ψ(R, t)2 as the probability distribution \(P\left({{{{{{{\bf{R}}}}}}}},t\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\), which is parameterized by the molecular structure degrees of freedom Θ. Using Bayesian inference, we then relate \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) to the measured molecular diffraction pattern. With this framework, we use Markov-chain Monte Carlo (MCMC) techniques to build \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) and tackle the curse of dimensionality by efficiently sampling structures most consistent with the measured Clmk(q). This method is unbiased and naturally avoids regions in our sampling space that are inconsistent with the Clmk(q). We retrieve \(P\left({{{{{{{\bf{R}}}}}}}},t\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\) with neither the PDF nor complex molecular dynamics simulations since we will analytically relate the molecular frame pairwise distances and angles to the Clmk(q). Further intuition is provided in Supplementary Note 4 and Ref. 36.

Instead of complex molecular dynamics simulations this method has fewer simulation requirements. In this method’s simplest form, when probing structural dynamics it only requires the much more tractable simulation of the rovibronic ground state structure to define the molecular frame. When measuring the equilibrium vibronic ground state, one does not require a priori knowledge of the structure they wish to find. This is because each sampled structure will define a new molecular frame. When using anisotropy components, we require time-dependent rotational simulations for the ADMs. This requires rotational constants and the molecular polarizability, all of which can be measured or calculated from the rovibronic ground state structure. When applying this method to excited states, we require the transition dipole, which is also measured or calculated from the rovibronic ground state structure. As discussed later, depending on the desired accuracy, one must select a functional form for \(P\left({{{{{{{\bf{R}}}}}}}},t\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\) based on a priori knowledge of the excitation or use normal distributions as a “first-order" approximation.

In this manuscript, we validate these principles by retrieving Ψ(R)2 for the vibronic ground states of both simulated NO2 and measured N2O rotational wavepackets. Here NO2, an asymmetric top, serves as a test case to show our method’s broad capabilities and behavior under various experimental conditions. Furthermore, we validated these capabilities with measured N2O data from the ultrafast MeV electron diffraction facility at SLAC (UED). We chose these molecules to specifically be amenable to conventional methods since triatomics do not suffer significantly from the curse of dimensionality. In this lower dimensional realm, we benchmark and validate our method against conventional methods with intentions to later expand to larger molecules. In the following, all simulations and equations correspond to ultrafast electron diffraction experiments but are easily extended to x-ray diffraction.

In this work, we rigorously and qualitatively describe this method in addition to quantitatively benchmarking both its advantages and shortcomings. We provide intuition and mathematically describe how induced anisotropy accesses the molecular frame structural angles (\({\theta }_{\mu \nu }^{({{{{{{{\rm{mf}}}}}}}})}\) and \({\phi }_{\mu \nu }^{({{{{{{{\rm{mf}}}}}}}})}\)) and how to retrieve this molecular frame structure using Bayesian inference. We evaluate this method on simulated and measured data, showing how \(P\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\) significantly improves upon the traditional Fourier limited PDF. Firstly, \(P\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\) unambiguously defines a unique molecular mean structure without complex molecular dynamics simulations. This is generally not possible from the PDF alone. Secondly, we report pairwise distance resolutions of order 10 mÅ and down to 0.1 mÅ from measured and simulated data, respectively. These resolutions are respectively a factor of 50 and 1000 times smaller than their corresponding PDF resolutions. Thirdly, we investigate this method’s behaviors and systematic errors as a function of experimental factors and analysis choices. We find this procedure depends more strongly on signal-to-noise than it does by extending measured momentum transfer. Fourthly, we demonstrate how this method expands ultrafast gas-phase diffraction experiments to quantitatively measure additional parameters, such as the width of Ψ(R, t)2. Lastly, we describe how one can apply this method to excited state dynamics. With these advancements, this method has the potential to expand ultrafast gas-phase diffraction into a more discovery-oriented technique, one that is free of complex excited state simulation limitations and is applicable to currently inaccessible molecular systems.

Results

Both the simulated NO2 and measured N2O diffraction patterns are from the SLAC UED facility6. Elastic electron diffraction is primarily sensitive to the nuclei. Diffraction from electronic transients occurs within the removed low q regions, removed because of experimental conditions. We use the independent atom approximation because we are primarily concerned with the nuclear structure. Given an anisotropic distribution of molecules, the measured diffraction pattern 〈I(q, t)〉rigid (derived in Supplementary Note 2) is given by

$$\langle I({{{{{\mathbf{q}}}}}},t) \rangle_{{{{{{\rm{rigid}}}}}}} = \, {{{{{\mathcal{I}}}}}} \Bigg(\mathop{\sum}\limits_\mu |f_\mu(q)|^2 + \mathop{\sum}\limits_{\mu,\nu : \mu \neq \nu} {{{{{\rm{Re}}}}}} \Bigg\{ f_\mu(q) f^*_\nu(q) \mathop{\sum}\limits_l \frac{32 \pi^3 i^l}{2l+1} \\ \times \mathop{\sum}\limits_{m, k} (-1)^{k} \underbrace{Y^{m}_l \left(\theta^{({{{{{\rm{lf}}}}}})}_q, \phi^{({{{{{\rm{lf}}}}}})}_q \right)}_{{{{{{\rm{Lab}}}}}}\,{{{{{\rm{Frame}}}}}}} \langle{{{\Psi}}(0)|\underbrace{j_l(q{{\Delta}} R_{\mu\nu}) Y_l^{-k} \left(\theta_{\mu\nu}^{({{{{{\rm{mf}}}}}})}, \phi_{\mu\nu}^{({{{{{\rm{mf}}}}}})}\right)}_{{{{{{\rm{Molecular}}}}}}\,{{{{{\rm{Frame}}}}}}\,{{{{{\rm{Structure}}}}}}}|{{\Psi}}(0)}\rangle \underbrace{\left. {{{{{{\mathcal{A}}}}}}^{l}_{{m}{k}}(t)}\right|_{{{{{{\rm{rigid}}}}}}}}_{{{{{{\rm{Anisotropy}}}}}}} \Bigg \} \!\!\Bigg)$$
(2)

where fμ(q) is the scattering amplitude of the μth atom, jl(qΔRμν) are the spherical Bessel functions of the first kind, \({{{{{{{\mathcal{I}}}}}}}}\) is the diffraction beam intensity, and the momentum transfer vector is given by \({{{{{{{\bf{q}}}}}}}}=[q,{\theta }_{q}^{({{{{{{{\rm{lf}}}}}}}})},{\phi }_{q}^{({{{{{{{\rm{lf}}}}}}}})}]\). Here we applied the rigid rotor approximation. The simulated and measured diffraction patterns that Eq. (2) describe are shown in Fig. 3 (row a).

Fig. 3: Analysis to access the molecular frame signal.
figure 3

To access the molecular structure term, in the molecular frame, one must remove the lab frame anisotropy dependence and fit onto the ADMs. For the NO2 simulation (left) and N2O data (right), we illustrate the analysis steps. a One first measures the difference diffraction pattern (Δ〈I(q, t)〉). b Removing the detector angular dependence, one retrieves time-dependent lab frame anisotropy components \({B}_{l}^{m}(q,t)\). c Removing the time-dependent ensemble anisotropy (ADMs) yields the molecular frame Mlmk(q) coefficients. All as described in the text. We note that in the N2O data (right) we have limited visibility of data due to experimental limitations illustrated by the hash lines area.

The Clmk(q) coefficients (Eq. (3)) distill the molecular frame information from Eq. (2)

$$C_{lmk}(q) = \,{{{{{\mathcal{I}}}}}} \, \mathop{\sum}\limits_{\mu,\nu : \mu \neq \nu} {{{{{\rm{Re}}}}}} \Bigg\{ f_\mu(q) f^*_\nu(q) (-1)^{k} \frac{32 \pi^3i^l}{2l+1} \\ \times \langle{ {{{\Psi}}(0)}|\underbrace{j_l(q{{\Delta}} R_{\mu\nu}) Y_l^{-k} \left(\theta_{\mu\nu}^{({{{{{\rm{mf}}}}}})}, \phi_{\mu\nu}^{({{{{{\rm{mf}}}}}})}\right)}_{{{{{{\rm{Molecular}}}}}}\, {{{{{\rm{Frame}}}}}}\, {{{{{\rm{Structure}}}}}}}}|{{{\Psi}}(0)}\rangle \Bigg \}$$
(3)
$$C_{lmk}^{({{{{\rm{calc}}}}})} (q) = \,{{{{{\mathcal{I}}}}}} {\int} \mathop{\sum}\limits_{\mu,\nu : \mu \neq \nu} {{{{{\rm{Re}}}}}} \Bigg\{ f_\mu(q) f^*_\nu(q) (-1)^{k} \frac{32 \pi^3i^l}{2l+1} \\ \times \underbrace{j_{l}\left(q{{\Delta}} R_{\mu\nu}\right) Y_{l}^{-k} \left(\theta_{\mu\nu}^{({{{{{\rm{mf}}}}}})}, \phi_{\mu\nu}^{({{{{{\rm{mf}}}}}})}\right)}_{{{{{{\rm{Molecular}}}}}}\, {{{{{\rm{Frame}}}}}}\, {{{{{\rm{Structure}}}}}}} \Bigg \} \,P({{{{{\mathbf{R}}}}}} | {{{\bf{\Theta}}}},C) \,d{{{{{\mathbf{R}}}}}}$$
(4)
$${M}_{lmk}(q)=\frac{{C}_{lmk}(q)}{{\sum }_{\mu }| {f}_{\mu }(q){| }^{2}}$$
(5)

where Eq. (3) is our measurement and Eq. (4) describes our measurement in terms of the desired Θ parameters. The modified Mlmk(q) in Eq. (5) (which remove the q−4 dependence) are shown in Figs. 4b and 3 (row c), respectively for NO2 and N2O. To get these Clmk(q) we simulated the ADMs, accounting for centrifugal distortion in the N2O case.

Fig. 4: Simulated NO2 data at various experimental conditions.
figure 4

For simulated NO2 we defined a Ψ(R)2 distribution, from which we calculated the Clmk(q) under various experimental conditions. a The simulated NO2 distribution is a multivariate normal distribution that we use to calculate the simulated NO2 responses (Clmk(q) and Mlmk(q)). b The Mlmk(q), shown for various signal-to-noise ratios (SNR) for the case of an ensemble temperature of 100 K and kick fluence of 1 J/cm2, isolate the molecular frame structure. c The ADMs have two dependencies: pump strength (constant ensemble temperature of 25 K) on the left and temperature (constant pump fluence of 1 J/cm2) on the right.

To determine the functional form of \(P\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\), we define our simulated stretched NO2 molecule in the ground vibrational state and we observe that 99.99% of the N2O molecules occupy the vibrational ground state (Supplementary Note 9). The normal distribution, \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\) (Eq. (9)), is a good description of both our NO2 and N2O vibronic ground state Ψ(R)2 distribution as the normal distribution is the ground state eigenfunction of the harmonic oscillator.

$$P\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\approx {\left\vert \Psi \left({{{{{{{\bf{R}}}}}}}}\right)\right\vert }^{2}$$
(6)
$${P}^{(\delta )}\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)=\delta \left({{{{{{{{\bf{\Theta }}}}}}}}}^{(\delta )}-{{{{{{{\bf{R}}}}}}}}\right)$$
(7)
$${{{{{{{{\bf{\Theta }}}}}}}}}^{(\delta )}=\left[\left\langle {{{{{{{{\rm{NO}}}}}}}}}^{(1)}\right\rangle ,\left\langle {{{{{{{{\rm{NO}}}}}}}}}^{(2)}\right\rangle ,\left\langle \angle {{{{{{{\rm{ONO}}}}}}}}\right\rangle \right]$$
(8)
$${P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)= \frac{1}{{\sqrt{2\pi }}^{{N}_{{{{{{{{\rm{dof}}}}}}}}}}\mathop{\prod }\nolimits_{i = 0}^{i < {N}_{{{{{{{{\rm{dof}}}}}}}}}}{{{{{{{{\bf{\Theta }}}}}}}}}_{2i+1}^{({{{{{{{\mathcal{N}}}}}}}})}} \\ \times\exp \left\{\frac{-1}{2}\mathop{\sum }\limits_{i = 0}^{i < {N}_{{{{{{{{\rm{dof}}}}}}}}}}{\left(\frac{{{{{{{{{\bf{\Theta }}}}}}}}}_{2i}^{({{{{{{{\mathcal{N}}}}}}}})}-{{{{{{{{\bf{R}}}}}}}}}_{i}}{{{{{{{{{\bf{\Theta }}}}}}}}}_{2i+1}^{({{{{{{{\mathcal{N}}}}}}}})}}\right)}^{2}\right\}$$
(9)
$${{{{{{{{\bf{\Theta }}}}}}}}}^{({{{{{{{\mathcal{N}}}}}}}})}=\left[\left\langle {{{{{{{{\rm{NO}}}}}}}}}^{(1)}\right\rangle ,\sigma \left({{{{{{{{\rm{NO}}}}}}}}}^{(1)}\right),\left\langle {{{{{{{{\rm{NO}}}}}}}}}^{(2)}\right\rangle ,\sigma \left({{{{{{{{\rm{NO}}}}}}}}}^{(2)}\right), \left\langle \angle {{{{{{{\rm{ONO}}}}}}}}\right\rangle ,\sigma \left(\angle {{{{{{{\rm{ONO}}}}}}}}\right)\right]$$
(10)

The Θ parameters are the means and standard deviations of pairwise distances and angles that parameterize this multivariate normal distribution (Eq. (10)).

Using the statistical nature of our experiment, we effectively invert Eq. (4) for the probability distribution of Θ parameters given the measured Clmk(q) (\(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\)). Here, \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) is the posterior distribution that is proportional to the likelihood function

$$P\left(C| {{{{{{{\bf{\Theta }}}}}}}}\right)= \left[\mathop{\prod}\limits_{lmk,q}\frac{1}{{\sigma }_{lmk}(q)\sqrt{2\pi }}\right]\\ \times \left\{\exp \frac{-1}{2}\mathop{\sum}\limits_{lmk,q}{\left(\frac{{C}_{lmk}(q)-{C}_{lmk}^{({{{{{{{\rm{calc}}}}}}}})}(q,{{{{{{{\bf{\Theta }}}}}}}})}{{\sigma }_{lmk}(q)}\right)}^{2}\right\}$$
(11)

which is the probability of observing the measured Clmk(q) coefficients assuming the Θ parameters are true. In Eq. (11) the σlmk(q) are the standard error of the mean from the distribution of measured Clmk(q) coefficients. Applying an MCMC technique to Eq. (11), we build \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) and find the optimal Θ parameters (Θ*) for \(P\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\) as described in the Methods section.

In our main result, we illustrate our method’s efficacy by retrieving Θ*and consequently \({P}^{({{{{{{{\mathcal{N}}}}}}}})}({{{{{{{\bf{R}}}}}}}}| {{{{{{{{\bf{\Theta }}}}}}}}}^{* },C)\) from both simulated NO2 and measured N2O Clmk(q) coefficients. We further investigate our method’s behavior and sensitivity to varying experimental conditions for the simulated NO2 system. Finally, we show how our Bayesian inference method significantly improves real-space resolution.

Molecular structure distribution retrieval

To retrieve \({P}^{({{{{{{{\mathcal{N}}}}}}}})}({{{{{{{\bf{R}}}}}}}}| {{{{{{{{\bf{\Theta }}}}}}}}}^{* },C)\) for simulated NO2 and measured N2O data we first built the posteriors \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) shown in Fig. 5a and c, respectively. Figure 5b and d show \({P}^{({{{{{{{\mathcal{N}}}}}}}})}({{{{{{{\bf{R}}}}}}}}| {{{{{{{{\bf{\Theta }}}}}}}}}^{* },C)\) for NO2 and the simulated PDF for N2O, respectively. Tables 1 and 2 give the extracted Θ* (the most probable Θ parameters) and σΘ, respectively, for N2O and NO2. For the NO2 simulation, the SNR is 400. For NO2, \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\)’s resolution (σΘ) for the nuclear distances and angles is ~0.5 mÅ and fully encompasses the ground truth values. Despite the largely flat \(\left\langle \angle {{{{{{{\rm{ONO}}}}}}}}\right\rangle\) distribution, Θ* still converges on the ground truth values. For N2O data, the retrieved \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) encompasses the previously measured results of the vibronic ground state37, 38. The resolution of this distribution is of order 10 mÅ even with our limited q range of [3.5, 7.25] Å−1 and the very poor SNR. Moreover, the retrieved \(\left\langle \angle {{{{{{{\rm{NNO}}}}}}}}\right\rangle\) is π and we resolve the ~50 mÅ difference between the NTNC and NCO bond distances (Table 2). The retrieved widths \(\sigma \left({{{{{{{{\rm{N}}}}}}}}}^{{{{{{{{\rm{T}}}}}}}}}{{{{{{{{\rm{N}}}}}}}}}^{{{{{{{{\rm{C}}}}}}}}}\right)\) and \(\sigma \left(\angle {{{{{{{\rm{NNO}}}}}}}}\right)\) are unphysical due to the limited q range, as discussed later. Compared to the PDF (Fig. 5d), with a ~2 Å Fourier resolution, this method improves resolution by a factor of 50. In the PDF, the missing low and high q components produce ringing artifacts in this inverse Fourier transform because of the incomplete Fourier space. This confuses the PDF results as they are not positive definite and falsely indicate population at large distances.

Fig. 5: Retrieving \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\), \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\), and the molecular structure parameters.
figure 5

We successfully retrieve the multivariate posterior \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) for NO2 and N2O from which we find Θ*. From \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\), we observe the distribution of Θ parameters: the mean and standard deviations of the pairwise distances and angles that define \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\). a We show the 1d and 2d projections of \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) distributions for the simulated NO2 response. b The recovered \({P}^{({{{{{{{\mathcal{N}}}}}}}})}({{{{{{{\bf{R}}}}}}}}| {{{{{{{{\bf{\Theta }}}}}}}}}^{* },C)\) is what we compare to the simulated Ψ(R)2 in Fig. 4a. The red dashed lines indicate the retrieved mode (Θ*), while the black “X" and solid black lines indicate the ground truth, respectively. c We show the 1d and 2d projections \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) distributions for N2O data, though only using the C200(q) contribution. The black “X" and solid black lines indicate previously measured values for N2O37,38. d For comparison, we provide the simulated Pairwise Distribution Function (PDF) from the same q range.

Table 1 Retrieved molecular frame structure parameters for simulated NO2.
Table 2 Retrieved molecular frame structure parameters for measured N2O.

We observe (Fig. 5a and c) that Θ* does not correspond to the mean or mode of most 1-dimensional projections of \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\). This is due to the nonlinearity and correlations of \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) in Θ space. This illustrates the importance of finding Θ* in this correlated space since the structure parameters are indeed correlated.

Exploring experimental effects and systematics

The measured q range is a critical component of gas-phase ultrafast diffraction, determining the information content and the PDF’s resolution. When expanding this range, Figs. 6a and 7, we observe resolution (σΘ) improvements only until ~8 Å−1, after which it plateaus. This indicates that after a modest q range our method is not very sensitive to further increases. The false correlations between Θ parameters (Fig. 7e), still, continue to decline as we increase this range. The plotted correlation in Fig. 7e is between all 6 Θ parameters. The correlations seen in Fig. 7a and c are termed false correlations since the simulated Ψ(R)2 is a multivariate normal distribution with a diagonal covariance matrix. Increasing the measured reciprocal range q provides more information about the system and reduces these correlations, seen in Fig. 7a, c, and e. Regardless of the sampled q ranges, we still retrieve the ground truth Θ parameters and consequently \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\): Fig. 7b and d.

Fig. 6: The effects of various experimental parameters on \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\).
figure 6

Varying experimental parameters affect the resolution (width) of \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\), but our method is most sensitive to the measured signal-to-noise ratio (SNR). The SNR is the geometric mean of C000(q)/σ000(q) between 0.5 < q < 4 Å−1. The uncorrelated widths of \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\), denoted by σΘ, change with increasing the (a) q range and (b) the SNR. The axis distribution moments, and consequently σΘ, are dependent on (c) pump fluence (ensemble temperature of 25 K) and (d) the molecular ensemble temperature (pump fluence of 1 J/cm2).

Fig. 7: Effects of varying the measured q range on \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\).
figure 7

Varying the measured q range affects false correlations in \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) for NO2; a larger reciprocal space provides more information and dampens false correlations. a We show the 1d and 2d projections of \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) and (b) the corresponding \({P}^{({{{{{{{\mathcal{N}}}}}}}})}({{{{{{{\bf{R}}}}}}}}| {{{{{{{{\bf{\Theta }}}}}}}}}^{* },C)\) for a limited q range of [0.5, 5] Å−1. The red dashed lines illustrate Θ*, while the black “X" and solid lines indicate the ground truth values. Similarly, we show (c) the 1d and 2d projections of \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) (d) the corresponding \({P}^{({{{{{{{\mathcal{N}}}}}}}})}({{{{{{{\bf{R}}}}}}}}| {{{{{{{{\bf{\Theta }}}}}}}}}^{* },C)\) for the broader q range of [0.5, 20] Å−1. e We highlight the correlation between all Θ parameters as a function of q-range. We note the decrease in correlations with larger q, which is further illustrated by the decreasing widths and correlations in the 1d and 2d \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) projections with higher q.

When varying the SNR, Fig. 6b, σΘ rapidly decreases with increasing SNR. Increasing SNR by an order of magnitude decreases σΘ by an order of magnitude for pairwise distances and angles. This strong and continuous dependence indicates that our method is sensitive to SNR due to our statistical interpretation. Although \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) becomes more peaked, the general shape from the correlations does not change since higher SNR improves resolution but does not add more information, in terms of the q range.

Increasing the induced rotational coherence and lowering the ensemble temperature rapidly improves resolution (Fig. 6c and d) similar to increasing SNR. In Fig. 6c, the gas was at 25 K while varying the rotational coherence. In Fig. 6c, the pump fluence was 1 J/cm2 while varying the ensemble temperature. Increasing the rotational coherence and decreasing the temperature increases the magnitude and complexity of the ADMs (Fig. 4c). This is because higher average pump fluences induce larger rotational coherence and lowering the ensemble temperature diminishes the spread of initial rotational states that incoherently interfere. The result is an increase in signal, a larger SNR, and consequently the similarly continuous behavior in Fig. 6b.

Generally, when varying the q range, SNR levels, pump fluence, and ensemble temperature we find the pairwise distances’ σΘ to be of order 1 mÅ; for the width parameters, σΘ is order 10 mÅ. Our retrieved Θ* values are generally within a relative error of ~10−7 and ~10−3 from the ground truth values for structural and width parameters, respectively. This resolution is often ~100 times better than PDF-based methods because our statistical treatment is highly sensitive to SNR.

Aside from experimental parameters, we investigate systematics induced by incorrectly selecting the functional form of \(P\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\). We assert the simulated NO2 vibronic ground state Ψ(R)2 distribution is a multivariate normal distribution (Fig. 4a). We evaluate both \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) (Eq. (9)) and \({P}^{(\delta )}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) (Eq. (7)) on this simulation, and in Fig. 8 we compare their 1d projections as a function of q range. The \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) distribution consistently encompasses the correct values, but the \({P}^{(\delta )}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) distribution fails to do so for q ranges of [0.5,7.5], [0.5,10], and [0.5,12.5] Å−1. These q ranges are shown in Fig. 8c. This is because \({P}^{(\delta )}\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\) assumes a single molecule response can describe a signal averaged over an ensemble of structures. With increasing q ranges, \({P}^{(\delta )}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) converges in an unstable fashion on the ground truth (Fig. 8b), unlike the smooth convergence in \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) (Fig. 8a). We note that for NO2, retrieving \({P}^{(\delta )}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) is ~100 times faster than \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\), which respectively take order 10 s to 1 min and 1 h to 1 day on 10 CPUs. This is because \({P}^{(\delta )}\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\) doesn’t have to sum over structures in Eq. (4). Supplementary Note 7 and Ref. 36 provides plots and further discussion of these results.

Fig. 8: Systematic errors from selecting incorrect Ψ(R)2 distributions.
figure 8

The \({P}^{(\delta )}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) distribution suffers from a q dependent systematic error stemming from the false assumption that a single structure describes the results measured from an ensemble. Here we show the 1d projections of (a, b) \({P}^{(\delta )}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) as a function of (c) the measured q range and a signal-to-noise ratio (SNR) of 400. Each column indicates a different q range starting at 0.5 Å−1 with the end of said q range indicated by the rightmost border of that column. The dashed lines are the ground truth values. c We used the same simulated C200(q) coefficient for both posteriors. Its intersected by the black lines indicate the upper q range of each column.

Effects of Bayesian inference

Our method retrieves the labeled pairwise distances with ~100 times better resolution than the PDF. This is due to our statistical treatment using Bayesian inference where each lmk and q contribution is itself an independent probability distribution; each is an experiment of its own. The Metropolis-Hastings algorithm’s (MHA) discrimination power grows exponentially with more Clmk(q), which increases the magnitude of the negative exponent in the relative ratio of likelihood functions P(CΘ) (Eq. (11)). Our method therefore heavily relies on σlmk(q) and Clmk(q) (seen in Fig. 6b). Statistical noise increases σlmk(q), making \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) wider (Fig. 6b), while systematic errors in Clmk(q) shift the centriod of \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) (Fig. 5c). Supplementary Note 5 describes our method for consistently accounting for both statistical and systematic errors. The PDF error adds in quadrature in σlmk(q); its scale is set by the largest error bar and disproportionately suffers from poorly measured data points. Conversely, MHA amplifies the contribution of high precision measurements while reducing contributions from poorly measured data points by weighting each term in the likelihood by 1/σlmk(q) (Eq. (11)).

Our Bayesian inference approach expands the utility of gas-phase ultrafast diffraction to measure previously inaccessible variables. Given \(P\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\) is a generic function parameterized by Θ, one can introduce variables through Θ by selecting a \(P\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\) that depends on them. Here, we expanded the measurable parameters of gas-phase ultrafast diffraction to include the width of Ψ(R)2 in \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\), shown in Fig. 5 and given in Table 1. Depending on one’s system and desired accuracy, a priori knowledge is needed to select the form of \(P\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\), e.g. harmonic oscillator eigenstates for vibrational excited states. Outside of the vibronic ground state, \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\) is a “first-order" measurement of the Ψ(R)2 width. It also reduces the systematic effects of assuming a single structure (\({P}^{(\delta )}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\)) as illustrated in Fig. 8. This was the case for our measured N2O data where our q range of [3.5, 7.25] Å−1 is insufficient to resolve the width of \(| {\Psi }^{({{{{{{{{\rm{N}}}}}}}}}_{2}{{{{{{{\rm{O}}}}}}}})}({{{{{{{\bf{R}}}}}}}}){| }^{2}\). Therefore, the widths become nuisance parameters used to avoid these systematic errors. Still, \({P}^{(\delta )}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) is accurate on the 10 mÅ scale and runs ~100 times faster than \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\). Therefore, \({P}^{(\delta )}\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\) serves as an intermediate test analysis before switching to the normal or any other distribution. For very large molecules with many degrees of freedom, \({P}^{(\delta )}\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\) may be the only tractable method.

The MHA performs an unbiased search through Θ space guided by the Clmk(q) coefficients and correlates each Θ parameter. Our method is model independent and does not suffer from model bias as might be a concern for conventional methods. Limited q range artificially introduces correlations between Θ parameters. Since Θ is the minimal set of parameters to define \(P\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\), we expect the parameters to be uncorrelated. Figure 7 shows how adding information by extending the q range decreases false correlations. For the N2O data, we observe these false correlations between \(\left\langle {{{{{{{{\rm{N}}}}}}}}}^{{{{{{{{\rm{T}}}}}}}}}{{{{{{{{\rm{N}}}}}}}}}^{{{{{{{{\rm{C}}}}}}}}}\right\rangle\) and \(\left\langle {{{{{{{{\rm{N}}}}}}}}}^{{{{{{{{\rm{C}}}}}}}}}{{{{{{{\rm{O}}}}}}}}\right\rangle\) (Fig. 5c). Simultaneously evaluating all Θ parameters leverages well-resolved parameters to constrain poorly resolved parameters. For example, the long OO bond (or ONO) in our asymmetric NO2 is the best constrained parameter as it produces the most q oscillations. The MHA removes structures where the two NO distances are inconsistent with the well-resolved OO distance. These correlations similarly help find Θ*, as observed with N2O, where the \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) uncorrelated widths do not distinguish the NTNC and NCO bonds but Θ* does.

The width of \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) (σΘ) relies heavily on SNR rather than increasing q range (Fig. 6b), which is ideal since it is generally prohibitively difficult to change the q range at ultrafast diffraction facilities and easier to reduce the SNR by taking more measurements39. This is because smaller σlmk(q) makes it less probable for the MHA to visit Θ parameters with larger residuals. For the PDF, the resolution is 2πq, or 1.26, 0.63, and 0.31 Å for q ranges of 5, 10, and 20 Å−1 respectively, which is roughly 100 to 1000 times larger than our observed resolution for simulated NO2 at typical to high SNR, respectively. For the measured N2O data with a very poor SNR and 0.04 Å resolution, we observe a 50X improvement over the 1.7 Å Fourier resolution. This agrees with our simulated results that have more than a factor of 2 better SNR and indicates we may observe these 100–1000X improvements in future measurements. Our method, therefore, lends itself well to high repetition-rate machines, such as the upcoming LCLS II. We note that increasing the q range above 8 Å−1 has a larger effect on the width parameters (Fig. 6a).

Discussion

In the following, we provide intuition about and describe how this method is able to approximate Ψ(R)2 while significantly improving upon real-space resolution. We first provide intuition for how induced anisotropy accesses the molecular frame structural angles \({\theta }_{\mu \nu }^{({{{{{{{\rm{mf}}}}}}}})}\) and \({\phi }_{\mu \nu }^{({{{{{{{\rm{mf}}}}}}}})}\). We then provide a brief intuitive discussion, that compliments the Methods section, of how our Bayesian inference approach inverts 〈I(q, t)〉 for Θ while improving upon resolution. Finally, we introduce methods to evaluate excited electronic state dynamics.

To provide intuition for the distinct angular terms, we condense and label the reference frames from Eq. (2)

$$\langle{I(q)}\rangle_{{{{{{\rm{mol}}}}}}} = \cdots \mathop{\sum}\limits_{m, k} (-1)^{k} \underbrace{Y^{m}_l \left(\theta^{({{{{{\rm{lf}}}}}})}_q, \phi^{({{{{{\rm{lf}}}}}})}_q \right)}_{\substack{{{{{{\rm{Lab}}}}}}\,{{{{{\rm{Frame}}}}}}}} \underbrace{{{{{{\mathcal{A}}}}}}^{l}_{{m}{k}}(t)|_{{{{{\rm{rigid}}}}}}}_{\substack{{{{{{\rm{Anisotropy}}}}}}}} \\ \times \Big \langle{{\Psi}(t) \Big| \underbrace{j_l(q {{\Delta}} R_{\mu\nu}) Y_l^{-k} \left(\theta_{\mu\nu}^{({{{{{\rm{mf}}}}}})}, \phi_{\mu\nu}^{({{{{{\rm{mf}}}}}})}\right)}_{\substack{{{{{{\rm{Molecular}}}}}}\,{{{{{\rm{Frame}}}}}}\,{{{{{\rm{Structure}}}}}}}}}\Big|{{{\Psi}}(t)}\Big \rangle.$$
(12)

Equation (12) highlights the anisotropic contributions at each level of this method. The molecular frame structure component separates into pairwise distance (jl(qΔRμν)) and angular \(({Y}_{l}^{-k}({\theta }_{\mu \nu }^{({{{{{{{\rm{mf}}}}}}}})},{\phi }_{\mu \nu }^{({{{{{{{\rm{mf}}}}}}}})}))\) terms. The former governs the q dependence and the latter is the angular decomposition of the molecular structure which acts as a scaling parameter. The ensemble anisotropy \({{{{{{{{\mathcal{A}}}}}}}}}_{mk}^{l}(t)\) acts as a key from the measured lab frame anisotropy (\({Y}_{l}^{m}({\theta }_{q}^{({{{{{{{\rm{lf}}}}}}}})},{\phi }_{q}^{({{{{{{{\rm{lf}}}}}}}})})\)) to the molecular frame structure by coupling these two reference frames. Similar derivations40,41,42 exist but do not stress the dependence on the 3d molecular frame coordinates; Ref. 40 is not treated fully quantum mechanically as done here in Supplementary Note 2. Anisotropy is required for our method to have an explicit dependence on the pairwise angles. Without anisotropy, C000(q) has no explicit angular dependence (Eq. (3)), just like the PDF.

Stronger impulsive alignment produces a broader coherent rotational wavepacket which exhibits higher amplitude signals with more variations (Fig. 4c). Larger amplitude ADMs improve Clmk(q) SNR by lifting higher order coefficients up out of the noise, resulting in similar resolution improvements to only increasing SNR, shown in Fig. 6c. Increasing the number of Clmk(q) coefficients improves the \({\theta }_{\mu \nu }^{({{{{{{{\rm{mf}}}}}}}})}\) and \({\phi }_{\mu \nu }^{({{{{{{{\rm{mf}}}}}}}})}\) resolution since each Clmk(q) provides a new angular constraint via \({Y}_{l}^{-k}({\theta }_{\mu \nu }^{({{{{{{{\rm{mf}}}}}}}})},{\phi }_{\mu \nu }^{({{{{{{{\rm{mf}}}}}}}})})\) (Eq. (3)).

One can produce fast signal variations with an initially broad hot thermal ensemble. Writing coherence onto hotter molecular ensembles produces weak but fast varying ADMs, shown in Fig. 4c. Figure 6d shows how quickly the resolution worsens at higher temperatures. When fitting the ADMs to \({B}_{l}^{m}(q,t)\), one ideally measures particular points that include two separate regions where the ADMs have high variation and sufficiently before and after the prominent anisotropy signal where their magnitude dampens. One need not strictly measure the entire transient rotational signal.

To simulate the ADMs one will need to measure the rotational constants or calculate them from the vibronic ground state structure. Measured constants remove structural biases potentially induced by calculating these coefficients from a simulated or presumed structure and decouple the rotational signal from the MHA sampling. When simulating or inducing molecular tumbling is prohibitively difficult, one may use the induced anisotropy from the dipole alignment of the initial photo-excitation. This method can be made more general as our Bayesian inference approach does not require anisotropy and is applicable to the traditionally used isotropic component.

With Bayesian inference, we use data to effectively invert 〈I(q, t)〉 for Θ. We use the Clmk(q) coefficients to independently constrain \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\), from which we find Θ* to parameterize P(RΘ*, C). The P(RΘ*, C) distribution, which approximates Ψ(R)2, provides the most probable (and unique) molecular structure. Traditionally, the PDF, being the inverse Fourier transform of qM000(q), is at best a weighted histogram of unlabeled pairwise distances from which one generally cannot obtain a unique structure. Since our measurements necessarily exclude q all the way to 0, and the strong signal drop-off limits high q measurements, our q range is always limited. These limitations obfuscate the PDF interpretations by introducing sinusoidal systematics that result in negative probabilities, e.g. in Fig. 5d where we do not expect any distance above 2.3 Å. Therefore, we typically simulate Ψ(R)2 with a priori knowledge and validate simulation against the measured PDF. Our method instead uncovers the globally optimal parameters (Θ*) from the data for a given \(P\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\). This requires only the initial vibronic ground state structure, simulations of the coherent rotational wavepacket when using Clmk(q) for l > 0, and for excited state dynamics one additionally needs relevant transition dipole moments. As made clear by comparing Fig. 5b and d, the P(RΘ*, C) distribution is significantly more information-rich than the PDF, e.g. it provides the 3d molecular structure and width of the Ψ(R)2. This method thus has the potential to shift ultrafast diffraction to a discovery method applicable even to systems that extend beyond the scope of theory.

We find that building \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) to later find its mode (Θ*) and its resolution (σΘ) is more informative and robust than using a gradient-based optimization routine to find Θ* and its precision. In either case, an optimization routine is used to find Θ*, but given \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) our method starts near the global minima and is more robust to local minima. If either routine finds a local minimum, one can avoid reporting misleading results by citing the resolution of \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) (σΘ) as its error. Since σΘ is the standard deviation of all Θ parameters consistent with the data, it is a conservative estimate that very likely encompasses the global minimum. The precision, used by an optimization routine, is determined by the loss landscape around Θ* and is unaware of the entire Θ distribution. The \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) distribution can also inform the experimentalist which values are best measured, which ones are correlated, and potentially how to improve the experimental apparatus through the false correlations and widths in Figs. 5a, c and 7a, c. One does so by varying experimental parameters, in simulation, to determine how isolated and resolved Θ parameters become.

Our method is broadly applicable to diffraction experiments with laser excitation, including dynamics from excited electronic states. Laser excitation imparts one or more units of angular momentum providing at least C20k(q). From low SNR N2O data we see the C200(q) alone recovers ~40 mÅ resolution. The primary difficulty with extending our method to excited states dynamics lies in isolating the ADMs in rovibronically coupled systems at sufficiently long timescales. Since the principle moments of inertia change with the structure, one must reorient the altered excited state structure by adding three molecular frame Euler angles to the Θ parameters Supplementary Note 2. The generally much wider excited state Ψ(R, t)2 dampens Clmk(q) coefficients and reduces the need for extended q. We discuss two variants to isolate the ADMs, a time-separable method and an isotropic method.

The time-separable method introduces a separation of time scales by assuming the ADMs are relatively stationary during the vibronic motion. This approximation is analogous to the Born-Oppenheimer approximation. For a single excitation pulse, the dipole selection rule introduces ensemble anisotropy independent of the difficulty to create a rotational wavepacket:

$${\langle I({{{{{{{\bf{q}}}}}}}})\rangle }_{{{{{{{{\rm{sep}}}}}}}}}^{(1)}(t)\approx \, {{{{{{{\mathcal{I}}}}}}}}\left(\mathop{\sum}\limits_{\mu }| {f}_{\mu }(q){| }^{2}+\mathop{\sum}\limits_{\mu ,\nu :\mu \ne \nu }{{{{{{{\rm{Re}}}}}}}}\left\{{f}_{\mu }(q){f}_{\nu }^{* }(q)\mathop{\sum}\limits_{l}\frac{32{\pi }^{3}{i}^{l}}{2l+1}\mathop{\sum}\limits_{{m}_{1},{m}_{2}}{(-1)}^{{m}_{1}}{Y}_{l}^{{m}_{2}}\left({\theta }_{q}^{({{{{{{{\rm{lf}}}}}}}})},{\phi }_{q}^{({{{{{{{\rm{lf}}}}}}}})}\right)\right.\right.\\ \left.\left.\times \mathop{\sum}\limits_{n,{n}^{{\prime} }}{\tilde{{{{{{{{\mathcal{A}}}}}}}}}}_{{m}_{2}{m}_{1}}^{(1)l}(n,{n}^{{\prime} })\left\langle {\psi }_{{{{{{{{\rm{el-vib}}}}}}}}}^{{n}^{{\prime} }}(t) \right| {j}_{l}(q{{\Delta }}{R}_{\mu \nu }){Y}_{l}^{-{m}_{1}}\left({\theta }_{\mu \nu }^{({{{{{{{\rm{mf}}}}}}}})},{\phi }_{\mu \nu }^{({{{{{{{\rm{mf}}}}}}}})}\right) \left| {\psi }_{{{{{{{{\rm{el-vib}}}}}}}}}^{n}(t) \right\rangle\right\}\right).$$
(13)

Here, \({\tilde{{{{{{{{\mathcal{A}}}}}}}}}}_{{m}_{2}{m}_{1}}^{(1)l}(n,{n}^{{\prime} })\) are the ADMs calculated with the rovibronic ground state structure, the ground rovibronic transition dipole, and evaluated immediately after laser excitation. This requires knowledge of either the transition dipole moment or the Frank-Condon factor and the electronic transition dipole moment.

To further constrain \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\), one can couple to more Clmk(q) coefficients by introducing a precursor pulse that excites a rotational wavepacket. This precursor pulse, assumed to be a rotational Raman impulse, is chosen to have a negligible effect on the vibronic system thus maintaining consistency with our separation of timescale approximation. The Raman impulse first induces rotational coherence. Following the Raman impulse, the system evolves for a rotational time τ, at this point the vibronic excitation pulse arrives. One would measure the vibronic dynamics over a small window (tτ). This is repeated for different orientations by scanning the delay τ over an appreciable portion of the rotational evolution. This window, measured by t, is typically of order picosecond or less such that the ADMs do not appreciably change. The measured diffraction images are given by

$${\langle I({{{{{{{\bf{q}}}}}}}})\rangle }_{{{{{{{{\rm{sep}}}}}}}}}^{(2)}(t,\tau )\approx \, {{{{{{{\mathcal{I}}}}}}}}\left(\mathop{\sum}\limits_{\mu }| {f}_{\mu }(q){| }^{2}+\mathop{\sum}\limits_{\mu ,\nu :\mu \ne \nu }{{{{{{{\rm{Re}}}}}}}}\left\{{f}_{\mu }(q){f}_{\nu }^{* }(q)\mathop{\sum }\limits_{l}\frac{32{\pi }^{3}{i}^{l}}{2l+1}\mathop{\sum}\limits_{{m}_{1},{m}_{2}}{(-1)}^{{m}_{1}}{Y}_{l}^{{m}_{2}}\left({\theta }_{q}^{({{{{{{{\rm{lf}}}}}}}})},{\phi }_{q}^{({{{{{{{\rm{lf}}}}}}}})}\right)\right.\right.\\ \left.\left.\times \mathop{\sum}\limits_{n,{n}^{{\prime} }}{\widetilde{{{{{\mathcal{A}}}}}}}_{{m}_{2}{m}_{1}}^{(2)l}(n,{n}^{{\prime} };\tau )\left\langle {\psi }_{{{{{{{{\rm{el-vib}}}}}}}}}^{{n}^{{\prime} }}(t) \left |{j}_{l}(q{{\Delta }}{R}_{\mu \nu }){Y}_{l}^{-{m}_{1}}\left({\theta }_{\mu \nu }^{({{{{{{{\rm{mf}}}}}}}})},{\phi }_{\mu \nu }^{({{{{{{{\rm{mf}}}}}}}})}\right) \right|{\psi }_{{{{{{{{\rm{el-vib}}}}}}}}}^{n}(t)\right\rangle\right\}\right).$$
(14)

where n labels the vibronic states, \(\left\vert {\psi }_{{{{{{{{\rm{el-vib}}}}}}}}}^{n}(t)\right\rangle\) is the vibronic state (assumed unknown), \({{{{{\widetilde{{{{{\mathcal{A}}}}}}}}}}}_{mk}^{(2)l}(n,{n}^{{\prime} };\tau )\) are the modified ADMs, and t is the arrival time of the probe after the second excitation pulse. These modified ADMs consider the angular momentum transfer by the vibronic excitation photon and require the vibronic transition dipole moments of the corresponding vibronic state. One then follows the above analysis procedure for each time t. In such an experiment, one should measure the ensemble anisotropy without the vibronic excitation pulse to find the best-fit ADMs. Supplementary Note 2 further describes our separation of timescale approximation and provides the derivations for Eqs. (13) and (14).

The isotropic method uses only the C000(q, t) term, similar to conventional analyses. Since \({\widetilde{{{{{\mathcal{A}}}}}}}_{00}^{(\alpha )0}(n,n^{\prime};t,\tau )\) becomes a constant absorbed by \({{{{{{{\mathcal{I}}}}}}}}\), this method can be applied to single (Eq. (13)) and double pulse (Eq. (14)) experiments. The C000(q, t) term only implicitly depends on the pairwise angles through ΔRμν. This is in contrast to the explicit pairwise angle dependence in the higher order Clmk(q) terms. Our statistical treatment likely provides adequate pairwise angle resolution because we have more pairwise distances than are required to specify a unique structure.

For a Raman-inducing precursor pulse, one will likely use a combination of the isotropic and time-separable methods. For fast dynamics, one would use the time-separable method for small windows shortly following the rotation time τ. Longer-lived dynamics can be retrieved by the isotropic method. When retrieving \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\), in either case, one initiates the MHA with the vibronic ground state Θ* parameters at the first temporal measurement. For each subsequent time step one initiates MHA with the Θ* parameters from the previous time step.

Electronic and vibrational excited state wavepackets bifurcate into multiple states, e.g. at conical intersections, causing P(R, tΘ*, C) to bifurcate as well. We account for these different states by

$$P({{{{{{{\bf{R}}}}}}}},t| {{{{{{{{\bf{\Theta }}}}}}}}}^{* },C)=\mathop{\sum }\limits_{i}^{{N}_{{{{{{{{\rm{ex}}}}}}}}}}{c}_{i}P({{{{{{{\bf{R}}}}}}}},t| {{{{{{{{\bf{\Theta }}}}}}}}}_{i}^{* },C)$$
(15)

where Nex is the number of excited state distributions with appreciable population. Conical intersections will induce bifurcations that spawn a new distribution that adds to Nex. In this way we consider this method to be fully data-driven since we can change our theoretical description (ci) based on data alone.

Thus far we have only considered diffraction consistent with the independent atom approximation and all the equations above have been derived under this approximation. Recently, diffraction beyond the independent atom approximation has been observed in both electron43 and x-ray diffraction44. Under such conditions, this method must be modified by either re-deriving the above equations to consider these effects or by accounting for this signal in the Clmk(q) coefficients. For MeV electron diffraction, inelastic scattering is limited to the low q < 1 Å−1 region and can be easily removed from the Clmk(q) coefficients. For x-ray diffraction beyond the independent atom approximation, contributions from excited Rydberg states create a constant offset after the initial signal turn-on that spans the entire q-range24,44. Due to the diffuse nature of the Rydberg state this signal does not vary appreciably in time and can be subtracted out.

Conclusion

We have shown that our method can approximate Ψ(R)2 with P(RΘ*, C) for the vibronic ground states of NO2 and N2O. In simulation, we retrieve ~0.5 mÅ resolution for NO2. From measured N2O UED data, we retrieve ~40 mÅ resolution despite a short q range of [3.5, 7.25] Å−1 and very poor SNR. Compared to PDF-based methods, this returns the labeled pairwise distances and angles with 50 and 100–1000 times better resolution in measurement and simulation respectively. In spite of similar bond distances and atomic scattering amplitudes for NO2 and N2O, our method distinguishes these distances. We begin to resolve the 〈NTNC〉 and 〈NCO〉 distances in our low SNR and narrow q range UED measurement. These results are highly encouraging and illustrate the viability of our Bayesian inference approach. They also inspire further expansion into excited state dynamics. The code repository45 contains the algorithms used for this work and instructions on how to reproduce these results. It also contains instructions on how to run this analysis and templates for applying this method to new molecules.

This Bayesian inference approach is best suited for gas-phase ultrafast diffraction instruments that have high SNR such as high repetition-rate free electron facilities, e.g. LCLS-II-HE. Resolution quickly improves with SNR considerably faster than if one increases q beyond ~8 Å−1. Nevertheless, larger q ranges improve resolution for widths of Ψ(R)2 and diminish false correlations between Θ parameters.

Our general method has the potential to become commonplace for ultrafast gas-phase diffraction measurements due to its broad applicability and its independence from complex excited state simulations. In this work, we validated its use for standard pump-probe setups. One can extend this method to excited state dynamics either with or without anisotropy. Our isotropic method is well suited for current pump-probe setups that generally focus on the isotropic component. This method greatly benefits from deterministic anisotropy that can either be induced by impulsive Raman or by the dipole moment selection from the excitation pulse. Beyond ultrafast gas-phase diffraction, one can apply this general framework to other classes of experiments, e.g. the previously mentioned photo-electron experiments27,28,29,31,32. This is done by deriving the molecular frame response (Eq. (2)) and applying this Bayesian inference approach.

Given its broad applicability, high resolution, amenability to various measurements, and independence from complex molecular dynamic simulations, our method has the potential to effectively turn ultrafast gas-phase molecular diffraction into a discovery-oriented technique. This method can retrieve a unique molecular structure distribution for general molecules with 10 mÅ. Moreover, because our method is parameterized by Θ, we have the opportunity to expand the scope of ultrafast gas-phase diffraction into previously inaccessible measurements. For instance, we demonstrated the use of this parameterization to measure the width of Ψ(R, t)2; this width is important in the excited state where single structures lose their meaning. This method unlocks our ability to study larger and more complex systems that are currently too difficult to simulate.

Methods

Our method can be subdivided into three principal concepts. Firstly, we use ensemble anisotropy, described by the ADMs, to access the molecular frame by projecting the data onto anisotropic components. Secondly, we select a model, \(P\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\), to approximate Ψ(R)2 and develop our statistical approach to solve for Θ using Bayesian inference. That is, through the statistical nature of our measurement we use Bayesian inference to effectively invert the diffraction signal for Θ. Lastly, we take our statistical description and use MCMC techniques to solve for \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) to retrieve the optimal Θ parameters (Θ*). The code used for this analysis45 can be run to reproduce the following results or adapted for other molecules.

Extracting molecular frame information

We describe our analysis procedure for a system given an induced deterministic ensemble anisotropy under experimental conditions at the SLAC MeV ultrafast electron diffraction facility (UED)6. Our generic pump-probe setup is similar to most ultrafast diffraction setups, consisting of an 800 nm Ti:Sapphire pump laser and a 120 fs FWHM electron bunch probe. For the simulated NO2 results, we consider using a single 10 TW/cm2 800 nm pump pulse to impulsively induce a coherent rotational wave packet and probing it within a window of high anisotropy variation: [37.5, 41.5] ps. For the measured N2O sample, a train of 8 identical 800 nm pulses (40 fs duration and 5 × 1012 W/cm2 irradiance) separated by full quantum revivals induced such rotational wavepacket46. We measured the first field free full quantum revival over a window of ~3 ps. We masked q regions [0, 3.5] Å−1 and above 7.25 Å−1 due to ellipticity in the imaging of the diffraction pattern and poor signal-to-noise, respectively. Linearly polarized pump pulses induce azimuthal symmetry, which sets m = 0 in Eq. (1) (P\(({\phi }_{{{{{{{{\rm{I}}}}}}}}}^{({{{{{{{\rm{lf}}}}}}}})},t)=1/2\pi\)), while the Raman excitation of the wavepacket requires l being even in Eq. (1).

We define anisotropy in two equivalent ways and quantify it through the ADMs. Firstly, anisotropy is defined by a non-zero projection of the measured diffraction pattern onto any \({Y}_{l}^{m}\) with even l > 0 for a given Δq range. Secondly, anisotropy exists when there is a non-zero \({{{{{{{{\mathcal{A}}}}}}}}}_{mk}^{l}(t)\) for l > 0. To calculate the ADMs, one must know the rotational constants (A, B, C) and ideally the centrifugal distortion (D) constants, as well as the differential polarizability, which can be calculated from the known ground state structure or measured from Raman spectroscopy. For N2O, we used the measured rotational constants47,48 to model the rotational wavepacket for the fitted ensemble temperature and laser intensity described in Supplementary Note 1. We note other methodologies to calculate the ADMs33,34,49. Supplementary Note 1 describes both our calculation of the ADMs and our search for the best-fit ADMs.

We access the molecular pairwise distances and angles in the molecular frame. Using the ADMs and the Independent Atom Approximation, we relate measured lab frame anisotropy in diffraction patterns, 〈I(q, t)〉, to the molecular structure

$$ \langle I({{{{{\mathbf{q}}}}}},t) \rangle = {{{{{\mathcal{I}}}}}} \Bigg(\mathop{\sum}\limits_\mu |f_\mu(q)|^2 + \mathop{\sum}\limits_{\mu,\nu : \mu \neq \nu} {{{{{\rm{Re}}}}}} \Bigg\{ f_\mu(q) f^*_\nu(q) \mathop{\sum}\limits_l 4 \pi i^l \mathop{\sum}\limits_{m, k} (-1)^{k} \underbrace{Y^{m}_l \left(\theta^{({{{{{\rm{lf}}}}}})}_q, \phi^{({{{{{\rm{lf}}}}}})}_q \right)}_{\substack{{{{{{\rm{Lab}}}}}}\,{{{{{\rm{Frame}}}}}}}} \langle{{\Psi}(t)|\underbrace{D^l_{m k} \left(\phi_{{{{{{\rm{I}}}}}}}^{({{{{{\rm{lf}}}}}})}, \theta_{{{{{{\rm{I}}}}}}}^{({{{{{\rm{lf}}}}}})}, \chi_{{{{{{\rm{I}}}}}}}^{({{{{{\rm{lf}}}}}})} \right)}_{\substack{{{{{{\rm{Ensemble}}}}}}\,{{{{{\rm{Anisotropy}}}}}}}} \underbrace{j_l(q {{\Delta}} R_{\mu\nu}) Y_l^{-k} \left(\theta_{\mu\nu}^{({{{{{\rm{mf}}}}}})}, \phi_{\mu\nu}^{({{{{{\rm{mf}}}}}})}\right)}_{\substack{{{{{{\rm{Molecular}}}}}}\,{{{{{\rm{Frame}}}}}}\,{{{{{\rm{Structure}}}}}}}}}|{{{\Psi}}(t)}\rangle \Bigg\} \Bigg)$$
(16)

In Eq. (16), derived in Supplementary Note 2, fμ(q) is the scattering amplitude of the μth atom, jl(qΔRμν) are the spherical Bessel functions of the first kind, \({{{{{{{\mathcal{I}}}}}}}}\) is the diffraction beam intensity, and the momentum transfer vector is given by \({{{{{{{\bf{q}}}}}}}}=[q,{\theta }_{q}^{({{{{{{{\rm{lf}}}}}}}})},{\phi }_{q}^{({{{{{{{\rm{lf}}}}}}}})}]\). The difference vector \({{\Delta }}{{{{{{{{\bf{R}}}}}}}}}_{\mu \nu }={{{{{{{{\bf{R}}}}}}}}}_{\mu }-{{{{{{{{\bf{R}}}}}}}}}_{\nu }=[{{\Delta }}{R}_{\mu \nu },{\theta }_{\mu \nu }^{({{{{{{{\rm{mf}}}}}}}})},{\phi }_{\mu \nu }^{({{{{{{{\rm{mf}}}}}}}})}]\) is the molecular frame pairwise distance and angles between the μth and νth atoms, illustrated in Fig. 1b. Equation (16) shows how the ensemble anisotropy connects the lab frame to the molecular frame structure. Directly accessing the molecular frame pairwise angles \(({\theta }_{\mu \nu }^{({{{{{{{\rm{mf}}}}}}}})},{\phi }_{\mu \nu }^{({{{{{{{\rm{mf}}}}}}}})})\) requires anisotropy and is otherwise inaccessible through the PDF and isotropic contributions alone. This is evident by isolating the isotropic component (l = 0, m = 0, k = 0) which sets \({Y}_{0}^{0}({\theta }_{\mu \nu }^{({{{{{{{\rm{mf}}}}}}}})},{\phi }_{\mu \nu }^{({{{{{{{\rm{mf}}}}}}}})})=1/(2\sqrt{\pi })\).

For our method, we describe optimal representations of the lab and molecular frames used in Eq. (16). The molecular frame is defined by the molecule’s principal moments of inertia before laser excitation with the \({{{{{{{\hat{\bf {z}}}}}}}}}^{({{{{{{{\rm{mf}}}}}}}})}\), \({{{{{{{\hat{{{\bf{x}}}}}}}}}}}^{({{{{{{{\rm{mf}}}}}}}})}\), and \({{{{{{{\hat{{{\bf{y}}}}}}}}}}}^{({{{{{{{\rm{mf}}}}}}}})}\) corresponding to the principle moments of inertia in decreasing order: A, B, and C respectively. This necessitates knowledge of the rovibronic ground state structure when one is measuring an excited rovibronic structure. When looking at the ΔRμν contribution, we isolate the μth and νth atoms while ignoring other atoms and translate the atom pair such that Rν defines the origin. This is highlighted in Fig. 1b where the nitrogen is translated to the origin. This translation allows us to define the pairwise angles and derive Eq. (16). Since we are concerned with a difference in locations ΔRμν, Eq. (16) is invariant under such molecular frame translations. In the lab frame, the laser polarization defines \({\hat{{{{{{{{\bf{z}}}}}}}}}}^{({{{{{{{\rm{lf}}}}}}}})}\) and the propagation direction of the probe pulse defines \({\hat{{{{{{{{\bf{y}}}}}}}}}}^{({{{{{{{\rm{lf}}}}}}}})}\). The measured signals in the lab frame, on a 2D detector, are defined by detector parameters q = q and the azimuthal angle θ(d) defined by \({\hat{{{{{{{{\bf{z}}}}}}}}}}^{({{{{{{{\rm{lf}}}}}}}})}\). Supplementary Note 2 describes how to rewrite q in terms of the detector coordinates. For small angle scattering at UED \({\theta }_{q}^{({{{{{{{\rm{lf}}}}}}}})}\approx {\theta }^{({{{{{{{\rm{d}}}}}}}})}\) and \({\phi }_{q}^{({{{{{{{\rm{lf}}}}}}}})}\approx 0\).

The primary difficulty of working with Eq. (16) comes from the expectation value including both the ensemble anisotropy and molecular frame structure. We want to separate the ensemble anisotropy into the ADMs. This isolates the time-dependent molecular structure term that we would like to retrieve. By doing this, we only require more tractable molecular rotation simulations with respect to the known rovibronic ground state structure in order to retrieve the time-dependent molecular structure. Otherwise, as Eq. (16) is written, it requires a priori knowledge of exactly the unknown time-dependent structures for which we are solving. In this work, we describe various ways to do this under common experimental conditions.

Focusing on the vibronic ground state of NO2, we can separate the ADMs and molecular structure contribution in Eq. (16) by applying a rigid rotor approximation. Equation (2) is the general form, which we adapt to our specific case by setting m = 0 and replacing \({\theta }_{q}^{({{{{{{{\rm{lf}}}}}}}})}\approx {\theta }^{({{{{{{{\rm{d}}}}}}}})}\) and \({\phi }_{q}^{({{{{{{{\rm{lf}}}}}}}})}\approx 0\). The resulting lab frame measurements are shown in Fig. 3a.

To retrieve \(P\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\), we first isolate the molecular frame structure terms from Eq. (2) with a series of fits. The first fit removes the initial diffraction beam intensity (\({{{{{{{\mathcal{I}}}}}}}}\)), described Supplementary Note 8. The second fit projects out the measured lab frame anisotropy \(\left({Y}_{l}^{m}\left({\theta }_{q}^{({{{{{{{\rm{lf}}}}}}}})},{\phi }_{q}^{({{{{{{{\rm{lf}}}}}}}})}\right)\right)\) from Eq. (2) by fitting the angular dependence of the measured diffraction.

$$B^m_{l}(q, t) = {\int}_{0}^{\pi} \Big\langle I \left({{{{{\mathbf{q}}}}}}(q,\theta^{({{{{{\rm{d}}}}}})}),t \right) \Big \rangle_{{{{{{\rm{rigid}}}}}}} Y_l^{m} \left({\theta^{({{{{{\rm{lf}}}}}})}_{q}} \left(q,\theta^{({{{{{\rm{d}}}}}})}\right), {\phi^{({{{{{\rm{lf}}}}}})}_q} \left(q,\theta^{({{{{{\rm{d}}}}}})}\right) \right) \sin \left({\theta^{({{{{{\rm{lf}}}}}})}_{q}}\left(q,\theta^{({{{{{\rm{d}}}}}})}\right) \right) d\theta^{({{{{{\rm{d}}}}}})} \\ = {{{{{\mathcal{I}}}}}} \, \mathop{\sum}\limits_{\mu,\nu : \mu \neq \nu} {{{{{\rm{Re}}}}}} \Bigg\{ f_\mu(q) f^*_\nu(q) \frac{32 \pi^3 i^l}{2l+1} (-1)^{k} \langle{\Psi}(0)|\underbrace{j_{l}{(q {\Delta} R_{\mu\nu})} Y_{l}^{-k} \left(\theta_{\mu\nu}^{({{{{{\rm{mf}}}}}})}, \phi_{\mu\nu}^{({{{{{\rm{mf}}}}}})}\right)}_{{{{\rm{Molecular}}}}\,{{{\rm{Frame}}}}\,{{{\rm{Structure}}}}}|{{{\Psi}}(0)}\rangle \underbrace{\left.{{{{\mathcal{A}}}}}^l_{m k}(t)\right|_{{{{\rm{rigid}}}}}}_{{{{\rm{Anisotropy}}}}} \Bigg\}$$
(17)

This yields the time (t) and q dependent \({B}_{l}^{m}(q,t)\) coefficients shown in Fig. 3b. The third fit isolates the molecular frame information by fitting out the time dependence of \({B}_{l}^{m}(q,t)\) with the simulated ADMs, \({{{{{{{{\mathcal{A}}}}}}}}}_{mk}^{l}(t)\). The resulting coefficients, Clmk(q), relate measured data to the molecular frame pairwise structure. Here, Mlmk(q) are the modified Clmk(q) coefficients that compensate for the rapid q−4 falloff in the electron scattering amplitudes. Figure 3c shows the retrieved Mlmk(q) for both the simulated and measured data. For the N2O data, the poor signal-to-noise precludes all contributions except C200(q). Depending on the data quality and degree of orthogonality in the ADMs, one may need to employ regularization to retrieve physical fit values. Regularization adds a fitting cost to extraneous coefficients, thus minimizing the impact of non-orthogonal ADMs. Supplementary Note 3 provides a further discussion on fitting the ADMs and regularization.

The standard error of the mean σlmk(q) for each Clmk(q) is calculated from a distribution of measured Clmk(q) coefficients. For the N2O data, Supplementary Note 5 describes the data processing and retrieval of σlmk(q). For the NO2 simulation, we add Poisson noise to the diffraction patterns and propagate that noise through the lab frame anisotropy and ADM fit (see Supplementary Section Supplementary Note 5).

Applying Bayesian Inference

We approximate Ψ(R)2 with the probability distribution \(P\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\), which is parameterized by Θ and conditioned on the observed Clmk(q) coefficients. This requires one to choose a functional form of \(P\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\) dependent on the system’s state and the desired degree of accuracy. Depending on the desired accuracy and precision of the desired results, this requires varying degrees of a priori knowledge. For example, one may choose a multivariate delta function for a single molecule response, a normal distribution to model the ground vibrational states, or harmonic oscillator eigenfunctions to describe arbitrary individual vibrational states.

The Θ parameters include the 3Natom − 6 structural degrees of freedom (Ndof) needed to define a unique molecular structure, and the width parameters in the case of \({P}^{({{{{{{{\mathcal{N}}}}}}}})}\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\). Here, Θ has the minimal number of parameters needed to define \(P\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\), and adding redundant parameters can significantly alter one’s results.

Having isolated the molecular frame structure terms (Clmk(q)) and chosen \(P\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\), we apply Bayesian Inference to address the diffraction inverse problem35,36,50 by effectively inverting Clmk(q) to approximate \(| \Psi \left({{{{{{{\bf{R}}}}}}}}\right){| }^{2}\). With Bayes rule,

$$P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)=\frac{P\left(C| {{{{{{{\bf{\Theta }}}}}}}}\right)P\left({{{{{{{\bf{\Theta }}}}}}}}\right)}{P(C)}$$
(18)

we use the statistical nature of our measurement to analytically relate the desired Θ parameters to the measured Clmk(q). In Eq. (18), \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) is the posterior distribution we wish to build. The likelihood \(P\left(C| {{{{{{{\bf{\Theta }}}}}}}}\right)\) relates the measured data to the Θ parameters and is the probability of observing Clmk(q) given the parameters Θ. Here, \({C}_{lmk}^{({{{{{{{\rm{calc}}}}}}}})}(q,{{{{{{{\bf{\Theta }}}}}}}})\) are the calculated Clmk(q) coefficients, and σlmk(q) are the standard errors of the means for Clmk(q). The prior, \(P\left({{{{{{{\bf{\Theta }}}}}}}}\right)\) contains our a priori knowledge of the system, and in this work is used to constrain Θ to physicality (e.g., Θ > 0 and ONO < π). This is because we do not assume any prior knowledge or simulations of the system. Calculating the marginal likelihood P(C) is generally, and in our case, intractable. Further intuition regarding how the statistical nature of our measurement allows us to invert for Θ is described in ref. 36.

Given the functional forms of \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\), \(P\left(C| {{{{{{{\bf{\Theta }}}}}}}}\right)\), and the presumed functional form of \(P\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)\), we now find the globally optimal Θ parameters (Θ*) by building \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) and finding its mode. To converge on the mode of \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\), one must use the correlations between the Θ parameters by building \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) in the full Θ-space rather than sampling each parameter individually. Consequently, we must next address the curse of dimensionality.

Solving for the high dimensional model parameters Θ

We retrieve \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) with the Metropolis-Hastings algorithm (MHA) from the following system of equations:

$${C}_{lmk}(q)=\int\,{H}_{lmk}\left(q,{{{{{{{\bf{R}}}}}}}}\right){\left\vert \Psi \left({{{{{{{\bf{R}}}}}}}}\right)\right\vert }^{2}d{{{{{{{\bf{R}}}}}}}}$$
(19)
$${C}_{lmk}^{({{{{{{{\rm{calc}}}}}}}})}(q,{{{{{{{\bf{\Theta }}}}}}}})=\int\,{H}_{lmk}\left(q,{{{{{{{\bf{R}}}}}}}}\right)P\left({{{{{{{\bf{R}}}}}}}}\right\vert \left.{{{{{{{\bf{\Theta }}}}}}}},C\right)d{{{{{{{\bf{R}}}}}}}}$$
(20)
$$H_{lmk} \left(q, {{{{{\mathbf{R}}}}}} \right) = \, {{{{{\mathcal{I}}}}}} {{{{{\rm{Re}}}}}} \left\{ (-1)^{k} \frac{32 \pi^3 i^l}{2l+1} \right. \\ \times \mathop{\sum}\limits_{\mu,\nu : \mu \neq \nu} |f_\mu(q)| |f_\nu(q)| j_l(q{{\Delta}} R_{\mu\nu}) Y_l^{-k} \left(\theta_{\mu\nu}^{({{{{{\rm{mf}}}}}})}, \phi_{\mu\nu}^{({{{{{\rm{mf}}}}}})}\right)\bigg\}.$$
(21)

We note the high dimensionality and complexity of Eq. (20), which is a system of order 10 equations, each with order 100 terms, embedded in an order 100-dimensional space of measurements in q. This must be evaluated on a Θ-dimensional space of all possible molecular structures and width parameters. The MHA is chosen for its ability to retrieve probability distributions from high dimensional integral equations50,51 like Eq. (20).

The MHA is designed to efficiently and preferentially sample regions of Θ-space proportional to the agreement with data, spending the vast majority of its time sampling regions of high probability (agreement). The MHA builds \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) by accumulating Θ parameters based on their relative posteriors

$$\frac{P\left({{{{{{{{\bf{\Theta }}}}}}}}}^{{\prime} }| C\right)}{P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)}=\frac{P\left(C| {{{{{{{{\bf{\Theta }}}}}}}}}^{{\prime} }\right)}{P\left(C| {{{{{{{\bf{\Theta }}}}}}}}\right)}$$
(22)

where Θ and \({{{{{{{{\bf{\Theta }}}}}}}}}^{{\prime} }\) are both physical, and the prior and the marginal likelihood cancel out. We note Eq. (22), and hence the MHA, is theory independent and is analogous to a random walk guided by the relative agreement of neighboring Θ parameters to the data. For instance, if the likelihood of Θ is 2 times larger than \({{{{{{{{\bf{\Theta }}}}}}}}}^{{\prime} }\), the MHA will sample twice as many structures around Θ than \({{{{{{{{\bf{\Theta }}}}}}}}}^{{\prime} }\). Similarly, if the likelihood for Θ is 1000 times larger than for \({{{{{{{{\bf{\Theta }}}}}}}}}^{{\prime} }\), then the MHA will effectively remove structures around \({{{{{{{{\bf{\Theta }}}}}}}}}^{{\prime} }\) from the search space. Reference50, the MHA python package50 used in this work, and Ref. 36 give detailed descriptions of combining Bayesian inference and the MHA. Supplementary Note 4 describes our use of the MHA and Bayesian inference in greater detail and how one can introduce physical intuition, or a priori knowledge, into the MHA.

This method ultimately yields the following three results; a distribution of Θ parameters (the posterior \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\)), the optimal set of model parameters (Θ*), and a parameterized probability of molecular structures P(RΘ*, C). For each individual Θ parameter, where the ith parameter is denoted as Θi, we calculate its resolution as the standard deviation of the projection of \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) onto Θi. This resolution, σΘ, is the one-dimensional standard deviation after marginalizing over all other parameters, which removes the correlations between Θ parameters. That is, if one randomly draws some parameters Θ from \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\), the distribution of parameter Θi will have a width of σΘ. In this work, we focus on how Bayesian inference and Eq. (16) effectively invert data for P(RΘ*, C) via an unambiguous and sharp \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\). It is this \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\) and its width (resolution) that are our figures of merit for the inversion. The accuracy of Θ* depends on one’s method for finding the mode, of which there are many methods. The precision of Θ* is a function of its local region. The mean and mode of said marginalized distribution will likely not correspond to Θ*, since Θ* is the mode of the full Θ-space distribution. We find Θ* via a simple mode search algorithm described in Supplementary Note 6.

The measured q range, the induced rotational wavepacket, and the σlmk(q) are vital in determining the width, shape, and parameter correlations of \(P\left({{{{{{{\bf{\Theta }}}}}}}}| C\right)\). To investigate such dependencies we first define a Ψ(R)2 distribution for NO2 to calculate Clmk(q). Figure 4a and Table 1 show and describe this distribution, respectively. Measuring more diffraction patterns increases the signal-to-noise ratio (SNR) by reducing σlmk(q) which scales as \(1/\sqrt{N}\). Here, the SNR is the geometric mean of C000(q)/σ000(q) between 0.5 < q < 4 Å−1. Figure 4b illustrates the Clmk(q) coefficients used in this analysis with the following SNRs based on previous UED52 and x-ray12 diffraction experiments. Unless otherwise stated, the standard configuration of experimental parameters for our NO2 results is a q range of [0.5, 10] Å−1, a SNR of 100, a pump fluence of 1 J/cm2 and a 100 K ensemble temperature.