Applying Bayesian Inference and deterministic anisotropy to retrieve the molecular structure $|\Psi(\boldsymbol{R})|^2$ distribution from gas-phase diffraction experiments

Currently, our general approach to retrieving molecular structures from ultrafast gas-phase diffraction heavily relies on complex ab initio electronic or vibrational excited state simulations to make conclusive interpretations. Without such simulations, inverting this measurement for the structural probability distribution is typically intractable. This creates a so-called inverse problem. In this work, we develop a broadly applicable method that addresses this inverse problem by approximating the molecular frame structure $|\Psi(\boldsymbol{R}, t)|^2$ distribution independent of these complex simulations. We retrieve the vibronic ground state $|\Psi(\boldsymbol{R})|^2$ for both simulated stretched NO$_2$ and measured N$_2$O. From measured N$_2$O, we observe 40 mAngstroms coordinate-space resolution from 3.75 inverse Angstroms reciprocal space range and poor signal-to-noise, a 50X improvement over traditional Fourier transform methods. In simulated NO$_2$, typical to high signal-to-noise levels predict 100--1000X resolution improvements, down to 0.1 mAngstroms. By directly measuring the width of $|\Psi(\boldsymbol{R})|^2$, we open ultrafast gas-phase diffraction capabilities to measurements beyond current analysis approaches. This method has the potential to effectively turn gas-phase ultrafast diffraction into a discovery-oriented technique to probe systems that are prohibitively difficult to simulate.


I. INTRODUCTION
Ultrafast molecular gas-phase diffraction, from either x-rays [1,2] or electrons [3][4][5][6], is a vital tool for retrieving time-dependent molecular structures.In elastic molecular gas-phase diffraction experiments, x-rays or electrons scatter off of electrons and nuclei, with differing proportionality.Each pairwise atomic distance creates a pattern of scattered x-rays or electrons as a function of their transverse momentum q.The measured diffraction pattern is the sum of all such contributions, this is orientationally averaged over the lab frame ensemble distribution.We lose pairwise directional information and thus the ability to explicitly distinguish individual atomic distances.Consequently, directly inverting diffraction patterns for the molecular structure is generally intractable, this is a so-called inverse problem.Typically, we avoid this inverse problem and retrieve both the molecular structures and the molecular frame orientations by simulating the forward excited state process.These are generally time-dependent ab initio electronic and vibrational excited state simulations that explore a large parameter space (rovibration, structure, and electronic state) with trajectory bifurcations due to effects like conical intersections [7][8][9][10].We refer to such simula-tions as complex simulations, that are typically validated through comparisons with measured diffraction patterns or pair-distribution functions (PDFs -a weighted histogram of pairwise distances).Consequently, ultrafast gas-phase diffraction is generally limited by the ability to perform these complex simulations.We aim to expand diffraction measurements for high-resolution reconstructions of molecular structure probability distribution |Ψ(R, t)| 2 without relying on complex molecular dynamics simulations by effectively solving this inverse problem with a statistical interpretation.
A variety of studies sought to reduce reliance on complex simulations, but are either limited in the systems they address or quickly run into the curse of dimensionality.Fourier transforming the time dependence exposes dissociative and vibronic signals [11][12][13] but it is insensitive to classes of isomerizations.Methods employing ensemble anisotropy have garnered much interest [14][15][16][17][18][19][20][21][22][23] yet they struggle to get sub-Angstrom resolution and the full 3d structure for generic molecular structures.Optimization methods, while capable of exposing large-scale motion, are susceptible to local minima [21].Pattern matching measured data against sampled isomers [24][25][26] becomes intractable for moderately large molecules due to the curse of dimensionality.For example, a molecule with N atoms atoms has 3N atoms − 6 degrees of freedom.To independently sample each degree of freedom 10 times would require 10 3Natoms−6 structures, becoming intractable for molecules with 7 or more atoms.Simulations reduce the structure-space of isomers to select, but this trade-off requires previous knowledge [24] that potentially imparts biases.
We employ insights from molecular ensemble anisotropy methods, applied statistics, and machine learning principles to address the inverse problem and the curse of dimensionality to approximate the molecular structure probability density |Ψ(R, t)| 2 .It is important to note that instead of sampling individual molecular structures and comparing single structures to the measured data, we are sampling entire |Ψ(R, t)| 2 probability distributions.We access the molecular frame by decomposing measured data onto anisotropic components.Then, we iteratively approximate |Ψ(R, t)| 2 with a statistical approach uniquely suited for high repetition-rate diffraction facilities.We observe that resolution strongly improves with signal-to-noise much faster than increasing the q range beyond moderate values.Unlike the PDF approach, this method retrieves the molecular distances and angles required to define a unique molecular structure.
(1) These ADMs describe the ensemble of molecular frame orientations with respect to the lab frame.When calculating the ADMs, the l, m, and k are difference and sum of quantum numbers between rotational eigenstates, respectively for the total angular momentum, the projection onto the lab frame z-axis, and the projection onto the molecular frame z-axis.These ADMs transform the lab frame into the molecular frame by decomposing the measured lab frame anisotropy into C lmk (q) coefficients, which are dependent on molecular frame pairwise distances and angles (θ (mf) µν and φ (mf) µν ) shown in Fig. 1b.The PDF is not directly sensitive to these angles.After impulsively aligning the molecular ensemble, Fig. 2 illustrates how transient anisotropy (panels b and c) provides constraints on these Euler angles and consequently the molecular frame (panels d-g).For example, at 39.25 ps the anisotropy provides simultaneous constraints on θ (the molecular frame azimuthal plane) is highly constrained.At 39.85 ps the ensemble is well localized in θ (lf) I , resolving measurements along the molecular frame ẑ.Here, P φ (lf) I is uniform due to cylindrical symmetry imparted by a linearly polarized pulse.
To effectively invert the molecular diffraction pattern and approximate |Ψ(R, t)| 2 , we use Bayesian Inference.Bayesian Inference describes a class of statistical inference techniques using Bayes's Theorem to update one's model based on observed data [36].We first approximate |Ψ(R, t)| 2 as the probability distribution P (R, t| Θ, C), which is parameterized by the molecular structure degrees of freedom Θ.Using Bayesian Inference, we then relate P (Θ|C) to the measured molecular diffraction pattern.With this framework, we use Markov-chain Monte Carlo (MCMC) techniques to build P (Θ|C) and tackle the curse of dimensionality by efficiently sampling structures most consistent with the measured C lmk (q).This method is unbiased and naturally avoids regions in our sampling space that are inconsistent with the C lmk (q).We retrieve P (R, t| Θ, C) with neither the PDF nor complex molecular dynamics simulations since we will analytically relate the molecular frame pairwise distances and angles to the C lmk (q).Further intuition is provided in Supplementary Note 4 and Ref. [37].
Instead of complex molecular dynamics simulations this method has fewer simulation requirements.In this method's simplest form, when probing structural dynamics it only requires the much more tractable simulation of the rovibronic ground state structure to define the molecular frame.When measuring the equilibrium vibronic ground state, one does not require a priori knowledge of the structure they wish to find.This is because each sampled structure will define a new molecular frame.When using anisotropy components, we require timedependent rotational simulations for the ADMs.This requires rotational constants and molecular polarizability, all of which can be measured or calculated from the rovibronic ground state structure.When applying this method to excited states, we require the transition dipole, which is also measured or calculated from the rovibronic ground state structure.As discussed later, depending on the desired accuracy, one must select a functional form for P (R, t| Θ, C) based on a priori knowledge of the excitation or use normal distributions as a "first-order" approximation.
In this manuscript, we validate these principles by retrieving |Ψ(R)| 2 for the vibronic ground states of both simulated NO 2 and measured N 2 O rotational wavepackets.Here NO 2 , an asymmetric top, serves as a test case to show our method's broad capabilities and behavior under various experimental conditions.Furthermore, we validated these capabilities with measured N 2 O data from the ultrafast MeV electron diffraction facility at SLAC (UED).We chose these molecules to specifically be amenable to conventional methods since triatomics do not suffer significantly from the curse of dimensionality.In this lower dimensional realm, we benchmark and validate our method against conventional methods with intentions to later expand to larger molecules.In the following, all simulations and equations correspond to ultrafast electron diffraction experiments but are easily extended to x-ray diffraction.
In this work, we rigorously and qualitatively describe this method in addition to quantitatively benchmarking both its advantages and shortcomings.We provide intuition and mathematically describe how induced anisotropy accesses the molecular frame structural angles (θ (mf) µν and φ (mf) µν ) and how to retrieve this molecular frame structure using Bayesian Inference.We evaluate this method on simulated and measured data, showing how P (R| Θ, C) significantly improves upon the traditional Fourier limited PDF.Firstly, P (R| Θ, C) unambiguously defines a unique molecular mean structure without complex molecular dynamics simulations.This is generally not possible from the PDF alone.Secondly, we report pairwise distance resolutions of order 10 m Å and down to 0.1 m Å from measured and simulated data, respectively.These resolutions are respectively a factor of 50 and 1000 times smaller than their corresponding PDF resolutions.Thirdly, we investigate this method's behaviors and systematic errors as a function of experimental factors and analysis choices.We find this procedure depends more strongly on signal-to-noise than it does by extending measured momentum transfer.Fourthly, we demonstrate how this method expands ultrafast gasphase diffraction experiments to quantitatively measure additional parameters, such as the width of |Ψ(R, t)| 2 .Lastly, we describe how one can apply this method to excited state dynamics.With these advancements, this method has the potential to expand ultrafast gas-phase diffraction into a more discovery-oriented technique, one that is free of complex excited state simulation limitations and is applicable to currently inaccessible molecular systems.

II. METHODS
Our method can be subdivided into three principal concepts.Firstly, we use ensemble anisotropy, described by the ADMs, to access the molecular frame by projecting the data onto anisotropic components.Secondly, we select a model, P (R| Θ, C), to approximate |Ψ(R)| 2 and develop our statistical approach to solve for Θ using Bayesian Inference.That is, through the statistical nature of our measurement we use Bayesian Inference to effectively invert the diffraction signal for Θ.Lastly, we take our statistical description and use MCMC techniques to solve for P (Θ|C) to retrieve the optimal Θ parameters (Θ * ).The code used for this analysis [38] can be run to reproduce the following results or adapted for other molecules.

A. Extracting Molecular Frame Information
We describe our analysis procedure for a system given an induced deterministic ensemble anisotropy under experimental conditions at the SLAC MeV ultrafast electron diffraction facility (UED) [6].Our generic pumpprobe setup is similar to most ultrafast diffraction setups, consisting of an 800 nm Ti:Sapphire pump laser and a 120 fs FWHM electron bunch probe.For the simulated NO 2 results, we consider using a single 10 TW/cm 2 800 nm pump pulse to impulsively induce a coherent rotational wave packet and probing it within a window of high anisotropy variation: [37.5, 41.5] ps.For the measured N 2 O sample, a train of 8 identical 800 nm pulses (40 fs duration and 5 × 10 12 W/cm 2 irradiance) separated by full quantum revivals induced such rotational wavepacket [39].We measured the first field free full quantum revival over a window of 3 ps.We masked q regions [0, 3.5] Å−1 and above 7.25 Å−1 due to ellipticity in the imaging of the diffraction pattern and poor signal-to-noise, respectively.Linearly polarized pump pulses induce azimuthal symmetry, which sets m = 0 in Eq. 1 (P(φ (lf) I , t) = 1/2π), while the Raman excitation of the wavepacket requires l being even in Eq. 1.
We define anisotropy in two equivalent ways and quantify it through the ADMs.Firstly, anisotropy is defined by a non-zero projection of the measured diffraction pattern onto any Y m l with even l > 0 for a given ∆q range.Secondly, anisotropy exists when there is a nonzero A l mk (t) for l > 0. To calculate the ADMs, one must know the rotational (A, B, C) and ideally the centrifugal distortion (D) constants, as well as the differential polarizability, which can be calculated from the known ground state structure or measured from Raman spectroscopy.For N 2 O, we used the measured rotational constants [40,41] to model the rotational wavepacket for the fitted ensemble temperature and laser intensity described in Supplementary Note 1.We note other methodologies to calculate the ADMs [34,35,42].Supplementary Note 1 describes both our calculation of the ADMs and our search for the best-fit ADMs.
We access the molecular pairwise distances and angles in the molecular frame.Using the ADMs and the Independent Atom Approximation, we relate measured lab frame anisotropy in diffraction patterns, I(q, t) , to the molecular structure I , χ In Eq. 2, derived in Supplementary Note 2, f µ (q) is the scattering amplitude of the µ th atom, j l (qr) are One first measures the difference diffraction pattern (∆ I(q, t) ), given by Eq. 3 (row a).Removing the detector angular dependence, one retrieves B m l (q, t) of Eq. 4 (row b).Removing the time-dependent ensemble anisotropy (ADMs) yields the molecular frame M lmk (q) coefficients Eq. 5 (row c).All as described in the text.We note that in the N2O data (right) we have limited visibility of data due to experimental limitations illustrated by the hashes.
the spherical Bessel functions of the first kind, I is the diffraction beam intensity, and the momentum transfer vector is given by q = [q, θ (lf) µν ] is the molecular frame pairwise distance and angles between the µ th and ν th atoms, illustrated in Fig. 1b.Equation 2 shows how the ensemble anisotropy connects the lab frame to the molecular frame structure.Directly accessing the molecular frame pairwise angles (θ requires anisotropy and is otherwise inaccessible through the PDF and isotropic contributions alone.This is evident by isolating the isotropic component (l = 0, m = 0, For our method, we describe optimal representations of the lab and molecular frames used in Eq. 2. The molecular frame is defined by the molecule's principal moments of inertia before laser excitation with the ẑ(mf) , x(mf) , and ŷ(mf) corresponding to the principle moments of inertia in decreasing order: A, B, and C respectively.This necessitates knowledge of the rovibronic ground state structure when one is measuring an excited rovibronic structure.When looking at the ∆R µν contribution, we isolate the µ th and ν th atoms while ignoring other atoms and trans-late the atom pair such that R ν defines the origin.This is highlighted in Fig. 1b where the nitrogen is translated to the origin.This translation allows us to define the pairwise angles and derive Eq. 2. Since we are concerned with a difference in locations ∆R µν , Eq. 2 is invariant under such molecular frame translations.In the lab frame, the laser polarization defines ẑ(lf) and the propagation direction of the probe pulse defines ŷ(lf) .The measured signals in the lab frame, on a 2D detector, are defined by detector parameters q = |q| and the azimuthal angle θ (d) defined by ẑ(lf) .Supplementary Note 2 describes how to rewrite q in terms of the detector coordinates.For small angle scattering at UED θ The primary difficulty of working with Eq. 2 comes from the expectation value including both the ensemble anisotropy and molecular frame structure.We want to separate the ensemble anisotropy into the ADMs.This isolates the time-dependent molecular structure term that we would like to retrieve.By doing this, we only require more tractable molecular rotation simulations with respect to the known rovibronic ground state structure in order to retrieve the time-dependent molecular structure.Otherwise, as Eq. 2 is written, it requires a priori knowledge of exactly the unknown time-dependent struc-tures for which we are solving.In this work, we describe various ways to do this under common experimental conditions.
Focusing on the vibronic ground state of NO 2 , we can separate the ADMs and molecular structure contribution in Eq. 2 by applying a rigid rotor approximation: This yields the time (t) and q dependent B m l (q, t) coefficients shown in Fig. 3b.The third fit isolates the molecular frame information by fitting out the time dependence of B m l (q, t) with the simulated ADMs, A l mk (t).The resulting coefficients, C lmk (q), relate measured data to the molecular frame pairwise structure.
Here, M lmk (q) are the modified C lmk (q) coefficients that compensate for the rapid q −4 falloff in the electron scattering amplitudes.Figure 3c shows the retrieved M lmk (q) for both the simulated and measured data.For the N 2 O data, the poor signal-to-noise precludes all contributions except C 200 (q).Depending on the data quality and degree of orthogonality in the ADMs, one may need to employ regularization to retrieve physical fit values.Reg-ularization adds a fitting cost to extraneous coefficients, thus minimizing the impact of non-orthogonal ADMs.Supplementary Note 3 provides a further discussion on fitting the ADMs and regularization.The standard error of the mean σ lmk (q) for each C lmk (q) is calculated from a distribution of measured C lmk (q) coefficients.For the N 2 O data, Supplementary Note 5 describes the data processing and retrieval of σ lmk (q).For the NO 2 simulation, we add Poisson noise to the diffraction patterns and propagate that noise through the lab frame anisotropy and ADM fit (see Supplementary Section Supplementary Note 5).

B. Applying Bayesian Inference
We approximate |Ψ(R)| 2 with the probability distribution P (R| Θ, C), which is parameterized by Θ and conditioned on the observed C lmk (q) coefficients.This requires one to choose a functional form of P (R| Θ, C) dependent on the system's state and the desired degree of accuracy.Depending on the desired accuracy and precision of the desired results, this requires varying degrees of a priori knowledge.For example, one may choose a multivariate delta function for a single molecule response, FIG. 4. Simulated NO2 data at various experimental conditions For simulated NO2 we defined a |Ψ(R)| 2 distribution, from which we calculated the C lmk (q) under various experimental conditions.Panel a shows the simulated NO2 distribution that we use to calculate the simulated NO2 responses (C lmk (q) and M lmk (q)).Panel b shows M lmk (q) for various signal-tonoise ratios (SNR) for the case of an ensemble temperature of 100 K and kick fluence of 1 J/cm 2 .Panel c shows two ADM dependencies: pump strength (constant ensemble temperature of 25 K) on the left and temperature (constant pump fluence of 1 J/cm 2 ) on the right.
a normal distribution to model the ground vibrational states, or harmonic oscillator eigenfunctions to describe arbitrary individual vibrational states.
Having isolated the molecular frame structure terms (C lmk (q)) and chosen P (R| Θ, C), we apply Bayesian Inference to address the diffraction inverse problem [36,37,43] by effectively inverting C lmk (q) to approximate |Ψ (R) | 2 .With Bayes rule, we use the statistical nature of our measurement to analytically relate the desired Θ parameters to the measured C lmk (q).In Eq. 12, P (Θ|C) is the posterior distribution we wish to build.The likelihood P (C|Θ) relates the measured data to the Θ parameters and is the probability of observing C lmk (q) given the parameters Θ Here, C lmk (q, Θ) are the calculated C lmk (q) coefficients, and σ lmk (q) are the standard errors of the means for C lmk (q).The prior, P (Θ) contains our a priori knowledge of the system, and in this work is used to con-strain Θ to physicality (e.g., Θ > 0 and ∠ONO < π).This is because we do not assume any prior knowledge or simulations of the system.Calculating the marginal likelihood P (C) is generally, and in our case, intractable.Further intuition regarding how the statistical nature of our measurement allows us to invert for Θ is described in Ref. [37].
Given the functional forms of P (Θ|C), P (C|Θ), and the presumed functional form of P (R| Θ, C), we now find the globally optimal Θ parameters (Θ * ) by building P (Θ|C) and finding its mode.To converge on the mode of P (Θ|C), one must use the correlations between the Θ parameters by building P (Θ|C) in the full Θspace rather than sampling each parameter individually.Consequently, we must next address the curse of dimensionality.
C. Solving for the high dimensional model parameters Θ We retrieve P (Θ|C) with the Metropolis-Hastings algorithm (MHA) from the following system of equations: We note the high dimensionality and complexity of Eq. 15, which is a system of order 10 equations, each with order 100 terms, embedded in an order 100-dimensional space of measurements in q.This must be evaluated on a Θ-dimensional space of all possible molecular structures and width parameters.The MHA is chosen for its ability to retrieve probability distributions from high dimensional integral equations [43,46] like Eq. 15.
The MHA is designed to efficiently and preferentially sample regions of Θ-space proportional to the agreement with data, spending the vast majority of its time sampling regions of high probability (agreement).The MHA builds P (Θ|C) by accumulating Θ parameters based their relative posteriors where Θ and Θ ′ are both physical, and the prior and the marginal likelihood cancel out.We note Eq. 17, and hence the MHA, is theory independent and is analogous to a random walk guided by the relative agreement of neighboring Θ parameters to the data.For instance, if the likelihood of Θ is 2 times larger than Θ ′ , the MHA will sample twice as many structures around Θ than Θ ′ .Similarly, if the likelihood for Θ is 1000 times larger than for Θ ′ , then the MHA will effectively remove structures around Θ ′ from the search space.Reference [43] The MHA python package [43] used in this work and Ref. [37] give detailed descriptions of combining Bayesian Inference and the MHA.Supplementary Note 4 describes our use of the MHA and Bayesian Inference in greater detail and how one can introduce physical intuition, or a priori knowledge, into the MHA.This method ultimately yields the following three results; a distribution of Θ parameters (the posterior P (Θ|C)), the optimal set of model parameters (Θ * ), and a parameterized probability of molecular structures P (R|Θ * , C).For each individual Θ parameter, where the i th parameter is denoted as Θ i , we calculate its resolution as the standard deviation of the projection of P (Θ|C) onto Θ i .This resolution, σ Θ , is the onedimensional standard deviation after marginalizing over all other parameters, which removes the correlations between Θ parameters.That is, if one randomly draws some parameters Θ from P (Θ|C), the distribution of parameter Θ i will have a width of σ Θ .In this work, we focus on how Bayesian Inference and Eq. 2 effectively invert data for P (R|Θ * , C) via an unambiguous and sharp P (Θ|C).It is this P (Θ|C) and its width (resolution) that are our figures of merit for the inversion.The accuracy of Θ * depends on one's method for finding the mode, of which there are many methods.The precision of Θ * is a function of its local region.The mean and mode of said marginalized distribution will likely not correspond to Θ * , since Θ * is the mode of the full Θ-space distribution.We find Θ * via a simple mode search algorithm described in Supplementary Note 6.
The measured q range, the induced rotational wavepacket, and the σ lmk (q) are vital in determining the width, shape, and parameter correlations of P (Θ|C).To investigate such dependencies we first define a |Ψ(R)| 2 distribution for NO 2 to calculate C lmk (q). Figure 4a and Table I show and describe this distribution, respectively.Measuring more diffraction patterns increases the signal-to-noise ratio (SNR) by reducing σ lmk (q) which scales as 1/ √ N .Here, the SNR is the geometric mean of C 000 (q)/σ 000 (q) between 0.5 < q < 4 Å−1 .Figure 4b illustrates the C lmk (q) coefficients used in this analysis with the following SNRs based on previous UED [47] and x-ray [12] diffraction experiments.Unless otherwise stated, the standard configuration of experimental parameters for our NO 2 results is a q range of [0.5, 10] Å−1 , a SNR of 100, a pump fluence of 1 J/cm 2 and a 100 K ensemble temperature.

III. RESULTS
Both the simulated NO 2 and measured N 2 O diffraction patterns are from the SLAC UED facility.Elastic electron diffraction is sensitive to the nuclei and diffraction from electronic transience occurs within the removed low q region.Using the independent atom approximation we are only concerned with the nuclear structure.Our stretched NO 2 molecule is simulated in the ground vibrational state due to its altered structure and we observe that 99.99% of the N 2 O molecules occupy the vibrational ground state (Supplementary Note 9).The normal distribution, P (N ) (R| Θ, C), is a good description of both our NO 2 and N 2 O vibronic ground state systems as it is the ground state eigenfunction of the harmonic oscillator.For N 2 O, our ADM simulations account for centrifugal distortion.In our main result, we illustrate our method's efficacy by retrieving P (N ) (R|Θ * , C) from both simu-FIG.7. Effects of varying the measured q range on P (N ) (Θ|C) Varying the measured q range affects false correlations in P (N ) (Θ|C) for NO2; a larger reciprocal space provides more information and dampens false correlations.Panel a shows the 1d and 2d projections of P (N ) (Θ|C) for a limited q range of [0.5, 5] Å−1 .The red dashed lines illustrate Θ * , while the black "X" and solid lines indicate the ground truth values.Panel b shows the corresponding P (N ) (R|Θ * , C).Similarly, panel c shows the 1d and 2d projections of P (N ) (Θ|C) for the broader q range of [0.5, 20] Å−1 .Panel d shows the corresponding P (N ) (R|Θ * , C). Panel e shows the correlation between all Θ parameters as a function of q range.We note the decrease in correlations with larger q, where panels a and b illustrate how the width and false correlations in P (N ) (Θ|C) decrease with higher q. a b c FIG. 8. Systematic errors from selecting incorrect |Ψ(R)| 2 distributionsThe P (δ) (Θ|C) distribution suffers from a q dependent systematic error stemming from the false assumption that a single structure describes the results measured from an ensemble.Here we show the 1d projections of P (N ) (Θ|C) (section a) and P (δ) (Θ|C) (section b) as a function of the measured q range (panel c) and a signal-to-noise ratio (SNR) of 400.Each column indicates a different q range starting at 0.5 Å−1 with the end of said q range indicated by the rightmost border of that column.The dashed lines are the ground truth values.The bottom plot in panel c is the simulated C200(q) coefficient used for both posteriors and is intersected by black lines that indicate the upper q range of each column.lated NO 2 and measured N 2 O C lmk (q) coefficients.After, we further investigate our method's behavior and sensitivity to varying experimental conditions for the simulated NO 2 system.Finally, we observe how our Bayesian Inference method significantly improves real-space resolution.

A. Molecular structure distribution retrieval
To retrieve P (N ) (R|Θ * , C), we first built the posterior P (N ) (Θ|C), shown in Fig. 5 for simulated NO 2 (a) and measured N 2 O data (c).Panels b and d show P (N ) (R|Θ * , C) for NO 2 and the simulated PDF for N 2 O, respectively.Tables I and II give the extracted Θ * (the most probable Θ parameters) and σ Θ , respectively, for N 2 O and NO 2 .For the NO 2 simulation, the SNR is 400.For NO 2 , P (N ) (Θ|C)'s resolution (σ Θ ) for the nuclear distances and angles is 0.5 m Å and fully encompasses the ground truth values.Despite the largely flat ∠ONO distribution, Θ * still converges on the ground truth values.For N 2 O data, the retrieved P (N ) (Θ|C) encompasses the previously measured results of the vibronic ground state [44,45].The resolution of this distribution is of order 10 m Å even with our limited q range of [3.5, 7.25] Å−1 and the very poor SNR.Moreover, the retrieved ∠NNO is π and we resolve the 50 m Å difference between the N T N C and N C O bond distances (Table II).The retrieved widths σ N T N C and σ (∠NNO) are unphysical due to the limited q range, as discussed later.Compared to the PDF (Fig. 5d), with a 2 Å Fourier resolution, this method improves resolution by a factor of 50.In the PDF, the missing low and high q components produce ringing artifacts in this inverse Fourier transform because of the incomplete Fourier space.This confuses the PDF results as they are not positive definite and falsely indicate population at large distances.
We observe (Fig. 5a and c) that Θ * does not correspond to the mean or mode of most 1-dimensional projections of P (N ) (Θ|C).This is due to the nonlinearity and correlations of P (Θ|C) in Θ space.This illustrates the importance of finding Θ * in this correlated space since the structure parameters are indeed correlated.

B. Exploring experimental effects and systematics
The measured q range is a critical component of gasphase ultrafast diffraction, determining the information content and the PDF's resolution.When expanding this range, Figs.6a and 7, we observe resolution (σ Θ ) improvements only until 8 Å−1 , after which it plateaus.This indicates that after a modest q range our method is not very sensitive to further increases.The false correlations between Θ parameters (Fig. 7e), still, continue to decline as we increase this range.The plotted correlation in Fig. 7e is between all 6 Θ parameters.The correlations seen in Figs.7a and c are termed false correlations since the simulated |Ψ(R)| 2 is a multivariate normal distribution with a diagonal covariance matrix.Increasing the measured reciprocal range q provides more information about the system and reduces these correlations, seen in Figs.7a, c, and e.
The input Θ parameters are those used to simulate the NO2 C lmk (q) coefficients with a signal-to-noise ratio (SNR) of 400.
Retrieved molecular frame structure parameters for measured N2O We provide the optimal molecular frame pairwise distance and angle parameters (Θ * ) for the measured N2O dataset.The Θ * parameters correspond to the mode of P (Θ|C).The resolution of Θ * (σ Θ ) is the standard deviation of the 1d uncorrelated projection of P (N ) (Θ|C).We also provide the corresponding literature values for the vibronic ground state [44,45], denoted as Θ * Literature .
pairwise distances and angles.This strong and continuous dependence indicates that our method is sensitive to SNR due to our statistical interpretation.Although P (N ) (Θ|C) becomes more peaked, the general shape from the correlations does not change since higher SNR improves resolution but does not add more information, in terms of the q range.Increasing the induced rotational coherence and lowering the ensemble temperature rapidly improves resolution (Fig. 6c and d) similar to increasing SNR.In Fig. 6c, the gas was at 25 K while varying the rotational coherence.In Fig. 6c, the pump fluence was 1 J/cm 2 while varying the ensemble temperature.Increasing the rotational coherence and decreasing the temperature increases the magnitude and complexity of the ADMs (Fig. 4c).This is because higher average pump fluences induce larger rotational coherence and lowering the ensemble temperature diminishes the spread of initial rotational states that incoherently interfere.The result is an increase in signal, a larger SNR, and consequently the similarly continuous behavior in Fig. 6b.
Generally, when varying the q range, SNR levels, pump fluence, and ensemble temperature we find the pairwise distances' σ Θ to be of order 1 m Å; for the width parameters, σ Θ is order 10 m Å.Our retrieved Θ * values are generally within a relative error of 10 −7 and 10 −3 from the ground truth values for structural and width parameters, respectively.This resolution is often 100 times better than PDF-based methods because our statistical treatment is highly sensitive to SNR.
Aside from experimental parameters, we investigate systematics induced by incorrectly selecting the functional form of P (R| Θ, C).We assert the simulated NO 2 vibronic ground state |Ψ(R)| 2 distribution is a multivariate normal distribution (Fig. 4a).We evaluate both P (N ) (Θ|C) and P (δ) (Θ|C) on this simulation, and in Fig. 8 we compare their 1d projections as a function of q range.The P (N ) (Θ|C) distribution consistently encompasses the correct values, but the P (δ) (Θ|C) distribution fails to do so for q ranges of [0.5,7.5],[0.5,10], and [0.5,12.5]Å−1 .This is because P (δ) (R| Θ, C) assumes a single molecule response can describe a signal averaged over an ensemble of structures.With increasing q ranges, P (δ) (Θ|C) converges in an unstable fashion on the ground truth (Fig. 8b), unlike the smooth convergence in P (N ) (Θ|C).We note that for NO 2 , retrieving P (δ) (Θ|C) is 100 times faster than P (N ) (Θ|C), which respectively take order 10 s to 1 minute and 1 hour to 1 day on 10 CPUs.This is because P (δ) (R| Θ, C) doesn't have to sum over structures in Eq. 15.Supplementary Note 7 and Ref. [37] provides plots and further discussion of these results.

C. Effects of Bayesian Inference
Our method retrieves the labeled pairwise distances with 100 times better resolution than the PDF.This is due to our statistical treatment using Bayesian Inference where each lmk and q contribution is itself an independent probability distribution; each is an experiment of its own.The MHA discrimination power grows exponentially with more C lmk (q), which increases the magnitude of the negative exponent in the relative ratio of likelihood functions P (C|Θ) (Eq.13).Our method therefore heavily relies on σ lmk (q) and C lmk (q) (seen in Fig 6b .Statistical noise increases σ lmk (q), making P (Θ|C) wider (Fig 6b ), while systematic errors in C lmk (q) shift the centriod of P (Θ|C) (Fig 5c).Supplementary Note 5 describes our method for consistently accounting for both statistical and systematic errors.The PDF error adds in quadrature in σ lmk (q); its scale is set by the largest error bar and disproportionately suffers from poorly measured data points.Conversely, MHA amplifies the contribution of high precision measurements while reducing con-tributions from poorly measured data points by weighting each term in the likelihood by 1/σ lmk (q) (Eq.13).
Our Bayesian Inference approach expands the utility of gas-phase ultrafast diffraction to measure previously inaccessible variables.Given P (R| Θ, C) is a generic function parameterized by Θ, one can introduce variables through Θ by selecting a P (R| Θ, C) that depends on them.Here, we expanded the measurable parameters of gas-phase ultrafast diffraction to include the width of |Ψ(R)| 2 in P (N ) (R| Θ, C), shown in Fig. 5 and given in Table I.Depending on one's system and desired accuracy, a priori knowledge is needed to select the form of P (R| Θ, C), e.g.harmonic oscillator eigenstates for vibrational excited states.Outside of the vibronic ground state, P (N ) (R| Θ, C) is a "first-order" measurement of the |Ψ(R)| 2 width.It also reduces the systematic effects of assuming a single structure (P (δ) (Θ|C)) as illustrated in Fig. 8.This was the case for our measured N 2 O data where our q range of [3.5, 7.25] Å−1 is insufficient to resolve the width of |Ψ (N2O) (R)| 2 .Therefore, the widths become nuisance parameters used to avoid these systematic errors.Still, P (δ) (Θ|C) is accurate on the 10 m Å scale and runs 100 times faster than P (N ) (Θ|C).Therefore, P (δ) (R| Θ, C) serves as an intermediate test analysis before switching to the normal or any other distribution.For very large molecules with many degrees of freedom, P (δ) (R| Θ, C) may be the only tractable method.
The MHA performs an unbiased search through Θ space guided by the C lmk (q) coefficients and correlates each Θ parameter.Our method is model independent and does not suffer from model bias as might be a concern for conventional methods.Limited q range artificially introduces correlations between Θ parameters.Since Θ is the minimal set of parameters to define P (R| Θ, C), we expect the parameters to be uncorrelated.Figure 7 shows how adding information by extending the q range decreases false correlations.For the N 2 O data, we observe these false correlations between N T N C and N C O (Fig. 5c).Simultaneously evaluating all Θ parameters leverages well-resolved parameters to constrain poorly resolved parameters.For example, the long OO bond (or ∠ONO) in our asymmetric NO 2 is the best constrained parameter as it produces the most q oscillations.The MHA removes structures where the two NO distances are inconsistent with the well-resolved OO distance.These correlations similarly help find Θ * , as observed with N 2 O, where the P (N ) (Θ|C) uncorrelated widths do not distinguish the N T N C and N C O bonds but Θ * does.
The width of P (Θ|C) (σ Θ ) relies heavily on SNR rather than increasing q range (Fig. 6b), which is ideal since it is generally prohibitively difficult to change the q range at ultrafast diffraction facilities and easier to reduce the SNR by taking more measurements [48].This is because smaller σ lmk (q) makes it less probable for the MHA to visit Θ parameters with larger residuals.For the PDF, the resolution is 2π/∆q, or 1.26, 0.63, and 0.31 Å for q ranges of 5, 10, and 20 Å−1 respectively, which is roughly 100 to 1000 times larger than our observed resolution for simulated NO 2 at typical to high SNR, respectively.For the measured N 2 O data with a very poor SNR and 0.04 Å resolution, we observe a 50X improvement over the 1.7 Å Fourier resolution.This agrees with our simulated results that have more than a factor of 2 better SNR and indicates we may observe these 100-1000X improvements in future measurements.Our method, therefore, lends itself well to high repetition-rate machines, such as the upcoming LCLS II.We note that increasing the q range above 8 Å−1 has a larger effect on the width parameters (Fig. 6a).

IV. DISCUSSION
In the following, we provide intuition about and describe how this method is able to approximate |Ψ(R)| 2 while significantly improving upon real-space resolution.We first provide intuition for how induced anisotropy accesses the molecular frame structural angles θ (mf) µν and φ (mf) µν .We then provide a brief intuitive discussion, that compliments the Methods section, of how our Bayesian Inference approach inverts I(q, t) for Θ while improving upon resolution.Finally, we introduce methods to evaluate excited electronic state dynamics.

A. The Role of Anisotropy
To provide intuition for the distinct angular terms, we condense and label the reference frames from Eq. 2 I , χ Equation 18 highlights the anisotropic contributions at each level of this method.The molecular frame structure component separates into pairwise distance (j l (q∆R µν )) and angular (Y −k l (θ µν )) terms.The former governs the q dependence and the latter is the angular decomposition of the molecular structure which acts as a scaling parameter.The ensemble anisotropy D l mk φ I , χ (lf) I acts as a key from the measured lab frame anisotropy (Y m l (θ q )) to the molecular frame structure by coupling these two reference frames.Similar derivations [49][50][51] exist but do not stress the dependence on the 3d molecular frame coordinates; Ref. [49] is not treated fully quantum mechanically as done here in Supplementary Note 2. Anisotropy is required for our method to have an explicit dependence on the pairwise angles.Without anisotropy, C 000 (q) has no explicit angular dependence (Eq.5), just like the PDF.
Stronger impulsive alignment produces a broader coherent rotational wavepacket which exhibits higher amplitude signals with more variations (Fig. 4c).Larger amplitude ADMs improve C lmk (q) SNR by lifting higher order coefficients up out of the noise, resulting in similar resolution improvements to only increasing SNR, shown in Fig. 6c.Increasing the number of C lmk (q) coefficients improves the θ µν ) (Eq. 5).One can produce fast signal variations with an initially broad hot thermal ensemble.Writing coherence onto hotter molecular ensembles produces weak but fast varying ADMs, shown in Fig. 4c. Figure 6d shows how quickly the resolution worsens at higher temperatures.When fitting the ADMs to B m l (q, t), one ideally measures particular points that include two separate regions where the ADMs have high variation and sufficiently before and after the prominent anisotropy signal where their magnitude dampens.One need not strictly measure the entire transient rotational signal.
To simulate the ADMs one will need to measure the rotational constants or calculate them from the vibronic ground state structure.Measured constants remove structural biases potentially induced by calculating these coefficients from a simulated or presumed structure and decouple the rotational signal from the MHA sampling.When simulating or inducing molecular tumbling is prohibitively difficult, one may use the induced anisotropy from the dipole alignment of the initial photo-excitation.This method can be made more general as our Bayesian Inference approach does not require anisotropy and is applicable to the traditionally used isotropic component.

B. Bayesian Inference and the MHA
With Bayesian Inference, we use data to effectively invert I(q, t) for Θ.We use the C lmk (q) coefficients to independently constrain P (Θ|C), from which we find Θ * to parameterize P (R|Θ * , C).The P (R|Θ * , C) distribution, which approximates |Ψ(R)| 2 , provides the most probable (and unique) molecular structure.Traditionally, the PDF, being the inverse Fourier transform of q M 000 (q), is at best a weighted histogram of unlabeled pairwise distances from which one generally cannot obtain a unique structure.Since our measurements necessarily exclude q all the way to 0, and the strong signal drop-off limits high q measurements, our q range is always limited.These limitations obfuscate the PDF interpretations by introducing sinusoidal systematics that result in negative probabilities, e.g. in Fig 5d where we do not expect any distance above 2.3 Å.Therefore, we typically simulate |Ψ(R)| 2 with a priori knowledge and validate simulation against the measured PDF.Our method instead uncovers the globally optimal parameters (Θ * ) from the data for a given P (R| Θ, C).This requires only the initial vibronic ground state structure, simulations of the coherent rotational wavepacket when using C lmk (q) for l > 0, and for excited state dynamics one additionally needs relevant transition dipole moments.As made clear by comparing Figs.5b and d, the P (R|Θ * , C) distribution is significantly more information-rich than the PDF, e.g. it provides the 3d molecular structure and width of the |Ψ(R)| 2 .This method thus has the potential to shift ultrafast diffraction to a discovery method applicable even to systems that extend beyond the scope of theory.
We find that building P (Θ|C) to later find its mode (Θ * ) and its resolution (σ Θ ) is more informative and robust than using a gradient-based optimization routine to find Θ * and its precision.In either case, an optimization routine is used to find Θ * , but given P (Θ|C) our method starts near the global minima and is more robust to local minima.If either routine finds a local minima, one can avoid reporting misleading results by citing the resolution of P (Θ|C) (σ Θ ) as its error.Since σ Θ is the standard deviation of all Θ parameters consistent with the data, it is a conservative estimate that very likely encompasses the global minimum.The precision, used by an optimization routine, is determined by the loss landscape around Θ * and is unaware of the entire Θ distribution.The P (Θ|C) distribution can also inform the experimentalist which values are best measured, which ones are correlated, and potentially how to improve the experimental apparatus through the false correlations and widths in Figs. 5 (a and c) and 7 (a and c).One does so by varying experimental parameters, in simulation, to determine how isolated and resolved Θ parameters become.

C. Outlook and potential Extension to Excited State Dynamics
Our method is broadly applicable to diffraction experiments with laser excitation, including dynamics from excited electronic states.Laser excitation imparts one or more units of angular momentum providing at least C 20k (q).From low SNR N 2 O data we see the C 200 (q) alone recovers 40 m Å resolution.The primary difficulty with extending our method to excited states dynamics lies in isolating the ADMs in rovibronically coupled systems at sufficiently long timescales.Since the principle moments of inertia change with the structure, one must reorient the altered excited state structure by adding three molecular frame Euler angles to the Θ parameters (Supplementary Note 2).The generally much wider excited state |Ψ(R, t)| 2 dampens C lmk (q) coefficients and reduces the need for extended q.We discuss two variants to isolate the ADMs, a time-separable method and an isotropic method.
The time-separable method introduces a separation of time scales by assuming the ADMs are relatively stationary during the vibronic motion.This approximation is analogous to the Born-Oppenheimer approximation.For a single excitation pulse, the dipole selection rule introduces ensemble anisotropy independent of the difficulty to create a rotational wavepacket: Here, Ã(1)l m2m1 (n, n ′ ) are the ADMs calculated with the rovibronic ground state structure, the ground rovibronic transition dipole, and evaluated immediately after laser excitation.This requires knowledge of either the transition dipole moment or the Frank-Condon factor and the vibronic ground state dipole.
To further constrain P (Θ|C), one can couple to more C lmk (q) coefficients by introducing a precursor pulse that excites a rotational wavepacket.This precursor pulse, assumed to be a rotational Raman impulse, is chosen to have a negligible effect on the vibronic system thus maintaining consistency with our separation of timescale approximation.The Raman impulse first induces rotational coherence.Following the Raman impulse, the system evolves for a rotational time τ , at this point the vibronic excitation pulse arrives.One would measure the vibronic dynamics over a small window (t ≪ τ ).This is repeated for different orientations by scanning the delay τ over an appreciable portion of the rotational evolution.This window, measured by t, is typically of order picosecond or less such that the ADMs do not appreciably change.
The measured diffraction images are given by Eq. 20 where n labels the vibronic states, |ψ n el-vib (t) is the vibronic wavefunction (assumed unknown), Ã(2)l mk (n, n ′ ; τ ) are the modified ADMs, and t is the arrival time of the probe after the second excitation pulse.These modified ADMs consider the angular momentum transfer by the vibronic excitation photon and require the vibronic ground state transition dipole moment.One then follows the above analysis procedure for each time t.In such an experiment, one should measure the ensemble anisotropy without the vibronic excitation pulse to find the bestfit ADMs.Supplementary Note 2 further describes our separation of timescale approximation and provides the derivations for Eqs.19 and 20.
The isotropic method uses only the C 000 (q, t) term, similar to conventional analyses.Since Ã(α)0 00 (n, n'; t, τ ) becomes a constant absorbed by I, this method can be applied to single (Eq.19) and double pulse (Eq.20) experiments.The C 000 (q, t) term only implicitly depends on the pairwise angles through ∆R µν .This is in contrast to the explicit pairwise angle dependence in the higher order C lmk (q) terms.Our statistical treatment likely provides adequate pairwise angle resolution because we have more pairwise distances than are required to specify a unique structure.
For a Raman-inducing precursor pulse, one will likely use a combination of the isotropic and time-separable methods.For fast dynamics, one would use the timeseparable method for small windows shortly following the rotation time τ .Longer-lived dynamics can be retrieved by the isotropic method.When retrieving P (Θ|C), in either case, one initiates the MHA with the vibronic ground state Θ * parameters.For each subsequent time step one initiates MHA with the Θ * parameters from the previous time step.
Electronic and vibrational excited state wavepackets bifurcate into multiple states, e.g. at conical intersections, causing P (R, t|Θ * , C) to bifurcate as well.We account for these different states by where N ex is the number of excited state distributions with appreciable population.Conical intersections will induce bifurcations that spawn a new distribution that adds to N ex .In this way we consider this method to be fully data-driven since we can change our theoretical description (c i ) based on data alone.Thus far we have only considered diffraction consistent with the independent atom approximation and all the equations above have been derived under this approximation.Recently, diffraction beyond the independent atom approximation has been observed in both electron [52] and x-ray diffraction [53].Under such conditions, this method must be modified by either re-deriving the above equations to consider these effects or by accounting for this signal in the C lmk (q) coefficients.For MeV electron diffraction, inelastic scattering is limited to the low q < 1 Å−1 region and can be easily removed from the C lmk (q) coefficients.For x-ray diffraction beyond the independent atom approximation, contributions from excited Rydberg states create a constant offset after the initial signal turn-on that spans the entire q range [24,53].Due to the diffuse nature of the Rydberg state this signal does not vary appreciably in time and can be subtracted out.

V. CONCLUSION
We have shown that our method can approximate |Ψ(R)| 2 with P (R|Θ * , C) for the vibronic ground states of NO 2 and N 2 O.In simulation, we retrieve 0.5 m Å resolution for NO 2 .From measured N 2 O UED data, we retrieve 40 m Å resolution despite a short q range of [3.5, 7.25] Å−1 and very poor SNR.Compared to PDF-based methods, this returns the labeled pairwise distances and angles with 50 and 100-1000 times better resolution in measurement and simulation respectively.In spite of similar bond distances and atomic scattering amplitudes for NO 2 and N 2 O, our method distinguishes these distances.We begin to resolve the N T N C and N C O distances in our low SNR and narrow q range UED measurement.These results are highly encouraging and illustrate the viability of our Bayesian Inference approach.They also inspire further expansion into excited state dynamics.The code repository [38] contains the algorithms used for this work and instructions on how to reproduce these results.It also contains instructions on how to run this analysis and templates for applying this method to new molecules.
This Bayesian Inference approach is best suited for gasphase ultrafast diffraction instruments that have high SNR such as high repetition-rate free electron facilities, e.g.LCLS-II-HE.Resolution quickly improves with SNR considerably faster than if one increases q beyond 8 Å−1 .Nevertheless, larger q ranges improve resolution for widths of |Ψ(R)| 2 and diminish false correlations between Θ parameters.
Our general method has the potential to become commonplace for ultrafast gas-phase diffraction measurements due to its broad applicability and its independence from complex excited state simulations.In this work, we validated its use for standard pump-probe setups.One can extend this method to excited state dynamics either with or without anisotropy.Our isotropic method is well suited for current pump-probe setups that generally focus on the isotropic component.This method greatly benefits from deterministic anisotropy that can either be induced by impulsive Raman or by the dipole moment selection from the excitation pulse.Beyond ultrafast gasphase diffraction, one can apply this general framework to other classes of experiments, e.g. the previously mentioned photo-electron experiments [27-29, 32, 33].This is done by deriving the molecular frame response (Eq.2) and applying this Bayesian Inference approach.
Given its broad applicability, high resolution, amenability to various measurements, and independence from complex molecular dynamic simulations, our method has the potential to effectively turn ultrafast gas-phase molecular diffraction into a discovery-oriented technique.This method can retrieve a unique molecular structure distribution for general molecules with 10 m Å.Moreover, because our method is parameterized by Θ, we have the opportunity to expand the scope of ultrafast gas-phase diffraction into previously inaccessible measurements.For instance, we demonstrated the use of this parameterization to measure the width of |Ψ(R, t)| 2 ; this width is important in the excited state where single structures lose their meaning.This method unlocks our ability to study larger and more complex systems that are currently too difficult to simulate.

VI. DATA AVAILABILITY
The UED N 2 O data used in this analysis will be provided by the corresponding authors upon reasonable request.The simulated NO 2 data, C lmk (q), can be calculated by the supplied analysis code in Ref. [38].

VII. CODE AVAILABILITY
The code used in this analysis can be found in Ref. [38].
Here, one will find a detailed description of the code and how to run it in order to reproduce the NO 2 results.This repository also includes templates for one to apply this algorithm to new molecules.
where the Wigner D matrix is reduced to the spherical harmonics under cylindrical symmetry.Before the alignment pulse (t < 0) the molecular ensemble is in a thermal distribution of rotational |j, m and vibrational eigenstate |ν .Here, ν labels the vibrational harmonic oscillator state.We presume the alignment pulse intensity is not sufficient to change the thermal distribution, and the pulse width is long enough that vibrational Raman excitation is negligible.Consequently, we separate the rotational and vibrational wavefunctions.It is still important to consider the initial vibrational state as the moments of inertia, and therefore the rotational constants, will vary between vibrational states.The alignment pulse launches a rotational wavepacket by introducing a rotational coherence between eigenstates where j i and m i label the initial (t < 0) rotational eigenstate for a single molecule.This thermal ensemble is represented by the density matrix where each state is weighted by the Boltzmann distribution where we sum over the initial |j i , m i |ν i states, Z is the partition function, k b is the Boltzmann, and T is the temperature.Evaluating Eq.S1 with respect to our density matrix representation of our thermal ensemble, we find that A l m (t) for a symmetric linear rigid rotor is given by For N 2 O, we simulated A 2 0 (t) by solving the TDSE for the c jimiνijm (t) coefficients using a split step operator.The non-resonant excitation laser field induces the potential where E 0 (t) is the pulse field envelope and ∆α = α − α ⊥ is the molecule's differential polarizability.The total Hamiltonian becomes Numerically simulating the c jimiνijm (t) is easily done via the split step operator technique outlined in Ref. [54].This simulation, however, requires the alignment pulse intensity and the ensemble temperature.To find these values, we simulated many variations of A 2 0 (t) and compared them to the measured B 0 2 (q, t) coefficients, shown in Fig. S2.For each q bin, we fit the A 2 0 (t) to the time dependence and calculate a χ 2 (q) value.Our aggregate χ 2 value is a weighted average of these χ 2 (q) weighted by the temporal variance.We find that a temperature of 73 K and a laser intensity of 5 × 10 12 W/cm 2 provides the best fit. Figure S1 shows this χ 2 landscape and Fig. S2 shows the measured data we fit and the best fit A 2 0 (t).

B. Asymmetric Rigid Rotors
Asymmetric rigid rotors have three unique principal axes with A = B = C, such that I A < I B < I C .As a result, they have a fundamentally different energy level structure.In general, the energy eigenvalues may be determined analytically for each J using the D 2 symmetry group of the rigid rotor Hamiltonian.This renders the Hamiltonian matrix in the |jmk symmetric top basis block diagonal [55].Here k is the angular momentum quantum number corresponding to the projection of the angular momentum on the molecular frame ẑ.Writing the eigenstates in this basis yields, The asymmetric top eigenstates |jmτ each correspond to an energy eigenvalue E jmτ , and the spacing between eigenstates determines the field-free evolution of the rotational wavepacket excited by the alignment pulse from an initial state rotational state (i), The coefficients c jmτ are determined by solving the TDSE for the asymmetric rigid rotor in a non-resonant timedependent electric field.The field-matter interaction is typically mediated by the molecular polarizability, resulting in a series of Raman Transitions.Such a calculation has been detailed by several authors [56][57][58][59][60][61][62][63][64], so we do not discuss it here.The density matrix ρ j ′ m ′ τ ′ jmτ (t) can then be determined as discussed above for the linear molecule.Finally, the ADMs can be calculated from the density matrix transformed into the |jmk basis as follows, (S14) The resulting ADMs for our simulated NO 2 distribution as a function of ensemble temperature and the pump laser fluence is given in Fig. S3.

Supplementary Note 2. ANISOTROPY DERIVATION
We now show how deterministic anisotropy allows one to access molecular frame geometric information by coupling the lab and molecular frame.Using the Independent Atom model, the x-ray, or electron, diffraction intensity from a single molecule is given by where I is a scaling coefficient, I 0 is the initial intensity of the probe, R beam is the distance between the sample and where the electron was detected, R µ is the position of the µ th atom, and f µ (q) is either the electron scattering amplitude or x-ray form factor of the µ th atom.Here q is the momentum transfer imparted on either the electron or x-ray after scattering from the molecule.In the case of x-ray scattering, we assume one has already removed the anisotropic effects from Thomson Scattering.The difference in the scattered x-ray and electron wave functions accounts for the R −2 beam factor in I.The first term µ |f µ (q)| 2 is independent of the molecule's structure and is referred to as the atomic scattering contribution.The second term depends on the pairwise distances of atoms and is known as the molecular diffraction.
Our objective is to represent the lab frame diffraction pattern, parameterized by the momentum transfer q = |q| and the detector's azimuthal angle θ (d) , in terms of the molecular frame pairwise distances and angles ∆R µν = . This derivation focuses on a single (µ, ν) pair from the molecular diffraction sum in Eq.S15, where the ν th atom defines the origin as we rotate between the lab and various body reference frames.Figure S4 illustrates these various frames serving as an intuitive guide, with the ν th atom translated to the origin.Such translations are allowed since they cancel in the ∆R µν term.For our rotations, we use the conventions in Ref. [55].Unless otherwise stated, θ and φ represent the polar and azimuthal angles, respectively, in a spherical coordinate system.
We define the pairwise frame (pf) such that ẑ(pf) = ∆ Rµν , again emphasizing we translate the molecule such that the ν th atom is at the origin.The pairwise frame is shown in Fig. S4c.The exponential term in Eq.S15 is rewritten using the partial wave expansion exp (iq Here, j l (q∆R µν ) are the spherical Bessel functions of the first kind, Y m l (θ q ) are spherical harmonics, and θ (pf) q , φ (pf) q are the polar and azimuthal angles that define q in the pairwise frame.In the above equation, we determine the dependence on the labeled pairwise distance ∆R µν , one of our parameters of interest.
The molecular frame (mf) is defined by the molecule's principal moments of inertia, here the ẑ(mf) , ŷ(mf) , and x(mf) axes correspond to the moments with increasing rotational inertia.Figure S4b shows the molecular frame for NO 2 with the nitrogen translated to the origin.We rotate from the pairwise frame into the molecular frame, shown in Fig. S4 as green and orange, respectively.exp (iq The molecular frame angles φ µν and θ µν define the orientation of ∆r µν , where χ (mf) µν = 0 since ∆R µν is a vector.We stress the importance of these molecular frame structure angles (θ µν ) as they are needed, along with ∆R µν to define a unique molecular structure.With PDF methods alone, one only has access to unlabeled ∆R µν and generally cannot define a unique molecular structure.These molecular frame angles are the last two geometric parameters of interest.
To connect our molecular frame calculation to our measurement, we rotate into the lab frame (lf).The lab frame ẑ(lf) is defined as the polarization of the alignment laser ( ε), and ŷ(lf) is along the probe path and normal to the detector.

exp (iq
I , and χ (lf) I are the conventional Euler angles in the lab frame that describe the orientation of the molecule's principal moments of inertia with respect to the lab frame.
I , χ (S22) We have now expressed the measurable diffraction (Eq.S15) in terms of the pairwise molecular frame distances and angles, as well as the lab frame angles θ q , φ (lf) q that define q.In gas-phase diffraction experiments one measures an ensemble of molecules at different orientations, alignments, and possibly differing structures depending on the populated rovibronic states.One samples that ensemble at a variety of times relative to the evolving ensemble anisotropy, revealing the following observable, I , χ where Ψ(t) is the molecular ensemble wavefunction that describes both the rotational and vibronic dynamics of the system.This is the general expression for the diffraction intensity from the entire molecular ensemble.
We have derived the expected diffraction intensity in terms of the momentum transfer vector, but in an experiment we do not have direct access to θ (lf) q and φ (lf) q .Instead, we measure the lab frame diffraction signal on a 2d detector, parameterized by q = |q| and θ (d) .The detector lies in the x-z plane of the lab frame where θ (d) is with respect to ẑ(lf) .
Here, λ is either the deBroglie wavelength of the electron probe, or the x-ray wavelength, and α is the scattering angle rotated by π/2.For the 3.7 MeV electron probe at the SLAC Ultrafast Electron Diffraction facility [6] λ = 3.0×10 −3 Å and the above relations simplify to For x-ray diffraction at 10 keV this expression does not simplify due to larger x-ray scattering angles.Often, one uses a linearly polarized alignment pump pulse which induces cylindrical symmetry in the ensemble rotation wave packet, which results in m 2 = 0. Equation S22 is derived for an asymmetric top, for a symmetric top there is symmetry about the molecular frame z axis, which sets m = 0.
It is difficult to extract ∆R µν from Eq. S23 in its current form since rovibronic coupling may affect the timedependent anisotropy.With rovibronic coupling, to calculate the ensemble anisotropy we may be required to simulate the excited state with the complex excited state simulations we do not want to rely on.This coupling, therefore, may render the anisotropy calculation too difficult.Instead, we consider two methods to separate the ensemble anisotropy and the molecular frame pairwise terms by assuming the molecular structure is rigid over the measurement period.In doing so, we aim to separate the ensemble anisotropy from the molecular frame geometry.To do this, we decompose the ensemble anisotropy into the Axis Distribution Moments (ADMs) by projecting the ensemble of molecular frame orientations, with respect to the lab frame (Fig. S4a), onto the Wigner D matrices, I , χ I , χ Simulations of the rotational wavefunction for rigid symmetric and rigid asymmetric tops [65][66][67][68][69][70][71] produce good agreement with measured alignment signatures.To extract ∆R µν from Eq. S23 we consider two approximations: the typical rigid rotor approximation and a separation of time scales.

A. Rigid Rotor Approximation
We first consider the rigid rotor approximation, which assumes the molecular structure is constant throughout the rotational dynamics.This allows us to take the expectation value of the molecular structure (the molecular frame terms) with respect to the ground rovibronic state structure at t = 0. We may also calculate the ADMs with respect to the ground rovibronic state structure, which allows us to separate the ADMs from the molecular frame terms This approximation is useful when investigating the vibronic ground state structure of a molecule or when the change in the molecule's structure has a negligible impact on the moments of inertia.

B. A Separation of Timescales Approximation for Excited State Dynamics
The second approximation is a separation of time scales between the rotational and vibronic dynamics.The anisotropy signature, A l mk (t), lasts of order one to tens of picoseconds for molecules with a few to tens of atoms, respectively.When the vibration or isomerization occurs on a much faster timescale than the change in anisotropy, we can calculate the rotational dynamics with respect to the known ground rovibronic state structure rather than with the unknown excited state structure.This disparity in timescales is very common, and this approximation is analogous to the Born-Oppenheimer approximation.
We first consider the more general case of a double pump pulse experiment that first induces a rotational wavepacket and then launches a vibronic wavepacket.The first pulse increases the ensemble anisotropy and consequently the number of C lmk (q, t) coefficients.The second pulse further mixes the rotational states while exciting vibronic modes.Let τ denote the arrival time of the second vibration-inducing pulse after the first rotation-inducing pulse, and t is the elapsed time after the second pump pulse.
In our experiment, we initially start with a thermal ensemble often dominated by the vibronic ground state.This ensemble is made of initial rovibronic states, each indexed by (i), in the Born-Oppenheimer basis as prior to any pulses.After the alignment pulse, and before the vibration-inducing pulse, our coherent rotational state evolves as The vibration pump pulse induces the excited state dynamics, while the photon's angular momentum mixes the rotational states.We project the vibronically excited state onto the Born-Oppenheimer basis, We have again separated the ensemble anisotropy ( Ã(α)l m1m2 (n, n ′ ; t, τ )) from the molecular frame structure term, which includes all the vibronic dynamics.The modified ADMs, Ã(α)l mk (n, n ′ ; t, τ ), are analogous to the original ADMs, but now include the coherent rotational mixing from the vibronic inducing pulse.That is, each vibronic state will have its own rotational coherence that must be accounted for when calculating the ensemble anisotropy.Finally, plugging the molecular diffraction term (Eq.S43) into the full diffraction expression we get I(q) (2)  sep (t, τ ) = I Due to the difference in timescales between the rotational and vibrational dynamics, we further simplify Eq.S46.In its current form, Eq.S46 relies on updating the ensemble anisotropy calculation as the structure changes with vibration.This requires us to know what the structure will be at time t, which is what we are ultimately trying to solve for.Instead, when the change in ensemble anisotropy is negligible with respect to the timescale of the vibration we can hold the anisotropy constant In doing so, the ensemble anisotropy and vibronic structural dependence are completely separable.We, therefore, continue to calculate the ensemble anisotropy with respect to the ground rovibronic state structure.
In some cases, a single-pump pulse experiment is preferred over a two-pump pulse experiment when the setup is too difficult or when the anisotropy is difficult to induce or measure.In such a case, we do not initially induce a rotational wavepacket and our initial state is given by Eq.S30 instead of Eq.S31.Therefore, one does not sum over a coherent set of rotational states in Eq.S34 and Here, the ensemble anisotropy is imprinted immediately after the pulse by the interaction between the polarized laser and the excitation dipole.
(S51) Depending on the system, one may further improve this approximation by calculating the ensemble dynamics with respect to a reference structure for t > 0. In some cases, the vibronic transience may be on the timescale of the rotational transience.Once Eqs.S47 or S50 no longer hold at some time t there are two options.Firstly, one can use only C 000 (q, t) which does not rely on anisotropy and Eq.S46 will be exact.Secondly, one can continue calculating Ã(α)l m2m1 (n, n ′ ; t, τ ) with respect to a reference structure.For example, if one knows an excited state structure is similar to the ground rovibronic state one can continue to use Ã(α)l m2m1 (n, n ′ ; t, τ ).One must prove this through a priori knowledge or through the retrieved structures at earlier times.In the case that the dynamics do not deviate from some other known structure one may calculate the Ã(α)l m2m1 (n, n ′ ; t, τ ) with respect to this structure.
Supplementary Note 3. FITTING FOR B m l (q, t) AND C lmk (q), AND COMMON MISTAKES Our method relies heavily on two fitting procedures that will likely be the most important steps of the analysis as they define the C lmk (q) coefficients and σ lmk (q).Below, we describe how one performs these fits analytically by minimizing the χ 2 .These analytical methods, however, will struggle to fit the measured time dependence with ADMs if there is not enough anisotropy and/or there is poor SNR.We highly encourage one to explore molecule-specific systematics to C lmk (q) by fitting simulated diffraction patterns.One can employ L1 regularization techniques to improve these fits.Since the derivative of |x| is undefined at x = 0 and we do not know the sign of the B m l (q, t) and C lmk (q), one will need to employ coordinate or gradient descent methods when using L1 regularization.Gradient descent will be much slower for numerous fits and should be used if the analytical approach is insufficient.Coordinate descent is much faster than gradient descent but will likely be considerably slower as well.fits.
Minimize the χ 2 is the weighted least squares regression problem Here, Y is the data vector we wish to fit, the matrix X are the fit bases (features) that span the columns, µ sums over the detector pixels, and ν sums over the fit bases.The bases are scaled by the fit coefficients F and each data point's contribution to the fit is weighted by W, where We will discuss two common ways to solve Eq.S52 for the optimal fit coefficients.The first method uses the pseudoinverse to minimize Eq.S52 and is commonly referred to as the normal equation.
The second method sets Eq.S52 to 0 and uses the QR decomposition to invert X where √ W is the Cholesky decomposition and Eq S59 is the QR decomposition.The QR decomposition has a lower condition number and produces a more accurate F. In this work, we used the normal equation for the measured N 2 O data and found sufficient agreement with literature values.This may be a function of our poor SNR.We, however, encourage the reader to use Eq.S60 and the more accurate QR decomposition.
To retrieve the B m l (q, t) coefficients, we fit the measured data, I(q(θ (d) ), t , with the spherical harmonics, Y m l θ (lf) q , φ (lf) q .Where Eqs.S25 and S26 relate θ (lf) q and φ (lf) q in terms of θ (d) .Although the spherical harmonics are orthonormal, this orthonormality is broken by the finite sampling of our detector.To account for this now nonzero overlap between different bases, we fit the spherical harmonics to the data instead of projecting onto them.This is most noticeable at low q where one often has the best SNR and the fewest bins to resolve θ (d) .We note that this can still be necessary for the isotropic component due to the Jacobian.We use the trapezoidal rule to increase the orthonormality of our binned spherical harmonics structure terms and gain access to |Ψ(R)| 2 , as shown in Eqs.3-14.
We approximate |Ψ(R)| 2 by choosing a probabilistic model that best describes our data, which we denote as P (R| Θ, C).Our model P (R| Θ, C) is parameterized by Θ and dependent on the measured C lmk (q) coefficients, here denoted as C. We now rewrite Eq.S73 with our new model as Some possible forms of P (R| Θ, C) include a multidimensional delta function which is analogous to a single structure, a normal distribution of structures that would describe the vibronic ground state, or harmonic oscillator eigenfunctions to describe a vibrational wavefunction.In this work, we focus on the following P (R| Θ, C) and their corresponding Θ Θ (delta) = NO (1) , NO (2) , ∠ONO (S78) Θ (gauss) = NO (1) , σ NO (1) , NO (2) , σ NO (2) , ∠ONO , σ (∠ONO) .
Given our model P (R| Θ, C), we use Bayesian Inference and Markov Chain Monte Carlo (MCMC) techniques to find the optimal Θ parameters (Θ * ) that best describe the observed C lmk (q).Bayesian Inference encompasses methods that use Bayes' Theorem to update the hypothesis [36,43].The most time, and computationally, intensive step of this analysis is building the posterior P (Θ|C), which we define through Baye's Theorem P (Θ|C) = P (C|Θ) P (Θ) P (C) .(S81) Here, P (C|Θ) is the likelihood function which is the probability of measuring the data C given our selected model with the given Θ parameters.The likelihood probability plays the largest role in building the posterior and is how information from the data enters the analysis.This can be calculated by assuming each C lmk (q) measurement in q is its own experiment that results in a probability distribution.That is, given many measurements (N images ) one builds a distribution of events for C lmk (q) which quickly becomes a normal distribution, due to the Central Limit Theorem, with a mean and standard error of the mean.To calculate P (C|Θ) one must multiply all of these probabilities where σ lmk (q) is the standard error of the mean of C lmk (q).Since σ lmk (q) ∝ 1/ N images , the summation in Eq.S82 scales as N images .By measuring more photons or electrons, one exponentially sharpens the probability distribution P (Θ|C).As mentioned above, this assumes that each C lmk (q) is an independent measurement which is not the case with sufficiently large x-ray/electron beams which have widths larger than the detector pixels.In such a scenario, one must alter Eq.S82 to account for this lack of independence.
The prior probability, P (Θ), describes the likelihood of a given Θ.Since P (Θ) does not depend on data, it encapsulates our prior knowledge of the Θ parameters.Because we do not want to bias our search through Θ-space we define P (Θ) = e K(Θ) (S83) parameters that best describe our measurement.To do this, we must again address the curse of dimensionality since we are still searching within the N Θ -dimensional space, where N Θ is the number of Θ parameters.We re-emphasize again that we are interested in the Θ * that best describes our data which is given by the mode of P (Θ|C), which does not necessarily correspond to the mean of P (Θ|C).If one looks at a single parameter θ, the mean or mode of this uncorrelated distribution may not correspond to the value that would provide the highest P (Θ|C) value in the full Θ-space: illustrated in Fig. 7.One must therefore search the correlated Θ-space.Once the MHA has converged, P (Θ|C) may have significantly constrained Θ-space, but searching for the mode may still be infeasible for a simple grid search.Below we describe three methods to find Θ * using P (Θ|C) to help us overcome the curse of dimensionality.
The first and most simple way to find Θ * is to apply the MHA to the measured C lmk (q) coefficients in the same way as before, but significantly decrease σ lmk (q).One can make P (Θ|C) arbitrarily sharp, effectively zooming onto the mode, by artificially decreasing σ lmk (q).With small enough σ lmk (q) one can zoom into P (Θ|C) until it is adequately described by a quadratic, where the mean and the mode of the distribution will be the same.The danger of using this method is that one may fall into a local maximum by decreasing σ lmk (q) too quickly without being careful.For example, one's initial Θ guess may be close to a local maximum and the small σ lmk (q) will force the MHA into it and not sample outside of it.To avoid this, one must start the MHA in many different initial Θ states and gradually decrease σ lmk (q) to find the mode and rule out any local maximum.
The second method is to interpolate P (Θ|C) between the evaluated MHA points using a high dimensional Kernel Density Estimator (KDE).We note that one can use all the MHA points rather than the points in P (Θ|C) which are filtered by the auto-correlation time τ (AC) .This is because we are looking for the mode and not evaluating some function over the P (Θ|C) distribution.The primary difficulty with KDEs is finding the shape and width of the kernel.Generally, KDE methods do not perform well for problems in larger than three dimensions.More recently, there has been work to generalize KDEs to high dimensions [72].Calculating points in P (Θ|C) with a KDE will be very fast.Such quick evaluations may allow one to find the mode through simple optimization schemes like a basic grid search or gradient descent.
The third method, used in this paper, is a mixture of simple searching methods and calculating Θ * by a weighted average of the most likely MHA points.By considering only the N likely unique points with the highest likelihood probability we focus on the mode while disregarding tails of the P (Θ|C) distribution.Since we are only concerned with the most likely points, we look at all the points the MHA accepted.This differs from P (Θ|C), which takes MHA points separated by the auto-correlation time τ AC .We calculate Θ * by a weighted sum of the N likely Θ parameters Θ * = n∈{N likely } Θ (n) P Θ (n) |C n∈{N likely } P Θ (n) |C = n∈{N likely } Θ (n) P C|Θ (n)   n∈{N likely } P C|Θ (n)   (S87) where {N likely } denotes the set of indices corresponding to the N likely Θ parameters with the largest posterior, and the second equality only holds because we chose P (Θ (n) ) = P (Θ (m) ).Given the most recently calculated Θ * value, we alternate between a grid search where points are separated by 0, ±1, and ±1.5 standard deviations (σ ) and a random search.Here σ (MS) i is the one dimensional standard deviation of the i th Θ parameter taken over the distribution of the N likely Θs.After one iteration of the grid search, we randomly sample Θs from a normal distribution with mean Θ * and standard deviation σ (MS) .The point of this random sampling is to focus on the region of less than one standard deviation.This keeps the grid search from making Θ * roam too far from the globally optimal parameters.The grid and random sampling are then repeated until every parameter changes by < 3% for five consecutive times.At this time we switch to a random sampling method.We randomly sample values from a normal distribution again with mean Θ * and standard deviation σ (MS) i .We consider Θ * has converged when every parameter has changed < 0.01% for three consecutive random samplings, but require at least one value to change between samplings.
There are many ways to search for Θ * that generally trade between speed and accuracy.Our simple search method performed well for all our experimental variations when there were sufficient samples in P (Θ|C), which depends on width of P (Θ|C).For P (Θ|C) distributions sufficiently broader than ours, one may need a more advanced method.To calculate the precision of Θ * one must find the hyper curve in Θ space with minimal precision, as outlined in Ref. [73].thermal distribution, the ground vibrational state dominates with nearly 100% population (Table S1).For the rotational thermal distribution, the distribution is shifted from 0 with a mode at the n = 8, as shown in Fig. S10.

FIG. 1 .
FIG. 1. Correspondence between the lab and molecular frame Our analysis considers each pairwise distance independently and we define the origin of both the lab and molecular frames by one of the pairwise vectors.For the highlighted NO bond, the nitrogen atom (blue) defines the origin.The lab frame (panel a) is defined by the laser polarization (ẑ) and propagation direction (ŷ).The molecular frame (panel b) is defined by the molecule's rovibronic ground state principal moments of inertia, where the molecular A, B, and C axes define ẑ(mf) , ŷ(mf) , and x(mf) .Here the NO is described by ∆Rµν , θ (mf ) µν , and φ (mf ) µν which correspond to its distance, polar angle, and azimuthal angle respectively.One accesses the molecular frame by rotating the lab frame by the Euler angles θ (lf) I , φ (lf) I , and χ (lf) I .

FIG. 2 .
FIG. 2. Axis distribution moments and ensemble anisotropy The Axis distribution moments (ADMs) encapsulate the ensemble anisotropy which provides various constraints on the molecular frame as a function of time.The ADMs are parameterized by the three angular momentum quantum numbers l, m, and k which correspond to the total angular momentum, the projection along the lab frame (ẑ) axis, and projection along the molecular frame (ẑ) axis respectively.Panel a shows the square norms of the ADMs.Panel b and c show these normalized ADMs, highlighting their time dependence.Panels d and e show the time-dependent ensemble anisotropy probability distribution for θ (lf) I and χ (lf) I , respectively.Panels f and g show illustrative line-outs of these Euler angle distributions for θ (lf) I and χ (lf) I , respectively, with isotropy indicated by the dashed lines.

FIG. 3 .
FIG. 3. Analysis to access the molecular frame signalTo access the molecular structure term, in the molecular frame, one must remove the lab frame anisotropy dependence and fit onto the ADMs.For the NO2 simulation (left) and N2O data (right), we illustrate the analysis steps.One first measures the difference diffraction pattern (∆ I(q, t) ), given by Eq. 3 (row a).Removing the detector angular dependence, one retrieves B m l (q, t) of Eq. 4 (row b).Removing the time-dependent ensemble anisotropy (ADMs) yields the molecular frame M lmk (q) coefficients Eq. 5 (row c).All as described in the text.We note that in the N2O data (right) we have limited visibility of data due to experimental limitations illustrated by the hashes.

FIG. 5 .FIG. 6 .
FIG. 5. Retrieving P (N ) (Θ|C), P (N ) (R| Θ, C), and the molecular structure parameters We successfully retrieve the multivariate posterior P (N ) (Θ|C) for NO2 and N2O from which we find Θ * .The axes of panels a and b are the Θ parameters: the mean and standard deviations of the pairwise distances and angles that define P (N ) (R| Θ, C).Panel a shows the 1d and 2d projections of P (N ) (Θ|C) distributions for the simulated NO2 response.The recovered P (N ) (R|Θ * , C) (panel b) is what we compare to the simulated |Ψ(R)| 2 in Fig. 4a.The red dashed lines indicate the retrieved mode (Θ * ), while the black "x" and solid black lines indicate the ground truth, respectively.Panel c shows the 1d and 2d projections P (N ) (Θ|C) distributions for N2O data, though only using the C200(q) contribution.The black "X" and solid black lines indicate previously measured values for N2O [44, 45].For comparison, panel d shows the simulated Pairwise Distribution Function (PDF) from the same q range.

FIG. S1 .
FIG. S1.Best fit ensemble temperature and Raman pump pulse intensity We show the χ 2 fit value between simulated Axis Distribution Moments (ADMs) and the N2O temporal variations.We vary the ADMs by changing the molecular ensemble's temperature and the pump beam intensity.The lowest χ 2 value is marked by the white dot.

FIG. S3 .
FIG. S3.Axis Distribution Moments as a function of ensemble temperature and pump laser fluence The Axis Distribution Moments (ADMs) vary as a function of pump fluence and ensemble temperature.The left column varies the pump fluence with a constant temperature of 25 K.The right column varies the temperature with a constant fluence of 1 J/cm 2 .
FIG. S4.Correspondence between the pairwise, molecular, and lab frames Our analysis considers each pairwise distance independently and we define the origin of both the lab and molecular frames by one of the pairwise vectors.For the highlighted NO bond, the nitrogen atom (blue) defines the origin.The lab frame (panel a) is defined by the laser polarization (ẑ) and propagation direction (ŷ).The molecular frame (panel b) is defined by the molecule's rovibronic ground state principal moments of inertia, where the molecular A, B, and C axes define ẑ(mf) , ŷ(mf) , and x(mf) .The pairwise frame (panel c) is simply defined by the pairwise vector along the (ẑ) axis with a pairwise length of ∆Rµν .To rotate the pairwise frame into the molecular frame one first rotates this vector by the polar molecular frame pairwise angle θ (mf ) µν and then by the azimuthal φ (mf ) µν .To access the lab frame from the molecular frame, one rotates the molecule by the lab frame Euler angles θ (lf) I , φ (lf) I , and χ (lf) I .

FIG. S6 .
FIG. S6.Removing systematic and statistical experimental contributions Both systematic and statistical errors must be carefully addressed in measured data.Here, we show the data after accounting for noise and systematic effects.Panel a (gray) shows the original and smoothed data and its comparison to the simulated data before and after subtracting the offset.Panel b shows the Fourier power spectrum (Pair Distribution Function) of the data (black) and the applied a low-pass filter (blue).

TABLE S1 .
Initial sample thermal distribution of vibrational states We show the thermal distribution for the first 4 vibrational states in the N2O gas sample.
FIG. S10.Initial sample thermal distribution of rotational statesWe show the thermal distribution of rotational states from the measured N2O sample before impulsive Raman excitation.