Multi-photon above threshold ionization of multi-electron atoms and molecules using the R-matrix approach

We formulate a computationally efficient time-independent method based on the multi-electron molecular R-matrix formalism. This method is used to calculate transition matrix elements for the multi-photon ionization of atoms and molecules under the influence of a perturbative field. The method relies on the partitioning of space which allows us to calculate the infinite-range free-free dipole integrals analytically in the outer region, beyond the range of the initial bound wave function. This approach is valid for an arbitrary order, that is, any number of photons absorbed both in the bound and the continuum part of the spectrum (below- and above-threshold ionization). We calculate generalized multi-photon cross sections and angular distributions of different systems (H, He, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {H}_{{2}}$$\end{document}H2, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {CO}_{{2}}$$\end{document}CO2) and validate our approach by comparison with data from the literature.

structure of all bound and continuum wave functions involved while avoiding the computationally expensive time evolution of the wave function.
The one-photon molecular ionization problem has been formulated within the stationary R-matrix theory 47 and implemented within the molecular package UKRmol+ 48 . It requires calculation of the final stationary photoionization state, which is the solution of the full Schrödinger equation with the incoming-wave boundary conditions. Two-photon ( 2 + 0 ) cross sections for photon energies below the single-photon ionization threshold have been calculated for molecular hydrogen 45,49 by a similar approach.
In this work we generalize the molecular R-matrix photoionization methodology to all photon orders, see Fig. 1. The principal problem of the calculation is the evaluation of the free-free matrix elements. In the R-matrix formulation of photoionization this aspect is simplified by splitting of the coordinate space into the inner and outer region. In the inner region, exact exchange and multi-electron correlation effects are included using configuration interaction. In the outer region, where a single electron moves in the multipole potential of the residual molecule, the problem is treated using analytic techniques.
After a brief review of the computational method in the next two sections, we present results for multi-photon ionization of helium, molecular hydrogen and carbon dioxide across a continuous range of photon energies probing REMPI and non-resonant MPI. The full exposition of the theory is given in Appendix A. It is split into two parts. For clarity, in Appendices A.1 and A.2, we first discuss the (N + 1)-photon ionization, whose description requires fewer conceptual extensions of the one-photon ionization problem than the general N + M process. In Appendix A.3 we describe the general (N + M)-photon process.

Overview of the molecular R-matrix method
A detailed description of the R-matrix theory and its molecular implementation has been given elsewhere 45,47,48,50 .
Here we limit ourselves to the definition of the key quantities needed for the development of the theory presented in this work.
The basic idea of this method is to divide the space by a sphere of radius r = a (often called "R-matrix radius") into an inner region, where multi-electron interactions including exchange are important, and an outer region, where exchange and correlation between the continuum electron and the residual target are negligible. The R-matrix, constructed in the inner region, is the Green's function for the one-electron outer-region problem evaluated on the boundary and provides the link between both regions. In the following we denote all spin-space coordinates of the ( N + 1)/N electrons by X N+1 /X N .
In the outer region the (N + 1)-electron wave function can be written as a sum of direct products of the bound wave functions of the N-electron residual molecule and the one-electron wave functions of the continuum electron: where φ Ŵ p p are the channel wave functions 45 of space symmetry Ŵ p defined as the residual N-electron state coupled to the real spherical harmonic X l p ,m p (r N+1 ) and spin σ N+1 of the continuum electron in the outer region; F p (r) is the radial channel wave function of the continuum electron.
In the inner region any solution E of the time-independent Schrödinger equation for energy E can be expressed as a linear combination of the R-matrix eigenstates ψ k www.nature.com/scientificreports/ where the form of the A k (E) coefficients depends on the choice of the asymptotic boundary conditions for the outer-region solution 47 . The (N + 1)-electron eigenstates ψ k are expressed on the Close-Coupling level in terms of "continuum configurations" A � N i η ij and " L 2 configurations" χ N+1 m as where A indicates the antisymmetrization operation, η ij (r N+1 , σ N+1 ) are continuum spin-orbitals dependent on the position vector r N+1 and spin σ N+1 with a non-zero amplitude on the R-matrix sphere, χ N+1 m are configurations not containing continuum orbitals. The summation over i runs over the subset of all residual ion eigenstates included in the model; j runs over those continuum orbitals that are coupled by symmetry to the respective residual ion states and m runs over configurations generated from the molecular orbitals fully contained inside the inner region. The coefficients c ijk and b mk are obtained by diagonalizing the Hamiltonian using equation where L is the Bloch operator 48,50

Multi-photon ionization
In this section atomic units are used exclusively. In the leading-order perturbation theory (LOPT) the total generalized K-photon ionization cross section for a fixed orientation of the molecule 51 is proportional to the squared magnitude of the K-photon transition matrix element 51 and has dimension length 2K × time K−1 . Here D c k is the projection of the dipole operator along the polarization c k of the k-th absorbed photon, i is the bound initial state of the molecule with energy E i , that is usually sufficiently well described by one of the inner-region R-matrix eigenstates, � (−) f k f is the final stationary photoionization state as defined in 47 and Ĝ (+) k is the Green's operator of the full Hamiltonian Ĥ of the system, evaluated at energy E i + kω, For simplicity, we denote the aggregated energy of the first k absorbed photons as kω , implying that all have the same energy, but in principle the energies of the photons can be different and the product kω replaced by a sum of their energies.
The formula (7) for the transition matrix element is equivalent to solution of K − 1 inhomogeneous Schrödinger equations for stationary intermediate states � (+) i+jω , also discussed in 44 , followed by evaluation of the final dipole transition, The boundary conditions for the intermediate states are chosen to correspond to the physical outgoing-wave solution. As long as the combined photon energies are insufficient to ionize the target, the boundary condition is asymptotically zero. In the R-matrix formulation the bound intermediate states (excited states of the molecule) are accurately represented by the R-matrix eigenstates as long as the R-matrix radius is large enough to contain them fully: see 45 where the N + 0 cross sections for H 2 were calculated this way.
Once the combined energy of j photons exceeds the ionization threshold, the situation changes. The righthand side that drives the equation for the first unbound intermediate state � (+) i+jω is proportional to a bound state limited to the inner region, allowing us to use a purely outgoing solution in the outer region, where Z is the residual ion charge, H + l (η, ρ) is the Coulomb-Hankel function, X lm (r) is the real spherical harmonic, p is a state of the residual ion and the index p labels the asymptotic one-electron photoionization channels accessible after absorption of j photons. In writing (11) we have neglected higher multipoles of the electron-molecule interaction. However, the theory can be formulated to take them into account too, by means of an asymptotic expansion 50 . Nevertheless, in photoionization processes of neutral molecules the Coulomb interaction dominates and the higher multipoles can be safely neglected. It is very practical to consider closed channels too when the channel thresholds are approached from below, so that not only the continuum states but also highly excited bound states can "leak" from the inner region. For charged residual ions, the radial function of a closed channel is the exponentially decreasing real Whittaker function 52 , where κ p = −2E p is the magnitude of the imaginary momentum associated with the closed channel p.
The full exposition of the theory is given in Appendix A. Here we briefly note that the R-matrix method makes it possible to write equations for the intermediate states in the inner region that automatically contain the desired boundary condition in the outer region. The inner region solutions can be then unambiguously extended into the outer region by means of the special functions H + p and W p . The transition elements (9) needed to calculate the cross sections are then calculated as a sum of contributions from the inner and outer region, employing the expansions (2) and (1). In the outer region, this requires integration of highly oscillatory integrals from the boundary between the regions all the way to infinity. This integration is managed using asymptotic forms of the Coulomb (or Whittaker) functions and repeated integration by parts; see Appendix C for details. When, for a given photon energy, the R-matrix radius chosen is insufficient for use of the asymptotic theory, numerical integration can be used for some radial interval as well.

Results and discussion
As the first application of the newly developed method we choose the hydrogen atom, where semi-analytic calculations have been performed by other authors long ago, see Fig. 2a. We note that in this case all "asymptotic" forms discussed above and in Appendix A are actually exact solutions valid throughout all space which allows us to test our implementation (a custom one-electron code) and validate the presented R-matrix theory. After separation of the angular degrees of freedom, this problem is one-dimensional. The radial basis in the inner region consisted of 1000 equally spaced radial B-splines and extended up to radius a = 500 a.u. There are no surprises, the results obtained perfectly match the old calculations by Klarsfeld 53 and Karule 54 . In the calculations for H, He and H 2 (see below), we have not used the continuation of the bound intermediate states by means of Eq. (12). Instead, a large enough R-matrix radius was used to contain these states. Resulting datasets for this and other discussed calculations are available in tabular form in the Supplementary Information to this article.
f p (r) = a p W Z/κ p ,l p +1/2 (2κ p r) ,   48 . This required implementing the multi-channel version of our multi-photon approach into the UKRmol+ codes. The molecular R-matrix package was used to construct and diagonalize the inner-region Hamiltonian (4) for all required irreducible representations of the target's point group symmetry (here restricted to D 2h as the largest available Abelian point group). Then, transition dipole elements between the R-matrix eigenstates are calculated for use in d inn , Eq. (28), and in the evaluation of the right-hand sides of the equations for the intermediate states, Eq. (22). Apart from this, the package also provides the transition dipole elements between the ionic states in (30), the boundary amplitudes w kp , the real Gaunt's coefficients for Eq. (32) and the necessary components of the final stationary photoionization state: the photoionization coefficients A (−) pj for Eq. (24) and the K-matrices needed for construction of the S-matrix in Eq. (26).
The size of the inner region was set to a = 100 a.u. As basic building blocks of the wave function of the residual ion we used Hartree-Fock orbitals of He + calculated in Psi4 55 with the Gaussian basis set d-aug-cc-pVDZ. For the centre-of-mass-centred continuum basis we used the partial wave expansion up to ℓ = 4 and a radial basis set consisting of 200 evenly spaced B-splines. A full CI model was used for the L 2 expansion in Eq. (3). To evaluate the outer region integrals we used 5 terms of the expansion (48) and P = 3 in Eq. (50). The results are shown in Fig. 2b and compared with the available calculated results [31][32][33] . We can see a perfect agreement with earlier calculations below the first core excited resonance, i.e. resonant transition He + (1s)-He + (2s) in the residual ion, at around 41 eV. The deviation occurring there and for higher photon energies is a direct consequence of the chosen Gaussian basis set, which is spatially limited and cannot represent diffuse excited states of the ion. Additionally, our calculations don't account for the double ionization channels opening at 39.5 eV. These are expected limitations of the molecular package, which however do not contradict the validity of the presented approach.
The narrow intervals of energies around the one-electron ionization thresholds (in every channel) are the only problematic regions for this method. Below the threshold, one should see a series of Feshbach resonances converging to the threshold. This is difficult to represent accurately because the highly doubly-excited states responsible for these resonances are increasingly spatially extended requiring a very large R-matrix sphere to represent them sufficiently accurately. Additionally, the asymptotic expansion of the Coulomb-Hankel functions, written in terms of the argument ρ = kr , becomes inapplicable close to the threshold (from above or from below) unless an extremely large R-matrix radius is used. Nevertheless, sufficiently far away from the threshold the theory works flawlessly and the results are very satisfying.
In Fig. 3 we compare one-, two-, three-and four-photon cross sections for ionisation of H 2 by a field polarised parallel to the molecular axis, with up to 3 photon absorptions in the continuum (high-energy section of the last curve). The repeated Feshbach resonance patterns below the thresholds of three-, two-and one-photon ionisation corresponding to excitation of a metastable state by several photons are well observable. There resonances correspond to intermediate excitations of the neutral molecule into one of the higher lying neutral bound states, ionized by the remaining photons. As before, in the vicinity of the thresholds the present method does not provide good results and the data were omitted from the plots. For this calculation we used static exchange model with the Hartree-Fock orbitals of the ion H + 2 built from the atomic Gaussian basis set cc-pVDZ, R-matrix radius a = 150 a.u. and a continuum basis formed of 225 uniformly spaced radial B-splines and partial wave expansion up to ℓ = 4 . The positions of the nuclei of both the initial neutral molecule and the residual ion were fixed at the equilibrium internuclear distance of H 2 , which is 1.4 atomic units. One-, two-, three-and four-photon generalized cross section for below-and above-threshold ionisation of the hydrogen molecule by a field polarised parallel with the molecular axis. The panels to the right provide details (from top to bottom) of the two-, three-and four-photon data around the multi-photon ionisation thresholds. The purple, green, blue and yellow vertical chain lines mark the one-, two-, three-and four-photon thresholds, respectively.  Fig. 4 we present calculated cross sections for two-photon ionization of H 2 employing a full CI model with the aug-cc-pVTZ basis set that was used also in 45 (called "ATZ" there). The polarization of the field was chosen parallel to the molecular axis ( Fig. 4a) or perpendicular to it (Fig. 4b). The radius of the inner region was set to a = 100 a.u. in order to converge the results around the core-excited resonances well. To represent the continuum we included 150 evenly spaced B-splines of order 6 for partial waves up to ℓ = 6 . The curve is interrupted at several energies around the one-and two-photon ionization thresholds (the latter are not marked), where the asymptotic expansion of Coulomb functions does not converge properly. The transition responsible for a core-excited resonance occurs within the residual ion rather than in the photo-electron, and so the momenta of the continuum functions in Eq. (29) are then very similar. This leads to problems with the asymptotic integrations, see in particular Eq. (58), where the convergence of higher terms depends also on the assumption that |k − k ′ |r ≫ 1 in the denominator. However, extending the R-matrix radius and possibly reducing the number of terms in the asymptotic expansion in Eq. (50) (we used P = 3 in this calculation) mitigates this deficiency close to the resonances. Below the one-photon ionization threshold, the cross sections are compared to calculations of Apalategui and Saenz 41 and Morales et al. 40 with almost perfect agreement, disregarding small energy (horizontal) shifts arising from different relative energies of the ground state and the residual ion as obtained from the different methods. The only apparent disagreement is visible before the second Feshbach resonance in the first plot (parallel polarization), where the results of Apalategui and Saenz lack of the sharp "shoulder" below 15 eV visible in results of Morales et al. as well as in ours. This region was shown in 45 to be sensitive to the molecular structure model used in the calculation. As in the case of helium, the description and energies of the higher excitations of H + 2 manifesting as core-excited resonances are also most likely limited by the atomic basis set and the resonances might shift to lower energies when a more diffuse basis set is used. As the cross section is dominated by the contributions of the resonances (both core-excited and Feshbach ones), shifts in their positions translate to significant changes in the overall magnitude of the cross section. Figure 5a-c shows the molecular-orientation-averaged data: the total isotropic cross section σ 0 and the asymmetry parameters β 2 and β 4 , respectively, calculated by transforming the matrix elements (7) to the spherical basis and performing averaging over molecular orientations parametrized by Euler angles as in 42,44 ; see Appendix D. Alternatively, the isotropic cross section σ 0 can be obtained directly in the real spherical harmonic basis and molecular frame by averaging only over polarizations ǫ of the photons and disregarding the dependence on the direction of the photoelectron momentum, by use of the formula 56,57 The results below the one-photon ionization thresholds are compared to other published calculations 39,41,42 . We see a good agreement with the calculation of Apalategui and Saenz 41 ; the small systematic difference in magnitude of the cross section before the first resonance is discussed in Appendix D, where we show that it probably comes from a typo in the codes of Apalategui and Saenz evaluating their formula for the orientational average. We also observe a qualitative agreement with the calculations of Ritchie and McGuire 39 and Demekhin et al. 42 (13)   Multi-electron target: CO 2 . Finally, we calculate photoelectron angular distributions for the two-photon ionization of a comparatively larger molecule: CO 2 , see Fig. 6. Here we used a high-quality electronic model of the molecule introduced in 47,58 , which was already demonstrated to yield very accurate one-photon cross section. The target model is based on the use of molecular orbitals of the neutral molecule in its equilibrium geometry (C-O distance 1.1621 Å) optimized using the complete active space self-consistent field method (CASSCF) with cc-pVTZ basis. Here we focus on the low-energy region with photon energies up to 25 eV (i.e. total absorbed energy up to 50 eV) in which the continuum partial wave expansion converged for ℓ = 6 . R-matrix radius of only a = 10 a.u. was sufficient and allowed to use Gaussian functions to represent the continuum orbitals. All molecular integrals were computed in quadruple precision using GBTOlib 48 which allowed to avoid numerical linear dependencies and retain all continuum orbitals in the basis. Convergent single-photon cross sections were obtained including 300 states of the residual ion in the Close-Coupling expansion (3), see 58 . The radius of a = 10 a.u. is mostly too small for the asymptotic integrals to be applicable. Therefore the necessary onedimensional outer-region integrations (34) have been performed numerically using the trapezoidal Romberg integration up to the radius b = 100 a.u. and analytically in the asymptotic part. Calculation with b = 200 a.u. was done for a subset of energies to check convergence away from the one-photon thresholds. To allow the intermediate bound states to extend beyond the inner region boundary we included closed channels (12) in the www.nature.com/scientificreports/ outer region wave function. As in the earlier one-photon calculation 58 , the energy of the ground state has been manually shifted by −1.1 eV to recover the experimental first ionization threshold 13.78 eV 59 .
In general, the partial isotropic cross sections for ionization into all four presented final states of CO + 2 exhibit a rapid rise with the photon energy, reaching the maximum one or two eV before the above-threshold ionization threshold, from which they decrease. Qualitatively, this rise and decrease in the below-threshold two-photon ionization cross section is somewhat similar to the behaviour of the one-photon photoabsorption cross section measurement 59 below the one-photon ionization threshold, see Fig. 7. The forest of resonances in the energy range 7-10 eV is associated with the two-photon thresholds of the residual ion states A, B, and C. The prominent pair of narrow isolated resonances that follows, located at approximately 11 eV, corresponds to intermediate excitation of the neutral molecule by the first photon to the lowest dipole-allowed states 1 + u and 1 u for parallel and perpendicular orientation of the molecule with respect to the polarization of the ionizing field, respectively. Right after these two resonances, further two-photon ionization channels open and the picture gets more complicated. The cross sections peak around the resonance associated with the excitation to the second 1 + u state of neutral CO 2 at 12.4 eV. Beyond the first one-photon threshold the cross sections decrease and exhibit further resonance structures corresponding to higher autoionizing excited states, as well as resonances associated with further one-photon thresholds. In this calculation, 300 ionic states were included in the model, Eq. (3), to correctly describe the polarization effects in the investigated range of energies. This is an order of magnitude more than the handful of states used for full CI calculations with He and H 2 , with the first excited final state not even Figure 6. Laboratory-frame photoelectron angular distribution parameters for below-and above-threshold two-photon ionization of CO 2 into its first four ionic states. The solid grey vertical line marks the first onephoton ionization threshold at 13.78 eV, while the dashed lines mark further calculated one-photon thresholds for states A, B, C and D, indicating narrow energy windows with inaccurate results. www.nature.com/scientificreports/ appearing before the ATI region. In the case of CO 2 these two-photon thresholds are much more densely spaced in energy and scattered all over the energy range of interest. From the computational perspective, the most demanding part of the molecular multi-photon calculations is typically the evaluation of the molecular integrals needed in UKRmol+ to construct the molecular Hamiltonian. Among the discussed calculations, the molecular integrals were the most resource-demanding in the case of the full-CI model of H 2 with R-matrix radius of 100 a.u., where their calculation took a little over 50 h on a 40-core machine, producing 120 GiB of data. The remaining preparatory steps in UKRmol+ are typically faster; the diagonalization of the Hamiltonian and evaluation of transition dipole elements between its eigenstates for full-CI H 2 took 6 h on the same machine. The evaluation of the two-photon ionization cross sections themselves for all 8000 distinct photon energies plotted in Fig. 5a-c and all ∼1700 final photoionization channels by the method proposed here amounted only to a little over 1 h on a common 10-core workstation.
However, the complexity of the calculation of multi-photon ionization amplitudes discussed in this article is strongly affected by the number of channels, as well as by the above-threshold photon absorption order. For instance, disregarding the preparatory calculations in UKRmol+, the evaluation of the one-/two-/three-/fourphoton ionization cross sections of H 2 in the static exchange model (with 12 outer region channels) for Fig. 3 took 1 s/1 s/16 s/1.5 h on a 10-core workstation, while the calculation of the two-photon ionization cross sections of CO 2 in Fig. 6 (up to ∼ 11,000 channels, 680 distinct photon energies) took around 40 h on a 36-core server machine, mostly because of the need for the numerical integration in the outer region due to the use of a very small R-matrix radius. For CO 2 , the UKRmol+ stage was comparatively fast, consisting of 40 minutes for integral calculation in quadruple precision on 36 cores, less than 1 h for Hamiltonian diagonalization and around 2 h for calculation of the K-matrices needed in the outer region. Generally, it is advisable to extend the R-matrix radius as much as possible, so that the numerical integration in the outer region is avoided (or at least minimized) and a high resolution in energies becomes computationally cheap. This is feasible in UKRmol+ thanks to the availability of B-spline orbitals for construction of the continuum basis.

Conclusion
In this article we have presented a method for the calculation of multi-photon ionization amplitudes and cross sections for arbitrary photon orders. The method is based on the R-matrix approach, which accurately solves the time-independent Schrödinger equation in the close vicinity of the target system (atom, molecule, molecular cluster, ...) and uses an analytic asymptotic expansion of the wave function elsewhere. The last point enables analytic treatment of the multi-dimensional oscillatory integrals that contribute to photon absorptions in above-threshold ionization. At the same time, the flexibility of the inner region allows us to treat complex targets using variational quantum-chemistry methods of the configuration-interaction type to describe the multi-electron dynamics.
Neglecting the channel coupling in the outer region, i.e. the long-range multipole potentials, is possible when the Coulomb interaction of the ejected electron with the residual ion is the dominant interaction. This narrows down the applicability of the method as presented here to situations where the ionized target has a non-zero charge or a negligible dipole moment; however, this is not a serious limitation, because most of the studied targets are neutral and the residual ions are singly charged. In any case the method can be straightforwardly extended to take into account channel coupling in the outer region by means of a multichannel asymptotic expansion of Figure 7. Summed two-photon isotropic partial ionization cross sections of CO 2 including the final states X, A, B, and C. The resonance structure in the below-threshold two-photon ionization is compared to belowthreshold one-photon absorption measurement of Chan et al. 59 The grey vertical line marks the one-photon ionization threshold. The first two resonances corresponding to dipole-allowed excitation of the neutral molecule to the singlet excited states 1 u and 1 u are labeled in the plot.  50 . The dipole integrals implied by this ansatz would have the same form as those worked out here. The three cornerstones of the method are the following: • Embedding of the purely outgoing boundary condition in the Schrödinger equation via Bloch operator (Appendix A). • Reduction of the effective rank of the inhomogeneous system of equations by means of the Sherman-Morrison-Woodbury formula (Appendix B). • General algorithm for evaluation of the multi-dimensional asymptotic Coulomb-Green's integrals (Appendix C).
For accurate results, this method presently requires the inner region to have a large enough size, so that the asymptotic formulas for Coulomb functions and their integrals can be applied. This can pose a problem close to thresholds, where linear momenta in some channels are small, the corresponding channel wave functions oscillate slowly and the asymptotic formulas require large distances to achieve the requested accuracy (or even validity). This can be tackled by further partitioning of the outer region into the "transition" part (say, for radii a to b ) and the "asymptotic" part (from b to +∞ ), where in the former one the integrals are evaluated numerically, while in the latter the analytic approach is used. In the presented results we have only used such splitting in the case of the two-photon ionization of CO 2 and aimed at large inner region radii otherwise. We validated the theory by comparison of the generalized multi-photon cross sections to available published data for atomic hydrogen, helium and molecular hydrogen. We also provide original results for two-photon ionization of CO 2 , below and above the ATI threshold. However, photoionization cross sections are not the only domain of applicability of this theory. The access to accurate transition amplitudes for polyatomic molecules can be used to study a range of interference phenomena (e.g. two-and multi-photon RABITT, electron vortices) using a time-independent approach, even though these have been traditionally the domain of time-dependent calculations [23][24][25] , or of approximative calculations that extend asymptotic wave functions all the way to the origin 21 . Our work opens the way to accurate calculations of multi-photon processes in multi-electron molecules. This is the direction of our further research.

Data availability
All data generated during this study are included in this published article and its Supplementary Information files.

Appendices A Multi-photon ionization in R-matrix approach. A.1 Incorporation of the boundary condition.
In the following, the channel angular momentum numbers l p m p will be collectively denoted by the index p so the Coulomb-Hankel function and the real spherical harmonic become H + p (kr) and X p respectively. We will not discuss the closed channels explicitly. The decreasing Whittaker function W p is simply the Coulomb-Hankel function H + p with positive-imaginary momentum and has the same asymptotic expansion as H + p . In the following text we do not distinguish these two functions and "open channels" can be understood to also include all weakly closed channels, whose penetration into the outer region needs to be considered to compensate for the finite inner region.
To embed the boundary condition (11) into the Schrödinger equation we proceed in the following way. First, we introduce the standard Bloch operator (5)  where the matrix elements of the Bloch operator between the inner region eigenstates ψ m and ψ n can be expressed as by means of the boundary amplitudes w pm and radial derivatives w ′ pn of those eigenstates when projected on the one-electron outer channels, www.nature.com/scientificreports/ Now, we can take advantage of the partially known asymptotic form (10) of the sought-after wave function. The boundary amplitude and the derivative of the radial channel wave function computed from the inner-region solution (15) are Comparing these expressions with (11) and eliminating the unknown expansion coefficient a p then yields where the boundary logarithmic derivative � p (a) stands for Substituting (20) into the Schrödinger equation (16) leads to the final form This equation provides the correct inner region expansion coefficients of (15) compatible with the boundary condition (10). The outer region expansion coefficients a p can be evaluated from (19) and (11) as The solution of Eq. (22) can be optimized using the Sherman-Morrison-Woodbury formula. For details see Appendix B.
A.2 Free-free matrix elements. The final (detector) wave function can be expressed in a similar way as the intermediate states. In the inner region we can write the partial wave expansion of (2), where the partial-wave photoionization coefficients A (−) pk follow from the one-photon ionization theory 47 , σ p = arg Ŵ(ℓ p + 1 − iZ/k f ) is the Coulomb phase shift and k f the final momentum of the photoelectron after ionization into the residual state f . In the outer region it is where the radial function has the form 50 and S qp is the element of the S-matrix coupling the energy-accessible outer channels q and p . In the case of absorption of only one photon in the continuum (e.g. absorption of the second, IR photon in the RABITT scheme) the evaluation of the dipole matrix element (9) can be split into inner region and outer region contributions, The symbol D cqk is the matrix element of the component c of the dipole operator between the states q and k of the residual ion and partial waves X q and X k of the ejected electron, .
. To evaluate the contribution to the dipole matrix element from the outer region we need the integrals where s 1 and s 2 stand for any combination of + and −. These integrals can be evaluated for large values of a by replacing the Coulomb-Hankel functions by their asymptotic series, using the method of repeated integration by parts 61,62 . This method is explained in detail in Appendix C.
A.3 Higher-order above-threshold ionization. When we go to higher orders (or energies) and it becomes possible that one or more photons leading to a given intermediate state are absorbed in the continuum, the righthand side of Eq. (14) is non-zero in the outer region or even polynomially diverges with radius. This behaviour is transferred also to the solution itself, see Fig. 8. The method then needs a generalization. Now, there is a source term both in the inner region and in the outer region. The value of the wave function at the R-matrix region boundary is a combination of these two sources. The inner region source, as before, generates an outgoing Coulomb-Hankel wave in each accessible channel. In addition, given that the outer region channels are not coupled, the contribution from the outer region to channel p can be directly obtained employing the Coulomb-Green's function where F p (k p r) is the regular Coulomb function 52 . The Eq. (35) is only valid for open channels. For energy-forbidden channels p one should use the negative-energy Green's function involving the real Whittaker functions 63 . In this work, though, we disregard contribution of closed channels to the 3-photon (and higher) ionization and limit ourselves to large R-matrix radii instead in such cases. For this reason we do not use the negative-energy Green's function, even though the extension is straightforward. The additional term that reflects the effect of the outer region source in the inner region does not depend on the sought expansion coefficients c n and hence can be transferred to the right-hand side, yielding the final, fundamental equation of this method The radial integration in �ψ m |D j |� (+) i+(j−1)ω � is limited to the inner region since the state ψ m is restricted to it. The additional term on the right-hand side compensates for this truncation. Given the updated boundary formula (37), the formula for the outer region expansion coefficients has to be generalized to Together with (36), we now have the complete prescription for the intermediate state (given by c n and a p ) in the whole configuration space that can be used in the evaluation of the final transition dipole elements, or as a seed for intermediate states of even higher order. Of course, when the right-hand side of (14) is limited to the inner region, then β p = 0 and the equations in A.3 simplify to their restricted forms introduced earlier in A.1.
Evaluation of d out,p (as well as of β p , which has the same form) gets progressively more and more complicated with the increasing order, requiring integrals of dimension increasing with the number of photons absorbed in the continuum, e.g. for 3 and 4 photons absorbed above the one-photon ionization threshold: etc. For simplicity, the radial and momentum-dependent arguments have been suppressed, but it should be clear that the Green's function separates parts of the expression that depend on adjacent r i and r i+1 . These nested integrals arise because the high-order intermediate states entering the formulas (9) and (39) no longer have the simple asymptotic proportional to a p H + p and contain, as the above-threshold absorption order increases, an increasing number of integrals over the Green's function, see Eq. (36). Evaluation of the required multi-dimensional integrals is discussed in Appendix C.
The overall asymptotic complexity of the method is exponential, O(exp N) , in the number of absorptions below the ionisation threshold due to the exponentially rising number of possible combinations of absorbed photon polarizations, and even factorial, O(M!) , in the number of absorptions in the continuum, above the ionisation threshold, as discussed in Appendix C. However, as the practical applications of perturbative theory (36) f p (r) = a p H + p (k p r) + +∞ a g (+) p (r, r ′ )�� p X p |D c j |� (+) i+(j−1)ω �dr ′ .
(37) f p (a) = a p H + p (k p a) + β p F p (k p a) , .  www.nature.com/scientificreports/ are generally limited to low orders anyway, the rapidly rising complexity for high-order above-threshold ionisation is not particularly worrying.

B Efficient solution of "diagonal plus low rank" equation.
In the standard R-matrix theory of collisions the inversion of the inner-region Hamiltonian, Eq. (4), needs to be performed only once. However, since equation (22) needs to be solved for each photon energy independently, it pays off to implement this solution efficiently. As written, the matrix of the set of equations has rank equal to the number of inner region eigenstates, which may be a very large number N . Solution of such equation has the asymptotic complexity O(N 3 ) . Fortunately, it has a very specific form, sometimes called "diagonal plus low rank" matrix, where the low rank refers to the second term of the matrix in Eq. (22) being constructed from boundary amplitude matrices, which have one of the dimensions equal to the number of photoionization channels n , while typically n ≪ N . In this case, one can make use of the Sherman-Morrison-Woodbury formula 64 , which allows expressing the solution of the N-by-N system using the solution of an n-by-n one, where N is the number of inner region eigenstates and n is the number of the outer region channels. In this case the solution is Note that the rank of the inner inversion in (46)

C Evaluation of multi-dimensional Coulomb-Green dipole integrals. Multi-dimensional
integrals of products of coordinate powers, Coulomb-Hankel functions and Coulomb-Green's functions from a large radius a to infinity in each coordinate can be expressed as a sum of "triangular" integrals over a < r 1 < r 2 < · · · < r M < +∞ , in order to fix the assignment of the coordinates in the Green's function (35). Then, the regular Coulomb functions F l are rewritten using Coulomb-Hankel functions H ± l : All Coulomb-Hankel functions are in turn expanded using the asymptotic series 52 w h e r e a = 1 + l + isη , b = −l + isη , η = −Z/k , ρ = kr , φ l (η, ρ) = ρ − η log(2ρ) − πl/2 + σ , σ = arg Ŵ(l + 1 + iη) and s is either +1 or −1 . As a result, the final integrand in (39) will become a product of oscillatory exponentials and (possibly negative) coordinate powers. In general, such integral does not have a well defined value. We circumvent this by introduction of an auxiliary damping factor exp(−cr i ) with positive constant c into the integrand for each coordinate r i as suggested in 65 . Following the integration we perform the limit c → 0+ ; the value of the integral stays finite and well-defined. All formulas in this section are to be understood in this limit. The prototypical one-dimensional integral with θ(r) = s 1 θ 1 (r) + s 2 θ 2 (r) , where θ i (r) = k i r + Z log(2k i r)/k i − πl i /2 + σ i , can be evaluated by repeated integration by parts (limited to P iterations) as done in 61,62 , leading to where γ (r) = i/θ ′ (r) and the terms T t (r) satisfy the recursive definition The primes indicate derivatives with respect to r . For evaluation of the terms T t , the derivatives of f (r) at r = a need to be known, which is easy for the one-dimensional case (49), where f (r) is simply a (small, or even negative) power of r . The sum in (50) is asymptotic series similar to (48); higher terms are proportional to high derivatives of negative powers of the coordinate and to high derivatives of the phase θ(r) at large distances, which www.nature.com/scientificreports/ rapidly decrease in magnitude with each new term, provided that the evaluation radius r = a is sufficiently large. For modest radii, though, considering fewer terms leads to better results. Nested "triangular" integrals of the type arising in absorption of at least two photons in the continuum can be calculated similarly by recursion, based on the observation that and so for a two-dimensional integral That is, the integrand at any level has the same form "function times exponential" and the same approach can be used to evaluate the integral as for lower orders. For the second-order integral, the only additional information needed to evaluate it is the knowledge of the value and derivatives of f 2 = f 2 g 1 at r = a , which can be precalculated before evaluating the second integral. The need of P derivatives of f 2 translates, via the chain rule, to the need to know P derivatives of g 1 , which in turn just raises the number of derivatives of f 1 that need to be precalculated beforehand, because derivatives of the terms T t are then needed, too, The algorithm for evaluation of the nested "triangular" exponential integral of arbitrary dimension N then begins by calculation of derivatives of f i (r) up to some high order dependent on M and P , followed by recurrent evaluation of terms T t and the integrals I (and their needed derivatives) for subsequently higher dimensions.
Note that, as would be expected from the multi-dimensional nature of the problem, while the asymptotic computational complexity for calculation of a single "triangular" integral using this approach is manageable O(M 3 P 2 ) , the computational complexity of the whole algorithm is asymptotically proportional to M! . This scaling is a result of the r 1 < r 2 < ... < r M unique orderings (permutations) are required. The corresponding number of independent triangular integrals to be calculated is proportional to 2 M for all splittings of the regular Coulomb function F l into the Coulomb-Hankel functions H ± l , and also up to n M ( n being the typical number of channels in an irreducible representation) for evaluation of a multi-dimensional integral for each possible (dipole coupled) sequence of intermediate state outer region channels. Recall that M refers to the number of photons absorbed in the continuum. Consequently, the problem is computationally tractable only for modest above-threshold ionisation orders.
When the R-matrix radius r = a is too small for the asymptotic procedures to be valid, it is possible to employ numerical integration between r = a and some sufficiently large r = b that is suitable for application of the asymptotic methods. The numerical integration typically needs to be employed for slowly oscillating functions only. Outside of the (a, b) M hypercube, e.g. in (a, b) M−1 × (b, +∞) , it can be combined with the analytic formulas.
Even though they are not necessary-as we have just outlined a general algorithm-for convenience and illustration we include below the explicit forms of the one-and two-dimensional integrals. The one-dimensional case is where u = s 1 k 1 + s 2 k 2 and v = Z(s 1 /k 1 + s 2 /k 2 ) . The coefficient a mpq is defined recursively as and a mpq = 0 for invalid indices ( p < 0 , q < 0 or q > p ). For two dimensions, where addit i o n a l l y φ(r) = s 3 θ 3 (r) + s 4 θ 4 (r) , γ (r) = i/(φ ′ (r) + θ ′ (r)) , ũ = s 1 k 1 + s 2 k 2 + s 3 k 3 + s 4 k 4 , ṽ = Z(s 1 /k 1 + s 2 /k 2 + s 3 /k 3 + s 4 /k 4 ) and m = m 1 + m 2 , we have    with the initial condition bm pq000 = 1 , and bm pqstt ′ = 0 whenever t < 0 , t > s , t ′ < 0 or t ′ > s. D Orientation averaging. The laboratory-frame differential cross section of ionization of a sample of randomly oriented molecules by absorption of n photons with given laboratory-frame polarizations p 1 , . . . , p n ( p i is equal to 0 for linear polarization or ±1 for a circular polarization) is and has the form of a Legendre expansion where σ 0 = b 0 is the isotropic generalized integral cross section, β L = b L /σ 0 are the dimensionless asymmetry parameters, and the quantities b L are Here, M fi,l f m f ,m n ...m 1 are terms of the partial wave expansion of (7); that is, l f and m f are the molecular-frame angular quantum numbers of the partial wave, while m 1 , . . . , m n label the components of the dipole operator in spherical basis in the molecular frame. The expression for b L was obtained by: expressing the molecular-frame expansion of M fi in terms of the laboratory-frame quantum numbers m ′ f and p 1 , . . . , p n , expressing the product Y l f m ′ f Y * f µ ′ f that arises in Eq. (64) as a series in Y LM ; recursively combining the rotational matrices associated with a partial wave with those associated with the dipole operator; and eventually using the orthogonality of the rotational matrices.
In the main text we mention a discrepancy in Fig. 5a between the orientation-averaged isotropic cross section for two-photon ionization of H 2 calculated in this work and the results of Apalategui and Saenz 41 . The authors of the latter article break down the total cross section into components σ a , σ b , . . . , σ g , some of which they explicitly plot. In Fig. 9 we reproduce those components and demonstrate that if we flip the sign of the interference term σ f with respect to the definition (7f) in the reference, we obtain their results (as far as our calculated transition matrix elements permit); cf. thin broken blue curve to thick solid blue curve in Fig. 9. With the exact definition of σ f in the article, however, we obtain our results; the two red curves in Fig. 9 lie on top of each other. Based on this observation, together with the fact that we obtained our results also in Cartesian basis via independent formula (13), which does not treat σ f separately, we conclude that the authors of the earlier calculation used an incorrect sign of σ f when evaluating the final orientation averaged cross section and that the agreement between (62) B m 1 m 2 pqs (r) =γ (r) P s=0 i s r m 2 s t,t ′ =0 bm pqstt ′ (ur) s−t v t (ũr) s−t ′ṽ t ′ (ũr +ṽ) 2s (ur + v) s , bm pqstt ′ = (m − p − q − s − t − t ′ + 1)bm pq(s−1)tt ′ + (m + p − q − t − t ′ + 2)bm pq(s−1)(t−1)t ′ + (m − p − q + s − t − t ′ + 1)bm pq(s−1)t(t ′ −1)