## Introduction

One of the major achievements in modeling of ultrafast nonlinear processes is a framework for numerical simulation of second-order nonlinear processes based on only two assumptions: unidirectionality of propagation and paraxial approximation1. Even more fundamental approach, which drops the paraxial approximation and treats nonlinear polarization (and possibly free currents) in a general way can be derived from the vectorial Unidirectional Pulse Propagation Equation (UPPE)2,3. These models enable numerical simulation of nonlinear processes in the presence of frequency dependent diffraction, dispersion and spatial walk-off. They do not, however, immediately enable noncollinear beam propagation. Still, as they were the most accurate models avaiable they have been used for noncollinear simulations of Optical Parametric Chirped Pulse Amplifiers (OPCPA)4,5,6,7. Only a small range of noncollinearity angles has, however, been considered (bellow 1° 7 and several degrees–for tilted front pulses6) to preserve the accuracy of calculations.

Accurate as they are these models are also resource consuming. In the case of OPCPA the challenge comes from the fact that the interacting pulses are not only extremely broadband but can also be stretched up to nanosecond durations8,9,10, therefore, their representation requires huge numbers of points in the simulation grid. Consequently, the full 3D models are used only when the signal pulses are moderately chirped6,7,11. In other cases the numerical reality forces simplifications and, therefore, additional assumptions. For example, three dimensional simulations with neglected higher order dispersion and diffraction are often in use12,13,14. Another common practice applied to lower the memory requirements is to reduce the dimensionality. Thus two dimensional simulations with one of the Cartesian dimensions dropped15,16,17, sometimes with no diffraction and no walk-off included18,19, are used. Noncollinear OPA geometry is known to provide very large bandwidths20,21,22, the simpler–collinear propagation model is, however, often used to model its operation. A good example of this approach are the two dimensional models with cylindrical coordinates23. Beams in such simulations are intrinsically collinear. To account for noncollinearity ad hoc spatial overlap parameters are sometimes included24,25. Dispersionless models with spatial effects26 and finally, one dimensional simulations with dispersion treated up to the second order27, or even with no dispersion28,29,30 can also, without question, be of value in some applications.

The extra assumptions pointed out above are well justified when general qualitative results are of interest. However, a more accurate modeling is always tempting as it may reveal some subtle effects obscured by too crude approximations. Recently, some of us have shown that the use of a correct ab initio approach to nonlinear pulse interaction simulations applied to a design of a nonlinear optical device can result in the threefold efficiency increase with respect to the efficiency of previous solutions and an unprecedented reduction in the size of the device31. Therefore, a search for more accurate simulation methods and mathematical tools which could reduce the computational requirements for these methods is justified.

The need for a robust noncollinear simulation is even more pronounced in the case of broadband frequency mixing at large angles. The fluorescence up and down conversion processes with 10°–30° angles32,33,34 can serve as a good example. Apparently, to this day no propagation simulation approach to these problems have been attempted.

Yet another area of interest for noncollinear propagation is a variety of four-wave mixing (FWM) processes. Here degenerate and two-color resonant FWM35,36,37, Raman based38,39,40 and (with a proper model for the medium nonlinear polarization response) 2D-IR41 experiments can be considered. While these experiments are often performed in a noncollinear configuration the literature on appropriate propagation models is minimal or nonexistent.

Although, the direct solution of Maxwell equations with finite difference time domain (FDTD) methods can, in principle, be considered as an alternative to unidirectional methods, it suffers from signifcant flaws42. First, it requires resoling the whole simulation domain of interest (not only the vicinity of the pulse) with a resolution high enough for the features of the pulse itself to be resolved. Additionally, the time step is limited so to fit the spatial grid resolution (Courant condition). Therefore, a supercomputer cluster instead of a laptop and a time of several days instead of minutes is required per single simulation performed with FDTD. Also, FDTD presents complications when nonlinear effects (especially Raman scatteing) have to be implemented43. Finally, the accuracy of FDTD is much smaller than that of the spectral (unidirectional) methods44.

In the present paper we describe a novel method for numerical simulations of noncollinear pulse propagation and nonlinear interaction using just the unidirectionality approximation. In this method the angle between the interacting beams is limited only by the unidirectionality principle and, therefore can exceed 140° (depending on the beam size). The method relies on expressing the separate UPPE describing propagation of each of the beams in a single common spatial coordinates system. In this new system the interaction of pulses through nonlinear terms can easily be calculated while the rotated linear terms assure accurate propagation. Our approach is valid as long as the volumes of the spectral space used for representation of the particular interacting pulses with the same polarization do not overlap.

Our method enables simulation of noncollinearly propagating pulses. Some preparation of the input pulses (initial conditions) is, however, required. We, therefore, present a Fourier transform based 3D rotation procedure inspired by the image processing techniques45. This technique can be easily applied to rotations of a complex number based electric field. It is more than three orders of magnitude faster than the linear 3D interpolation even for relatively small grids. For large grids the Fourier rotation becomes an enabling tool as the rotation time can be reduced from days (as in case of interpolation) to minutes. It is also expected to be more accurate than the common interpolation techniques45.

To demonstrate the capabilities of our method, we present a number of linear and nonlinear propagation examples:

• linear interference of ultrashort pulses.

• optical switching through cross-focusing of beams crossing each other at a 90° angle.

• a fluorescence up-conversion in a BBO crystal where we compare type II, and type I configurations (with mutual angles between the interacting beams of 19°, and 27°, respectively).

• a degenerate four-wave mixing experiment in the boxcar configuration for which we observe spectral narrowing as the angle between the beams is increased.

## UPPE Framework

The most accurate unidirectional model of propagation in nonlinear media, derived from Maxwell equations with a single assumption of unidirectionality is the one based on the vectorial UPPE2,3:

$${\partial }_{z}{E}_{s}^{p}=i{k}_{z}^{p}{E}_{s}^{p}+{e}_{s}^{p}{{\bf{e}}}^{p}\frac{\tilde{\omega }}{2{\varepsilon }_{0}{c}^{2}{k}_{z}^{p}}(i\tilde{\omega }{{\bf{P}}}^{NL}-{\bf{j}}),\,s=x,y\,,$$
(1)

with $${{\bf{E}}}^{p}(\tilde{\omega },{k}_{x},{k}_{y},z)=({E}_{x}^{p},{E}_{y}^{p},{E}_{z}^{p})$$–the complex electric field of polarization mode p represented in Fourier space with –the optical frequency and kx, ky–the spatial frequencies or wavevector $$({{\bf{k}}}^{p}(\tilde{\omega },{k}_{x},{k}_{y},{k}_{z}^{p}))$$ components. UPPE describes propagation of electric field along z or more specifically only the part of electric field propagating towards positive values of z as this is a unidirectional equation. The ep = Ep/|Ep| is a unit polarization vector and |.| describes length of vector. The wavevector component along the propagation axis $${k}_{z}^{p}$$ is related to the frequency, kx, ky and refractive index n(, kx, ky) via the dispersion relation:

$${k}_{z}^{p}=\sqrt{{(\frac{\mathop{\omega }\limits^{ \sim }{n}_{p}(\mathop{\omega }\limits^{ \sim },{k}_{x},{k}_{y},{k}_{z}^{p})}{c})}^{2}-{\mathop{k}\limits^{ \sim }}_{x}^{2}-{\mathop{k}\limits^{ \sim }}_{y}^{2}},$$
(2)

For the case of birefringent materials Eq. (1). can be written down separately for each of the polarization modes i.e.: ordinary (p = o) and extraordinary (e) modes in uniaxial or slow (s) and fast (f ) modes in biaxial materials. In homogeneous medium representation of two modes is also required and one can select p = 1, 2 in this case. The dependence of $${k}_{z}^{p}$$ on the optical frequency and the wavevector components (kx, ky) is responsible for dispersion and diffraction, respectively. The additional dependence of refractive index on kx, ky present in biaxial materials is responsible for spatial walk-off (double refraction). All the above mentioned linear effects are treated exactly in UPPE. The PNL = PNL(E, , kx, ky, z) and j = j(E, , kx, ky, z) are the nonlinear part of the polarization vector and the free current vector46,47. UPPE is solved for electric field components perpendicular to z axis. The missing Ez component of electric field vector required for calculation of PNL and j can, however, be obtained from Ex and Ey2,48.

From the numerical point of view it is convenient to use electric field envelope instead of Ep:

$${\tilde{{\bf{E}}}}^{p}(t)={\tilde{{\bf{A}}}}^{p}(t){{\bf{e}}}^{-i{\omega }_{R}t}+c.c,$$
(3)

where ωR is a certain reference frequency (usually the central frequency of the pulse). This substitution is equivalent to variable change (shift operation) in the Fourier space, the Fourier transform of Eq. (3) reads:

$${{\bf{E}}}^{p}(\tilde{\omega })={{\bf{A}}}^{p}(\tilde{\omega }-{\omega }_{R})={{\bf{A}}}^{p}(\omega ),$$
(4)

where ω =  − ωR is the detuning from the pulse central frequency.

## Idea of the Method

The central idea of our work is to represent each of the interacting pulses in a separate discrete cuboid grid and write a rotated UPPE for each of the pulses separately (see Fig. 1(a).). The grids’ nodes have to overlap exactly in the spatio-temporal domain so that the nonlinear mixing terms can be calculated. Therefore, each of the interacting beams, propagating along axes zl (with l = 1, 2...) in a xl − yl − zl system, will require solution of its own UPPE represented in a new coordinate system x′ − y′ − z′:

$${\partial }_{z^{\prime} }{A^{\prime} }_{s}^{p}=i\frac{1}{{c}_{\theta }}({K^{\prime} }^{p}-\eta {c}_{\theta }\kappa ^{\prime} -{s}_{\theta }{c}_{\varphi }{k^{\prime} }_{x}-{s}_{\theta }{s}_{\varphi }{k^{\prime} }_{y}){A^{\prime} }_{s}^{p}+\frac{1}{{c}_{\theta }}{Q}^{^{\prime} p}{P}^{^{\prime} NL,p}.$$
(5)

where the first term describes the linear and the second term nonlinear effects, respectively. The definitions of variables used in Eq. (5), together with the equation derivation are given in Methods section.

The flow chart of the simulation is presented in Fig. 2. for an example of three wave mixing. First, preparation of the initial conditions: the electric field envelopes (Ai, i = 1, 2, 3) in the t′ − x′ − y′ space is required. Often it is interesting to simulate propagation of complicated shape of electric field. It can involve Hermite and Laguerre spatial modes as well as secant, sinc, super-Gaussian or just arbitrary temporal shape known from SPIDER or FROG measurement of an experimental pulse. While, discretization of such shapes in arbitrarily rotated coordinates (t′ − x′ − y′) is difficult it is easy in non rotated grids. It is, therefore, of essence to find a procedure for arbitrary rotation of electric field within discretized grids. Moreover, this procedure is required at the end of simulation for back rotation of the fields and retrieval of output temporal and spatial profiles. In the Methods’ “Arbitrary Fourier rotation” section we present such a procedure based on Fourier transform and shear operations. This method can be faster than interpolation by as much as 3 orders of magnitude (see Fig. 3). An example of rotated pulse propagation is presented in Fig. 4.

In the process of simulation the linear phase term (kz from Eq. (1). or Kp from Eq. (5).) has to be represented in x′ − y′ − z′ system for each of the beams. To this end a third coordinates system–the crystal optical axes xC − yC − zC system has to be considered. For example see Fig. 5, where: κ = ω n(ωR, 0, 0)/c = ω nR/c is the normalized detuning from the reference frequency. Here $${k}_{z}^{e}$$ was calculated through Eq. (2). for highly birefringent YVO4 crystal for the angle between z and zC axes of 48° (maximum walk-off). The dispersion comes from slight deviation from linear growth of the phase along the vertical axis in Fig. 5(a). The diffraction manifests itself through circle like behaviour of kz along the horizontal axis. And finally the presence of walk-off reveals itself in slight asymmetry of the colour pattern around the vertical axis. There are no propagating wave solutions for $${k}_{x}^{2}+{k}_{y}^{2} > {(n\tilde{\omega }/c)}^{2}$$ (black color). In birefringent media the calculation of $${k}_{z}^{p}$$ requires an iterative procedure involving multiple transitions between the three coordinate systems. This procedure is described in the “Calculation of the linear phase term” section. Calculation of the nonlinear polarization also require adequate vectorial treatment which is given in the section “Calculation of nonlinear coefficients”.

In each step of the simulation the nonlinear polarization is calculated in the x′ − y′ − t′ space–this requires inverse 3D Fourier transformation and back transformation of the mixing result to the $${k^{\prime} }_{x}$$ − $${k^{\prime} }_{y}$$ − κ′ space. Finally, Eq. 5. is solved by an ordinary differential equation solver e.g.: integrating factor or exponential Runge-Kutta method.

Our approach is valid as long as the volumes of the spectral space (kx − ky − ω) used for representation of particular interacting pulses with the same polarization do not overlap (see Fig. 1(b).). Note that this limitation is not as strict as the one imposed on the collinear propagation where the optical spectra of the interacting pulses cannot overlap49. In our case the extremely broad pulses can be used as long as the sum of their divergence angles is smaller then the noncolinearity angle by a certain factor (1.5 for the separation of 3σ for Gaussian beams). We have verified that this condition is easily fulfilled for a standard NOPA setup.

## Method’s Accuracy

First the results of forward propagation (θ = 0°) for beam sizes above 10 μm, for which the paraxial approximation is valid, have been compared with with SNLO50 and Hussar software31. The results were found to be in perfect agreement. Then the results of propagation for various values of θ have been compared to results of the forward propagation. Both linear and nonlinear (SHG) propagation through 5 mm BBO crystal have been tested for various beam sizes (3–300 μm) and pulse durations (3–300 fs). The linear propagation tests have also been performed for constant width of 10 μm and pulse durations down to 1.15 fs (1.4 cycle) and for various width (w0 ≤ 1 μm) and constant duration of 10 fs in 1 mm of fused silica. The nonlinear propagation requires multistep solver, thus, the Integrating Factor Runge-Kutta 4, 5 Dormand-Prince method (IFRK45)51 with adaptive step-size control52 have been used. The linear problem can be solved in a single propagation step as in this case: $${E}_{s}^{p}(z+{\rm{\Delta }}z)={E}_{s}^{p}(z){e}^{i{k}_{z}^{p}{\rm{\Delta }}z}$$ exactly. Thus, the Exponential Euler method (EE)53, which does not subdivide the steps into smaller parts was used. Both: Integrating Factor and Exponential methods are specially designed for problem with large linear therm.

In case of rotated UPPE method the ultimate limitation appears when pulse contains components that would propagate in the negative z′ axis direction. These components cannot be propagated with a unidirectional model and, thus, the error must grow near the 90° limit. In fact we have verified that for θ ≤ 60° (corresponding to a 120° mutual angle for a two beam case) the error in beam waist, pulse duration and energy was bellow 3·10−3 for the single step linear cases (for pulses longer then 1.15 fs the errors are bellow 10−5). For the nonlinear case the error depends on the step size. We have found, however, that, for a given distance, it can be easily reduced down to 10−4 for reasonable step sizes–a multiple of the light wavelength (grid size 1024(t) × 1024(x)). The detail description of the tests of the method is present in the supplementary material.

## Examples

For each of the examples, two z-marching schemes were used–IFRK45 and an EE equipped with Richardson extrapolation52 for automatic step size selection. The step sizes were selected automatically so that the relative errors bellow 10−6 in each step were assured. The sizes and densities of the grids were increased to a point at which relative differences in parameters resulting from consecutive simulations: energies, beam waists and pulse durations were below 10−3 (usually much less). Additionally, the spectral, spatial and temporal profiles were also compared visually. This convergence has occurred for grid sizes that enabled simulation on a 16 GB RAM laptop computer. The results have been, however, confirmed with simulations using larger grids run on a 64 GB RAM PC.

### Pulse interference

The noncollinear but linear propagation has a potential application for solving interference problems. Here an example interference pattern for two pulses crossing each other at a mutual angle of 120° is visualized. The pulses are created by propagation of two 3 fs FWHM Gaussian pulses with 3 μm waist and central wavelength of 800 nm through 20 μm of ZnSe. A “top view” of the initial pulse and the result of propagation are presented in Fig. 6(a,b) respectively. Three stages of interference are presented in Fig. 6(c–e), the Supplementary Movie 1. also presents the evolution of the interference pattern during pulse propagation.

### Optical Switching through XPM

The cross-phase modulation and resulting cross-focusing has been considered as a candidate mechanism for optical switching in dielectric media54,55,56,57,58. It has been obtained eperimentally in gases59 where the band of a supercontinuum of a probe beam was controlled through interaction with the pump. In this work apart of phase-modulation the Raman effect played an important role. Beam direction switching through attraction of beams traveling in the same direction has been studied for coaxial54, and displaced beams55. It has been studied extensively in fibers56 also for supercontinuum generation purposes57 and in fiber gratings58. Cross-focusing of coaxial beams have also been extensively studied in the context of plasma physics60,61,62.

In the present section a cross-focusing effect occurring between two beams with polarizations along the y axis (see Fig. 7.) crossed at 90° is studied. The nonlinear polarization for SPM and XPM can in this case be written as:

$${\tilde{P}}_{NL}^{\mathrm{1/2}} \sim {n}_{2}(|{\tilde{A}}_{\mathrm{1/2}}{|}^{2}+\frac{2}{3}|{\tilde{A}}_{\mathrm{2/1}}{|}^{2}){\tilde{A}}_{\mathrm{1/2}},\,{\rm{with}}\,{n}_{2}={e}_{y}{\chi }_{yyyy}^{\mathrm{(3)}}{e}_{y}{e}_{y}{e}_{y}.$$
(6)

where indices 1 and 2 refer to the interacting pulses.

The simulation of cross-focusing of two Gaussian pulses (10 fs FWHM and 33 μm of waist and wavelength of 800 nm) in 50 μm-thick fused silica plate ($${n}_{2}=3\times {10}^{-20}\frac{{{\rm{m}}}^{{\rm{2}}}}{{\rm{W}}}$$) is performed. The energy of one of the pulses (the pump) is varied while the energy of the second pulse (probe) is set to a constant value of 1 μJ. Figure 8(a) presents the relative change of the beam width at three points located 4, 8 and 12 mm away from the pulse crossing point. Apparently even for such small interaction length (equal to the beam size) a significant change in size by around 10% can be observed 4 mm from the crossing point for the pump energy of 6.4 μJ. The “top view” spatial profile of the probe beam at this location for 0.1 and 6.4 μJ energy of the pump pulse is presented in Fig. 8(b,c) respectively. Figure 9 presents the dependence of the beam waist (spot-size in the beam focus) and the change in the beam divergence for different pump pulse energies.

If the probe pulse was focused into another medium the supercontinuum generation could be optically switched on as in the presence of the pump pulse the spot size change in the beam focus is as high as by 7% giving a 11% increase in the intensity. For energies of the pump pulse above 10 μJ the pump becomes strongly affected by SPM resulting in a distorted probe pulse.

### Highly noncollinear up-conversion

In this section we will present the first ever 3D numerical simulation of the fluorescence up-conversion process under high mutual angles of the fluorescence and gate beams. The search for optimal conditions for efficient and broadband up-conversion in spectroscopy is a subject of a live debate lasting at least since the begining of the century32,33,34,63. Here, broadband type I (ooe) and II (eoe) up-conversion processes in BBO crystal are compared. The fluorescence in range from 500 nm to 900 nm and the gate beam at 1020 nm were selected. For type I interaction the following geometry is assumed: θG = 31.9°, θF = 58° and φ = 90° (deff = 1.4 pm/V) with indexes G and F corresponding to the gate and fluorescence beams, respectively. For type II: θG = 34.3°, θF = 53.4° and φ = 0° (deff = 1.18 pm/V). The mutual beam angles are 26° and 19°, respectively. The gate pulse with a Gaussian temporal profile (100 fs FWHM), energy of 11 μJ in a Gaussian beam was selected to simulate conditions of the experimental setup at IPC PAS63. The beam waist was assumed to be w0 = 0.5 mm. The fluorescence was modelled by a pulse with a super-Gaussian (flat top) spectrum ranging from 510 nm to 865 nm. The fluorescence pulse was chirped with 50 fs2 to the duration of 100 fs (FWHM). Figure 10(a) presents the three interacting pulses (see also Supplementary Movie 2). The up-converted light arises where the gate pulse overlaps with the fluorescence. In the real life experiment the gating pulse has to be tilted to achieve high temporal resolution of the setup. In the present paper pulses without tilt are used to highlight basic properties of new propagation simulation method.

The spectral power density of the up-converted signal (ISum) normalized to the fluorescence spectral power density (IF) for three thicknesses of the BBO crystal and for the two phase-matching types are presented in Fig. 10(b). The inverse of this quantity is the “correction factor”34 for retrieving fluorescence from the up-converted light. The type II phase-matching delivers both: more efficiency and a broad spectrum up-conversion. This confirms its superiority over type I phase-matching for 1020 nm gate and is consistent with results obtained experimentally for 1340 nm gate34. Note, that for the 150 μm crystal the quantum up-conversion efficiency exceeds 0.1.

### Degenerate four-wave mixing

Degenerate four-wave mixing between pulses at 1550 nm in a box car configuration is considered in the present section. Two narrowband pump pulses (super-Gaussian temporal profile of the sixth order with FWHM of 1 ps) are sent into a highly nonlinear 5 mm ZnSe crystal (n2 ≈ 10−18 m2/W64) together with a broadband (super-Gaussian spectral profile of the third order with FWHM of around 100 nm) signal pulse stretched to around 1.1 ps through introduction of 16 000 fs2 of dispersion.

In the case of collinear propagation the different spectral parts of the signal pulse constantly overlap with the two long pump pulses as all the pulses have the same group velocities. Through FWM a fourth (idler) pulse is created. The idler pulse is broadband and its phase is similar to that of the signal pulse with, however, an opposite sign (see inset on Fig. 11(b)). This feature of FWM is well known and it is some times used for phase imprinting especially for purposes of “time lens” realization65,66.

When the box car configuration is used the three input beams cross at the center of the crystal. At this point different temporal parts of the pulses overlap at different times. Fronts of the pulses interact, as they reach the crossing point (where the spatial overlap occurs) first. Then the temporal centers and finally tails of the pulses have the chance to interact.

The situation suggests that, as well as in the collinear situation, all the frequencies will be amplified in a similar way. This is, however, not the case because the idler pulse is generated mostly in the area where the other three beams overlap. Figure 11(a) presents the pulse intensities at the beginning of the sample, the idler pulse (red) appears in the overlap area of the pump (green and blue) and the signal (orange) pulses. At the beginning the front parts of the pulses have the highest overlap, therefore, the red frequencies from the signal pulse are amplified the most. Later on, the center frequency, and finally, the “bluest” frequency of the signal pulse is amplified as the center parts and tails of the pulses interact, respectively. The central parts of the pulses interact for the longest time, therefore, the spectrum of the idler pulse becomes narrower with respect to the collinear case (see Fig. 11(c)). Moreover, the generated idler pulse walks-off spatially in a constant manner as it has its direction defined trough the phase-matching relation (see also the Supplementary Movie 3). As a consequence the idler pulse becomes spatially stretched (see Fig. 11(a)) and tilted–with the pulse front perpendicular to the average propagation direction of all the pulses. Obviously, the overall conversion efficiency of the process will diminish when the angle between the beams is increased, as the interaction length is decreased (see Fig. 11(b) for the energy of the generated idler pulse). A simple non-resonant FWM was considered in our model. A premise, however, exists that the boxcar configuration should be used with caution also in resonant spectroscopic experiments. Proper simulation for particular problems can be performed when the 3D propagation model is combined with an appropriate model for the medium (spectroscopic sample) nonlinear response.

We estimate that performing an analogous simulation with FDTD would require ~3500 times more RAM (above 1 TB) and ~85 times more computational steps, which can translate into days of simulation instead of 1 minute with the current approach.

## Conclusions

We have presented a novel method for numerical simulation of noncollinear pulse propagation and nonlinear interaction. We have found that the method works properly even for the mutual angles between the propagation directions of the interacting pulses as high as 140°, it is limited by unidirectional approximation. The techniques have been tested on linear and nonlinear propagation examples.

We have also presented a novel method for arbitrary 3D roation of the complex electric field. Our method, in comparison to interpolation presents a speed-up on a level of three orders of magnitude.

Our simulations shown that optical switching is possible through cross-focusing even for very short interaction lengths i.e. even in the case of perpendicular pulse routes. We have shown that the type II phase-matching delivers both: more efficiency and a broad spectrum up-conversion and, thus, it is a preferable method for fluorescence up-conversion in BBO. Finally, we have shown that increase of noncolinearity angle in degenerate four-wave mixing experiment can lead to spectral narrowing of the generated signal pulse.

## Methods

### UPPE in rotated coordinates

It is convenient to use three Euler angles and to represent the actual pulse direction as a result of three consecutive rotations: around z′ axis by δ (Rz(δ)), around y′ axis by θ (Ry(θ)) and again around z′ by ϕ (Rz(ϕ)). The “normalized frequency” (κ) and time ($$\zeta =-\,\frac{c\tau }{{n}_{R}}$$) are used to simplify notation. Apart of the rotation, in the propagation model, the moving reference frame would be welcome67,68. The latter also requires the change of variables with “local normalized time” ζ′ = ζ − ηz, where η = c/nRv and v is the velocity of the window. The change of variables is thus defined by, first: rotation and then transition to a moving reference frame:

$$(\begin{array}{c}x^{\prime} \\ y^{\prime} \\ z^{\prime} \\ \zeta ^{\prime} \end{array})=(\begin{array}{cccc}1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & -\eta & 1\end{array})(\begin{array}{cccc}{c}_{\varphi }{c}_{\theta }{c}_{\delta }-{s}_{\varphi }{s}_{\delta } & -{c}_{\varphi }{c}_{\theta }{s}_{\delta }-{c}_{\delta }{s}_{\varphi } & {c}_{\varphi }{s}_{\theta } & 0\\ {s}_{\varphi }{c}_{\theta }{c}_{\delta }+{c}_{\varphi }{s}_{\delta } & -{s}_{\varphi }{c}_{\theta }{s}_{\delta }+{c}_{\delta }{c}_{\varphi } & {s}_{\varphi }{s}_{\theta } & 0\\ -{c}_{\delta }{s}_{\theta } & {s}_{\delta }{s}_{\theta } & {c}_{\theta } & 0\\ 0 & 0 & 0 & 1\end{array})(\begin{array}{c}x\\ y\\ z\\ \zeta \end{array}),$$
(7)

or:

$$(\begin{array}{c}x^{\prime} \\ y^{\prime} \\ z^{\prime} \\ \zeta ^{\prime} \end{array})=(\begin{array}{cccc}{c}_{\varphi }{c}_{\theta }{c}_{\delta }-{s}_{\varphi }{s}_{\delta } & -{c}_{\varphi }{c}_{\theta }{s}_{\delta }-{c}_{\delta }{s}_{\varphi } & {c}_{\varphi }{s}_{\theta } & 0\\ {s}_{\varphi }{c}_{\theta }{c}_{\delta }+{c}_{\varphi }{s}_{\delta } & -{s}_{\varphi }{c}_{\theta }{s}_{\delta }+{c}_{\delta }{c}_{\varphi } & {s}_{\varphi }{s}_{\theta } & 0\\ -{c}_{\delta }{s}_{\theta } & {s}_{\delta }{s}_{\theta } & {c}_{\theta } & 0\\ {c}_{\delta }{s}_{\theta }\eta & -\eta {s}_{\delta }{s}_{\theta } & -\eta {c}_{\theta } & 1\end{array})(\begin{array}{c}x\\ y\\ z\\ \zeta \end{array}).$$
(8)

where a shorted notation: sα = sinα, cα = cosα have been used. With this in mind the derivative ∂z form UPPE can be expressed as:

$${\partial }_{z}={\partial }_{z}x^{\prime} {\partial }_{x^{\prime} }+{\partial }_{z}y^{\prime} {\partial }_{y^{\prime} }+{\partial }_{z}z^{\prime} {\partial }_{z^{\prime} }+{\partial }_{z}\zeta ^{\prime} {\partial }_{\zeta ^{\prime} }={c}_{\varphi }{s}_{\theta }{\partial }_{x^{\prime} }+{s}_{\varphi }{s}_{\theta }{\partial }_{y^{\prime} }+{c}_{\theta }{\partial }_{z^{\prime} }-\eta {c}_{\theta }{\partial }_{\zeta ^{\prime} }$$
(9)

which, after Fourier transform (∂ζ → −′, ∂x′/y → ikx′/y) becomes:

$${\partial }_{z}=i{c}_{\varphi }{s}_{\theta }{k}_{x^{\prime} }+i{s}_{\varphi }{s}_{\theta }{k}_{y^{\prime} }+{c}_{\theta }{\partial }_{z^{\prime} }+i\eta {c}_{\theta }\kappa ^{\prime}$$
(10)

To simplify the notation in this section UPPE (Eq. (1)) is written in the contracted form:

$${\partial }_{z}{A}_{s}^{p}=i{K}^{p}{A}_{s}^{p}+i{Q}^{p}{P}^{NL,p}$$
(11)

with:

$${K}^{p}(\kappa ,{k}_{x},{k}_{y})={k}_{z}^{p}(\frac{c\kappa }{{n}_{R}}+{\omega }_{R},{k}_{x},{k}_{y}),\,{Q}^{p}(\kappa ,{k}_{x},{k}_{y})=\frac{{(\frac{c\kappa }{{n}_{R}}+{\omega }_{R})}^{2}}{2{\varepsilon }_{0}{c}^{2}{k}_{z}^{p}(\frac{c\kappa }{{n}_{R}}+{\omega }_{R},{k}_{x},{k}_{y})}{e}_{s}^{p},$$
(12)

and PNL,p = epPNL where the envelope defined by Eq. (3). was used.

With the above described variable change rotated UPPE yields:

$${\partial }_{z^{\prime} }{A^{\prime} }_{s}^{p}=i\frac{1}{{c}_{\theta }}({K^{\prime} }^{p}-\eta {c}_{\theta }\kappa ^{\prime} -{s}_{\theta }{c}_{\varphi }{k^{\prime} }_{x}-{s}_{\theta }{s}_{\varphi }{k^{\prime} }_{y}){A^{\prime} }_{s}^{p}+\frac{1}{{c}_{\theta }}{Q^{\prime} }^{p}{P^{\prime} }^{NL,p}$$
(13)

where $${K^{\prime} }^{p}/{Q^{\prime} }^{p}/{A^{\prime} }_{s}^{p}/{P^{\prime} }^{NL,p}(\kappa ^{\prime} ,{k}_{x^{\prime} },{k}_{y^{\prime} })$$ represent rotated $${K}^{p}/{Q}^{p}/{A}_{s}^{p}/{P}_{NL,p}$$. The procedure for calculation of these variables is descried in next sections.

### Transitions between coordinate systems

In multi-pulse propagation simulations it is convenient to set the direction of one of the beams with respect to the simulation coordinates system and to define the direction of all the pulses with respect to the (often birefringent) crystal. We therefore give expressions for transitions between the three coordinate systems:

$$({{\bf{x}}}_{B},{{\bf{y}}}_{B},{{\bf{z}}}_{B})={R}_{BS}\,({{\bf{x}}}_{S},{{\bf{y}}}_{S},{{\bf{z}}}_{S}),$$
(14)
$$({{\bf{x}}}_{C},{{\bf{y}}}_{C},{{\bf{z}}}_{C})={R}_{CS}\,({{\bf{x}}}_{S},{{\bf{y}}}_{S},{{\bf{z}}}_{S}),$$
(15)
$$({{\bf{x}}}_{B},{{\bf{y}}}_{B},{{\bf{z}}}_{B})={R}_{BC}\,{\boldsymbol{(}}{{\bf{x}}}_{C},{{\bf{y}}}_{C},{{\bf{z}}}_{C}),$$
(16)

where (xB, yB, zB), (xC, yC, zC), (xS, yS, zS) represents the system of the beam, crystal and simulation, respectively.

If the orientation of the beam with respect to the simulation coordinate system is known and defined by the angles ϕB, θB and δB, then:

$${R}_{BS}={R}_{{z}_{S}}({\varphi }_{B}){R}_{{y}_{S}}({\theta }_{B}){R}_{{z}_{S}}({\delta }_{B}),$$
(17)

where Ra(α describes rotation about axis a by an angle α. If the orientation of the crystal with respect to the simulation coordinate system is known and defined by the angles ϕC, θC and δC, then:

$${R}_{CS}={R}_{{z}_{S}}({\varphi }_{C}){R}_{{y}_{S}}({\theta }_{C}){R}_{{z}_{S}}({\delta }_{C}),$$
(18)

and:

$${R}_{BC}={R}_{CS}^{-1}{R}_{BS}$$
(19)

Note that RBS and RCS represent rotations around axes of a system (xS, yS, zS) in which the very matrices are represented (xS, yS, zS form a identity matrix), thus, they have a simple classical form, similar to that of the upper left part of second transformation matrix in Eq. (7).

If, on the other hand, the crystal orientation with respect to the simulation frame (ϕC, θC and δC) and the beam orientation with respect to the crystal orientation (ϕ, θ and δ) is known, then:

$${R}_{BS}={R}_{{z}_{C}}(\varphi ){R}_{{y}_{C}}(\theta ){R}_{{z}_{C}}(\delta ){R}_{CS}.$$
(20)

Note that the matrix $${R}_{{z}_{C}}(\varphi ){R}_{{y}_{C}}(\theta ){R}_{{z}_{C}}(\delta )$$ has a much more complicated form. This is a fact, as (xC, yC, zC) are not in general parallel to the unit vectors defining the simulation coordinate systems. These vectors are, however, available as the columns of RCS matrix and the form of the matrix defining a rotation about an arbitrary vector can be found in reference69.

Finally, if beam propagation direction (ϕB, θB, δB) and the material orientation with respect to the beam (ϕ, θ, δ) are known:

$${R}_{CS}={({R}_{{z}_{B}}(\varphi ){R}_{{y}_{B}}(\theta )R{z}_{B}(\delta ))}^{-1}{R}_{BS},$$
(21)

and the directions of (xB, yB, zB) are the columns of RBS matrix. The matrix RBS and angles ϕ, θ, δ are required for calculation of the linear propagation phase term K'p as will be shown in the next section. The angles ϕ, θ, δ can be obtained from RBC70.

### Calculation of the linear phase term

The iterative procedure of calculating $${k}_{z}^{p}$$ for forward pulse propagation has been described before1. In case of noncollinear propagation the procedure differs slightly.

First for each set of discrete simulation coordinates (κ′, kx, ky) a corresponding set of coordinates from the beam reference frame (κ, kx, ky) is calculated through rotation Eq. (14). This operation creates 3 matrices of κ, kx and ky values, each numbered by κ′, kx and ky.

Then, matrices of θ(κ, kx, ky) and ϕ(κ, kx, ky)–angles defining the direction of propagation of each of the plane waves (defined by κ, kx, ky) with respect to the crystal orientation are initialized to given values defining the general beam direction (θ0 and ϕ0). Again θ and ϕ are matrices numbered by κ′, kx and ky, the corresponding values of κ, kx and ky are, however, known from the previous step.

At this point the iterative method starts. Based on the values of θ and ϕ the values of refractive index np = np(, θ, ϕ) are calculated (where $$\tilde{\omega }=\frac{c\kappa }{{n}_{R}}+{\omega }_{R}$$ is the optical frequency). The refractive index can be calculated from Sellmeier formula and properties of refractive index ellipsoid71. Finally the length of the wavevector ($$|{{\bf{k}}}^{p}|=\frac{\tilde{\omega }{n}_{p}}{c}$$) and the linear phase term ($${k}_{z}^{p}=\sqrt{|{{\bf{k}}}^{p}{|}^{2}-{k}_{x}^{2}-{k}_{y}^{2}}$$) can be calculated.

Now a set of wavevectors: $$({k}_{x},{k}_{y},{k}_{z}^{p})$$ for all the plane waves describing the pulse in the beam reference frame is known. With the inverse of rotation Eq. (16) it is transformed to the crystal coordinate system $$({k}_{x}^{C},{k}_{y}^{C},{k}_{z}^{pC})$$.

The values of θ and ϕ can now be updated through:

$$\theta =\arctan (\frac{\sqrt{{k}_{x}^{C2}+{k}_{y}^{C2}}}{{k}_{z}^{pC}}),\,\varphi =\arctan (\frac{{k}_{y}^{C}}{{k}_{x}^{C}})$$
(22)

and the next iteration can be started.

The iteration can be stopped when the change of $${k}_{z}^{p}$$ value becomes negligible (around 20 iterations are sufficient to obtain relative accuracy of 10−13 for standard birefringent materials).

### Calculation of nonlinear coefficients

The coefficient matrix Qp(κ′, kx,ky) can already be calculated from Kp and Eq. (12). We will describe the treatment of the nonlinear coefficient PNL,p on the example of second order nonlinearity.

Note first that in absence on nonlinearity birefringent media electric field vector of a particular mode (o, e, s, f) is uniquely defined by , kx and ky. The procedure for finding epC–the electric field vector direction in the crystal coordinate system–is known72,73 and implemented in Hussar software31. We will assume that this vector does not change due to nonlinearity. We have verified that, for a “worst case crystal” with linear properties of highly birefringent YVO 4, and Kerr constant of 10−18 m2/W characteristic for highly nonlinear ZnSe64 illuminated with intensity of 3400 GW/cm2–the damage threshold intensity of highly resistant BBO crystal for 25 fs pulses at wavelength of 800 nm73, the actual change of ee components is around 1%. For more common conditions this error will not exceed the one coming from currently obtainable accuracy of the refractive index measurements–10−4 (see refractive index measuremnt references in73). Total electric field can, therefore, be decomposed into:

$${{\bf{E}}}^{C}={{\bf{E}}}^{pC}+{{\bf{E}}}^{qC}=|{{\bf{E}}}^{p}|{{\bf{e}}}^{pC}+|{{\bf{E}}}^{q}|{{\bf{e}}}^{qC}$$
(23)

with pq. Moreover, $${f}_{x}^{p}(\omega ,{k}_{x},{k}_{y})=|{{\bf{E}}}^{p}|/{E}_{x}^{p}=\mathrm{1/}{e}_{x}^{p}$$ and corresponding $${f}_{y}^{p}$$ is also uniquely defined. Therefore, one can write:

$${{\bf{E}}}^{C}={f}_{r}^{p}{E}_{r}^{p}{{\bf{e}}}^{pC}+{f}_{s}^{q}{E}_{s}^{q}{{\bf{e}}}^{qC}$$
(24)

with r, s = x, y. Then, for a medium with second order nonlinearity characterized by nonlinear suscetibility χ(2):

$${P^{\prime} }^{NL,p}={{\bf{e}}}^{pC}{{\bf{P}}}^{NL}={{\bf{e}}}^{pC}{\varepsilon }_{0}{\chi }^{\mathrm{(2)}}{{\bf{E}}}^{C}{{\bf{E}}}^{C}={\varepsilon }_{0}{{\bf{e}}}^{pC}{\chi }^{\mathrm{(2)}}({f}_{r}^{p}{E}_{r}^{p}{{\bf{e}}}^{pC}+{f}_{s}^{q}{E}_{s}^{q}{{\bf{e}}}^{qC})({f}_{r}^{p}{E}_{r}^{p}{{\bf{e}}}^{pC}+{f}_{s}^{q}{E}_{s}^{q}{{\bf{e}}}^{qC})$$
(25)

Which for SHG (q + q → p) and SFG (p + q → p) becomes:

$$=\,{\varepsilon }_{0}{\chi }_{{\rm{deff}}}(\omega ,{k}_{x},{k}_{y}){f}_{s}^{q2}\,{E}_{s}^{q2},\,\,\,\,{\chi }_{{\rm{deff}}}={{\bf{e}}}^{pC}{\chi }^{\mathrm{(2)}}{{\bf{e}}}^{qC}{{\bf{e}}}^{qC}$$
(26)

and

$$=\,2{\varepsilon }_{0}{\chi }_{{\rm{deff}}}(\omega ,{k}_{x},{k}_{y}){f}_{r}^{p}{f}_{s}^{q}\,{E}_{r}^{p}{E}_{s}^{q},\,{\chi }_{{\rm{deff}}}={{\bf{e}}}^{pC}{\chi }^{\mathrm{(2)}}{{\bf{e}}}^{pC}{{\bf{e}}}^{qC},$$
(27)

respectively, where χeff(ω, kx, ky) represents the effective nonlinear coefficients. We have verified that for a set of 18 nonlinear crystals (including most popular like BBO, BiBO and LBO) if the x axis is selected along the polarization vector for the wave propagating exactly along the z axis ($$\hat{x}\parallel {{\bf{E}}}^{p}({\tilde{\omega }}_{532{\rm{nm}}},{k}_{x}=\mathrm{0,}\,{k}_{y}=\mathrm{0)}$$) the deviation of coefficients $${f}_{s}^{p}$$ from unity is less than 10−2 for Gaussian beams with waist above 1.4 μm at 532 nm (divergence of ~7°) and less than 10−3 for beam widths above 4.5 μm (divergence of ~2°). Note, therefore, that in practice it is safe to assume $${f}_{s}^{p}=1$$ as current methods of measurements of χ(2) (or the experimentalists d tensor) give results with accuracy of 5–10% at best73.

### Arbitrary Fourier rotation

Here, we describe a convenient way of rotating an arbitrarily shaped pulse without the use of interpolation which is erroneous and time consuming when applied to a 3D case. The inspiration for the method comes from the raster image rotation well known in computer graphics74 and its less known implementation with Fourier transform45. The method is based on shear operation which can be performed through 1D Fourier transformation of the electric field E(x, y, ζ) to a mixed space, multiplication by a phase factor and back transformation to (x, y, ζ). The phase factor has to depend linearly on the Fourier space variable as well as on one of the remaining real variables. Two shear operations are required for single rotations. The definitions of the operators and corresponding Fourier transform operations are listed in the Table 1.

A traditional 3D rotation can be constructed from 3 rotations: Rz(φ)Ry(θ)Rz(δ). The three Euler angles corresponding to consecutive rotations around z, y and z axes (by δ, θ and φ, respectively) are used to achieve complete freedom of pulse manipulation. Figure 4 presents an example - a temporal Gaussian pulse with Hermite-Gaussian spatial mode has been propagated in a linear regime after rotation. In our case, however, another construction will also have to be used i.e.: Rz(φ′)Rx(θ′)Rz(δ′) with second rotation performed around x axis. Therefore, the above described rotation can be constructed in the following way:

$$E^{\prime} (x,y,\zeta )=\mathop{\underbrace{{T}_{{k}_{x},y}({g}_{\phi })\,{T}_{{k}_{y},x}({f}_{\phi })}}\limits_{{R}_{z}(\phi )}\,\,\mathop{\underbrace{{T}_{{k}_{x},z}(d)\,{T}_{{k}_{z},x}(a)}}\limits_{{R}_{y}(\theta )}\,\,\mathop{\underbrace{{T}_{{k}_{x},y}(g)\,{T}_{{k}_{y},x}(f)}}\limits_{{R}_{z}(\delta )}SE(x,y,\zeta )$$
(28)

with the parameter values as follows:

$$\begin{array}{c}{S}_{x}={({c}_{\theta }{c}_{\varphi }{c}_{\delta })}^{-1},\,{S}_{y}={c}_{\varphi }{c}_{\delta },\,{S}_{z}={c}_{\theta },\,a=-\,{c}_{\theta }{s}_{\theta }{c}_{\varphi },\,d={s}_{\theta }{({c}_{\theta }{c}_{\varphi })}^{-1},\\ f={c}_{\theta }{c}_{\varphi }^{2}{c}_{\delta }{s}_{\delta },\,g=-\,{s}_{\delta }{({c}_{\theta }{c}_{\varphi }^{2}{c}_{\delta })}^{-1},\,{f}_{\phi }={c}_{\varphi }{s}_{\varphi },\,{g}_{\phi }=-{s}_{\varphi }{({c}_{\varphi })}^{-1}\end{array}$$
(29)

or

$$E^{\prime} (x,y,\zeta )={T}_{{k}_{x},y}({g}_{\phi })\,{T}_{{k}_{y},x}({f}_{\phi })\,\mathop{\underbrace{{T}_{{k}_{z},y}(j)\,{T}_{{k}_{y},z}(h)}}\limits_{{R}_{x}(\theta )}\,{T}_{{k}_{x},y}(g)\,{T}_{{k}_{y},x}(f)\,S\,E(x,y,\zeta )$$
(30)

with:

$$\begin{array}{c}{S}_{x}={({c}_{\varphi }{c}_{\delta })}^{-1},\,{S}_{y}={c}_{\theta }{c}_{\varphi }{c}_{\delta },\,{S}_{z}={c}_{\theta }^{-1},\,j=-\,{c}_{\theta }{s}_{\theta }{c}_{\varphi },\,h={s}_{\theta }{(\cos \theta {c}_{\phi })}^{-1},\\ f={c}_{\theta }{c}_{\phi }^{2}{c}_{\delta }{s}_{\delta },g=-{s}_{\delta }{({c}_{\theta }{c}_{\phi }^{2}{c}_{\delta }{s}_{\delta })}^{-1},\,{f}_{\phi }={c}_{\varphi }{s}_{\varphi },\,{g}_{\phi }=-{s}_{\varphi }{({c}_{\varphi })}^{-1}\end{array}$$
(31)

It is worth to note that the order of the shear operations for each rotation can be reversed (eg.: $${T}_{{k}_{x},y}(g)\,{T}_{{k}_{y},x}(f)\to {T}_{{k}_{y},x}(f^{\prime} )\,{T}_{{k}_{x},y}(g^{\prime} ))$$. In such a case, however, the parameters for the shears as well as for the scaling, have to be recalculated. The calculation of the parameters Eqs (29) and (31) have been performed with help of a symbolic Matlab tool. The rotation procedure have been verified, first on the geometrical object (cuboid) with use of matrix operations (see second column in Table 1), then on the 3D matrix representing electric field with use of the Fourier transform operations.

The use of Fourier transform for rotations is limited to around 60° 45. Any rotation by angle α > 45° can be, however, decomposed into a trivial 90° rotation and rotation by 90° − α. Thus, to perform rotations by higher angle values we propose following procedure:

• if φ [0°, 45°] ∩ [315°360°[use standard rotation (second rotation around y axis).

• if φ [45°, 135°] set φ → φ − 90°, δ → δ + 90°, θ → −θ, perform second rotation around x axis.

• if φ [135°, 225°] set φ → φ − 180°, θ → −θ, use standard rotation.

• if φ [225°, 315°] set φ → φ − 270°, δ → δ + 270°, perform second rotation around x axis.

possibly, through adding or subtracting 360° bring δ back into a [0°, 360°[range, then:

• if δ [0°, 45°] ∩ [315°360°[use E − y, x, ζ)

• if δ ]45°, 135°] replace E(x, y, ζ) with E(−y, x,ζ)

• if δ ]135°, 225°] replace E(x, y, ζ) with E(−x, −y, ζ)

• if δ ]225°, 315°] replace E(x, y, ζ) with E(y, −x, ζ)

The scheme presented above can be used for θ [0°, 60°]. For θ > 60° a scheme involving 90° rotations in the x − ζ and y − ζ plane would be required. Small corner regions of the rotated surface are affected by artifacts arising from the periodic nature of the Fourier transform algorithm45. This, however, is a minor concern when the electric field is concentrated in the center of the x − y − ζ plane, which is the usual case with an optical pulse.

The speed advantage of the Fourier transform base rotation with respect to the 3D interpolation (MATLAB’s griddata function) for different grid sizes is presented in Fig. 3. The comparison have been performed on a single “interlagos” class node of the Hydra cluster of the Interdisciplinary Centre for Mathematical and Computational Modelling. For each grid size 10 rotations were performed. For a grid with a size of 4096 × 256 × 256 the rotation through interpolation takes 20 hours, while the same rotation performed with Fourier transform approach takes around 46 seconds (around 1500 times faster).