Addressing nonlinearities in Monte Carlo

Monte Carlo is famous for accepting model extensions and model refinements up to infinite dimension. However, this powerful incremental design is based on a premise which has severely limited its application so far: a state-variable can only be recursively defined as a function of underlying state-variables if this function is linear. Here we show that this premise can be alleviated by projecting nonlinearities onto a polynomial basis and increasing the configuration space dimension. Considering phytoplankton growth in light-limited environments, radiative transfer in planetary atmospheres, electromagnetic scattering by particles, and concentrated solar power plant production, we prove the real-world usability of this advance in four test cases which were previously regarded as impracticable using Monte Carlo approaches. We also illustrate an outstanding feature of our method when applied to acute problems with interacting particles: handling rare events is now straightforward. Overall, our extension preserves the features that made the method popular: addressing nonlinearities does not compromise on model refinement or system complexity, and convergence rates remain independent of dimension.

Time-average of chemical conversion rate at some earth location p  Figure 1 | Solar thermochemical reduction of zinc oxide: conversion rate over years. a, Solar-driven high temperature thermochemical cycles processes, commonly based on metal oxides reduction, are an alternative to fossil fuel-based method for H 2 generation. Their practical interest depends however upon their lifetime average productivity. Here we focus on thermal reduction of zinc oxide, as the first part of a two step water splitting cycle. Photons emitted from the sun are reflected on heliostats and concentrated at the entrance of the chemical reactor in which ZnO dissociation is carried out. Solar power Z Y absorbed by the reactor at a given instant Y of lifetime determines the non linear chemical conversion rate f (Z Y ) of the reaction ZnO → Zn + 1 2 O. b, Here we address the estimation of the annual solar-plant's conversion rate C(p) at different earth locations p, by averaging the instantaneous conversion rate f (Z Y ) over the statistics of sun position and incident Direct Normal Irradiance (DNI, which fluctuates with weather). The Monte Carlo estimation combines the thermochemical knowledge of the non-linear kinetics of zinc oxide dissociation with the description of radiative transfer from the sun to a multiplereflection central receiver solar plant. The instantaneous thermal-power Z Y collected by the receiver at moment Y of the year is the average of the contribution X|Y of sun emissions along every optical paths. This solar power Z Y received at the entrance of the chemical reactor is used to reduce the zinc oxide with a thermochemical conversion rate f (Z Y ). c, Two typical days showing the received sun power (DNI, Direct Normal Irradiance) which yields Z Y after concentration by heliostats, and the conversion rate f (Z Y ); the corresponding conversion rate from DNI to f (Z Y ) is indicated in the inset. d, Estimated solar-plant's annual conversion rate in three earth locations.

SI1 -Solar thermochemical reduction of zinc oxide averaged over the year
The problem (see Extended Data Figure 1). Solar-driven high temperature thermochemical cycles processes, commonly based on metal oxides reduction 1,2 , are an alternative to fossil fuelbased method for H 2 generation. Here we focus on thermal reduction zinc oxide, as the first part of a two step water splitting cycle. Photons emitted from the sun are reflected on heliostats and concentrated at the entrance of the chemical reactor in which ZnO dissociation is carried out. Solar power received by the reactor at a given instant determines the chemical conversion rate of the reaction ZnO → Zn + 1 2 O. The present Monte Carlo estimation of the solar-plant's annual conversion rate C associates the thermochemical knowledge of the non-linear kinetics of zinc oxide dissociation 3,4 to the description of radiative transfer on a multiple-reflection solar receiver 5 . The nonlinearity lies in the instantaneous coupling between photon transport and zinc-oxide reduction.
The random variable X is the contribution of an optical path from the sun to the entrance of the chemical reactor. Its expectation E X|Y (X|Y) is the instantaneous thermal-power fraction collected at a given moment Y of the year 6 . Solar power is used to reduce the zinc oxide with a nonlinear conversion rate f (E X|Y (X|Y)) (see 3,4 ).
Non-Linear Monte Carlo formulation The annual conversion rate C is reformulated as its Tay where the B X i |Y are Bernoulli random variables: In the end, the reformulation is: where the Monte Carlo weight function w is with u ∈ N and C 1 , C 2 , C 3 , C 4 , α and x 0 known constants.

Algorithm
Step 1 Uniform sampling of a moment y of the year and index initialisation n = 1.
Step 2 Sampling of the n-th optical path contributing to thermal-power collection at instant y and computation of its contribution x n (as presented in 5, 6 ).
Step 3 Computation of the probability P n in Eq. 2 and sampling of a realisation b n of the Bernoulli random variable B Xn|Y .
Step 4 If b n = 0, the algorithm loops to step 2 after incrementation of n. Else, the procedure is terminated and the weightŵ(n) computed according to Eq. 4. Figure 1 are obtained for a 1 MW solar plant with a 80 m high central receiver tour and 160 heliostats arranged in a radial stagered layout (nueen method). Values of the conversion-rate parameters C 1 , C 2 , C 3 , C 4 and α are given in 3,4 .

Simulated configuration Results shown in Extended Data
Meteonorm DNI database. Differential scattering cross-section averaged over scatterer's orientations Differential scattering cross-section Extended Data Figure 2 | Wave scattering by a complex-shaped and optically-soft scatterer (cyanobacterium Arthrospira). a, An incident plane wave with propagation direction e i and wave number k is scattered by the cyanobacterium. The cyanobacterium is large compared to 1/k and has low relative refractive index (optically-soft scatterer). Therefore, the scalar wave approximation is used : the field Ψ resulting from the interaction between the incident wave and the scattering potential U is solution of the scalar wave equation (∇ 2 + k 2 − U )Ψ = 0. This wave scattering problem is identical to that of high-energy potential scattering studied by L.I. Schiff in the context of quantum mechanics. Under Schiff's approximation, the complex scattering-amplitude S Y (far-field region) in the forward directions depends on the scatterer orientation Y through its projected surface on the plane of the incident direction e i . b, Here we address the estimation of W (e s ) the single-scattering differential cross-section in direction e s for a scatterer ensemble, by averaging the differential scattering cross-section f (S Y ) = |S Y | 2 over the statistics of orientations Y (independent scattering regime). The Monte Carlo estimation of W (e s ) = E Y (f (S Y )) at small scattering angles θ combines the description of waves Ψ propagation with the non-linear formulation of the power that they carry |Ψ| 2 (Poynting vector magnitude). The scattering amplitude S Y results from the interference of secondary waves contributions X|Y originating from the projected surface of the scatterer with orientation Y. This scattering amplitude determines the differential cross-section f(S Y ) for that orientation. c, For a given direction e s (θ = 2 • ), the scatterer orientation can greatly affect the wave shape, resulting in huge variations of the transmitted power after scattering (e.g. by five orders of magnitude from α = 140 • to α = 170 • ). d, After averaging over an isotropic orientation distribution, the crosssection depends only on θ.

SI2 -Wave scattering
This example is fully detailed in Charon et al. 2016 7 .
The problem (see Extended Data Figure 2). Here we address the solution of Schiff's approximation 8 , also known as the anomalous diffraction approximation 9 , for an incident plane wave with propagation direction e i and wave number k scattered by the scattering potential U that takes value U in inside a large domain compared to 1/k and 0 outside (this domain corresponds to the shape of the where b is the unit vector along the projection of e s on the plane containing D P|Y and A Y is the area of D P|Y . Given the orientation Y of the scatterer, the conditional expectation S Y (e s ) = is the complex scattering amplitude in direction e s 7-10 . For the scatterer orientation Y, the far-field scattering diagram is given by the differential scattering cross- We address the Monte Carlo computation of W (e s ) = E Y (Ŵ Y (e s )) which isŴ Y (e s ) averaged over the statistics of the scatterer orientation Y 9, 10, 12 . The full expression is then where the configuration spaces are the orientation vectors of the scattering potential with respect to e i (D Y ) and the projected surface of the scattering potential seen from the incident direction e i (D P|Y ).
Non-Linear Monte Carlo formulation W (e s ) is reformulated based on the definition of two independent and identically distributed location random-variables P 1 |Y and P 2 |Y: where the Monte Carlo weight function w is: with X Pq|Y (e s ) defined in Eq. 5.
Algorithm The sampling procedures of the Monte-Carlo algorithm are then: Step 1 Isotropic sampling of an orientation y of the scattering potential.
Step 2 Uniform sampling of the first location p 1 on the projected surface defined by y and computation of the corresponding crossing length l(p 1 ): the realisation x 1 of X P|Y (e s ) is computed according to Eq. 5.
Step 3 Uniform sampling of the second location p 2 on the projected surface defined by y and computation of the corresponding crossing length l(p 2 ): the realisation x 2 of X P|Y (e s ) is computed according to Eq. 5.
Step 4 Computation of the weightŵ = w(p 1 , p 2 , e s ) according to Eq. 8:ŵ = x 1 x 2 + Biomass growth-rate averaged over the culture volume for biomass concentration C Extended Data Figure 3 | Phytoplankton growth in lightlimited environments. a, Phytoplankton is put to grow in a continuous stirred tank photobioreactor, an enclosed and perfectly controlled environment insuring optimal pH and temperature conditions, as well as non-limiting CO 2 and minerals supplies to micro-algae: photosynthesis is only light-limited. Light is provided by 979 light-diffusing optical fibres F immersed within the phytoplankton culture. The fibres insure a quasi-uniform light-flux density on the totality of their surface. This diluted light-input triggers an artificially sustained algal bloom with high photosynthetic efficiency and high biomass growth-rate R. The local rate of photon absorption A Y at location Y determines the non-linear photosynthetic-response f(A Y ) of cells at that location. b, Here, we address the Monte Carlo estimation of R(C) the biomass growth-rate in the culture volume as a function of biomass concentration C, by av-eraging the local growth-rate f (A Y ) over locations in the volume. The Monte Carlo estimation of R(C) = E Y (f (A Y )) combines the available knowledge of the non-linear photosynthetic growth-rate of a single cell to the description of radiative transfer within the multiple-scattering and absorbing micro-algae suspension with concentration C. The rate of photon absorption A Y at location Y is the average of the contributions X|Y of every multiple-scattering optical paths from fibres to Y. c, Photons absorbed by a phytoplankton-cell are non linearly converted within the photosynthetic units and the Z-scheme, leading to a spatial profile of the biomass growth-rate f(A Y ). d, The full-tank growth-rate is shown to depend upon the biomass concentration and indicates the optimal concentration allowing the largest biomass production rate. We note that R(C) is usually denoted by < r x > (C x ) in the photobioreactor literature.

SI3 -Phytoplankton growth in light-limited environments
The problem (see Extended Data Figure  is the rate of photon absorption at location Y, for the biomass concentration C.
Photons absorbed by a micro-algae are converted within the photosynthetic units and the Z-scheme 17,18 , leading to the local biomass growth-rate 11, 14 where α, β, K and K r are constant parameters that depend on the studied microorganism (see 14 for values in the case of Chlamydomonas Reinhardtii).
Due to light absorption and scattering by micro-algae, the field A Y is heterogeneous within the volume (less light farther from the fibres and for higher concentrations) and so is the local photosynthetic rate r Y (C). We address the Monte Carlo computation of R(C) = E Y (r Y (C)), the local photosynthetic rate r Y (C) averaged over locations Y of the culture volume. The full expression is then Then, we expand the non-linear function in Eq. 9 around x 0 chosen as an upper bound of X i (C)|Y.
Finally, the infinite power series is statistically formulated thanks to the discrete random variable N the order of Taylor expansion: where the B X i (C)|Y are Bernoulli random variables: In the end, the reformulation is: where the Monte Carlo weight function w is with u ∈ N and α, K, β, K r , x 0 and C known constants.

Algorithm
Step 1 Uniform sampling of a location y within the culture volume and index initialisation n = 1.
Step 2 Sampling of the n-th optical path contributing to absorption at y: the realisation x n of X Γn(C)|Y (C) is computed according to 15 .
Step 3 Computation of the probability in Eq. 12, P = x n /x 0 , and sampling of a realisation b n of the Bernoulli random variable B Xn(C)|Y .

Simulated configuration Results shown in Extended Data
Optical thickness from altitude H Y to TOA at frequency ν Y along Γ Y X|Y Contribution to τ Y of one absorption-transition L X (T X , P X , C X , ν Y ) at altitude H X c, Dealing with the non-linearity of Beer extinction is the leading question of band-average radiative transfer. There is a double difficulty : i) because of sharp absorption-lines, even in the narrowest bands, the optical thickness varies of orders of magnitude with frequency (compare τ Y at three frequencies, c-1, c-2, c-3); ii) because of the variations of pressure and composition with altitude, the spatial dependance of τ Y is difficult to handle, and so is the diversity of its exponential translation : f (τ Y ) ≈ 1 − τ Y in c-3 whereas it is highly non-linear c-1. d, When applied to a typical state of earth atmosphere, our NLMC algorithm addresses successfully the simulation of band-averaged outgoing IR-radiation (zenith angle, ∆ν = 25cm −1 ) without the need of pre-processing the state-transition database (here Hitran) : for each sampled path, several state-transitions are sampled in the database, depending on the sampled order of the Taylor expansion of the exponential (using the null-collision approach). In red, the band-averaged specific intensity for several bands covering the whole infrared ; in blue, the part of this intensity due to photons emitted at the surface (close to red at frequencies where the atmosphere is nearly transparent).

SI4 -Atmospheric radiative transfer: top-of-atmosphere specific intensity (from earth toward the outer space)
This example is fully detailed in Galtier et al. 2015 19 . S Y is the source associated with the emission Y (Planck's function at the local temperature times an emissive power that depends on concentration, pressure and temperature). The sum over the N t transitions j x is thus not handled by standard Monte Carlo algorithms, despite the fact that this sum has all the features inviting to make use of statistical approaches: N t is huge, typically of the order of 10 6 , and the deterministic pre-calculations can be computationally very demanding (and are to be re-performed when testing each new spectroscopic assumption). No attempt has been made so far to address this sum statistically, together with the photon-path statistics because these two statistics are combined via the non-linearity of the exponential extinction. Step 1 sample a frequency,

Non-Linear Monte Carlo formulation
Step 2 sample an emission-altitude, Step 3 sample an order n of the development of the exponential, Step 4 sample n paired absorption-altitudes, transitions.
Null-collision reformulation Instead of implementing this solution in this straightforward manner, we chose to use null-collisions 20 . The same quantities are sampled, but the order is different, leading to quite intuitive physical pictures. The first step is still to sample a frequency, but then we implement a backward tracking algorithm, starting from the observation altitude. We sample a first collision-altitude as if photons were emitted at the observation location in the downward direction in an homogeneous atmosphere. Then we sample a transition and a statistical test is made to determine whether the algorithm stops at this altitude or continues (a Bernoulli test to determine whether the collision is a true one or a null-collision). If it continues, this first collision-altitude and first transition have the status of one of the n absorption-altitudes and transitions in the above presented algorithm. Then a next collision-location is sampled, together with a next transition, etc.
When the algorithm stops after n collisions, the final altitude is interpreted as the emission-altitude.
So n and the emission-altitude are not sampled first: they are sampled by the successive Bernoulli tests as in a standard backward-tracking multiple-scattering algorithm.

The final algorithm
Step 1 Initialisation of the current altitude at H x ← T OA.
Step 2 Uniform sampling of a frequency ν y in the considered infrared band (or the whole infrared).
Step 3 Exponential sampling of a travelled distance d before absorption for a photon of frequency ν y travelling from H x in the backward direction within a virtual homogeneous atmosphere of absorption coefficientk νy .
Step 4 If the travelled distance leads the photon to hit the surface, H x ← 0 and the algorithm jumps to Step 7. Otherwise H x ← H x − d.
Step 6 Bernoulli trial of probability P a = h νy ,jx kν y P(jx) to decide whether the algorithm jumps to Step 7 or Step 3.
Step 7 H y ← H x and the algorithm stops withŵ = ∆νB y where B y is the Planck function at frequency ν y for the atmospheric temperature at H y . Fig. 4 have been obtained using the HITRAN spectroscopic database. All other parameters are given in Galtier et al. 2015 19 .

Simulated configuration The simulation results in Extended Data
Computational performance As far as computational costs are concerned, our computations did not require to first scan HITRAN and pre-compute absorption coefficients at all altitudes and all frequencies before running the Monte Carlo code for transfer since our Monte Carlo code handles both simultaneously despite the nonlinearity of the exponential extinction. The computational benefit is very significant when studying the effects of new spectroscopic data or new line-shape assumptions. Otherwise, for earth applications and fixed spectroscopic assumptions, there is no problem associated to the pre-computation of absorption coefficients and to their tabulation as function of molecular composition and thermodynamic state. But for combustion or astrophysics applications, thinking in particular of exoplanets, the diversity of compositions is extremely wide, temperatures can by high, imply the use of much larger spectroscopic databases, including hot lines, and pre-computation/tabulation of absorption coefficients is a today's challenge in itself.
Our nonlinear Monte Carlo suppresses this need 19 .

SI5 -Gas kinetics
The problem We consider a gas of interacting particles, with collisions following Maxwell model, and a cross section σ( c, c * ) = κ 4π c− c * . Particles are confined within an harmonic static trap of pulsation ω so that acceleration at position r is spring-like: a( r) = −ω 2 r.
We follow the distribution function f ( r, c, t) at location r, velocity c at time t. We consider it known at some time t I , either constrained at local equilibrium, according to: or, out of equilibrium, according to: where n I denotes density, u I denotes the mean velocity and c q,I denotes the mean square speed at r at time t I ; p N (µ,σ 2 ) denotes probability density function of a Gaussian random variable, with expectation µ and variance σ 2 .
For the case in fig. 3a, we set ω = 0, and n I , u I and c q,I are set homogeneous, corresponding to the case studied par Krook and Wu 21,22 . Starting from f BKW , the gas relaxes to equilibrium f LEQ .
Adimensional time is κn I t and adimensional distribution function is 1 For the case in fig. 3b, we set ω = 2π, and n I , u I and c q,I are set heterogeneous to correspond to one state of the undamped oscillation (breathing mode 23 ): where ∆c 2 q is the maximal deviation of c 2 q from its equilibrium value c 2 q,EQ .
With these initial values, and starting from local equilibrium f LEQ , the local equilibrium remains unbroken and the gas displays the undamped oscillation for any value of the cross section.
Conversely, starting from out of equilibrium f BKW , any positive value of cross section will lead to dampened oscillations.
The usual PDE expression of this model, given above, can be translated into its Fredholm counterpart, following: with where r b and c b are location and velocity corresponding to the ballistic path through r at time t with velocity c, such as: and r b (t) = r, c b (t) = c.
The product f ( r b (t ), c α (t ), t )f ( r b (t ), c β (t ), t ) has been treated following the NLMC expansion exposed in Methods.
The exponential term is handled using null collisions technique, as exposed in SI4. The upper bound for cross section is set at the value it takes at the center of the gas cloud at time of maximal contraction of the undamped oscillation : Algorithm According to eq. 19, three random variables are defined for the Monte Carlo estimates: T for t , U for u and C * for c * , with: over the unit sphere p C * ( c * ) = p N ( u Boltz ,c 2 q,Boltz ) over velocity space (21) where u Boltz et c q,Boltz are respectively the mean velocity and the mean square speed at the considered time, and set to the values predicted in the case of undamped oscillation.
f ( r, c, t) can be estimated by this recursive algorithm: Initialisation Sample a date t .
Recursion termination: If t t I return f ( x I , c I , t I ).

Recursion
Sample a velocity c * and a unit vector u.
Compute the ballistic solution x b (t ) and c b (t ). Estimate If Q ∈ [0, 1], then set P ← Q else set P ← Q 2Q−1 .
Sample a standard uniform r.
If r > P, then estimate f b ← f ( x b (t ), c b (t ), t ) and return 1−Q 1−P f b , Estimate Estimate f β ← f ( x b (t ), c β (t ), t ).