Inverse problem for multi-body interaction of nonlinear waves

The inverse problem is studied in multi-body systems with nonlinear dynamics representing, e.g., phase-locked wave systems, standard multimode and random lasers. Using a general model for four-body interacting complex-valued variables we test two methods based on pseudolikelihood, respectively with regularization and with decimation, to determine the coupling constants from sets of measured configurations. We test statistical inference predictions for increasing number of sampled configurations and for an externally tunable temperature-like parameter mimicing real data noise and helping minimization procedures. Analyzed models with phasors and rotors are generalizations of problems of real-valued spherical problems (e.g., density fluctuations), discrete spins (Ising and vectorial Potts) or finite number of states (standard Potts): inference methods presented here can, then, be straightforward applied to a large class of inverse problems. The high versatility of the exposed techniques also concerns the number of expected interactions: results are presented for different graph topologies, ranging from sparse to dense graphs.

(0) i (J ) − λ||J|| 1 and L(x, J ), we invert the functional signs and minimize, in order to use well established packages available. For the minimization of -PL with 1 -regularization we adopted L1General [1]: a set of MATLAB routines implementing several of the available unconstrained minimization strategies for solving 1regularization problems. In particular, the results here presented were obtained with the spectral projected gradient method [1]. For the minimization of −L(x, J ) in the decimation procedure, we use, instead, Open Optimization Library (OOL) [2], which is a set of constrained optimization codes written in C. The results here pre- * Correspondence to alessia.marruzzo@uniroma1.it sented were obtained using the spectral projected gradient method for convex set [3]. The decimation code using the OOL routine has been ad hoc developed with parallel programming on GPU's, reducing the running times of numerical processing with respect to C++ codes on CPU up to a factor 5 for the largest analyzed systems, N = 32. In terms of timing, the parallel CUDA code outperforms the MATLAB routines up to a factor 30 for N = 16.
Inverse algorithm To evaluate the inverse of the Fisher Information matrix, defined by where with a, b we indicate two possible quadruplets node i might belong to (see next paragraph), we first evaluate its Cholesky decomposition, being I i ab a symmetric, positive define square matrix, and then its inverse through GSL libraries. Pseudolikelihood function From Eq. (1) in the main text, the likelihood function of a mode amplitude configuration a, given a coupling set J , is readily introduced as To construct the likelihood function of a j , i. e., the probability distribution of a single variable given the values of all the other variables a \j , we first rewrite Eq. (1) in an equivalent way: defining the complex-valued local effective fields Then, we separate the contributions from a given variable a i from all contributions not involving a i , only a \i : In terms of this decoupling the partition function reads: with i. e., Eq. (9) of the main text. In order to effectively carry out the sum over a i values one has to recall that the phasors satisfy a global spherical constraint j |a j | 2 = N , with constant . Once all a \i are given, then, the value of |a i | is fixed, and ai simply reduces to an integral on the angular phase variable φ i ∈ [0 : 2π[. Using Eqs. (2) and (3) of this section the pseudolikelihood function of the values of a i , biased by a \i values, can be, eventually, written as In order to find the best estimates of the interaction parameters given a data set of M mode amplitude configurations, we minimize the opposite log-pseudolikelihood functional −L i : with respect to the coupling parameters, exploiting the explicit knowledge of the derivatives where we denoted Rewriting the complex amplitude in polar coordinates a i = A i e ıφi we have the following expression for the marginal (we do not write the J dependence) where and I 0 (x) is the modified Bessel function of the first kind.
As mentioned in the main text, the polar coordinates are most useful in the cases of intensity equidistribution among the modes, A i 1, ∀ i, and of quenched amplitudes, i.e., when the A i dynamics is quenched on the time scales of the φ's dynamics. In the latter case all A's are taken care of by rescaling the coupling constants as When the couplings are considered real-valued, the polar expressions of the local effective fields, cf. Eq. (2) in the main text, can be rewritten by substituting Eq. (2) with and the log-pseudolikelihood functional L i and its gradient, cf. Eqs. (4,5) in this section, simplify to To determine the interaction network of non-linear wave systems we use and compare two techniques: the 1regularization PLM [4,5] and the decimation PLM [6].
1 -regularization In this approach we add a 1 norm contribution for each coupling to be inferred to the logpseudolikelihood in order to keep the coupling values from diverging during the minimization procedure: The positive regularizer λ must be small in order to prevent the modification of the landscape of L i . We take a pseudolikelihood that is intensive in M but small enough values of λ for all M 's.
Within this method, for each mode i all couplings involving i are inferred in one apart iteration. The same quadruplet J ijkl can, thus, be inferred more times, proceeding by minimizing likelihood functions for different, though interacting, modes i, j, k and l. Nothing prevents the values of apart reconstructions to be the same. Eventually, thus, all inferred values of the same coupling are averaged in order to enforce the original symmetry.
δ thresholding A further improvement in the reconstruction of the topology can be achieved, as suggested in Ref. [5], by setting to zero all couplings whose estimate is below a threshold value δ. However, the choice of the δ value might be delicate since there are many cases in which there is no clear gap between the zero and the non-zero couplings [6]. If the probability distributions of the estimators are known, we can overcome this problem developing a more accurate hypothesis testing scheme. P(Ĵ) thresholding It can be seen that, as M → ∞, the probability distribution P(Ĵ) of the maximum PL esti-matorĴ converges to a Gaussian distribution centered around the true value of the coupling and with variance estimated by the diagonal elements of the inverse of the Fisher information matrix [7]. The elements, I i ab , of the Fisher information matrix are defined through: where with a, b we indicate two possible quadruplets including node i, i.e., a = {i, j, k, l}, b = {i, j , k , l }. Then, knowing the distribution, we can determine, for every estimated value J * a , if it is compatible with a Gaussian centered in zero, i.e., if the hypothesis for the true coupling to be zero might or not be rejected.
The hypothesis testing can be schematically developed as follows.
(i) Once the maximum points of the PL are found, we evaluate the inverse of the Fisher information matrix, Eq. (9); from Eq. (7), we have: where with B (x) we indicate: The diagonal terms of the inverse matrix are taken as estimates for the variances, σ * a , of the estimator distributions.
(ii) We assume every couplings to be zero; hence, we expect every estimated value J * a to be compatible with the N (0, σ * a ) distribution. We, then, construct a confidence interval C n that should contain the estimated value J * a within a 97.5% probability.
(iii) If the inferred J * a is contained in C n we accept the zero hypothesis and the coupling is set to zero.
Optimal λ regularizer It is clear from Eq. (8), that as λ increases it increases the tendency to globally underestimate the interaction couplings. Within the hypothesis testing procedure described more and more couplings become compatible with a Gaussian centered in zero and are considered as irrelevant. Indeed, as noticed in Fig. 1, err J (λ) increases above some optimal valueλ that should depend on the number of configurations M available and on T . One procedure that might be adopted to determineλ consists in testing the PLM with 1 -regularization on data relative to a known system and use the optimum value found to solve the inverse problem of new systems. However, there might be cases in which there are no solved inverse problems available and, in general, λ could depend on the details of the problem. Cross-Validation (CV) techniques are often applied to determine the best value of λ on supervised learning algorithms. On this problem of model inference, we can also think of implementing a CV method with the following scheme: the observed configurations are divided in two sets, a training and a validating set; the training set is then used to fit the model, i.e., determine the interaction couplings as a function of the trial value for the regularizer; we then run a Monte-Carlo, with these inferred couplings, and look at equilibrium configurations from which we extract four point correlations functions, C mc ijkl ; these correlations are then compared with those ones obtained from the configurations of the validating set, C val ijkl ; the optimal λ is taken as the value that minimize the distance among C mc ijkl and C val ijkl . The hypothesis testing procedure explained in the previous paragraph could also be considered as a tool to determineλ. Indeed, within the PLM, every coupling J ijkl has four estimators J * a and we evaluate the σ * a for each one. It might be that not all the four estimated values have the same compatibility with a Gaussian centered in zero since, due to the lack of configurations, one value might be overestimated. Increasing λ, the number of quadruplets for which the hypothesis test does not give the same answer for every estimated value decreases. We observed that the minimum λ value for which there are no quadruplets that answer differently to the hypothesis test is very close, within less than an 8% difference, toλ.
PLM Decimation In the decimation procedure we maximize the total log-likelihood for all modes starting from L max defined on a full graph and inferring all the values of the couplings on that network. Sorting the couplings by their absolute value and taking away (decimating) the N 0 smallest we are left with a network of N q − N 0 = xN q non-zero couplings. A new inference iteration, including minimization of L(x) and sorting, will allow to decimate further another group of the smallest couplings. And so on and so forth, as far as the couplings of the decimated model network are more than the couplings of the true system. Indeed, as far as the number of inferred parameters is larger than the true number of parameters the pseudolikelihood is not expected to change and will stick to its maximum value L max (overfitting). The indication of the true number of couplings will be given, indeed, right by the x value of the fraction of non-decimated couplings across which L starts decreasing because the number of inferred parameters is less than the real number of parameters (underfitting). The minimum conceivable L(x) is for the non-interacting system: L min = L(0). To enhance the determination of the true network connectivity x * , below which statistical interpolation becomes less reliable, Decelle and Ricci-Tersenghi [6] introduced the tilted Table 1: Total number of quadruplets in all simulated networks used to yield data to be tested by our pseudolikelihood inference methods. We stress that using the total likelihood, cf. Eq. (10), the symmetry of coupling constants under indices permutation is automatically enforced. Random Graphs We report the distributions of connectivities in the various graphs that we used to test the reconstruction methods exposed in the main manuscript for the complex spherical model (SM), cf. Eq. (1), and the XY model, cf. Eq. (4), at finite N . In the XY models the x axis is the connectivity per variable. In the SM the total number of couplings scales like N 3 : the x axis represents, in this case, the connectivity per variable node rescaled by N 2 .
The distributions in the Erdos-Renyi-like graphs and in the Mode-Locked-like graphs in pairwise interacing systems are known to tend to Poissonian distributions in the thermodynamic limiti N → ∞ [8], though the MLgraph convergence is much slower for increasing N . As a comparison, in the right panels of Figs. 1, 2 we plot the relative Poissonian distributions with the same average. Finite size effects are clearly strong for the small simulated sizes. The actual number of quadruplet couplings N q for each simulated instance is reported for all consid- ered models in Tab. 1. Finite size critical temperature We refer several times to a critical region in T for which inference works better than at higher or lower T . In particular, the reconstruction error is up to one order of magnitude smaller and the network topology is correctly reconstructed with the need of less configurations data. In Fig. 3 we display the energy plots of the models used to produce equilibrium data with Exchange Monte Carlo simulations. In the thermodynamic limit N → ∞ they should display a true critical point behavior where a phase transition occurs from a paramagnetic-like phase to a phase-locked or to a spinglass phase. For the 4XY model on ER sparse graphs these temperatures are analytically known, as T c = 0 for N q /N = 1 and T c > 0 for N q /N = 2 [9]. For finite sizes the mathematical discontinuity is approximated by a steep, though smeared, descent as T decreases. This step becomes steeper and steeper as N increases, eventually reaching the limit of a true singularity representing a true phase transition. Indeed, for finite systems, no actual phase transition occurs. However, one can identify finite size pseudocritical points that, for systems of increasing size N , form a series converging to the true critical point. Operatively, we take as finite size critical points T c (N ) the peak of the specific heat dE/dT from E N (T ) data in Fig. 3. For the simulated systems whose energy is displayed in Fig. 3 the T c (N ) are reported in Tab. 2.