Experimental adaptive Bayesian estimation of multiple phases with limited data

Achieving ultimate bounds in estimation processes is the main objective of quantum metrology. In this context, several problems require measurement of multiple parameters by employing only a limited amount of resources. To this end, adaptive protocols, exploiting additional control parameters, provide a tool to optimize the performance of a quantum sensor to work in such limited data regime. Finding the optimal strategies to tune the control parameters during the estimation process is a non-trivial problem, and machine learning techniques are a natural solution to address such task. Here, we investigate and implement experimentally for the first time an adaptive Bayesian multiparameter estimation technique tailored to reach optimal performances with very limited data. We employ a compact and flexible integrated photonic circuit, fabricated by femtosecond laser writing, which allows to implement different strategies with high degree of control. The obtained results show that adaptive strategies can become a viable approach for realistic sensors working with a limited amount of resources.

Achieving ultimate bounds in estimation processes is the main objective of quantum metrology. In this context, several problems require measurement of multiple parameters by employing only a limited amount of resources. To this end, adaptive protocols, exploiting additional control parameters, provide a tool to optimize the performance of a quantum sensor to work in such limited data regime. Finding the optimal strategies to tune the control parameters during the estimation process is a non-trivial problem, and machine learning techniques are a natural solution to address such task. Here, we investigate and implement experimentally for the first time an adaptive Bayesian multiparameter estimation technique tailored to reach optimal performances with very limited data. We employ a compact and flexible integrated photonic circuit, fabricated by femtosecond laser writing, which allows to implement different strategies with high degree of control. The obtained results show that adaptive strategies can become a viable approach for realistic sensors working with a limited amount of resources.
Quantum sensing devices are among the most promising quantum technologies. Their implementation relies on the use of quantum probes to attain enhanced performances in the estimation of one or more parameters compared to classical ones. Quantum metrology aims at identifying the best strategy able to provide this quantum advantage [1][2][3][4][5][6][7]. This is achieved by carefully tailoring the probe state, the interaction, and the measurement, in order to extract the information on the relevant parameter, and by the optimal choice of the estimator through data post-processing [8]. When performing a single parameter estimation, the optimal strategy is unequivocally identified through the saturation of the Cramér-Rao bound (CRB), which establishes the maximum achievable precision on the measured parameter [9]. The CRB is asymptotically saturated with the number of resources employed to probe the system during the measurement. Conversely, the realisation of quantum sensors, able to perform estimations in realistic scenarios, poses two constraints to sensing devices: the resources to be used for probing are limited, and systems can show high complexity, often involving more than one parameter. Due to the finite number of available resources, it becomes of paramount importance to optimize the estimation protocols in order to reach the sought accuracy bounds employing the smallest number of resources possible. Such problem has been explored recently with several theoretical analysis. A possible approach towards the protocol optimization is that of exploiting adaptive strategies. These have been successfully employed in single-parameter estimation [10][11][12][13][14][15][16][17][18]. In this regard, machine learning (ML) approaches have provided a significant speed up in the saturation of the ultimate bounds [17,[19][20][21].
On the other hand, measuring multiple parameters at once might be necessary in complex systems characterized by a set of parameters, where a time or spatial dependency can prevent * fabio.sciarrino@uniroma1.it the successful realization of subsequent single-parameter estimations. The parameters considered can span from multiple phases [22][23][24], to phase and phase diffusion in frequencyresolved phase measurements [25][26][27], and phase and loss in absorbing systems [28]. In other instances where a system depends solely on one parameter, a multiparameter approach could still be favourable as other parameters can be interrogated as a control to monitor the quality of the sensor itself [29][30][31]. In general, the saturable bounds for quantum multiparameter strategies are not as defined as in the single parameter case, and trade-offs in the achievable precision for each of the parameters have to be sought [32][33][34][35][36].
In this context, it becomes of paramount importance to identify both a suitable estimation scenario and a corresponding platform for an experimental investigation of adaptive multiparameter estimation protocols. A notable scenario to investigate is multiphase estimation [22,24,[37][38][39][40][41][42][43][44][45][46]. Not only such scenario provides a benchmark for multiparameter quantum metrology, but it has a plethora of practical applications in quantum imaging [32,34]. A fundamental step is to find a suitable experimental platform to realize multiphase estimation. A viable solution is provided by integrated photonics, which enables the implementation of complex circuits with reconfiguration capabilities [47][48][49][50][51][52] with applications ranging from quantum simulation, to computation, and communication. This platform represents a promising system for optical quantum metrology, since interferometers with several embedded phases can be employed as a benchmark platform to study multiparameter estimation problems. In this direction, first results on multiphase estimation with quantum input probes have been recently reported [22], using a threearm interferometer fabricated by femtosecond laser writing [53,54]. Here, we report on multiphase estimation experiments performed with an integrated platform using different adaptive protocols. We identify the strategy providing better performances in terms of optimal estimation and computational resources. In particular, we have experimentally tested the proposed approach by feeding with single-photon states an integrated three-mode interferometer realized by the femtosecond laser writing technique. The employed technique is a Bayesian learning protocol which exploits advantages of Montecarlo approach, such as its independence of integration space dimensions [55]. This solution seems to be ideal for adaptive multiparameter problems, where complex optimizations could involve multiple integrations. Here we employ experimentally for the first time an adaptive strategy optimized in the limited data regime for multiphase estimation. Using this algorithm, we demonstrate that the convergence to the CRB can be experimentally achieved after only few photons. Importantly, such convergence is achieved for both the simultaneously estimated phases. Our results improve research for identifying optimal learning strategy and finding experimental platforms suitable to test multiparameter estimation problems also in the limited data regime.

Bayesian multiparameter estimation
In multiparameter estimation, the aim is to measure simultaneously an unknown set of parameters x = (x 1 , . . . , x n ) by reaching the maximum precision allowed by the amount of resources employed in the process. In general, the set of parameters is encoded within the evolution of a system, either described through a unitary operator U x or a more general map L x . The value of the unknown parameters x can be estimated by preparing a suitable probe state ρ and sending it to evolve throughout the system. Information on the unknown parameters can be retrieved by measuring the output state ρ x with a set of measurement operators {Π d }, where d = 1, . . . , m represents the set of possible outcomes. Such process is then repeated N times to improve precision in the estimation process. After N probes have been prepared and measured, the obtained sequence of measurement outcomes d = (d 1 , . . . , d N ) has to be converted in a set of parameters estimatesx through a suitably chosen function x =x(d 1 , . . . , d N ). A possible choice of estimator is provided by Bayesian protocols. This class of estimators is based on encoding the initial knowledge on the parameters in a probability function p(x), called prior distribution, which is updated according to the Bayes rule at each step of the estimation protocol. The posterior distribution after N probes reads p(x|d) = N −1 p(d|x)p(x), where p(d|x) is the likelihood function of the system expressing the conditional probability of obtaining the measurement sequence d for given values of the parameters x, and N is a normalization constant. Then, the mean of the posterior distribution can be exploited as the estimate of the unknown parametersx i = x i p(x|d) i dx i . Bayesian protocols present several important properties. In particular, it can be shown that such approach is asymptotically unbiased, meaning that the estimated values converge to the true values when N is large enough. This is related to the quadratic loss L(x,x;w) = iw i (x i −x i ) 2 , whose average value over all measurement sequences d is commonly employed as a figure of merit to quantify the convergence of the estimation process. The coefficientsw i can be chosen to reflect different weights between the parameters, while for equally relevant parameters they can be set asw i = 1.
Hereafter, we will consider this latter scenario and thus define the quadratic loss as L(x,x) = i (x i −x i ) 2 . Furthermore, in a Bayesian framework the posterior distribution also provides a confidence region for the parameters estimates, which is represented by the covariance matrix Cov(x) of p(x|d). This latter figure of merit is obtained for each single estimation experiment composed of a sequence of N probes, and has no counterpart in frequentist approaches [56]. In general, Bayesian bounds for both the quadratic loss and the covariance matrix depend on the amount of a priori knowledge p(x) available [16,[56][57][58]. Asymptotically for large values of N , corresponding to the regime where the amount of information acquired in the estimation process far exceeds the a priori knowledge, the covariance matrix satisfies the Cramér-Rao inequality Cov(x) ≥ F −1 /N , where F is the Fisher information matrix [59] and thus F −1 corresponds to its inverse. Such quantity also provides an asymptotic bound for the quadratic loss as L(x,x) ≥ Tr[F −1 ]/N . Adaptive protocols can be employed when, besides the set of unknown parameters x, the user has access to an additional set of control parameters c = (c 1 , . . . , c l ) that can be changed throughout the estimation process. More specifically, after each of the N probes is sent and measured, the acquired knowledge is employed to change the values of c for the next probe to maximize the extraction of information in the sub-sequent measurement. Within a Bayesian framework, such knowledge is encoded in the posterior distribution. Hence, after each step of the estimation protocol, the user can decide the values of the control parameters c starting from p(x|{c}, d) (see Fig. 1a). Adaptive protocols represent a relevant tool in phase estimation process. Indeed, the adoption of adaptive strategies becomes a crucial requirement even in the singleparameter case to optimize the algorithm performances [10-12, 15, 17, 19, 20, 55, 60, 61], with the aim of achieving the ultimate bounds provided by the Cramér-Rao inequality for small values of N [17]. Furthermore, in more complex systems characterized by a phase-dependent Fisher information matrix, adaptive strategies become crucial to reach equal performances for all values of the unknown parameter(s) [31].

Adaptive protocols for multiarm interferometers
Given the general scenario described in the previous section, it is crucial to identify and test experimentally protocols to saturate the ultimate bounds with a very limited number of probes. In this context, multiarm interferometers represent a benchmark platform to perform simultaneous estimation of multiple phases. The platform is schematically shown in Fig. 1b, and represents the m-mode generalization of a Mach-Zehnder interferometer in the multimode regime [22,43,62,63]. More specifically, it is composed by a sequence of a first multiport splitter, employed to prepare the probe state, a series of phase shifts between all the optical modes, and a second multiport splitter which defines the output measurement. Both multiport splitters can be in principle designed according to appropriate decompositions [64,65] to implement any linear unitary transformation. The internal phase shifts can be divided in two layers. The first one φ = (φ 1 , . . . , φ n ) corresponds to the unknown parameters to be measured, while the second one Φ = (Φ 1 , . . . , Φ n ) takes the role of the control parameters for adaptive estimation; we note that in our implementation the number of controls l = n. Here, n = m − 1 is the number of independent parameters, since one of the phases is considered as the reference mode. Both the unknown parameters and the control ones contribute to the overall phase differences ∆φ = (∆φ 1 , . . . , ∆φ n ) within the interferometer.
We study different adaptive protocols for Bayesian learning of the unknown phases of this platform injected by a singlephoton state, by focusing both theoretically and experimentally on the three-mode scenario (m = 3) with two independent parameters (n = 2). More specifically, we choose both for state preparation and state measurement transformation a balanced tritter described by unitary matrix U with |U i,j | 2 = 1/3, ∀(i, j) [66]. Injecting a single photon on input port 1 corresponds to generating a sequence of probe states of the form |ψ in = 3 −1/2 (|1, 0, 0 + |0, 1, 0 + |0, 0, 1 ), which represents a single-photon state exiting in the balanced superposition of the three modes. The Fisher information matrix in this scenario shows a phase-dependent profile F(∆φ 1 , ∆φ 2 ), meaning that without adaptive strategies the asymptotic precision will be different depending on the actual phase val-ues. In particular, by looking at the inverse of F, we obtain min ∆φ1,∆φ2 Tr(F −1 ) 3.866, which is obtained for six different phase pairs (∆φ 1 ,∆φ 2 ). For those pairs, minimum asymptotic quadratic loss is achieved.
Bayesian protocols require in general expensive computational resources, due to the need of evaluating complex integrals to determine the normalization constant N , as well as the estimated values and their corresponding covariance matrices. A possible solution is to perform a discretization of the parameters space, thus converting integrals to sums. In this case, the bin size has to be chosen depending on the minimum error expected at the end of the estimation process. However, such solution becomes quickly unmanageable when the number of parameter increases, since such discretization has to be performed in a n-dimensional space. A different solution has been explored in [55] for Bayesian learning problems by using a Sequential Monte Carlo (SMC) approach. Indeed, Monte Carlo methods seems to be a natural solution, due to their capability of reaching convergence independently from the integration space dimension. The SMC method approximates the infinite dimensional support φ with a finite number M of elements φ i , called particles, with associated probability weights w i . The error in the approximation can be arbitrarily reduced by increasing the number of particles, leading to a trade-off between computational time and accuracy of the approximation. In the context of Bayesian analysis, any distributionp(φ) in the particles approximation is expressed as We now consider the case of an initial prior knowledge p(φ) corresponding to a uniform distribution. In the particles scenario, this prior information is approximated by a set of M randomly drawn pair of phases φ i with equal weights w i = 1/M to satisfy the normalization condition ( M i=1 w i = 1). During the experiment, the information about the unknown phases φ is updated according to the Bayes rule after each measurement outcome d. In the particle approximation, having fixed control phases, this corresponds to updating the particle weights as w i → w i p(d|φ i , Φ), while keeping the particles {φ i } unchanged. The estimation of φ is then provided by the expectation value of the posterior distribu- [55], the particle approximation needs some additional steps to avoid the introduction of further errors throughout the estimation process. In particular, after a few iterations the nonzero weights will be mostly concentrated on a small subset of {φ i }, reducing the validity of the approximation. To avoid such effect, it is possible to employ resampling techniques [67]. More specifically, when the particle weights become too concentrated according to a given threshold condition, a new set of particles {φ i } is generated by adding a small random perturbation to the original particles (see Supplementary information IA for more details). The weights are then reset to w i = 1/M , and the estimation process restarts. Within this framework, we now have to define the adaptive rule to determine the value of the control parameters at each step depending on the actual knowledge. More specifically, at each step of the estimation process one has to decide the control parameters c (here, the additional phases Φ) for the next probe.
To this end, we consider different strategies.
A first approach (i) is based on choosing the control phases according toφ + Φ δφ, where δφ = argmin ∆φ1,∆φ2 Tr(F −1 ). This strategy looks to set the interferometer phases ∆φ to those values leading to a minimum bound for L(φ,φ) according to the Cramér-Rao inequality. While this approach is tailored to work in the asymptotic regime of large N , its performances are not guaranteed to be optimal for small N . An upside of this approach is that setting the control parameters does not require complex optimization steps, since an analytic rule can be easily defined.
In order to devise a strategy working in the small N regime, one can consider a second strategy (ii) which is specifically tailored to work for all values of N . To this end, we adapted the protocol described in [55] to the multiparameter scenario implemented by our system. By this approach, the choice of the control phases is performed to optimize a given figure of merit, known as utility function (U ). Canonical choices for U are information gain or quadratic loss. In our case, we choose U (φ) = Tr[Cov(φ)]. Hence, at each step the minimization algorithm finds the best control phases Φ that, averaged over all possible measurement outcomes, leads to a minimum value for the sum of the parameters confidence intervals. This is thoroughly discussed in Sec. IB of the Supplementary Information. Given that this method relies on numerical optimization steps, it is more expensive in terms of computational resources than the previous strategy based on the Fisher information matrix. Conversely, it provides the advantage of searching the optimal control phases for all values of N , thus covering the limited data regime where asymptotic approaches may not be the proper choice.
We have then performed numerical simulations to characterize the performances of the two algorithms. More specifically, we have sampled N ph = 100 random pairs of phases For each pair, we simulated N exp = 100 estimation processes where N = 100 single-photon probes are sent in the interferometer. The results are shown in Fig. 2. We first tested the performances of both algorithms (i) and (ii). We observe that, concerning strategy (i), the protocol fails to approach the Cramér-Rao bound even for N ∼ 100. This is related to the periodicity of the likelihood function, which presents multiple equivalent points. Approach (i) seeks for setting the phase differences ∆φ to a fixed point, and it is not able to resolve such periodicity issue. Better results are obtained by applying at each step a random (but known) set of control phases (iii), which shows better convergence while not reaching the Cramér-Rao bound. However, the application of this strategy is capable of resolving the multiple periodicity. One can then consider a modified version (i') of the asymptotic protocol (i), where the first K control phases are drawn from a uniform distribution, while for N > K the strategy works as (i). Numerical evidence shows that the best choice for this parameter is K ∼ 20. We observe that, with this modified strategy, the Cramér-Rao bound is approached for N ∼ 50. Better results are obtained with the optimized strategy (ii), in particular in the small N regime. For N > 60, we observe that both strategies (i') and (ii) provide similar performances since the exper- iment progressively approaches a large N scenario where the Fisher information matrix defines the system sensitivity. In this work we experimentally implement the optimal strategy to guarantee a faster convergence of the estimation process.

Integrated circuit for multiphase estimation
The platform employed in this experiment is an integrated three-arm interferometer. This system has been employed in Ref. [22] for the simultaneous estimation of two relative phase shifts φ = (φ 1 , φ 2 ) between the arms of a three mode interferometer (Fig. 3). We first discuss the circuit layout and parameters, while we subsequently describe the working condition used for the multiphase estimation experiments reported below. The platform is a three-arm interferometer realized in a glass chip through femtosecond laser writing [53,54]. The interferometer, optimized for operation at λ = 785 nm, is implemented by two cascaded tritters (three-mode beam splitters) A and B interspersed with phase shifters. Each tritter is decomposed in a 2-D planar configuration [64] consisting of three balanced directional couplers and one phase shifter φ A T (φ B T ) for tritter A (B). These phase shifters, as well as those placed between the two tritters, can be tuned by means of the thermo-optic effect, using microresistors that are patterned in a thin gold layer covering the chip surface. When an electrical current is applied to the resistor, an optical path change on the waveguide is induced by the dissipated heat [68]. In par-ticular, let us consider the dissipated power P i = R i I 2 Ri on resistor R i subjected to a current I Ri , where we also include that the value of the resistor depends on the current due to its temperature change. The two induced relative phase shifts ∆φ = (∆φ 1 , ∆φ 2 ) between the arms of the interferometer with respect to the reference mode, have the following general dependence on the dissipated powers: where j = 1, 2 and φ j0 stands for the static phases of the interferometer. Parameters α ji and α N L j,i=k are the linear and quadratic response coefficients relative to the dissipated power P i , respectively, while α N L j,i =k represent the nonlinear coefficients associated to the product of the two powers P i and P k to include cross-talk effects. In our device 8 independent resistors are present (Fig. 3). Resistors R A and R B are exploited to tune tritter phases φ A T and φ B T , respectively. Conversely, resistors R 1 , R 4 along mode 1, R 2 , R 5 along mode 2 and R 3 , R 6 along mode 3, are employed to tune the internal relative phase shifts of the interferometer, according to (1). The operations of tritters A and B are described through the unitary evolutions U A and U B , respectively, while the action of each phase shifter along mode i is described through a unitary matrix P S i (i = 1, 2, 3). The overall evolution U tot of the interferometer is given by In order to characterize the relevant parameters necessary to fully describe the evolution of the interferometer, we measure the output probabilities when single photons are injected along input 1, tuning the current applied on each resistor. The probabilities have been theoretically modeled by modifying the ideal expression with additional terms, taking into account non-ideal visibilities and dark counts of the detectors. In this way, we performed an overall fit of all the measured probabilities to determine the 58 chip parameters (Supplementary Information II) and finely reconstruct the likelihood probability p(d|∆φ) of our system.
According to the scheme of Fig. 1b, the unknown phases to be estimated are the pairs (φ 1 , φ 2 ), relative to the chosen reference arm φ ref . The 8 resistors allow us to finely tune and control all the relevant phase shifts of the interferometer. The tritters phases can be tuned and are chosen in order to maximize the sensitivity of the interferometer. Using single photon probes, the optimal configuration for our interferometer employs mode 1 as input and mode 2 as reference. In this case, the trace of the inverse of Fisher Information matrix, minimized over all possible internal phases, is Tr[(F exp ) −1 ] = 4.2. The unknown phases φ = (φ 1 , φ 2 ) are tuned by means of resistors R 4 , R 5 and R 6 , according to (1), while the control phases Φ = (Φ 1 , Φ 2 ) are tuned by resistors R 1 and R 2 (see Methods).

Experimental adaptive multiphase estimation
We perform the experiment by continuously adapting the present tunable circuit following the optimized Bayesian-SMC method [strategy (ii)]. This allows us to achieve best attainable estimation with a limited number of resources. The probes are heralded single photons at 785 nm generated by a degenerate type-II SPDC process inside a BBO crystal, pumped by a pulsed 392.5 nm laser. A photon from each pair is sent through the circuit, entering in input 1, and acts as probe, while the other photon acts as the trigger for the heralding process (see Fig. 3). An event is then recorded as the coincidence between the trigger detector and one of the three outputs of the circuit. The interaction of the probe with the chip operator encodes information about φ onto its state. Finally, the result of the measurement is collected and used to identify the optimal settings for the next experimental step. The phases φ to be estimated can be chosen by setting the currents flowing in three resistors R 4 , R 5 , R 6 (see Methods). In order to test the protocol over different estimation experiments, we have identified N ph = 15 pair of phases uniformly distributed (Fig. 4). Resistors R 1 , R 2 are used to tune the control phases necessary for the adaptive strategy. After the first event, where currents I R1 , I R2 are chosen at random, we implement strategy (ii): optimal control phases Φ are calculated by minimizing the expected posterior variance. The nearest available control currents I R1 , I R2 , limited by the precision of our power supply (Keithley 2230), are calculated and effective control phases are applied to the device. The calculation of the prior distribution for each step is made through the particle approximation. A uniform grid of n = 2000 pairs of phases (Fig. 5a) is assumed as initial set for the prior distribution. This choice is performed yo avoid any possible harmful periodicity during the estimation process. Examples of prior information evolution during an experiment are reported in Fig. 5b-d. In Fig. 5 c the resampling step is shown, where particles with zero weight of the previous step (Fig. 5b) are rearranged in more significant locations (see Sec. IA of Supplementary Information for more details). Each pair is estimated N exp = 100 times, adopting N = 100 resources (pho-tons) as for the numerical simulations discussed above. Single experiments are reported in details in Sec. II of Supplementary information. Algorithm performances are shown in Fig.  6. A first evaluation consists in averaging the experimental quadratic loss for each pair of phases over all N exp independent runs. As a result, the overall quadratic loss L(φ,φ) saturates the CRB with a limited number of resources, in agreement with the numerical simulations described above. Furthermore, saturation occurs both for off-and diagonal matrix elements of the CRB. In particular, the latter show that the CRB is reached with similar performances in the estimation of both phases. This result is a fundamental feature for multiparameter metrology tasks when both parameters are treated equally. In our case the resulting difference in estimation of the two parameters is less than 10%, when compared to the sensitivity bound. Furthermore, a heuristic estimation of the convergence time to saturate CRB can be calculate by studying the difference L(φ,φ) − Tr[(F exp ) −1 ]/N . A characteristic time can be computed by using a + b exp −(N/τ N ) as fit function, with a, b, τ N ∈ R the fitting parameters. The value obtained for τ N is τ fit N = 5.6, which underlines the good performance of the adaptive adopted technique in using small number of probes. Another significant property of Bayesian approach is the ability to provide the statistical error in each step of the estimation process, calculated as the variance of the posterior distribution. Final estimated pairs fall on average within the error from true set values of phases (Fig.  4). All these experimental results demonstrate the quality of Bayesian-SMC strategy, confirming it as largely suitable for multiparameter estimation problems. Implementation of this strategy has been enabled onlyby the high reconfigurability of our employed integrated device, which highlights the fundamental role of an appropriate platform for metrology tasks which involve more than one parameter.

DISCUSSION
Multiparameter estimation is a fundamental problem for the realization of realistic quantum sensors in several scenarios. In this task, there are still several open problems and a comprehensive framework has yet to be defined. Hence, it is crucial to identify an experimental platform versatile enough to address different possible approaches. Multiphase estimation provides an ideal scenario with different practical applications. Furthermore, it represents a testbed for different multiparameter estimation protocols. Applying these to real world scenario requires a further step, that is, the optimization of the available resources, so as to attain the minimum reachable uncertainties after a sufficiently small number of measurements. This can be achieved by implementing adaptive strategies.
Here, we have reported the first experimental implementation of a multiphase Bayesian adaptive protocol on an integrated platform, optimized to operate in the limited data regime. We have reviewed different adaptive strategies and selected the one optimizing the cost function given by the trace of the covariance matrix. This has been employed to perform several simultaneous estimations of uniformly dis- Red shaded regions represents the interval where L(φ,φ) relative to each single phase (averaged on its Nexp runs) can be found. b, Analysis of diagonal elements of CRB by comparing quadratic loss relative to the single phase of the estimated pair L(φi,φi) (with i = 1, 2) (blue triangles) and Tr[(Fexp) −1 ]/(2N ) (blue solid line). The algorithm shows symmetric optimal performances for estimation of both parameters, by using the same amount of resources. This feature is highlighted by the inset panel, where the ratio ∆L(φ,φ) between the difference of the two estimations and the bound value is reported. c, Analysis of phase correlations by comparing off-diagonal terms of F −1 exp (see Sec. III of Supplementary Information) and [(Fexp) −1 ]12/N (green solid line). d, Estimation of convergence time (τN ) to CRB. The value can be estimated by fitting the distance between the averaged L(φ,φ) and CRB, after N > 2. The adopted fit function is a+b exp (−N/τN ), with a, b, τN ∈ R the fitting parameters, leading to τN = 5.6. The choice of this function is performed to provide a reasonable estimation of τN , as the number of probes necessary to approach the CRB. these techniques can be directly generalized for multi-photon quantum probes which would provide insight on the achievable quantum accuracy limit. At the same time, the algorithm here described can be applied to more complex integrated platforms, which enable optimized extraction of information. Further perspectives include the study of different multiparameter scenarios, as well as practical applications to quantum sensing of delicate samples [69].

Tuning of circuit parameters for adaptive two-phase estimation
We discuss in more details how we exploit the phases in our interferometer. The pair (φ 1 , φ 2 ) represents the unknown phases relative to a reference arm with phase φ ref (Fig. 1b). All the relevant phases of the circuit can be finely tuned by means of 8 resistors.
The first step performed aimed at finding the optimal choice for the tritter phases φ A T , φ B T to maximize the sensitivity of the interferometer. To achieve this goal, we first evaluate the Fisher information matrix F exp associated to the device from the experimentally estimated parameters, Then, we numerically minimize Tr[(F exp ) −1 ] over all possible values of φ A T , φ B T and internal phases ∆φ, in the allowed range of dissipated powers (the upper threshold being ∼ 1 W, to avoid possible damages to the resistors). We identify such minimum for all combinations of possible inputs and reference arms. The best scenario for our interferometer corresponds to use mode 2 as reference mode and arm 1 as input mode for single photons, with the following values of phases: φ A T = 1.49 rad, φ B T = 0.72 rad, ∆φ 1 = −3.07 rad and ∆φ 2 = 0.34 rad. In this working point, the trace of the inverse of Fisher Information matrix is Tr[(F exp ) −1 ] = 4.2. We now have to assign each resistor R i (i = 1, . . . , 6) to tune both the unknown phase shifts φ = (φ 1 , φ 2 ), and the control phases Φ = (Φ 1 , Φ 2 ) for the adaptive algorithms. More specifically, we choose to employ resistors R 4 , R 5 and R 6 to tune φ. Conversely, the control phases Φ are those modified by dissipating power in R 1 and R 2 . Hence, considering (1) as ∆φ = φ+Φ, we find the following expressions: with j = 1, 2. Note that, in principle, only 4 resistors would be sufficient to tune independently the 4 phase shifts (2 unknown and 2 controls). However, we employed 5 resistors in order to obtain large tunability of the device within limits of the damage threshold of each resistor.

I. ADAPTIVE BAYESIAN ESTIMATION BASED ON SEQUENTIAL MONTE CARLO APPROACH
As explained in the main text, the machine learning technique exploited to realize two-phase estimation experiments is based on approximating the prior probability distribution support with M discrete particles [1]. More specifically, a probability weight w i is associated to each i−particle by keeping normalization M i=1 w i = 1. Then, the posterior distribution is updated according to the Bayes rule. In this section we discuss two aspects of this technique: the resampling strategy, and the utility function chosen to tune the control parameters. Finally, we report some experimental data to provide an overview of the overall estimation process.
A. Resampling strategy Fig. 5a of the main text shows that the initial support is uniformly covered by particles. During the estimation protocol, the updating process according to the Bayes rule modifies the particles weights towards more likely phases values. Greater weights are attribute to particles closer to the estimated values. Conversely, distant particles tend to zero weights according to the normalization rule (Fig. 5b of the main text), thus bringing no useful information. Furthermore, the estimation sensitivity of the unknown phase pairs is limited by the initial density of particles around the true values. To solve these problems, resampling technique can be adopted to update particles in more significant positions. Following Ref. [1], the employed resampling technique is based on introducing, at particular steps of the process, perturbations on the particles ({φ}) to move them on more likely position ({φ }), according to the posterior probability. First, M particles are selected randomly following the prior distribution. Then, the selected particles are moved randomly according to the multivariate distribution defined by: The various distribution peaks are generated by displacing the original estimated pair (µ) in the direction of each φ i , of a quantity µ i = aµ + (1 − a)φ i . The parameter a is the resampling parameter, that we set to a = 0.98 as suggested in Ref. [1]. The covariance matrix (Σ) is calculated by multiplying (1 − a 2 ) with the covariance matrix of the initial particles. As a result, the particles are rearranged by increasing the density around the estimated phase values. Then, the weights of new particles are set to a uniform distribution (w i = 1/M ), and the learning process restarts. Resampling is performed when the following condition is fulfilled:

B. Utility function
The utility function U defines the figure of merit that is employed to tune the control parameters Φ during the estimation process. Canonical choices for the utility function are the information gain or the quadratic loss. In our case * fabio.sciarrino@uniroma1.it arXiv:2002.01232v1 [quant-ph] 4 Feb 2020 we chose to minimize the expected variance of the posterior distribution after each step. Given the prior distribution {w}, each output d of three possible cases will update the posterior following the Bayes rule, and a precise overall variance can be assigned to that specific output. The overall variance is computed by tracing the covariance matrix associated to the posterior distribution: U (d|{w}) = Tr[Cov(φ|d, {w})]. The expected variance U (Φ) is computed by averaging this quantity over the probability to obtain that specific output p(d|Φ). More specifically, the utility function reads: where p(d|Φ) is given by ). Finally, we note that p(d|φ, Φ) represents the likelihood of the system, which is discussed in the Sec. II. A suitable characterization of the device is thus crucial to correctly apply the algorithm.

C. Adaptive two-phase estimation experiments
In Fig. 1 we report some examples of multiphase estimation experiments performed on our device, by using the adaptive protocol. In particular, we report the overall quadratic loss L(φ,φ) obtained for six different pairs, which is estimated N exp = 100 independent times. The mean L(φ,φ) over all N exp runs demonstrates that the Cramér-Rao bound (CRB) is saturated for all pairs of phases, after a limited number of data. The CRB is computed as Tr[(F exp ) −1 ], where F exp is the reconstructed Fisher Information matrix of the device (see Sec. III). Our results show that all the estimated phases are close to the true set phases within one standard deviation, thus confirming the efficiency of the adopted learning algorithm. The estimation error is directly provided by the Bayesian protocol through the covariance matrix of the posterior distribution.

II. LIKELIHOOD ESTIMATION OF THE CIRCUIT
A crucial step for the implementation of phase estimation protocols is a precise characterization of the circuit. Modeling of its action comprises 58 parameters that have to be evaluated.
The 14 static parameters are beam-splitter transmittivities t i (i = 1, . . . , 6), the relative phase shifts with zero applied current φ j0 (j = 1, 2) and φ k T (k = A, B), and the two independent visibilities with constant background noise. Conversely, the dynamic parameters are the remaining 44 response coefficients α ji and α N L jik [Eq. (1) of the main text]. Of those parameters, 24 describe the action of the internal resistors R i (i, j = 1, . . . , 6) and 4 describe the tritter resistors R j (j = A, B). Indeed, the phases φ A T and φ B T have been considered independent from the power dissipated on other resistors, since cross-talk effects with the remaining elements of the device are negligible. The last 16 parameters describe the dependence of the 8 resistors on the applied currents. We measured the three output probabilities P (1 → i) (i = 1, 2, 3), injecting single photon along input 1 of the circuit. More details on the employed characterization method can be found in Ref. [2].
We performed different measurements of the output probabilities by fixing the currents on the resistors relative to the unknown phases (R 3 , R 4 and R 5 ), and by varying those applied on the resistors associated to feedback phases (R 1 and R 2 ). For instance, when no current is applied on resistors R 3 , R 4 and R 5 , we estimate the chip parameters by fitting the probabilities as function of the currents applied on R 1 and R 2 . The obtained value of the χ 2 is 1428 with 1200 experimental points. The comparison between the fit function and the experimental data is shown in Fig.  2. Note that the values of the normalized χ 2 is below 1.3 for each data set.
The probability function, that depends on the fitted parameters, represents the likelihood associated to the circuit and is the basis of the Bayesian protocol described in the main text.

III. FISHER INFORMATION MATRIX
After the characterization of all parameters, we reconstructed the Fisher Information matrix F exp . Then, we optimized the quantity Tr[(F exp ) −1 (∆φ 1 , ∆φ 2 , φ A T , φ B T )] over the phases ∆φ 1 , ∆φ 2 , φ A T , φ B T , in the range of total dissipated power permitted by the circuit. Indeed, a total dissipated power greater than 1 W could damages to the resistors. We found that the minimum value of this quantity is reached when the single photons are injected along input 1, arm 2 of the circuit is chosen as a reference, and the values of tritter and internal phases are the following: φ A T = 1.49 rad, φ B T = 0.72 rad, ∆φ 1 = −3.07 rad and ∆φ 2 = 0.34 rad. The inset shows the average estimated phases (red dots), the true phases to be estimated (blue dot) and the confidence interval associate to the covariance matrix (orange region).
Fixing these conditions we reconstruct the Fisher Information matrix, obtaining: Hence, estimating the two phases with N probes, the bound over the sum of the quadratic losses is: