## Introduction

The basic principles of quantum mechanics strongly limit the processes that can be performed in nature. The linearity of quantum operations prohibits the possibility of creating in a deterministic way perfect copies of unknown quantum states1. Information encoded in non-orthogonal quantum states can be perfectly recovered only at the expense of allowing a probability of error2,3,4. Measurements, a fundamental tool for understanding the natural world, are also limited. In remarkable difference with the classical realm, a measurement in a quantum system alters the quantum state of the system, which prevents further information gain. In this way, an ensemble of many equally prepared systems is necessary to obtain information about a quantum system and, consequently, quantum measurements become statistical in nature5,6,7. Furthermore, quantum mechanics establishes through the quantum Cramér–Rao bound8,9,10 a fundamental limit in the best accuracy that a measurement or estimation process can achieve as a function of the ensemble size.

The accuracy limit also bounds the estimation of unknown quantum processes and states. These are central problems in the theory of quantum measurements11 and play a key role in the control of quantum systems, the benchmarking of quantum technologies12,13,14, and quantum metrology15. Today there is a large collection of methods11,16,17,18,19,20,21,22,23,24,25 for estimating unknown quantum states, which are collectively known as quantum tomographic methods. These are based on the post-processing of data acquired through the measurement of a set of positive operator-valued measures in an ensemble of N identically, independently prepared copies of the unknown state to be estimated. Recently, there has been a great deal of research activity aimed at designing tomographic methods with increasing accuracy, in particular methods that saturate the quantum accuracy limit. From the theoretical and experimental point of view, it has been shown that a two-stage adaptive tomographic method achieves the quantum accuracy limit in the case of unknown mixed states of a single qubit26,27. Unfortunately, this result does not hold in the higher dimensional case28.

Here, we report an adaptive tomographic method that asymptotically approaches the quantum accuracy limit in the important case of estimating unknown pure quantum states in high dimensions. This is an instance of multi-parameter estimation, where due to the information trade-off among incompatible observables the progress has been slow. In the case of pure states the quantum accuracy is limited by the Gill–Massar lower bound29. The method is based on the concatenation of the Complex simultaneous perturbation stochastic approximation (CSPSA), a recently proposed iterative stochastic optimization method on the field of the complex numbers30, and Maximum likelihood estimation (MLE), a well known statistical inference method31. The method here proposed reaches the quantum accuracy limit after a small number of iterations, typically of the order of 8, for all inspected dimensions. Thereby, the method makes an optimal use of the ensemble size and surpasses the estimation accuracy of known methods for pure-state tomography32,33,34,35. Moreover, the method also surpasses the estimation accuracy of any tomographic method designed to estimate mixed states via separable measurements on the ensemble of equally prepared copies.

## Method

The accuracy achieved in the estimation of the parameters of a quantum state can be studied by means of three fundamental inequalities. These are the Cramér–Rao inequality36 $${\mathcal {C}}\le {\mathcal {I}}^{-1}$$, the quantum Cramér–Rao inequality36,37 $${\mathcal {C}}\le {\mathcal {J}} ^{-1}$$, and the Gill–Massar inequality29 $$Tr(\mathcal {IJ}^{-1})\le d-1$$, where $$d, {\mathcal {C}}$$, $${\mathcal {I}}$$, and $${\mathcal {J}}$$ are the dimension of the Hilbert space, the covariance matrix, the classical Fisher information matrix, and the quantum Fisher information matrix, respectively. These inequalities allows one to deduce lower bounds for several metrics of accuracy, such as infidelity or mean square error, considering the impact of the ensemble size. As accuracy metric we employ the infidelity38,39 $$I(|{\tilde{\psi }}\rangle ,|{\hat{\psi }}\rangle )=1-|\langle {\tilde{\psi }}|{\hat{\psi }}\rangle |^2$$ between an unknown state $$|{\tilde{\psi }}\rangle$$ and its estimate $$|{\hat{\psi }}\rangle$$. The estimation accuracy is given by the expectation or mean value $${\bar{I}}(|{\tilde{\psi }}\rangle )$$ of the infidelity with respect to all possible estimates $$|{\hat{\psi }}\rangle$$ of a fixed unknown state $$|{\tilde{\psi }}\rangle$$, that is,

\begin{aligned} \bar{I}(|{\tilde{\psi }}\rangle )={\mathbb {E}}[I(|{\tilde{\psi }}\rangle ,|{\hat{\psi }}\rangle )||{\hat{\psi }}\rangle ]=\int I(|{\tilde{\psi }}\rangle ,|{\hat{\psi }}\rangle )p(|{\hat{\psi }}\rangle )d{\hat{\psi }}, \end{aligned}
(1)

where $$p(|{\hat{\psi }}\rangle )$$ is the probability density function of obtaining the estimate $$|{\hat{\psi }}\rangle$$. Depending on the general characteristics of the estimation process, the fundamental inequalities lead to various lower bounds for the estimation accuracy. For the estimation of pure states, which have $$2(d-1)$$ independent parameters, the ultimate quantum estimation accuracy is given by the inequality $$\bar{I}(|{\tilde{\psi }}\rangle )\ge {\bar{I}}_p$$, with $${\bar{I}}_{p}=(d-1)/N$$29,40,41 the Gill–Massar lower bound. This state-independent bound can be used as a benchmark to assess tomographic methods.

Our main goal is designing a tomographic method for pure quantum states in high dimensions that achieves an estimation accuracy equal to $${\bar{I}}_{p}$$. The design of the proposed tomographic method originates from observing42,43,44,45 that an unknown and fixed pure quantum state $$|{\tilde{\psi }}\rangle$$ can be characterized as the minimizer of the infidelity in the Hilbert space of the estimate $$|{\hat{\psi }}\rangle$$, that is, $$I=0$$ when $$|{\hat{\psi }}\rangle =|{\tilde{\psi }}\rangle$$. Then, we can envision the use of an optimization method to iteratively drive a sequence of estimates toward decreasing values of the infidelity.

The choice of the optimization method requires certain consideration. Traditional optimization methods are based on the evaluation of higher derivatives of the function to be optimized. In the case of the infidelity this is not possible, since the derivatives depend on the state to be estimated, which is unknown. We also require a fast convergence rate from the optimization method. Furthermore, the implementation of the method should be as economical as possible from the point of view of computing and physical resources.

To deal with these situations we resort to the Complex simultaneous perturbation stochastic approximation. This method can optimize real valued functions of complex variables that also depend on unknown complex parameters. The infidelity is a non-holomorphic function, that is, it violates the Cauchy-Schwarz conditions, which are necessary for the existence of a complex derivative. In this case, the usual approach is to optimize with respect to a real parametrization of the complex variables entering in the infidelity, that is, the complex probability amplitudes. This approach has some unwanted side effects. The elements of the real-valued gradient of the infidelity are in general more convoluted than would be those of a complex gradient formed by first order derivatives with respect to the initial complex variables. Additionally, any inherent structures present in the complex derivatives of the infidelity, which could be exploited to enhance the performance of optimization methods, are unused46. The CSPSA method has been designed to avoid these unwanted features. The main tool in the formulation of CSPSA is the Wirtinger complex calculus. For a target function $$f({\varvec{z}},{\varvec{z}}^*):{\mathbb {C}}^n\times {\mathbb {C}}^n\rightarrow {\mathbb {R}}$$ the Wirtinger derivatives are defined by47

\begin{aligned} \partial _{z_i}=\frac{1}{2}(\partial _{x_i}-i\partial _{y_i})~\mathrm{and}~\partial _{z_i^*}=\frac{1}{2}(\partial _{x_i}+i\partial _{y_i}), \end{aligned}
(2)

where $$x_i$$ and $$y_i$$ are the real and imaginary parts of $$z_i$$, respectively. These derivatives exist even if $$f({\varvec{z}},{\varvec{z}}^*)$$ is non-holomorphic. Extremal points of $$f({\varvec{z}},{\varvec{z}}^*)$$ are completely characterized by the conditions $$\partial _{z_i^*}f=0~\forall ~i=1,\dots ,d$$ or, equivalently, $$\partial _{z_i}f=0~\forall ~i=1,\dots ,d$$48,49,50. Thereby, the complex gradient is defined by $${\varvec{g}}=\partial _{{\varvec{z}}^*}f$$ with $$\partial _{{\varvec{z}}^*}=(\partial _{z_1^*},\dots , \partial _{z_d^*})$$. The CSPSA method is defined by the iterative rule30

\begin{aligned} \hat{\varvec{z}}_{k+1}=\hat{\varvec{z}}_k-a_k\hat{\varvec{g}}_k(\hat{\varvec{z}}_k,\hat{\varvec{z}}_k^*), \end{aligned}
(3)

where $$a_k$$ is a positive gain coefficient and $$\hat{\varvec{z}}_k$$ is the estimate of the minimizer $$\tilde{\varvec{z}}$$ of $$f({\varvec{z}},{\varvec{z}}^*)$$ at the k-th iteration. The iteration starts from an initial guess $$\hat{\varvec{z}}_0$$, which is randomly chosen. Instead of employing the complex gradient $${\varvec{g}}$$, CSPSA resorts to an estimator $$\hat{\varvec{g}}_k(\hat{\varvec{z}}_k,\hat{\varvec{z}}_k^*)$$ for the Wirtinger gradient $${\varvec{g}}$$ of $$f({\varvec{z}},{\varvec{z}}^*)$$ the components of which are defined by

\begin{aligned} {\hat{g}}_{k,i}=\frac{f(\hat{\varvec{z}}_{k+},\hat{\varvec{z}}_{k+}^*)+\epsilon _{k,+}-(f(\hat{\varvec{z}}_{k-},\hat{\varvec{z}}_{k-}^*)+\epsilon _{k,-})}{2c_k{\Delta }_{k,i}^*}, \end{aligned}
(4)

with $$\hat{\varvec{z}}_{k\pm }=\hat{\varvec{z}}_k\pm c_k{\varvec{\Delta }}_k$$, $$c_k$$ a positive gain coefficient, and $$\epsilon _{k,\pm }$$ describes the presence of noise in the values of $$f(\hat{\varvec{z}}_{k\pm },\hat{\varvec{z}}_{k\pm }^*)$$. The components of the vector $${\varvec{\Delta }}_k\in {\mathbb {C}}^n$$ are identically and independently distributed random variables in the set $$\{\pm 1,\pm i\}$$. The gain coefficients $$a_k$$ and $$c_k$$ control the convergence of CSPSA and are chosen as

\begin{aligned} a_k=\frac{a}{(10k+1+A)^s},~~c_k=\frac{b}{(10k+1)^r}, \end{aligned}
(5)

where the values of aAsb and r can be adjusted to increase the rate of convergence. The estimates $${\varvec{\hat{z}}}_k$$ provided by CSPSA converge asymptotically in mean to the minimizer of $$\varvec{\tilde{z}}$$ of f and $$\hat{\varvec{g}}_k$$ is an asymptotically unbiased estimator of the Wirtinger gradient.

The use of an estimator $$\hat{\varvec{g}}_k(\hat{\varvec{z}}_k,\hat{\varvec{z}}_k^*)$$ for the Wirtinger gradient $${\varvec{g}}$$ allows CSPSA to optimize functions with unknown parameters, provided that the values of $$f(\hat{\varvec{z}}_{k\pm },\hat{\varvec{z}}_{k\pm }^*)$$ can be obtained and the unknown parameters remain constant along the optimization procedure. This is precisely the case of the infidelity. Considering an unknown pure quantum state $$|\psi ({\varvec{\tilde{z}}})\rangle$$ of a qudit given by

\begin{aligned} |\psi ({\varvec{\tilde{z}}})\rangle =\frac{1}{\sqrt{N}}\sum _{i=0}^{d-1}{\tilde{z}}_i|i\rangle , \end{aligned}
(6)

with $$N=\sum _{i=0}^{d-1}|{\tilde{z}}_i|^2$$, and an estimate $$|\psi ({\varvec{\hat{z}}})\rangle$$ of $$|\psi ({\varvec{\tilde{z}}})\rangle$$ given by

\begin{aligned} |\psi ({\varvec{\hat{z}}})\rangle =\frac{1}{\sqrt{K}}\sum _{i=0}^{d-1}{{\hat{z}}}_i|i\rangle , \end{aligned}
(7)

with $$K=\sum _{i=0}^{d-1}|{{\hat{z}}}_i|^2$$, the infidelity becomes

\begin{aligned} I|\psi (\varvec{\tilde{z}})\rangle ,(|\psi (\varvec{\hat{z}})\rangle )=1-\frac{|{\varvec{\hat{z}}}\cdot \tilde{\varvec{z}}^*|^2}{NK}. \end{aligned}
(8)

This function quantifies the deviation of the estimate $$|\psi ({\varvec{\hat{z}}})\rangle$$ from the true unknwon state $$|\psi (\tilde{\varvec{z}})\rangle$$. In this function $$\varvec{\hat{z}}$$ is a set of complex variables and $$\varvec{\tilde{z}}$$ plays the role of a set of fixed unknown complex parameters. According to Eq. (4) CSPSA evaluates at each iteration k the infidelity $$I(|\psi (\varvec{\tilde{z}})\rangle ),|\psi (\varvec{\hat{z}})\rangle$$ at $${\varvec{\hat{z}}}=\hat{\varvec{z}}_{k\pm }$$. These values can be obtained by projecting the unknown state $$|\psi (\varvec{\tilde{z}})\rangle$$ onto a d-dimensional orthonormal base $$B_{k\pm }=\{|\psi _{i,k\pm }\rangle \}$$ (with $$i=0,\dots ,d-1$$) that contains the state $$|\psi (\hat{\varvec{z}}_{k\pm })\rangle$$. This procedure generates the detection statistics $$n_{i,k\pm }$$ that lead to the probability distributions $$p_{i,k\pm }=n_{i,k\pm }/\sum _jn_{j,k\pm }$$, where $$N_{est}=\sum _jn_{j,k\pm }$$ is the total number of copies of the unknown states employed in the projective measurements. These probability distributions are employed to obtain the estimates

\begin{aligned} I(|\psi (\varvec{\tilde{z}})\rangle ,|\psi (\hat{\varvec{z}}_{k\pm })\rangle )=1-p_{0,k\pm } \end{aligned}
(9)

for the infidelity, where we have assumed the convention $$|\psi _{0,k\pm }\rangle =|\psi (\hat{\varvec{z}}_{k\pm })\rangle$$. These estimates together with Eq. (3) generate the next estimate $$|\psi (\hat{\varvec{z}}_{k+1})\rangle$$ for the unknown quantum state $$|\psi (\tilde{\varvec{z}})\rangle$$. The CSPSA method can be understood as a generalization of the Simultaneous perturbation stochastic approach (SPSA)51,52, a well known stochastic gradient-free optimization method working in the field of the real numbers. SPSA has been proposed42 and experimentally demonstrated44 as tomographic method for pure quantum states. In this context, it has been show30 that CSPSA achieves a higher convergence rate than SPSA.

CSPSA generates 2 probability distributions at each iteration, that is, a total of 2d different probabilities. However, only two of them are employed to estimate the required values of the infidelity. The remaining $$2d-2$$ probabilities are not occupied by the algorithm. Thus, CSPSA generates a large amount of accumulated data that is simply discarded. Here, we show that precisely this information can be used to improve the convergence rate of CSPSA in such a way that the estimation accuracy reaches the Gill–Massar lower bound for the estimation of pure quantum states. This is accomplished by resorting to Maximum likelihood estimation, a well known statistical inference method that is extensively employed as a post-processing stage in quantum tomographic methods. MLE is a method in statistical inference aimed at estimating unknown parameters of a population from observed data. The underlying idea is to choose as estimator the maximizer of the probability of obtaining the observed data53,54. MLE was introduced in quantum tomography as a post-processing method31 to obtain physically acceptable quantum states. A quantum system that undergoes a measurement process described by the set of projectors $$\{|a_i\rangle \langle a_i|\}$$ has a likelihood function $$L(\rho )$$ given by

\begin{aligned} L(\rho )=\frac{N!}{\Pi _j n_j!}\Pi _i Tr(\rho |a_i\rangle \langle a_i|)^{n_i}, \end{aligned}
(10)

where the detection statistic of each projector is given by the number of counts $$n_i$$ and the total number of counts is given by $$N=\sum _in_i$$. If the quantum state prior to the measurement is $$\rho$$, then $$L(\rho )$$ is the total joint probability of registering data $$\{n_i\}$$. MLE is defined by the convex optimization problem

\begin{aligned} \arg \max _{\rho }L(\rho ),\, s.\,t.\, Tr(\rho )=1,\, \rho \ge 0. \end{aligned}
(11)

At this point we link together CSPSA and MLE. Employing the accumulated data $$\{n_{i,m\pm }\}$$ between iterations $$m=1$$ until $$m=k$$ we define the accumulated likelihood $$L_k(\rho )$$ by the expression

\begin{aligned} L_k(\rho )=\Pi _{m=1}^k\Pi _{\lambda =\pm }\Pi _{i=0}^{d-1}Tr(\rho |\psi _{i,m\lambda }\rangle \langle \psi _{i,m\lambda }|)^{n_{i,m\lambda }}, \end{aligned}
(12)

which is maximized in the set of pure states employing as starting guess the estimate $$|\psi (\hat{\varvec{z}}_{k+1})\rangle$$ provided by CSPSA. The refined estimate provided by MLE is then employed as starting guess for the next iteration with CSPSA. We refer to this procedure as the CSPSA-MLE tomographic method. The main steps of the CSPSA-MLE tomographic method have been summarized as pseudocode in Algorithm 1 above.

## Results

For a fixed unknown state, the CSPSA-MLE method exhibits three sources of randomness. Since there is no a priori information about the unknown state, the initial guess is chosen according to a Haar-uniform distribution. At each iteration, the vector $${\varvec{\Delta }}_k$$ is also randomly chosen and two measurements are performed on an ensemble of size $$N_{est}$$, which leads to finite statistics effects. Thus, each time the CSPSA-MLE method is employed to estimate a fixed unknown state $$|\psi (\varvec{\tilde{z}})\rangle$$, a different estimate $$|\psi (\varvec{\hat{z}})\rangle$$ is generated. In this scenario, the accuracy of the estimation procedure for a fixed unknown state $$|\psi (\varvec{\tilde{z}})\rangle$$ is given by the expectation value $${\bar{I}}(|\psi (\varvec{\tilde{z}})\rangle )$$ of Eq. (1).

To study the performance of the CSPSA-MLE method several Monte Carlo experiments in the regime of a small number of iterations were carried out. A set $$\Omega _d$$ with $$2\times 10^2$$ pure quantum states $$|\psi (\varvec{\tilde{z}})\rangle$$ of a single d-dimensional quantum system (qudit), uniformly distributed on the unit hypersphere, was generated. Each state in $$\Omega _d$$ was reconstructed via the CSPSA-MLE- method considering a number G of initial guesses, also uniformly distributed on the unit hypersphere, and R independent simulations for each fixed pair of unknown state and initial guess. At each iteration the values of $$I(|\psi (\tilde{\varvec{z}})\rangle , |\psi (\hat{\varvec{z}}_{k,\pm })\rangle )$$ were estimated considering a multinomial distribution on an ensemble of size $$N_{est}$$. For each state in $$\Omega _d$$; mean, variance, median and interquartile range for the infidelity as functions of the number of iterations k for several values of ensemble size $$N_{est}$$ were estimated. Similar numerical experiments were performed via CSPSA without MLE for comparison purposes. Since the optimization of the gains is a computational costly problem, we have resorted to the gains $$s=1$$ and $$r=0.166$$. These lead to a high rate of convergence for CSPSA in the regime of few iterations. The resting coefficients have been set to $$A=0, a=3$$, and $$b=0.35,0.3,0.07,0.06,0.03$$ for $$N_{est}=10,10^2,10^3,10^4,10^5$$, correspondingly.

Figure 1 displays a log-log graphic of the mean infidelity $${\bar{I}}(|\psi (\varvec{\tilde{z}})\rangle )$$ (stars), for two randomly chosen states $$|\psi (\varvec{\tilde{z}})\rangle$$ in $$\Omega _2$$, that is, for a single qubit, as a function of the number k of iterations and for several values of the ensemble size $$N_{est}$$. This mean infidelity is estimated as

\begin{aligned} {\bar{I}}(|\psi (\varvec{\tilde{z}})\rangle )=\frac{1}{RG}\sum _{\varvec{\hat{z}}}I(|\psi (\varvec{\hat{z}})\rangle ,|\psi (\varvec{\tilde{z}})\rangle ), \end{aligned}
(13)

with $$G=500$$ and $$R=20$$. Within the first 10 iterations the mean infidelity $${\bar{I}}(|\psi (\varvec{\tilde{z}})\rangle )$$ exhibits a fast decrease that is followed by an asymptotic linear behavior. The decrease of $${\bar{I}}(|\psi (\varvec{\tilde{z}})\rangle )$$ becomes more pronounced as $$N_{est}$$ increases. After 10 iterations the CSPSA-MLE method leads to a mean infidelity $${\bar{I}}(|\psi (\varvec{\tilde{z}})\rangle )$$ approximately equal to $$10^{-2}, 7\times 10^{-4}, 5\times 10^{-5}, 5\times 10^{-6}$$ and $$5\times 10^{-7}$$, for increasing $$N_{est}$$. For the same states CSPSA without MLE yields after 10 iterations a mean infidelity $${\bar{I}}(|\psi (\varvec{\tilde{z}})\rangle )$$ approximately equal to $$2.1\times 10^{-1}, 5.2\times 10^{-2}, 4.1\times 10^{-2}, 4.0\times 10^{-2}$$ and $$3.9\times 10^{-2}$$ (see Fig. 1 in30), for increasing $$N_{est}$$. Thus, the concatenation of CSPSA to MLE yields a mean infidelity that is $$10^{-1}$$ to $$10^{-5}$$ times closer to the true minimum than the one provided by CSPSA alone. Let us note that this comparison considers the same type and amount of physical resources for both methods, that is, ensemble size $$N_{est}$$ for the estimation of the infidelity and total number 2dk of measurement outcomes. The estimation via CSPSA can reach similar mean infidelity values to the CSPSA-MLE method but at the expense of more iterations. For instance, with $$N_{est}=10^4$$ and after 10 iterations the CSPSA-MLE method delivers a mean infidelity of $$5\times 10^{-6}$$. A similar value can be achieved via estimation with CSPSA with $$N_{est}=10^4$$ after 100 iterations, which represents an increase of one order of magnitude in the total ensemble size N as well as in the total number of measurements. Thus, the concatenation of MLE to CSPSA provides a significative improvement in the rate of convergence and a large reduction of the required physical resources.

After 7 iterations and for $$N_{est}=10,10^2,10^3,10^4,10^5$$ the mean infidelity $${\bar{I}}(|\psi (\varvec{\tilde{z}})\rangle )$$ asymptotically approximates linear behavior. This resembles the Gill–Massar lower limit $${\bar{I}}_{p}$$ for the average infidelity reached in the estimation of pure states, where now $$N=2kN_{est}$$ is the total ensamble size employed after k iterations. This bound imposes a fundamental precision limit on the achievable mean infidelity: no method for the estimation of pure states attains a mean infidelity $${\bar{I}}(|\psi (\varvec{\tilde{z}})\rangle )$$ lower than $${\bar{I}}_{p}$$. Dashed lines in Fig. 1 correspond to $${\bar{I}}_{p}$$ as function of k for several values of $$N_{est}$$. Clearly, the CSPSA-MLE method delivers a mean infidelity of the same order of magnitude than $${\bar{I}}_{p}$$, albeit slightly higher. However, the gap between $${\bar{I}}(|\psi (\varvec{\tilde{z}})\rangle )$$ and $${\bar{I}}_{p}$$ tends to close asymptotically as k increases. Thus, the infidelity provided by the CSPSA-MLE method tends to approach the Gill–Massar lower bound $${\bar{I}}_{p}$$ with a convergence rate that increases with $$N_{est}$$.

Figure 1 also displays the median infidelity $$\bar{M}(|\psi (\varvec{\tilde{z}})\rangle )$$ of $$I(|\psi (\varvec{\hat{z}})\rangle ,|\psi (\varvec{\tilde{z}})\rangle )$$ as a function of k for both randomly chosen states $$\psi (\varvec{\hat{z}})\rangle$$ and for several values of $$N_{est}$$. This exhibits a much faster decrease and an earlier onset of the asymptotic linear behavior. The shaded areas in Fig. 1 correspond to the interquartile range, which is divided into two areas above and below the median infidelity $$\bar{M}(|\psi (\varvec{\tilde{z}})\rangle )$$. In the lineal regime, the Gill–Massar lower bound $${\bar{I}}_{p}$$ lays in the upper half of the interquartile range. This indicates that more than 50% of the reconstruction attempts lead to an infidelity lower than $${\bar{I}}_{p}$$. Nevertheless, the mean infidelity $${\bar{I}}(|\psi (\varvec{\tilde{z}})\rangle )$$ is above the median $$\bar{M}(|\psi (\varvec{\tilde{z}})\rangle )$$. This points to the existence of a small fraction of realizations with high values of the infidelity, which increases the value of the mean infidelity above the median infidelity. As the number of iterations increases the impact of these realizations on the value of the mean infidelity decreases and mean and median infidelity tend to reach similar values.

The main features exhibited in Fig. 1 are typical, that is, all randomly chosen states in $$\Omega _d$$ display a similar behavior. This is shown in Fig. 2 where mean $$\bar{{\mathbb {I}}}$$ (stars), median $$\bar{{\mathbb {M}}}$$ (dots) and interquartile range (shaded areas) of the mean infidelity $${\bar{I}}(|\psi (\varvec{\tilde{z}})\rangle )$$ over all states in $$\Omega _d$$ as a function of k for several values of $$N_{est}$$ are depicted and compared to $${\bar{I}}_p$$ (dashed lines). The mean $$\bar{{\mathbb {I}}}$$ (stars) correspond to the expectation value of $${\bar{I}}(|\psi (\varvec{\tilde{z}})\rangle )$$ on the Hilbert space, that is,

\begin{aligned} \bar{{\mathbb {I}}}={\mathbb {E}}[{\bar{I}}(|{\tilde{\psi }}\rangle )||{\tilde{\psi }}\rangle ]=\int {\bar{I}}(|{\tilde{\psi }}\rangle )f(|{\tilde{\psi }}\rangle )d{\tilde{\psi }}, \end{aligned}
(14)

where $$f(|{\tilde{\psi }}\rangle )$$ is the probability density function of randomly and uniformly generating the unknown state $$|{\tilde{\psi }}\rangle$$. The mean $$\bar{{\mathbb {I}}}$$ is estimated as

\begin{aligned} \bar{{\mathbb {I}}}=\frac{1}{200}\sum _{|\psi (\varvec{\tilde{z}})\rangle \in \Omega }{\bar{I}}(|\psi (\varvec{\tilde{z}})\rangle ). \end{aligned}
(15)

Figure 2 exhibits a fast decrease of $$\bar{{\mathbb {I}}}$$ until approximately iteration 7, which is followed by asymptotic linear behavior. The median $$\bar{{\mathbb {M}}}$$ shows a similar behavior, although it enters an asymptotic linear behavior faster than the mean $$\bar{{\mathbb {I}}}$$ in iteration 5 approximately. As the number k of iterations increases mean $$\bar{{\mathbb {I}}}$$ and median $$\bar{{\mathbb {M}}}$$ overlap almost perfectly in the linear regime. The interquartile range is very narrow and nearly indistinguishable from these two quantities. This indicates that in the linear regime the mean infidelities $${\bar{I}}(|\psi (\varvec{\tilde{z}})\rangle )$$ of the states in $$\Omega _d$$ are concentrated in an extremely narrow interval around the mean $$\bar{{\mathbb {I}}}$$ and the median $$\bar{{\mathbb {M}}}$$. Thus, all states in $$\Omega _d$$ are estimated by the CSPSA-MLE method with an accuracy that is very close to $$\bar{{\mathbb {I}}}$$. Furthermore, the estimation accuracy $$\bar{{\mathbb {I}}}$$ provided by the CSPSA-MLE method tends to converge from above to $${\bar{I}}_{p}$$. Thereby, the CSPSA-MLE method produces a mean infidelity $$\bar{{\mathbb {I}}}$$ for any unknown pure state that approaches asymptotically the best possible estimation accuracy of pure states $${\bar{I}}_p$$ allowed by the laws of quantum mechanics. Figure 2 also displays the behavior of $$\bar{{\mathbb {I}}}$$ (squares) and median $$\bar{{\mathbb {M}}}$$ (rombos) obtained with the use of CSPSA only, that is, without employing MLE to refine the guesses provided by CSPSA, for the case of $$N_{est}=10^5$$. Clearly, the CSPSA-MLE tomographic method outperforms the CSPSA tomographic method by at least 5 orders of magnitude. Therefore, the concatenation of MLE to CSPSA plays a key role in improving the accuracy of the estimation and bringing it closer to the lower bound of Gill–Massar.

Once the CSPSA-MLE method enters in a lineal regime, after approximately 10 iterations in the inspected dimensions, delivers an estimation accuracy $$\bar{I}(|{\tilde{\psi }}\rangle )$$ close to $${\bar{I}}_p$$. Thereby, we can write

\begin{aligned} N\approx \frac{d-1}{\bar{I}(|{\tilde{\psi }}\rangle )}, \end{aligned}
(16)

or since $$N=2N_{est}k$$ also

\begin{aligned} N_{est}k\approx \frac{d-1}{2\bar{I}(|{\tilde{\psi }}\rangle )}~~~\mathrm{for}~k\ge 10. \end{aligned}
(17)

Thus, in a given dimension d the CSPSA-MLE tomographic method achieves a predefined estimation accuracy with an ensamble size N that can be divided into 2k ensambles of size $$N_{est}$$ each one. Thereby, we can employ a small value of $$N_{est}$$ and a large number of iterations or a large value of $$N_{est}$$ and a small number of iterations. The last alternative provides a faster convergence rate. In this case, the estimation of the complex gradient is closer to the exact gradient and the convergence of CSPSA becomes similar to a complex formulation of a deterministic first-order iterative optimization algorithm. However, whether the first or second alternative is more appropriate also depends largely on the characteristics of the experimental platform where the estimation is realized.

We can also compare the mean infidelity $$\bar{{\mathbb {I}}}$$ achieved by the CSPSA-MLE method with the Gill–Massar lower bound $${\bar{I}}_{m}$$ for the mean infidelity achieved in the estimation of full-rank mixed states via separable measurements on the ensemble of equally prepared copies, which is given by $${\bar{I}}_{m}=[(d+1)/2]^2{\bar{I}}_{p}$$. Tomographic methods designed to estimate unknown mixed states cannot achieve a better accuracy than $${\bar{I}}_{m}$$ as long as they resort to separable measurements on the ensemble of equally prepared copies. This departs quadratically from $${\bar{I}}_p$$ and $$\bar{{\mathbb {I}}}$$ as the dimensions increases. Thus, tomographic methods that do not employ the a priori information about the purity of the unknown state cannot estimate pure states with an accuracy better than $${\bar{I}}_{m}$$. Thereby, the CSPSA-MLE method provides an advantage for pure state estimation over standard quantum tomography, two-stage standard quantum tomographic, and methods such as the ones based on mutually unbiased bases, symmetric informationally complete positive-operator-valued measures, and equidistant states.

## Discussion

The accurate estimation of quantum states with a limited ensemble size is a difficult task. In the case of 2-dimensional pure quantum states, the best measurement strategy, that is, the one that leads to the (local unbiased) estimator that saturates the quantum Cramér–Rao bound, is generally a function of the parameters of the unknown state itself. Thereby, the use of the optimal estimation strategy is unfeasible55. It is possible, however, to employ an adaptive strategy that approaches the quantum Cramér–Rao bound. This strategy consists of a sequence of measurements where each one is optimal for a given guess of the unknown state56,57. The multi-parameter estimation of an unknown d-dimensional quantum state, which is defined by $$2d-2$$ independent real numbers, cannot be carried out following a similar strategy since the optimal measurement strategy is unknown.

Instead, we have approached the estimation of unknown d-dimensional quantum states from the optimization of the metric used to characterize the accuracy of the estimation process. The optimization is solved by a method based on the concatenation of CSPSA to MLE. The proposed a method allows for estimating quantum pure states with high accuracy. CSPSA drives a sequence of projective measurements toward the infidelity minimizer. MLE provides at each iteration a refinement of the estimates considering the accumulated data generated by all previous measurements. Monte Carlo experiments for dimensions $$d=2,4,8$$ and 16 indicate that the mean infidelity for a fixed arbitrary unknown state exhibits a fast decrease within few iterations followed by an asymptotic linear trend. In the linear regime the attained mean infidelity closely approaches the fundamental limit on the accuracy established by the Gill–Massar lower bound for the mean infidelity of the estimation of pure states. Hence, the CSPSA-MLE method surpasses the accuracy of known tomographic methods for pure quantum states. The mean infidelity is also lower than the Gill–Massar lower bound for the infidelity of the estimation of mixed quantum states. Consequently, no tomographic method for mixed states can achieve a mean infidelity lower than the one attained by the CSPSA-MLE method. The median infidelity is also below the Gill–Massar lower bound for pure states. Therefore, more than 50% of the estimation attempts leads to lower infidelities than the Gill–Massar lower bound.

The CSPSA-MLE method exhibits a clear trade-off. The concatenation of CSPSA to MLE increases the rate of convergence of the mean infidelity, which leads to a decrease in the number of iterations and in the ensemble size. However, there is an increase in the computational complexity of the algorithm because in each iteration it is now necessary to solve the optimization problem corresponding to MLE. Recently, optimized methods for MLE in high dimension have been proposed58,59.

An experimental realization of the CSPSA-MLE method applied to state of a single 2-dimensional quantum system can be carried out with current experimental techniques44,57 for generating and measuring single-photon polarization states. The higher dimensional case can be demonstrated by means of experimental setups based on single photons and concatenated spatial light modulators32,60,61 or via integrated quantum photonics62,63. These two experimental platforms offer the possibility of performing electronically controlled adaptive measurements.

## Code availability

The source codes that support the results of this study are available from the corresponding author upon reasonable request.