Recursively Adaptive Quantum State Tomography: Theory and Two-qubit Experiment

Adaptive techniques have important potential for wide applications in enhancing precision of quantum parameter estimation. We present a recursively adaptive quantum state tomography (RAQST) protocol for finite dimensional quantum systems and experimentally implement the adaptive tomography protocol on two-qubit systems. In this RAQST protocol, an adaptive measurement strategy and a recursive linear regression estimation algorithm are performed. Numerical results show that our RAQST protocol can outperform the tomography protocols using mutually unbiased bases (MUB) and the two-stage MUB adaptive strategy even with the simplest product measurements. When nonlocal measurements are available, our RAQST can beat the Gill-Massar bound for a wide range of quantum states with a modest number of copies. We use only the simplest product measurements to implement two-qubit tomography experiments. In the experiments, we use error-compensation techniques to tackle systematic error due to misalignments and imperfection of wave plates, and achieve about 100-fold reduction of the systematic error. The experimental results demonstrate that the improvement of RAQST over nonadaptive tomography is significant for states with a high level of purity. Our results also show that this recursively adaptive tomography method is particularly effective for the reconstruction of maximally entangled states, which are important resources in quantum information.


I. INTRODUCTION
One of the central problems in quantum science and technology is the estimation of an unknown quantum state [1]. Quantum state tomography as the procedure of experimentally determining an unknown quantum state has become a standard technology for verification and benchmarking of quantum devices [2][3][4][5][6][7][8][9][10][11][12][13][14][15][16]. Two key tasks in quantum state tomography are data acquisition and data analysis. The aim of data acquisition is to devise appropriate measurement strategies to acquire information for reconstructing the quantum state. Then in the step of data analysis, the acquired data is associated with an estimate of the unknown quantum state using an estimation algorithm.
In order to enhance the efficiency in data acquisition, it is desired to develop optimal measurement strategies for collecting data. However, an optimal measurement strategy, which is only known for a few special cases [2,[17][18][19][20], depends on the state to be reconstructed. To circumvent this issue, many kinds of fixed sets of measurement bases are designed to be optimal either in terms of the average over a certain quantum state space [21][22][23][24][25] or in terms of the worst case in the quantum state space [12].
For instance, improved state estimation can be achieved by taking advantage of mutually unbiased bases (MUB) [21,25,26] and symmetric informationally complete positive operator-valued measures (SIC-POVM) [27,28]. For multi-partite quantum systems, MUB and SIC-POVM are difficult to experimentally realize since they involve nonlocal measurements.
How to efficiently acquire information of an unknown quantum state using simple measurements that are easy to realize experimentally remains open.
For data analysis in tomography, although many methods, such as maximum-likelihood estimation [4,[29][30][31][32], Bayesian mean estimation [33,34], least-squared inversion [35], have been used to reconstruct the quantum state, this task can be computationally intensive, and may take even more time than the experiments themselves. It has been reported in [6] that using the maximum-likelihood method to reconstruct eight-qubit took weeks of computation. Therefore, the development of an efficient data analysis algorithm is also a critical issue in quantum state tomography [12,36]. In [12], a recursive linear regression estimation algorithm was presented which is much more computationally efficient in the sense that it can greatly save the cost of computation as compared to the maximum-likelihood method with only a small amount of accuracy sacrificed.
For a given number of copies of the system, in order to improve the tomography accuracy by better tomographic measurements, a natural idea is to develop an adaptive tomography protocol where the measurement can be adaptively optimized based on data collected so far. Adaptive measurements have shown more powerful capability than nonadaptive measurements in quantum phase estimation [37][38][39], phase tracking [40], quantum state discrimination [41,42], and Hamiltonian estimation [43,44]. Actually, adaptivity has been proposed for quantum state tomography in various contexts [2,33,36,[45][46][47][48][49]. For example, the results on one qubit have demonstrated that adaptive quantum state tomography can improve the accuracy quadratically considering the infidelity index [36]. However, when generalizing their results to n-qubit systems, the adaptive tomography protocol will involve nonlocal measurements which are hard to realize in experiments.
In this paper, we combine the computational efficiency of the recursive technique of [12] with a new adaptive protocol that does not necessarily require nonlocal measurement to present a new recursively adaptive quantum state tomography (RAQST) protocol.
In our RAQST protocol, no prior assumption is made on the state to be reconstructed. The state estimate is recursively updated based on the current estimate and the new measurement data. Thus, we do not have to combine all the historical information with the new acquired data to update the estimate as the maximum-likelihood method. Thanks to the simple recursive estimation procedure, we can obtain the estimate state in a realtime way, and using the estimate we can adaptively optimize the measurement strategies to be performed in the following step. In our RAQST protocol, the measurement to be performed at each step is optimized upon the corresponding admissible measurement set determined by the experimental conditions.
It is first demonstrated numerically that our RAQST even with the simplest product measurements can outperform the tomography protocols using MUBs and the two-stage MUB adaptive strategy. For maximally entangled states, the infidelity can even be reduced to beat the Gill-Massar bound which is a quantum Cramér-Rao inequality [2].
Moreover, if nonlocal measurements are available, with our RAQST the infidelity can be further reduced. For a wide range of quantum states, the infidelity of our RAQST can be reduced to beat the Gill-Massar bound with a modest number of copies. We perform the two-qubit state tomography experiments using only the simplest product measurements, and the experimental results demonstrate that the improvement of our RAQST over nonadaptive tomography is significant for states with a high level of purity. This limit (very high purity) is the one relevant for most forms of quantum information processing.

II. RECURSIVELY ADAPTIVE LINEAR REGRESSION ESTIMATION
A linear regression estimation (LRE) method for quantum state tomography was proposed in [12], and the results have shown that the LRE approach has much lower computational complexity than the maximum-likelihood estimation method for quantum tomography.
Here, we further develop this LRE method to present a recursively adaptive quantum state tomography protocol that can greatly improve the precision of tomography.
We first convert a quantum state tomography problem into a parameter estimation problem of a linear regression model. Consider a d-dimensional quantum system with i=1 denote a set of Hermitian operators satisfying (i) Tr(Ω i ) = 0 and (ii) Tr(Ω i Ω j ) = δ ij , where δ ij is the Kronecker function. Using this set, the quantum state ρ to be reconstructed can be parameterized as where I is the identity matrix and θ i = Tr(ρΩ i ). Let Θ = (θ 1 , · · · , θ d 2 −1 ) T , where T denotes the transpose operation.
A quantum measurement can be described by a positive operator-valued measure (POVM) {E i } M i=1 , which is a set of positive semidefinite matrices that sum to the identity, i.e., E i ≥ 0 and M i=1 E i = I. In quantum state tomography, different sets of POVMs should be appropriately combined to efficiently acquire information of the unknown quantum state. Let M = j=1 M (j) denote the admissible measurement set, which is a union of POVMs determined by the experimental conditions. Each POVM is denoted as k=1 , elements of the POVM can be parameterized as When we perform the POVM M (j) on copies of a system in state ρ, the probability that we observe the result m is given by Assume that the total number of experiments is N , and we perform a measurement described by m denote the number of the occurrence of the outcome m from the n (j) measurement trials of M (j) . Letp(m|M (j) ) = n (j) m /n (j) , and e (j) m =p(m|M (j) ) − p(m|M (j) ). According to the central limit theorem, e (j) m converges in distribution to a normal distribution with mean 0 and variance [p(m|M (j) ) − p 2 (m|M (j) )]/n (j) . Using (2), we have the linear regression equations for m = 1, · · · , M (j) , Note thatp(m|M (j) ), γ Hence, the problem of quantum state tomography is converted into the estimation of the unknown vector Θ.
To give an estimate with a high level of accuracy, the basic idea of LRE is to find an estimateΘ t such that Here, being performed at the k-th step. The notation W A recursive LRE algorithm [12] can be utilized to find the solution ofΘ t . For completeness, we present the recursive LRE algorithm in Appendix A. Its basic idea is that one only needs to store the best estimate state so far, and then update it recursively using a bunch of new measurement results with a fixed setting. This is quite different from the maximum-likelihood estimation method since there one has to combine all the historical information with the new collected data to update the estimate, which is quite computationally intensive. It has been demonstrated in Fig. 1 of [12] that the recursive LRE tomography algorithm can greatly reduce the total cost of computation with only a small amount of accuracy sacrificed in comparison with the maximum-likelihood estimation method.
As demonstrated in Appendix B, when the number of copies N of the unknown quantum state becomes large, the only relevant measure of the quality of estimation becomes the mean squared error matrix E(Θ t − Θ)(Θ t − Θ) T . The mean squared error matrix depends upon the state ρ (i.e., Θ) to be reconstructed and the chosen POVMs. Thanks to the recursive algorithm, we can obtain the estimate of the state ρ recursively, and then adaptively optimize the POVM measurements that should be performed. By doing so, the accuracy of the tomography can be greatly improved. The details of how to adaptively choose POVMs are presented in Appendix C.
Using the solutionΘ t in (4) and the relationship in (1), we can obtain a Hermitian matrixμ with Trμ = 1. However,μ may have negative eigenvalues and be nonphysical due to the randomness of measurement results. In this work, the physical estimateρ is chosen to be the closest density matrix toμ under the matrix 2-norm. In standard state reconstruction algorithms, this task is computationally intensive [32]. However, we can employ the fast algorithm in [32] with computational complexity O(d 3 ) to solve this problem since we have a Hermitian estimateμ with Trμ = 1. It can be verified that pullingμ back to a physical state can further reduce the mean squared error [13].

III. NUMERICAL RESULTS
In this section we present the numerical results. First of all, we would like to stress two advantages of the recursive LRE method: (a) as we have demonstrated in [12], the recursive LRE method can greatly reduce the cost of computation in comparison with the maximum-likelihood method; (b) the recursive LRE algorithm is naturally suitable for optimizing measurements adaptively.
The argument for the advantage (b) can be explained as follows. For state tomography the optimal measurements generally depend upon the state to be reconstructed. By utilizing the recursive LRE algorithm, we can obtain the estimate of the real state in a computationally efficient way. Using the state estimate, the measurements to be performed can be adaptively optimized. In the following, we perform numerical simulations of two-qubit tomography using only the recursive LRE method while with six different measurement strategies: (i) standard cube measurements [22]; (ii) mutually unbiased bases (MUB) measurements; (iii) MUB half-half [36]; (iv) "known basis" [36]; (v) RAQST1: the admissible measurement set only contains the simplest product measurements; (vi) RAQST2: the admissible measurement set is not limited.
Each of the tomography protocols (iii)-(vi) consists of two stages. In the first stage, we all use the standard cube measurements. For the MUB half-half, we first perform standard cube measurements on N/2 copies and obtain a preliminary estimateρ 0 via LRE, and then measure the remaining half of copies so that one set of the bases is adaptively adjusted to diagonalizeρ 0 and it together with another four sets of bases constitutes a complete set of MUB as proposed in [36]. As compared to the MUB half-half, for the "known basis" [36], in the second stage, we perform a set of measurements so that one of the five bases of the MUB is the eigenbasis of the state to be reconstructed. Although it is impossible physically, this is a useful comparison. For the RAQST, we first perform standard cube measurements on N 1 copies and obtain a preliminary estimate. Then we adaptively optimize the measurement to be performed at each iteration step upon the corresponding admissible measurement set (see Appendix D). In RAQST1, the basic admissible measurement set is the standard cube measurement bases. At each iteration step, we add another set of product measurements obtained by solving a conditional extremum problem to the basic admissible measurement set (see Appendix E). In RAQST2, at each iteration step, the set of the eigenbases of the current estimate state is also added into the admissible measurement set. Note that the admissible measurement set in RAQST2 will involve nonlocal measurements in general if there are more than one particle. The details can be found in Appendix D.
For the RAQST, we need to specify N 1 , which is the number of copies measured in the first stage, and the number K of the iteration steps such that N = N 1 + K · N 2 , where N 2 is the number of copies for each POVM in the second stage. In principle, the number K of the iteration steps in the second stage may depend on the preliminary estimate in the first stage. For simplicity, in this work, we give empirical formulas depending only upon the total number N of the copies. Note that in RAQST1 and RAQST2, the admissible measurement sets are different, and so are their empirical formulas. For RAQST1, N  = N (0.8 − 0.01 log 10 N ), K (2) = ⌊1.5 log 10 N − 2⌋ where ⌊x⌋ returns the maximum integer that is less than or equal to x. Obviously the formula for the resource distribution for RAQST2 applies only when N is not too large.
We use Monte Carlo simulations to demonstrate the results. The figure of merit is the particularly well-motivated quantum infidelity [36], 1 − F (ρ,ρ) =   . It can be seen that the average infidelity of the static tomography protocols (i.e., (i) and (ii)) versus N is in the order of O(1/ √ N ). However, the Gill-Massar bound [2] for the infidelity in two-qubit state tomography is 75 4N . This can be obtained by combining the equations (5.29) and (A.8) in [19] (see Appendix F). It is clearly seen that, as compared to the static tomography protocols and the adaptive MUB half-half, the average infidelity using our RAQST protocol can be reduced to beat the Gill-Massar bound even only with the simplest product measurements. Furthermore, if there is no limitation on the admissible measurement set, the RAQST2 can outperform the "known basis" tomography, and the average infidelity of RAQST2 versus N can be significantly reduced to the order of the Gill-Massar bound, i.e., O(1/N ). Fig. 1(b) shows the histogram for RAQST over 200 randomly selected pure states and 200 maximally entangled states when the total number of copies is N = 10 4 for each random state. Random pure states are created using the algorithm in [50]. Since all the maximally entangled states are equivalent under local unitary operations, they are randomly selected by applying randomly generated local unitary operators [51] on the same maximally entangled states. We adopt the index Υ = C−A C−G to evaluate the performance of our RAQST protocol. Here, C and A represent the log 10 of the average infidelity between the corresponding estimate and the true state when the standard cube measurement bases and the RAQST are utilized, respectively, while G is the Gill-Massar bound. Note that if Υ > 0, our adaptive protocol surpasses the standard measurement strategy, while if Υ > 1, our adaptive protocol beats the Gill-Massar bound. From Fig. 1(b) we can see that our RAQST protocol is particularly effective for the class of maximally entangled states which are important resources in quantum information. Fig. 2(a) depicts average infidelity versus N for state ρ = 0.997 (|HV −|V H )( HV |− V H|) 2 + 0.003 I 4 , which has purity Tr(ρ 2 )=0.9955. Note that there are kinks in the four curves corresponding to the four different adaptive protocols (iii)-(vi). We can see that each of the four curves can be divided into three segments from left to right. In the first segment, the infidelity decreases quickly as N increases until the infidelity is reduced to the order of the small eigenvalues of the state to be reconstructed, then the curves go into the second segment where the infidelity decreases slowly. After the infidelity is smaller than the smallest eigenvalues, the infidelity decreases quickly again as N increases. This is because infidelity is hypersensitive to misestimation of small eigenvalues, as pointed out in [36]. Hence, we must accurately estimate the eigenvalues that appear to be zero. When the infidelity is in the order of the smallest eigenvalues, it will be hard to estimate them accurately, so the decay rate of the infidelity will become slow. Once the infidelity decreases to be smaller than the smallest eigenvalues, we can estimate them more accurately as N increases, and then the infidelity decreases quickly. It can be seen that our RAQST1 can beat the static tomography protocols and the adaptive MUB half-half protocol even with the simplest product measurements. The infidelity can be further reduced by using RAQST2, and when the total copies N ≥ 10 4.5 , the infidelity can be reduced to O(1/N ). Fig. 2(b) shows average infidelity versus different where α, β ≥ 0 and satisfy α + β = 1. The results show that when the states have a high level of purity, our RAQST1 with the simplest product measurements can beat the MUB protocol. However, as the state becomes more mixed (Tr(ρ 2 ) decreases), using MUB measurements for state tomography can do better than using the adaptive product measurements. This fact is due to the essential limit of product measurements on mixed states. As pointed out in [2], nonlocal measurements on a mixed state can extract more information. Thus, to reconstruct mixed states, it is better to use nonlocal measurements, e.g., MUB measurements. It is also clear that the infidelity achieved by using RAQST2 is much lower than that using MUB, and can beat the Gill-Massar bound for a wide range of quantum states.

IV. EXPERIMENTAL RESULTS
In this section, we report the experimental results using our RAQST protocol for two-qubit quantum state tomography.
Since it is hard to perform nonlocal measurements in real experiments, we only experimentally implement tomography protocols using (i) standard cube measurements and (v) RAQST1.
As shown in Fig. 3, the experimental setup includes two modules: state preparation (gray) and adaptive measurement (light blue). In the state preparation module, a pair of polarization-entangled photons with a central wavelength at λ =702.2 nm is first generated after the continuous Ar + laser at 351.1 nm with diagonal polarization pumps a pair of type I phase-matched β-barium borate (BBO) crystals whose optic axes are normal to each other [52]. The generation rate is about 3000 two-photon coincidence counts per second at a pump power of 60 mW. Half-wave plates at both ends of the two single mode fibers are used to control polarization. Then, one photon is either reflected by or transmits through a 50/50 beam splitter (BS). In the transmission path, a QWP is tilted to compensate the phase of the two-photon state for the generation of In the reflected path, three 446 λ quartz crystals and a half wave plate with 22.5 • are used to dephase the two-photon state into a completely mixed state I/4. The ratio of the two states mixed at the output port of the second BS can be changed by the two adjustable apertures for the generation of arbitrary Werner state in the form ρ = p (|HV −|V H )( HV |− V H|) 2 + (1 − p) I 4 . Note that since the coherence length of the photon is only 176 λ (due to the 4 nm bandwidth of the interference filter (IF)), much smaller than the optical path difference which is about 0.5 m, two states from the reflected and transmission path only mix at the second BS rather than coherently superpose. In the adaptive measurement module, the two-photon product measurements are realized by the combinations of quarter-wave plates, half-wave plates, polarizing beam splitters, single photon detectors and a coincidence circuit. The rotation angles of quarter-wave plates and half-wave plates can be adaptively adjusted by a controller according to the analysis of the collected coincidence data on a computer.
In the first experiment, as shown in Fig. 4(a), we realize RAQST1 and standard cube measurements tomography protocols for entangled states with a high level of purity with respect to different number of resources N ranging from 251 to 251189. First, we calibrate the true state ρ using RAQST1 with N = 10 7 copies so that the infidelity of the calibrated true state is even 10 times smaller than the estimate accuracy achieved at N = 251189 with RAQST1. The purity of the calibrated state is 0.983. Systematic error is crucial in the experiments. Beam displacers, which separate extraordinary and ordinary light, act as PBS and have an extinction ratio of about 10000:1. As the precision of rotation stages of QWPs and HWPs are 0.01 • , the rotation error is determined by the calibration error of optic axes, which is 0.1 • in our experiment. Phase errors of the currently used true zero-order QWPs and HWPs are 1.2 • , which dominate the systematic error of practically realized measurements. These error sources induce a systematic error to the estimate state, which can be characterized by its infidelity from the true state. The systematic error is in the order of 10 −3 when the error sources take the above values. For resource number N ≥ 10 3 , the systematic error is of the same scale as or even larger than the statistical error due to finite resources (N copies). To deal with this problem, we employ error-compensation measurements [53] to reduce the systematic error to the order of 10 −5 . In error-compensation measurement technique, multiple nominally equivalent measurement settings are applied to sub-ensembles such that the systematic errors can cancel out in first order. Tomography experiments using both RAQST1 and standard cube measurements are repeated 10 times for each number of photon resources.
In the second experiment, as shown in Fig. 4(b), we realize tomography protocols using RAQST1 and standard cube measurements for Werner states with purities ranging from 0. 25   changed by adjusting the apertures. Since the photon resource for each run of tomography protocols is only 10 4 , we use 10 6 copies to calibrate the true state. There are 40 experimental runs and 1000 simulation runs for each of nine Werner states. In each RAQST experiment, four adaptive steps are used to optimize the measurements. To ensure measurement accuracy, error-compensation measurements are also employed.
In both of these two experiments, our experimental results agree well with simulation results.
The improvement of RAQST1 protocol over standard cube measurements strategy is significant. According to the simulation results of MUB protocol and the experimental results of RAQST1, even only with the simplest product measurements, our RAQST1 can outperform the tomography protocols using mutually unbiased bases for states with a high level of purity. Taking into account the trade-off between accuracy and implementation challenge, from Fig. 2 and Fig. 4, RAQST using the simplest product measurement seems to be the best choice for reconstructing entangled states with a high level of purity.

V. SUMMARY AND PERSPECTIVES
We presented a new recursively adaptive quantum state tomography protocol using an adaptive LRE algorithm and reported a two-qubit experimental realization of the adaptive tomography protocol. In our RAQST protocol, no prior assumption is made on the state to be reconstructed. The infidelity of the adaptive tomography is greatly reduced and can even beat the Gill-Massar bound by adaptively optimizing the POVMs that are performed at each step. We demonstrated that the fidelity by using our RAQST with only the simplest product measurements can even surpass those by using mutually unbiased bases and the two-stage MUB adaptive strategy for states with a high level of purity.
Considering the trade-off between accuracy and implementation challenge, it seems that RAQST using the product measurements is the best choice for reconstructing the pure and nearly pure entangled states, which are the most important resources for quantum information processing. It is worth stressing that our RAQST protocol is flexible and extensible. For any finite dimensional quantum systems, once the admissible measurement set is given, we can utilize the adaptive measurement strategy to recursively estimate an unknown quantum state. As demonstrated by numerical results, if nonlocal measurements can be experimentally realized as the breakthrough of the technology, the admissible measurement set M can be enlarged, and our RAQST protocol can be better utilized accordingly. How to give a more effective empirical formula in the second stage is worthy of further exploring, where the formula depends upon the estimate state of the first stage. This is actually related to the tomography problem wherein some prior information is already known, e.g., pure entangled states, matrix-product states, low-rank states. By taking full advantage of the prior information, more efficient RAQST protocol may be designed. Our RAQST protocol may have wide applications in practical quantum tomography experiments.
Note added. After we completed the experiments, recently we became aware of a highly relevant work [54] taking a Bayesian estimation approach to realize two-qubit adaptive quantum state tomography using factorized measurements.

Acknowledgements
The authors would like to thank Huangjun Zhu for helpful discussions about the Gill-Massar bound for infidelity. The work was supported by the National Natural Science Foundation of China under Grants (Nos. 61222504, 11574291, 61374092 and 61227902)   is being performed at the k-th step. To facilitate the presentation, we rename them according to the natural order. Thus, the notationp(m|M (j k ) ), γ can be simplified with the corresponding sequence number n asp n , γ n,0 , Γ n , e n , and W n . Let X t = Γ 1 , · · · , Γ n , · · · , Γ Mt T , e t = e 1 , · · · , e n , · · · , e Mt T , W t = diag W 1 , · · · , W n , · · · , W Mt T .
Using the notation, the linear regression equations (3) can be expressed into a compact form The solution of (4) iŝ We now show how to recursively obtain the solution of Θ t . Define (A2) Using the matrix inversion formula (see, e.g., page 19 of [55]) for n = 1, · · · , M t , we have Q n = Q n−1 − a n Q n−1 Γ n Γ T n Q n−1 .
(A3) From (A1), (A2) and (A3), the recursive form ofΘ n can be obtained aŝ Θ n =Θ n−1 + a n Q n−1 Γ n (p n − γ n,0 d − Γ T nΘ n−1 ). (A4) Appendix B: Significance of the Mean squared error matrix As pointed out in [2], as the number of copies N becomes large, the only relevant measure of the quality of estimation becomes the mean squared error matrix Θ). To be specific, for a good estimation strategy, a reasonable expectation is that the elements of the mean squared error matrix decrease as Assume that f (Θ, Θ) is any smooth cost function that can measure how much the estimateΘ (ρ) differs from the true value Θ (ρ). From equations (1)-(3) in [2], there exist a function f 0 (Θ) and a positive semidefinite matrix C(Θ) such that the mean value of f (Θ, Θ) under a reasonable estimation strategy will decrease as Note that f 0 (Θ) and C(Θ) depend only on the cost function and the true state, while V (Θ, Θ) depends on the true state as well as the estimationΘ. Hence, from (B1), we can minimize the mean squared error matrix V (Θ, Θ) to minimize any smooth cost function by choosing appropriate POVMs and suitable estimation strategies. In the following, we only minimize Tr(V (Θ, Θ)) instead of V (Θ, Θ) itself for simplicity, although this is not equivalent.

Appendix C: Optimization of POVMs
Thanks to the recursive algorithm in Appendix A, we can improve the tomography accuracy by adaptively optimizing the POVMs to be performed in the following step. In order to give a criterion on how to optimize the POVMs, we first look at the mean squared error matrix ofΘ n . From (A1) and (A2), we have where Σ (n) is the covariance matrix of e n = (e 1 , · · · , e n ) T . As we have mentioned in Appendix B, we will minimize the trace of E(Θ n − Θ)(Θ n − Θ) T . This can be achieved by minimizing Q n . Here, we provide an intuitive explanation on why it is reasonable to minimize Q n . It can be seen that if the weighted matrix W n satisfies W −1 n ≈ Σ (n) when N becomes large, E(Θ n −Θ)(Θ n −Θ) T ≈ Q n . Recall that the weight of the i-th linear regression equation is approximately equal to the inverse of Σ (n) ii . Moreover, if e i and e j correspond to different POVMs, they are independent, and so Σ (n) ij = 0. Therefore, we can adaptively choose POVMs to minimize Q n .
Our RAQST protocol is presented as follows. It can be divided into two stages. In the first stage, we perform a standard linear regression estimation on N 1 copies with the standard cube measurement bases to get a prelimiarŷ Θ and Q [12]. Next in the second stage we set the initial value Q 0 = Q in (A3) andΘ 0 =Θ in (A4), and then utilize the remaining N − N 1 copies for adaptive linear regression estimation.
Suppose after s steps, we get Q Ms andΘ Ms where m=1 being performed at the k-th step. If s = 0, M s = 0. From (A3), we can see that Q Ms+1 ≤ Q Ms , and The remaining question is how to choose POVMs to improve the rate of decreasing. We can choose E at the (s + 1)-th step. By doing this, we can get M (js+1) linear regression equations. Thus, we can utilize (A3) and (A4) to get Q Ms+1 andΘ Ms+1 , where M s+1 = s+1 k=1 M (j k ) . The above procedure is repeated until all the copies have been measured.
Two points should be paid attention to the above RAQST protocol. The first one is when choosing E (js+1) i to maximize g Ms+1 , we cannot get the information of p Ms+1 in W Ms+1 because we have not really performed the experiments. From (3), we can use its estimatẽ to replacep Ms+1 . Another one is that given the total number of copies N , how to determine the number of copies N 1 used in the first stage and the number of adaptive steps in the second stage. The optimal values remain open. We can give empirical formulas when the dimension of quantum systems for tomography is given.

Appendix D: two-qubit RAQST
In two-qubit RAQST, we assume that the basic admissible measurement set at each iteration step is the standard cube measurement bases, wherein all the elements are one-dimensional projectors. If we choose a projector by minimizing g in (C1) as described above, it is easy to reconstruct the corresponding POVM. Actually, if the chosen projector is |ψ 1 ψ 1 |⊗|ψ 2 ψ 2 |, the corresponding POVM is {|ψ 1 ψ 1 |⊗|ψ 2 ψ 2 |, |ψ ⊥ 1 ψ ⊥ 1 |⊗ |ψ 2 ψ 2 |, |ψ 1 where |ψ ⊥ i is orthogonal to |ψ i . As pointed out in [36], in order to minimize infidelity, we must accurately estimate the small eigenvalues of the state to be reconstructed, particularly those nearly 0 eigenvalues. Inspired by this, at each iteration step we find a product projector that minimizesp. This projector can be found by a simple iteration algorithm since it is a standard conditional extremum problem. The details are given in Appendix E. By doing this, the value of g may become larger with this new projector. Thus, we add the corresponding POVM into the admissible measurement set at each iteration step in RAQST1. In RAQST2, we further add the set of the eigenbases of the current state estimate into the admissible measurement set. Note that the admissible measurement set in RAQST2 will involve nonlocal measurements in general.
We illustrate our algorithm in the case of two-qubit tomography. We first fix the basis set {Ω i } 3 i=0 for one-qubit, and then we can represent the one-qubit projector measurement for the j-th qubit {Π j 1 , Π j 2 } as Π j i = (π j i,0 , π j i,1 , π j i,2 , π j i,3 ) T , i, j = 1, 2. Here, the projector Π j 1 is orthogonal to Π j 2 . For a two-qubit system, we take the tensor product of {Ω i } 3 i=0 as the basis set. Then any two-qubit product projectors can be parameterized as π 2 j,l Ω l ).
The convergence of our algorithm is straightforward to prove. Note that it can be verified L k ≤ L k−1 during iterations. Moreover, sincep i,j ≥ 0, we have L ≥ 0. Therefore, the sequence {L k } indeed has a limit and the convergence is guaranteed accordingly. It is worth noting that ρ is actually unknown. Hence, we should take its current estimate instead of ρ in the above procedure.