Approaching optimal entangling collective measurements on quantum computing platforms

Entanglement is a fundamental feature of quantum mechanics and holds great promise for enhancing metrology and communications. Much of the focus of quantum metrology so far has been on generating highly entangled quantum states that offer better sensitivity, per resource, than what can be achieved classically. However, to reach the ultimate limits in multi-parameter quantum metrology and quantum information processing tasks, collective measurements, which generate entanglement between multiple copies of the quantum state, are necessary. Here, we experimentally demonstrate theoretically optimal single- and two-copy collective measurements for simultaneously estimating two non-commuting qubit rotations. This allows us to implement quantum-enhanced sensing, for which the metrological gain persists for high levels of decoherence, and to draw fundamental insights about the interpretation of the uncertainty principle. We implement our optimal measurements on superconducting, trapped-ion and photonic systems, providing an indication of how future quantum-enhanced sensing networks may look.

In this Supplementary Note we describe the computation of the single-copy Nagaoka, two-copy Nagaoka and Holevo bounds. For the qubit rotation estimation problem discussed in this work we consider the state |0 subject to rotations θ x and θ y about the x and y axes of the Bloch sphere respectively. The rotation operators are given by and R y (θ) = cos( θ 2 ) −sin( θ 2 ) sin( θ 2 ) cos( θ 2 ) . (1) After the rotations, the probe is subject to the decoherence channel. Assuming the rotations are small, the density matrix and its derivatives are given by where is the decoherence strength and ∂ i represents the partial derivative with respect to the parameter θ i . The parameters θ x and θ y can be estimated using a positive operator valued measure (POVM). A POVM is described by a set of non-negative operators, {Π l }, which sum to the identity, l Π l = I. Each measurement outcome occurs with a certain state-dependent probability, p l = tr[ρ θ Π l ]. The series of measurement results are used to construct unbiased estimators,θ x andθ y , for the parameters of interest, θ x and θ y . Our goal is to minimise the mean squared error (MSE) between the true θ and the estimated valueθ. The MSE matrix is given by where the sum is over all possible measurement outcomes. The Holevo and Nagaoka bounds are lower bounds for the trace of the MSE matrix when using collective and separable measurements respectively. The Holevo bound is obtained by solving the following non-trivial minimisation problem [1,2] The Holevo bound applies when we allow for collective measurements on infinitely many copies of the probe state. When restricted to separable measurements, the Nagaoka bound provides an upper limit on the attainable precision. For estimating two parameters, the Nagaoka bound is given by [3,4] Tr [V ] where [A, B] = AB − BA and the Hermitian matrices X are subject to the same unbiased conditions as before.
The following Hermitian matrices simultaneously solve the optimisation problem for the Holevo bound and the single-copy Nagaoka bound and X y = 0 By substituting into Eq. (4) it is easily verified that these matrices give a Holevo bound of where v x(y) is the variance in the estimate of θ x(y) . Similarly, by substituting into Eq. (6), the single-copy Nagaoka bound is given by For computing the two-copy Nagaoka bound it is no longer sufficient to consider the state and derivatives in Eq. (2). To find the attainable precision when performing collective measurements on two copies of the quantum state simultaneously, we make the transformation The Hermitian matrices which solve the two-copy Nagaoka bound are given by We note that these solutions are not unique and multiple optimal solutions were found for the two-copy Nagaoka bound. This solution gives the two-copy Nagaoka bound as As both the Holevo and Nagaoka bounds can be computed efficiently [5,6] we can be sure our solutions are correct.

Supplementary Note 2 . POVMs saturating the Nagaoka bounds
In this Supplementary Note we present measurement strategies, i.e. POVMs and estimator functions, which saturate the single-and two-copy Nagaoka bounds. A POVM which saturates the single-copy Nagaoka bound is given by The probability of each of the four outcomes is 1 4 . By attaching an estimator coefficient ξ j,k to each measurement outcome, it is possible to construct unbiased estimators for the parameters we want to sense,θ j = k p k ξ j,k , where p k is the probability of the kth POVM outcome occuring. From this particular POVM, we can construct unbiased estimators using the following estimator coefficients ξ x,2 = ξ y,3 = −ξ x,1 = −ξ y,4 = 2/(1 − ) and ξ i,j = 0 for all other i and j. These estimators then give individual variances of v y = 1 4 (ξ 2 y,3 + ξ 2 y,4 ) , which gives a total variance of v x + v y = 4/(1 − ) 2 coinciding with the single-copy Nagaoka bound N 1 , Eq. (9). Alternatively, from the POVM, it is possible to compute the classical Fisher information without using the estimator coefficients. This measurement strategy is theoretically optimal when measuring probe states individually. However, it requires four measurement outcomes, meaning it is necessary to use an ancilla qubit to implement this POVM. In this case Naimark's theorem can be used to convert the POVM to a projective measurement in a higher dimensional Hilbert space [7], but this comes at the cost of increasing experimental complexity. This can be avoided by noting that this POVM is equivalent to measuring σ x half of the time and σ y half of the time, where σ x and σ y are the usual Pauli matrices. Thus we can simply split our measurement in two. We measure θ x with half of the probe states using the following POVM and θ y with the remaining probe states using The new estimator coefficients are reduced by a factor of two ξ * x,2 = ξ * y,1 = −ξ * x,1 = −ξ * y,2 = 1/(1− ), the probability of each outcome is now 1 2 and each estimate uses half as many resources. Therefore the variances in estimating θ x and θ y are given by coinciding with the original measurement, Eq. (13). For the two-copy measurement we write the POVM as Π j = |ψ j ψ j | where The first three POVM outcomes occur with probability (4+ ( −2))/12 and the fourth outcome occurs with probability (2 − )/4. We use the following estimator coefficients and ξ x,4 = 0 (21) The individual variances are then given by v y = 1 4 and the sum coincides with the two-copy Nagaoka bound, Eq. (12). A geometrical interpretation of this POVM is provided in Supplementary Note 12 .
Supplementary Note 3 . Optimised three-and four-copy projective measurements surpassing the preceding Nagaoka bound In this Supplementary Note we present the details of the three-and four-copy measurements mentioned in the main text, which surpass the two-and three-copy Nagaoka bounds respectively. A three-copy POVM and estimator function which surpass the two-copy Nagaoka bound were found numerically for = 0.5. For this particular decoherence strength, the Holevo bound on the variance when measuring θ x and θ y , per qubit, is 12 rad 2 . The single-, two-and three-copy Nagaoka bounds on the variance are 16 rad 2 , 13 rad 2 and 12.716 rad 2 respectively. Our specific three-copy measurement is theoretically able to attain a variance of 12.719, almost the same as the three-copy Nagaoka bound. We use the following 8 POVM elements to obtain this variance.
These POVMs, combined with the following estimator coefficients give rise to a variance of 12.719. There is an extremely small difference of 0.003 rad 2 in the variance of this measurement and the three-copy Nagaoka bound, but this is not due to numerical error. Indeed, we were able to find a 10 outcome POVM with a variance equal to the three-copy Nagaoka bound (up to numerical error ≈ 10 −6 ). However, in the main text, the near-optimal 8 outcome POVM was implemented, because this does not require any ancilla qubits, simplifying the experimental realisation.
In the main text we also presented simulations based on a projective four-copy measurement, which theoretically surpasses the three-copy Nagaoka bound. The details of this POVM are found in our publicly available repository [8]. The four-copy Nagaoka bound on the variance is 12.368 rad 2 . Our four-copy projective measurement attains a variance of 12.508 rad 2 , which is below the three-copy Nagaoka bound, but does not saturate the four-copy Nagaoka bound. Due to computational restraints, we only searched for projective four-copy measurements, so saturating the four-copy Nagaoka bound may still be possible with a more general four-copy POVM.

Single-and two-copy POVM implementation
In this Supplementary Note we describe how to convert the optimal POVMs into experimentally realisable quantum circuits. The quantum processors used in our experiments make measurements in the z basis. Therefore, in order to implement the POVMs it is necessary to diagonalise them in this basis. For example, for the two qubit POVM, we aim to find a unitary matrix U such that is diagonal in the computational basis, where |ψ 1 |ψ 2 |ψ 3 |ψ 4 is a 4 × 4 matrix. Implementing this unitary matrix and measuring in the z basis is equivalent to implementing the desired POVM. We start with the single qubit POVM for estimating θ x , Eq. (16). We write so that we wish to diagonalise |ψ x1 |ψ x2 . It is easy to verify that this is diagonalised by the following unitary matrix In Fig. 1 of the main text, single qubit unitary matrices are shown as green boxes in the circuit diagram. We use the following definition for the most general single qubit gate characterised by three parameters, θ, φ and λ With this definition, up to a global phase factor, we can write U x = U 1q (3π/2, 0, 3π/2). Following the same approach, to implement the POVM for estimating θ y , Eq. (17), we require the following unitary matrix This can be implemented as U y = U 1q (3π/2, 0, 0) = R y (3π/2). Optically these unitary matrices are implemented through a motorised quarter-wave plate (QWP) followed by a half-wave plate (HWP) and then another QWP. The HWP and QWP set to a specific angle θ implement the following unitary transformations and To implement U x , we use the following angles QWP(−π/4)HWP(−3π/4)QWP(π/2), which is equivalent to U x up to a global phase shift. To implement U y we use QWP(0.788913939)HWP(-1.174581359)QWP(0.003515755). This is not an exact implementation, but the numerical error in the implementation of U y is significantly smaller than the expected MSE, hence can be ignored. For the two-copy POVM, Eq. (20), note that ψ 1 |, ψ 2 |, ψ 3 | and ψ 4 | are orthonormal hence the unitary matrix Any two qubit unitary matrix can be implemented by three CNOT gates, four arbitrary single qubit unitary matrices, Eq. (31), and three single qubit rotations [9]. This is the circuit shown in Fig. 1 (f) in the main text. The rotation parameters needed to implement this unitary matrix to an accuracy of 1 part in 10 million are given in Tables. S1 and S2.  TABLE S1. Parameters required to implement the arbitrary single qubit unitary matrices for the optimal two-copy circuit in  TABLE S2. Parameters required to implement the single qubit rotations for the optimal two-copy circuit in Fig. 1 (f) from the main text.

State preparation
The final step before running these optimal measurement circuits is to prepare the correct probe state. For a given decoherence strength and rotation angles θ x and θ y we need to prepare three different states per qubit with the appropriate probabilities. The state |ψ θ = R y (θ y )R x (θ x ) |0 is prepared with probability 1 − and the states |0 and |1 are prepared with probability /2 for each qubit. For implementing collective measurements, each two qubit state is a Kronecker product of two states and is prepared with a probability equal to the product of the probabilities for each individual state. For example, the state |ψ θ ⊗ |0 is prepared with probability (1 − ) /2.
In principle, there should be no error in the value of used in our experiments, as the exact probability with which we prepare each state is known. However, in practice, due to non-zero state preparation error, there will be some error in the value of used. We now estimate the error in , σ , given a certain state preparation error. We ignore readout error when calculating σ as this does not affect the actual state we prepare. For simplicity, we will consider the case when θ x = θ y = 0, so that the state we wish to prepare is (1 − /2) |0 0| + /2 |1 1|. Using N qubits for the complete experiment, the true value is given by where N 0,prep is the number of |0 states which are actually prepared. We assume that the state preparation error is symmetric for preparation of both the |0 and |1 state, which we denote p p . This is the probability of initialising the wrong qubit state. We can then write in terms of the number of |0 and |1 states which would ideally be prepared and the probability of incorrect initialisation where i is the value which would be prepared in an ideal experiment. Finally, assuming that the |0 and |1 initialisation errors are independent, we find the variance of as As we average our results over 400 repetitions of the same experiment, σ 2 is decreased by a further factor of 400. Using the calibration data for our devices, we find that σ is on the order 10 −4 or smaller for all devices, hence horizontal error bars on Fig 2 (e) in the main text would be negligible and we do not include them. . This POVM is constructed such that the ψ 3c i are orthonormal. Therefore, the unitary matrix to be implemented experimentally is Using the technique presented in Ref. [10], we are able to convert this unitary matrix to a three qubit quantum circuit, shown in Fig. S1. This circuit consists of 43 CNOT gates, 20 single qubit unitary matrices, 6 Hadamard gates and 28 single qubit rotations, and is an extensive circuit. When implemented on various superconducting processors, the results of this measurement did not reach the theoretical limits. This can possibly be attributed to the gate error rates of the different processors or crosstalk between qubits. The quantum circuit corresponding to our four-copy POVM can be found in our publicly available repository [8]. The decomposition of the four-copy measurement which we use contains 115 CNOT gates, hence unsurprisingly the four-copy POVM cannot reach the theoretical limits with even small amounts of noise. In Fig. 2 (f) in the main text we simulated implementing these measurements on noisy quantum computers. This simulation used the IBM Q QISKIT package noise models and noise was modelled as depolarising noise for different gate error rates. The results of this simulation suggest that with realistic future noise levels, quantum processors may be able to implement three-copy measurements with a precision approaching the theoretical limits. Four-copy measurements, on the other hand, will require considerably lower gate error rates to reach the theoretical limits. Although this discussion comes with the caveat that modelling noise in such a complex system is unlikely to be overly accurate, we do observe good qualitative agreement between simulation and experiment for the three-copy measurement.
The results presented for the noisy simulations in Fig. 2 (f) in the main text use a slightly different error mitigation method than that used for the experimental data. This is done to avoid erroneously predicting small variances from our noisy simulations. In Fig. S2 we plot the predicted values of θ x using our four-copy measurement as a function of the input θ, simulated for two different gate error rates. Qualitatively, it is clear that as the noise is increasing the estimator is predicting values closer to 0. Hence, we would expect such an estimator to perform worse with increasing gate error rate. Quantitatively however, without any error mitigation, the MSE actually decreases with increasing gate error rate for θ = 0. With gate error rates of 1 × 10 −3 , the MSE is 11.89 rad 2 , for gate error rates of 5 × 10 −3 , the MSE is 11.71 rad 2 . The fact that higher gate error rates give a lower MSE indicates something is not correct, but even more alarming is that both of these MSEs are below what is allowed by the Holevo bound. The reason for this is that the high gate error rate biases the estimator. For an input angle of θ = 0, an estimator which predictsθ = 0 every time will have a MSE of 0 rad 2 . Hence, for our noisy simulations, the calibration model we use corrects for both the gradient of the estimator and the offset,θ x = m xθnoisy,x + c x . As we will stress in Supplementary Note 5 , a calibration model of this form can actually bias the estimator itself. Hence, we only use this form of calibration for our noisy simulation results. This is more a reflection of the limitations of our simulations than anything else. It is likely that a more comprehensive noise model is needed to completely capture the noise of a real quantum processor.
The results of our three-and four-copy collective measurements show the trade-off between what is gained by implementing a theoretically better measurement versus what is lost by the increased experimental complexity of such a measurement. However, this work may be viewed as one small step towards implementing collective measurements on a large number of copies of the probe state simultaneously. For comparison, we will now briefly discuss alternative approaches to implementing collective measurements. All previous approaches have been restricted to implementing collective measurements on two copies of the probe state [11][12][13][14][15] and have relied on optical systems. Owing to the way that two copies of the quantum state were created in these approaches, it is difficult to extend to measurements on more than two copies of the quantum state simultaneously. Experiments using three degrees of freedom of a single photon may offer a way around this problem [16]. Another possible way of implementing collective measurements on more than two copies of a quantum state simultaneously is through quantum walks. Quantum walks were originally proposed and demonstrated for implementing POVMs on single qubit states [17][18][19] and it was a very similar technique which has enabled some of the recent demonstrations of two-copy collective measurements [13][14][15]. The theory of quantum walks as a measurement tool has been extended to POVMs on qudit states [20]. This, combined with recent advances in optical quantum state engineering [21,22], may some day allow optical approaches to collectively measure more than two copies of the quantum state simultaneously. It is likely that the continued development of collective measurements on multiple platforms will be useful.

Supplementary Note 5 . Effect of error mitigation on the bias of an estimator
When calculating the various bounds in Supplementary Note 1 it was assumed that the rotations to be estimated are small, meaning the estimators are unbiased exactly at θ = 0. It is not guaranteed that the estimators will remain unbiased away from θ = 0. Fig. S3 shows the predicted angles,θ, for a range of different input angles, θ. Evidently, the estimator remains approximately unbiased for a large range of θ. It is important for any estimator to be unbiased to ensure a fair comparison is being made. Provided we have sufficiently many probe states available, it is always possible to operate in the region where the estimator is unbiased. This can be done by taking a small sample of the probe states, √ N N , where N is the number of available probe states, to obtain a rough estimate of θ. The measurement apparatus can then be adjusted to operate in the unbiased region by taking into account this rough estimate.
In this work error mitigation was used to allow us to observe quantum-enhanced metrology. An important requirement on any error mitigation technique used is that it does not introduce any bias into the estimator. We primarily focused on Clifford data regression error mitigation [23] which essentially amounts to producing a model for the system being interrogated. In the main text we used a model of the form where c x(y) is a constant which accounts for the offset of the true estimate from the noisy estimate. However, the authors who introduced this form of error mitigation originally proposed a linear model [23] of the form θ x(y) = m x(y)θnoisy, x(y) + c x(y) , where the extra term m x(y) is another constant, determining the slope of the model. Given that the linear model proposed is motivated by depolarising noise, it seems a natural choice. While this may be true for many other applications of quantum processors, it is not true for quantum metrology. For quantum metrologyθ noisy, x(y) will be some distribution of estimated angles with a certain variance. If we were to multiply this distribution by some constant m x(y) which is less than 1, we would artificially reduce the variance of the estimator. It then becomes possible to have variances which appear smaller than the minimum allowed by quantum mechanics. The effect of naively applying error mitigation based on Eq. (39) to quantum metrology is shown in Fig. S4. In this example the fitted gradient depends on the value of : for larger , m is smaller. As Fig. S4 shows, with this error mitigation model, we apparently surpass the two-copy Nagaoka bound in some regions, whereas, with an unbiased estimator this is not possible. Thus, it is clear that error mitigation cannot be naively applied to quantum metrology. Other error mitigation techniques were investigated, including zero noise extrapolation [24][25][26] and quantum readout-error mitigation [27,28], however they were not found to be as effective as Clifford data regression. Specifically, zero noise extrapolation introduces a large overhead, which is not ideal for quantum metrology, and quantum readout-error mitigation offered only marginal improvements in the MSE. Further study is needed to fully understand what error mitigation protocols are best for quantum metrology. Indeed, recent error mitigation proposals, specifically designed for quantum metrology, may prove more effective [29].
Supplementary Note 6 . Further three-copy measurement results Fig. 2 in the main text shows the results of our three-copy measurement implemented on the Rigetti Aspen-9 and F-IBM QS1 processors. Although these measurements did not reach the theoretical limit, the MSE for the F-IBM QS1 processor is within a factor of 2 of this limit. This perhaps suggests that with minor improvements in gate error rates, the theoretical limits on three-copy measurements may be approached. However, as we now show, this is not necessarily true. We estimated a range of angles using the three-copy measurement on several different quantum processors. The results of this are shown in Fig. S5, where it is evident that the three-copy measurements are effectively useless at distinguishing different angles. For all devices tested, there was no meaningful correlation between the input angle and the estimated angle. This is in stark contrast to the single-copy and two-copy measurements, shown in Figs. 1 and 2 in the main text. The data-set measured for the Rigetti Aspen-9 device differs slightly from the IBM Q devices, due to different device accessibility. Further theoretical and experimental studies will be required to fully understand the utility of three-copy collective measurements for metrology.

Supplementary Note 7 .
Problem where collective measurements on many copies of the probe state are necessary For the problem considered in this work, two-copy measurements are able to achieve almost the same precision as the Holevo bound. Performing collective measurements on three or more copies of the probe state offers only marginal improvements in the precision at the cost of greater experimental complexity. Thus, it is natural to wonder if it is truly necessary to implement collective measurements on more than two copies of the quantum state. In this Supplementary Note we provide an example which clearly shows that collective measurements on many copies of the quantum state are necessary.
We examine a similar problem to the main text, estimating qubit rotations subject to the amplitude damping channel, using the probe state |1 . After the rotations and amplitude damping, the probe and its derivatives are given by where p is the amplitude damping strength. The following matrices then optimise the Holevo and Nagaoka bounds By direct substitution it can be verified that these matrices satisfy the unbiased conditions, Eq. (5). The corresponding there is no meaningful correlation between the true and predicted values, rendering them effectively useless as estimators.
All data points are based on at least 17,000 shots and all error bars correspond to one standard deviation obtained through bootstrapping.

Holevo bound is given by
The single-copy Nagaoka bound is given by Considering measurements on two copies of the probe state simultaneously, the following matrices optimise the two-copy Nagaoka bound can only be narrowed by collective measurements on more than two copies of the probe state.
The two-copy Nagaoka bound is given by where we include the factor of two to account for the resources used. The inverse of the single-copy Nagaoka, two-copy Nagaoka and the Holevo bounds are shown in Fig. S6 (a). Although the two-copy Nagaoka bound is closer to the Holevo bound, there remains a considerable gap between these two bounds. The difference between both Nagaoka bounds and the Holevo bound is shown in Fig. S6 (b). For this example, it is evident that measurements on more than two copies of the probe state will be required if the ultimate limits in quantum metrology are to be attained. This is also a physically relevant channel, as the amplitude damping channel can be used to model the decay of an atom from its excited state to its ground state. We can expect that many other tasks in quantum information will require collective measurements on many copies of the probe state.
Supplementary Note 8 . Computing Lu and Wang's metrological bound based on Heisenberg's uncertainty principle Recently Lu and Wang derived a trade-off relation between measurement variances when estimating two parameters based on uncertainty relations [30]. As in the main text, we shall refer to this bound as the LW uncertainty relation. We now compute the LW uncertainty relation for the problem considered in the main text. To do so we first compute the symmetric logarithmic derivative (SLD) quantum Fisher information matrix F, where L j satisfies (L j ρ + ρL j )/2 = ∂ j ρ. It is easily verified that the matrix Q is given by Based on this thec jk terms (Eq. (7) of Ref. [30]) can be computed asc 11 =c 22 = 0 andc 12 =c 21 = 1. These terms then allow the following metrological bound to be derived (Eq. (8) of Ref. [30]) The LW uncertainty relation provides a tradeoff relation between the variances which can be attained for estimating the two rotation angles. The minimum total variance is achieved when v x = v y = 2/(1 − ) 2 , which coincides with the single-copy Nagaoka bound. Therefore, our two-copy collective measurement, which surpasses the single-copy Nagaoka bound, can also surpass the LW uncertainty relation. To the best of our knowledge, this is the first time that a "universally valid" uncertainty principle has been surpassed, albeit indirectly, i.e. through measurement variances as opposed to directly probing the observables.
The LW uncertainty relation is based on previously derived measurement uncertainty relations which explicitly assume that only separable measurements are used. These previous bounds are based on the usual notion of uncertainty relations for operators, whereas Lu and Wang were the first to map this to quantum parameter estimation. It is possible to calculate a two-copy version of the LW uncertainty relation, however this is unsatisfying as it gives a bound which cannot be reached even with collective measurements on infinitely many copies of the probe state. In Supplementary Note 11 we present one possible way to modify the LW uncertainty relation so that it accounts for collective measurements.
Supplementary Note 9 . POVMs violating the LW uncertainty relation with unbalanced variances Fig. 3 in the main text shows single-copy measurements which verify the LW relation when restricted to separable measurements for a range of v x and v y values. For v x = v y , this is achieved using the single-copy POVMs presented in Supplementary Note 2 , with an equal number of qubits used for estimating θ x and θ y . For v x = v y the same POVM is used, but now a different number of qubits are used for estimating θ x and θ y . The total number of qubits used in each experiment remains fixed. By assigning more qubits to estimating θ x , we can reduce v x at the expense of increasing v y and vice versa.
For the measurements violating the LW uncertainty relation, the two-copy measurement in Supplementary Note 2 is sufficient for the point where v x = v y . However, for the points where v x = v y , a new POVM is required. The purple line in Fig. 3 is obtained from the weighted Nagaoka bound, i.e. the two-copy Nagaoka bound computed with non-identity weight matrices. Changing the weight matrix corresponds to a transformation of the parameters being estimated [31]. Hence, after computing the weighted Nagaoka bound, we transform from the weighted variances w x v x and w y v y to the variances in the parameters of interest v x and v y . Each different weight matrix produces a line in the v x − v y plane. The purple curve is then the envelope of all these lines. The line corresponding to the weighted Holevo bound is calculated similarly.
Similarly, when finding measurements where v x = v y , which violate the LW uncertainty relation, we find a measurement which minimises w x v x + w y v y . For w x = 1.4, w y = 0.6, we use the POVM gives rise to the data point in Fig. 3 with reduced variance in estimating θ x . Similarly, for w x = 0.6, w y = 1.4, the necessary POVM is {Π w2 The estimator coefficients for this POVM become This measurement gives rise to the data point in Fig. 3 with reduced variance in estimating θ y .

Supplementary Note 10 . Measurement surpassing universally valid uncertainty relations for operators
Surpassing Lu and Wang's bound, should be equivalent to surpassing the measurement uncertainty relations for operators on which it is based. We now show this explicitly. The LW uncertainty relation is based on an uncertainty relation which was a cumulation of work from Ozawa and Branciard [32][33][34][35]. These works provided trade-off relations for measuring two Hermitian operators A and B. When A and B do not commute, they cannot be jointly measured and so instead we measure a pair of commuting observables A and B which approximate the ideal measurement. The approximate observables are measured on an extended Hilbert space of the quantum state ρ combined with an ancilla state η. The measurement errors for the ideal observables A and B are then given by The following uncertainty relation is then claimed to hold where In order to map our measurements to this operator approach, we need the ideal operators A and B. For a parameter estimation problem these are always given by the SLD operators, A = L x and B = L y . For the problem considered in this work, the SLD operators are Substituting this in gives σ A = σ B = 1 − . We can also evaluate D AB as (1 − ) 2 . The uncertainty relation for this problem therefore becomes At this point we can proceed in one of two ways to show that this relation is violated. From a metrological perspective, Lu and Wang define the "regret of the Fisher information" (hereafter abbreviated to regret), as the difference between the classical Fisher information (CFI) of a specified measurement F and the SLD quantum Fisher information F. Then an equivalence is drawn between the regret and the measurement errors for the ideal observables R xx = 2 A and R yy = 2 B . Therefore, by computing the CFI for of our measurements we can evaluate the regret and in turn the uncertainty relation, Eq. (56). Given a POVM {Π m } and a density matrix which depends on the parameters of interest ρ(θ x , θ y ), the CFI can be computed as where log refers to the natural logarithm. Using the single-and two-copy measurements specified previously, Eqs. (13) and (20) respectively, we can evaluate the CFI for our two measurements as and Note that the two-copy CFI has been scaled by a factor of two to account for the fact that twice as many resources are being used. Evaluating the regret and substituting into Eq. (56), shows that the uncertainty relation is saturated for the single-copy measurement and violated for the two-copy measurement. Equivalently, we can give the exact forms for the approximate observables A and B. For this we require the measurement channel Φ : S(H S ) → S(H R ) introduced by Lu and Wang, which maps from the set of all density matrices on a Hilbert space H S to density matrices on an alternative Hilbert space H R which acts as a register of all the measurement outcomes. This channel is defined as where |m are states forming an orthonormal basis in H R . The new SLD operator for the density matrix Φ(ρ) is given byL The approximate observables can then be defined as A = U † (I S ⊗L x ⊗ I R )U and B = U † (I S ⊗L y ⊗ I R )U , where I S and I R are the identity matrices on the system and register Hilbert spaces respectively. U is a unitary matrix which satisfies for all density matrices ρ, where η is any state in the register Hilbert space and Tr 1,3 denotes the partial trace over the first and third systems.
We will now present unitary matrices which satisfy Eq. (62) for both the single-copy and two-copy measurements, allowing the approximate observables A and B to be obtained. For the single-copy measurement we first use the Naimark extension [7] to convert the measurement in Eq. (13) to projectors. One possible Naimark extension is Using this projection we define the following unitary matrix We next define where I d is the d−dimensional identity matrix and The necessary single-copy unitary matrix is U 1q = U swap .U 4 .U 3 .U 2 .U 1 , where U swap swaps modes 2 and 3 followed by modes 1 and 2.
The two-copy measurement presented in Eq. (20) is already a projective measurement, hence we do not need to invoke Naimark's theorem. The necessary unitary matrix for the two-copy measurement is very similar to the single-copy unitary matrix, however in the definition of U Π we need to replace |ψ j,E with |ψ j from Eq. (20). The unitary matrix U swap now needs to swap modes 2 and 3, followed by modes 1 and 2, followed by modes 3 and 4 and finally modes 2 and 3. For the single-copy and two-copy measurements the total ancilla systems are |x + x + | ⊗ |z + z + | ⊗ |z + z + | ⊗ |z + z + | and |x + x + | ⊗ |z + z + | ⊗ |z + z + | respectively, where |z + = (1, 0) T and |x + = (1, 1) T / √ 2. By inferring the approximate observables A and B we verify that the separable measurement saturates the uncertainty relation whereas the two-copy measurement violates it. As A and B commute, the measurement of one does not disturb any subsequent measure of the other. Hence, our collective measurement violating the LW uncertainty relation can be mapped to a violation of error-disturbance type uncertainty relations [32]. We note here that, for the two-copy measurement, we scale the errors A and B by a factor of two, because they are effectively estimating the optimal operator twice. This rescaling by a factor of two is the same rescaling as for the two-copy CFI. This is necessary, otherwise the same measurement repeated side by side on independent copies of the system would give a measurement error for the complete system which is at least a factor of two greater than the measurement error for the individual system.
It has been known for some time that the original formulation of the uncertainty principle was not a tight bound, and indeed Heisenberg's uncertainty principle has been violated experimentally [36][37][38]. However, prior to this work, it was not demonstrated, theoretically or experimentally, that the universally valid uncertainty relations which succeeded Heisenberg's uncertainty principle, could be violated. The metrological bound derived by Lu and Wang [30] built on several previous uncertainty relations [32][33][34][35]. There is nothing incorrect in any of these works, however they all rely on a particular assumption from Ozawa's paper. In Ref. [33] Ozawa states "We assume that any joint measurements are carried out on single systems". So it is built into the definition of all of these uncertainty relations that only separable measurements are considered. As collective measurements offer no advantage over separable measurements for pure states, we can expect the above uncertainty relations to hold for pure states.
Supplementary Note 11 . Adjusting Lu and Wang's bound to allow for collective measurements Eq. (19) of the supplemental material of Lu and Wang's paper, Ref. [30], reads FIG. S7. Schematic for estimating SLD operators. As the SLD operators for this problem do not commute, it is necessary to measure an approximate version of these operators on an extended Hilbert space. ρ is the input state and |η is any ancilla state. The unitary matrices to be implemented are those that satisfy Eq. (62). The two-copy measurement is able to estimate the two-copy SLD operators better than what is allowed by the uncertainty relation, Eq. (56), after accounting for a factor of two rescaling.L x(y) andL x(y),2 are the approximate observables measured in the single-and two-copy schemes respectively.
where we have replaced C 2 jk with D 2 jk , which strengthens the inequality. This bound holds when the regret is the difference between the SLD Fisher information and a separable measurement precision, R jj = F jj − F sep jj . However, in reality when considering collective measurement precisions the regret can be reduced. By examining how much the regret can decrease when allowing for collective measurements, the LW uncertainty relation can be altered to accounts for collective measurements. We denote the CFI for the optimal collective measurement, i.e. a collective measurement on infinitely many copies of the probe state, as F col jj . Then the regret R jj can be reduced a factor S jj where We can therefore modify Eq. (67) in the following way to account for collective measurements Unfortunately, S jj is not easily computed as there is no known way to find F col jj . Nevertheless, there are still situations where Eq. (69) will be useful. For symmetric problems we have that F col jj = F col kk = 2/H. As the Holevo bound can be computed efficiently, min(S jj , S kk ) can be computed efficiently in this case.

Supplementary Note 12 . Probability simplex
In order to extract information about θ, varying θ must change the probability of the measurement outcomes obtained. The different possible probability distributions form a space known as a probability simplex [39]. In Fig S8 we plot the the probability simplex generated by varying θ x and θ y in the region 0 to 2π, as a geometrical interpretation of our two-copy measurement. As we can only plot a 3D simplex we combine two of the outcome probabilities into one axis. The three different axes of our simplex are p 1 , p 2 and p 3 + p 4 , where p i is the probability of obtaining the ith measurement outcome from Eq. (20). The probability simplex is shown for different values of . When increases the area occupied by the simplex decreases, meaning neighbouring states are harder to distinguish. As → 1 the simplex shrinks to the point where all four measurement outcomes are equally likely for all values of θ. At this point it is impossible to discern any information about θ and so the variance goes to infinity.