Error-mitigated fermionic classical shadows on noisy quantum devices

Efficiently estimating the expectation values of fermionic Hamiltonians, including $k$-particle reduced density matrices ($k$-RDMs) of an n-mode fermionic state, is crucial for quantum simulations of a wealth of physical systems from the fields of many-body physics, chemistry, and materials. Yet, conventional methods of quantum state tomography are too costly in terms of their resource requirements. Classical shadow (CS) algorithms based on quantum data have been proposed as a solution to address this task by substantially reducing the number of copies of quantum states required. However, the implementation of these algorithms faces a significant challenge due to the inherent noise in near-term quantum devices, leading to inaccuracies in gate operations. To address this challenge, we propose an error-mitigated classical shadow algorithm for fermionic systems. For n-qubit quantum systems, our algorithm, which employs the easily prepared initial state $|0^n\rangle\!\langle 0^n|$ assumed to be noiseless, provably efficiently estimates all elements of $k$-RDMs with $O(kn^k\ln n)$ scaled copies of quantum states and $O(\sqrt{n}\ln^2 n)$ scaled calibration measurements. It does so even in the presence of gate or measurement noise such as depolarizing, amplitude damping, or $X$-rotation noise with at most a constant noise strength. Furthermore, our algorithm exhibits scaling comparable to previous CS algorithms for fermionic systems with respect to the number of quantum state copies, while also demonstrating enhanced resilience to noise. We numerically demonstrate the performance of our algorithm in the presence of these noise sources, as well as its performance under Gaussian unitary noise. Our results underscore the potential utility of implementing our algorithm on near-term quantum devices.


I. INTRODUCTION
Assessing the properties of interacting fermionic systems constitutes one of the core tasks of modern physics, a task that has a wealth of applications in quantum chemistry [1,2], condensed matter physics [3], and materials science [4].Notions of quantum simulation offer an alternative route to studying this important class of systems [5][6][7].In analog simulation, one prepares the system of interest under highly controlled conditions.However, any such effort makes sense only if one has sufficiently powerful readout techniques available that allow one to estimate properties.In fact, the read-out step constitutes a core bottleneck in many schemes for quantum simulation.
Fortunately, for natural fermionic systems, one often does not need to learn the full unknown quantum state; trying to do so regardless would be highly impractical, as the resources required for a full tomographic recovery would scale exponentially with the size of the system.Instead, what is commonly needed are the so-called k-particle reduced density matrices, abbreviated as k-RDMs.These are expectation values of polynomials of fermionic operators of the 2k-th degree.Naturally, the expectation value of any interaction fermionic Hamiltonian can be estimated using 2-RDMs only [8,9].Indeed, the adaptive variational quantum algorithm (VQE) [10] also utilizes up to 4-RDMs to simulate many-body interactions in the ground and excited state [11,12].That is to say, meaningful methods of read-out often focus on estimating such fermionic reduced density matrices.
On the highest level, there are several approaches that can be pursued when dealing with fermionic operators.One of those-and the one followed here-is to treat the fermionic system basically as a collection of spins.Then given spin Hamiltonians {H i } m i=1 and an unknown quantum state ρ, where m = O (poly(n)), the classical shadow (CS) algorithm or its variants [13][14][15][16][17][18][19][20][21][22][23] in qubit systems are among the most promising ways to calculate the expectations Tr (ρH i ), with the representation of the Hamiltonians {H i } m i=1 in the Pauli basis, which invokes a fermion-to-spin mapping such as the Jordan-Wigner [24,25] or Bravyi-Kitaev encodings [26,27].We define the classical shadow channel as M, which involves operating the unitary channel U uniformly randomly sampled from the Clifford group before measurements in the Z-basis measurements and classical postprocessing operations on the measurement outcomes.By performing the inverse of the classical shadow channel M −1 on the resulting state after performing M on the initial state ρ, we obtain the classical shadow representation ρ of the quantum state ρ, allowing for the calculation of the expected values of observables {H i } m i=1 with respect to ρ using classical methods.While the classical shadow algorithm requires exponentially many copies even for some local interacting fermions due to the inefficient representation in the qubit system, recently, several classical shadow algorithms for fermionic systems without encoding of the Hamiltonians have been proposed [28][29][30].Zhao, Rubin, and Miyake [28] utilize the generalized CS method [13] for fermionic systems, and proposed an algorithm that requires O n k k 3/2 (log n)/ε 2 copies for the unknown quantum states to output all the elements of a k-RDM.Low [29] proves that all elements of the k-RDM can be estimated with n k (1 − η−k n ) k 1+n 1+n−k /ε 2 number of copies of the quantum state, where η is the number of particles and n is the number of modes.These fermionic shadow estimation methods (along with the generic classical shadow formalism) do not account for noise in the system, which is an inevitability in real physical systems.
Since we are still in the noisy intermediate-scale quantum (NISQ) era [31][32][33], current quantum simulators are heavily affected by noise; hence, any characterization technique needs to be robust for these simulators to be useful.For qubit systems, robust shadow estimation was developed [34,35] where Chen et al [34] use techniques from randomized benchmarking to mitigate the effect of gate-independent time-stationary Markovian noise channels on the procedure.
Utilizing the robust shadow estimation scheme and taking inspiration from the fermionic shadow estimation of Zhao et al [28,30], we present an error-mitigated shadow estimation scheme for fermionic systems and demonstrate its feasibility for realistic noise channels.
We sample our unitaries U Q from the matchgate group [36], a natural choice for our protocol as there is a one-to-one correspondence between two-qubit matchgates and free-fermionic evolution [37,38].We therefore design the classical postprocessing operations by leveraging the irreducible representation of the matchgate group.We successfully introduce an unbiased estimator M for the noisy classical shadow channel M, where we require an additional calibration protocol to generate the estimator M with the assumption that the computational basis state |0⟩⟨0| can be prepared noiselessly.Additionally, we demonstrate the efficacy of our protocol under conditions of constant noise strength by evaluating its performance across various common noise channels: depolarizing noise, amplitude damping, X-rotation, and Gaussian unitary noise.The number of samples required for the estimation process of our protocol is on the same order as the noise-free matchgate classical shadow scheme [28,30].
We determine the effectiveness of our protocol with the above noise models by calculating the expectations of all elements of the k-particle reduced density matrix (k-RDM) when the noise strength is constant.The number of samples required for estimation, in this case, is O kn k ln(n/δ e )/ε 2 e and for calibration is We have extended the analysis of our error-mitigated fermionic shadow channel estimation to more general physical quantities inspired by the fermionic shadow analysis of Wan et al. [30], with more details in Appendices E and I.We list distinct classical shadow approaches in both noiseless and noisy qubit and fermionic systems in Table I.Our error-mitigated fermionic classical shadow technique constitutes an extension of the work by Chen et al. [34], accommodating scenarios where the gate-set lacks (1) 3-design properties [39] and (2) the applicability of the randomized benchmarking scheme developed by Helsen et al. [40].
We tested the accuracy and efficacy of our protocol by performing numerical experiments to estimate Tr (ρ γ S ) (where , where γ S is the product of |S| Majorana operators and plays a crucial role in computing k-RDMs) on a noisy quantum device subjected to various types of gate noise such as depolarizing, amplitude damping, X-rotation, and Gaussian unitaries.Our numerical investigations confirm the potential of our methods in real-world experimental scenarios.
This work is structured as follows.In Section II, we introduce the basic notations and definitions used throughout the paper.Section III presents our proposed learning process for the noisy fermionic channel.We describe the mitigation algorithm and provide an error analysis in Section IV.We then present numerical results to support our algorithm in Section V. Finally, we conclude the paper in Section VI.

II. PRELIMINARIES
Here we give the basic notations and concepts that will be used throughout this work.
Basic notations.The channel representation of a unitary U is denoted in calligraphic font as U = U (•)U † .The symbols X, Y , and Z denote the Pauli X, Y , and Z operators respectively.The operator R X (θ) = exp −i θ 2 X denotes the rotation operator around the x-axis.A Z-basis measurement is one that is performed with respect to the basis of eigenstates of the Pauli-Z operator.We utilize I to represent the identity operator on the full system.The set of linear operators on a vector space H is denoted as L(H).
Superoperator.We denote the superoperator representation of a linear operator O ∈ L(H) as |O⟩⟩ := O/ Tr (OO † ) and the scaled Hilbert-Schmidt inner product between linear operators as The action of a channel E ∈ L(L(H)) operating on a linear operator O ∈ L(H) can hence be written as E|O⟩⟩ = E(O)/ Tr (OO † ).The channel representation of a measurement with respect to the computational basis can be represented as X = x∈{0,1} n |x⟩⟩⟨⟨x|.We denote the unitary channel corresponding to the unitary operator U as Majorana operator.The Majorana operators γ j for 1 ≤ j ≤ 2n describes the fermionic system with γ j = b (j+1)/2 + b † (j+1)/2 for odd j and γ j = −i(b j/2 − b † j/2 ) for even j, where b j and b † j are the annihilation and creation operators, respectively, associated with the j-th mode.Let γ S be the product of the Majorana operators indexed by the subset S, denoted as γ S = γ l1 • • • γ l |S| for |S| > 0 and γ ∅ = I, where S = l 1 , . . ., l |S| and l 1 < l 2 < . . .< l |S| .It can be shown that γ S forms the complete orthogonal basis for L(H) for S ⊆ [2n].Let Γ k := {γ S | |S| = k} be the subspace of γ S with cardinality k.We denote the even subspace as Γ even := l Γ 2l .Also, we denote P k as the projector onto the subspace Γ k , i.e.
where we have used the notation that for a set A and an integer k, A k = {T ⊆ A : |T | = k} denotes the set of subsets of A with cardinality k.
Gaussian unitaries.Matchgates are in a one-to-one correspondence with the fermionic Gaussian unitaries and can serve as a qubit representation for these unitaries.We denote M n as the matchgate group, and write its elements U Q ∈ M n in terms of rotation matrices Q belonging to the orthogonal group Orth(2n) (see Appendix A for details) [37,38,41,42].Following Wan et al.'s study [30], which demonstrated that the continuous matchgate group M n and the discrete subgroup M n ∩ Cl n (where Cl n represents the Clifford group) deliver equivalent performances for fermionic classical shadows, our findings remain applicable to both continuous and discrete matchgate circuits.Since |S| ) det(Q| SS ′ )γ S ′ .k-particle reduced density matrices (k-RDM).We denote a k-RDM as k D, which can be obtained by tracing out all but k particles.Here we denote it as a tensor with 2k indices, k D j1,...,j k ;l1,..., where j i and l i are in [n] for i ∈ [k].The fermionic system can be equivalently described in the Majorana basis, in which case a tensor can be rewritten as the linear combinations of Tr (ργ S ), and |S| ≤ 2k.Hence all n 2k elements of the k-RDM can be obtained by calculating Tr (ργ S ), for the scaling of O n k different S with |S| ≤ 2k [43].
Pfaffian function.The Pfaffian of a matrix Q ∈ R 2n×2n is defined as which can be calculated in O n 3 time [44].
Ideal fermionic shadow (Wan et al. [30]) Given an unknown quantum state ρ, the classical shadow method first applies a unitary U Q uniformly randomly sampled from matchgate group M n , followed by measuring the generated state in the computational basis.With the measurement result |x⟩⟩, we can generate the classical representation ρ = M −1 U † Q |x⟩⟩ for the unknown quantum state ρ, where the channel M describing the overall process is defined as Noise assumptions.In this work, we assume that the noise is gate-independent, time-independent, and Markovian (a common assumption in randomized benchmarking (RB) abbreviated as the GTM noise assumption [45]) and that the preparation noise for the easily prepared state |0⟩⟨0| is negligible.For the convenience of calculation, we utilize the lefthand side noisy representation for a noisy fermionic unitary U Q := ΛU Q .Here we define the average fidelity in Γ 2k subspace for noise channel Λ as where With some calculations, we have B k = 1 for the noise-free model where noise channel Λ equals the identity.In the following, we give the analysis of the simplified result for B k for several common noise models in the qubit system and fermionic system when k > 1. See more details for the analysis in Appendix C.
(3) The X-rotation noise with the channel representation where R X (θ) = exp(−i n l=1 θ l X l /2), where the noise strengths θ l are some real numbers.By some calculations, we have Noise that is a Gaussian unitary [46,47], where we assume that the noise has no coherence with the environment.This noise channel is denoted as where U Q is a Gaussian unitary.By selecting the noise model to be Λ g , we get k ) det(Q| D(S),D(S ′ ) ).Note that for the noise models defined in (1-3), the average fidelity B k ∈ [0, 1] is close to one when the noise strengths are close to zero.
For comparison, it is worth noting that the average noise fidelity defined in Chen et al. [34], F Λ = 1 2 n b ⟨⟨b|Λ|b⟩⟩ is not equivalent as B k for the same noise model.For instance, for depolarizing noise Λ d , we have

III. LEARNING OF THE NOISY FERMIONIC CHANNEL
This section presents an unbiased estimation approach for the noisy representation of the fermionic shadow channel, which utilizes a protocol similar to the matchgate benchmarking protocol [48].According to representation theory [49] (see details in Appendix A), the noisy fermionic shadow channel can be represented as M = n k=0 f 2k P 2k .Since Tr H M(ρ) = Tr ρ M(H) , with the pre-knowledge of f 2k we can calculate Tr H M(ρ) for any observable H in the even subspace.To learn the 2(n + 1) coefficients, we begin with the easily prepared state ρ 0 = |0⟩⟨0| and apply a noisy unitary channel U with U sampled from the matchgate group.We then perform a Z-basis measurement X with measurement outcomes x ∈ {0, 1} n , followed by classically operating the unitary channel U † on |x⟩⟨x|.The generated state has expected value n k=0 f 2k P 2k (ρ 0 ), and f 2k is obtained by projecting the final state to the P 2k subspace with some classical post-processing.We illustrate the learning process of the noisy channel in Fig. 1 (a).The following theorem provides an unbiased estimation of the noisy fermionic classical shadow.
Theorem 1.The noisy fermionic shadow channel can be represented as M = k f 2k P 2k , where P 2k is defined in Eq. (1), and f2k = 2 n ⟨⟨0|P 2k U † Q |x⟩⟩/ n k is an unbiased estimator of f 2k ∈ R, where |x⟩⟩ is the measurement outcome from the noisy shadow protocol obtained by starting from the input state |0⟩⟩ and applying a noisy quantum circuit U Q followed by a Z-basis measurement, where U Q is uniformly randomly picked from the matchgate group.
The representation of the noisy fermionic channel, denoted by M = k f 2k P 2k , where f 2k ∈ C, can be obtained by the irreducible representation of the Gaussian unitary.A detailed proof of this theorem is provided in Appendix D. We claim that the coefficients f2k can be efficiently calculated with the following lemma. where This lemma can be obtained by Proposition 1 in Ref. [30].For the completeness of this paper, we also give the proof of this lemma in Appendix D. The coefficients can be calculated with the polynomial interpolation method in polynomial time.
With Theorem 1, we can give an unbiased estimation M for M.
By the definition of f2k , and the twirling properties of E U ⊗2 Q , the expectation value for the estimation f2k can be formulated as We postpone the details of this proof to Appendix D. It implies that in the noiseless scenario, M degenerates into M as defined in Equation ( 4).Combined with the definition of B k in Eq. ( 5), f 2k is close to 2n 2k −1 n k if the average noise fidelity in Γ 2k subspace is close to one.Recall that B k is a constant in the depolarizing, amplitude-damping, and X-rotation noises with a constant noise strength, which implies that these noises can be efficiently mitigated with our algorithm.
Alternatively, we have a counterexample in Appendix C that illustrates the limitations of our mitigation algorithm.Specifically, if the noise follows a Gaussian unitary channel U Q where Q is a signed permutation matrix (associated with a discrete Gaussian unitary), then B k may become zero, and hence f 2k = 0, rendering our estimation approach unsuitable.
Recall that our goal is to estimate {Tr (ρH i )} m i=1 using a noisy quantum device and polynomial classical cost, where ρ is an n-qubit quantum state and H i is an observable in the even subspace {Γ 2k } n k=0 .Here we visualize the estimation process with the guarantee in Theorem 1.We uniformly randomly sample a matchgate U Q from the matchgate group and apply it to the quantum state ρ, and then measure in the Z-basis to get outcomes x.We define the estimator Since E M = M, it follows that v is an unbiased estimator of Tr (ρH), i.e.E[v] = Tr (ρH).Note that the estimator defined in Eq. ( 12) is not always efficient for all states ρ and observables H.Here we claim that with this estimator, we can efficiently calculate substantial physical quantities such as the expectation value of k-RDM, which not only serves the variational quantum algorithm (VQE) of a fermionic system with up to k particle interactions [50,51], but also provide supports to the calculations of derivatives of the energy [52,53] and multipole moments [54].It is also an indispensable resource for the error mitigation technique [12,55].It also serves to calculate the overlap between a Gaussian state and any quantum state, and the inner product between a Slater determinant , where UQ is uniformly randomly sampled from the matchgate group.The estimation f2k for f 2k is obtained by projecting M to P 2k subspace with input state ρ0 = |0⟩⟨0|.(b) The shadow estimation ρ for input state ρ with the noisy quantum circuit and classical post-processing with the approximated noisy classical shadow channel in (a), where U Q ′ is uniformly randomly sampled from the matchgate group.and any pure state inspired by the fermionic shadow analysis of Wan et al. [30].We postpone the details to Appendix E.
Note that all elements of k-RDMs can be derived through Tr (ργ S ) for a total of O n k sets S with |S| = 2k.In an expansion of this concept, we now focus on evaluating Tr (ρ γ S ) for O n k different S with |S| = 2k.To calculate the expectation value Tr (ρ γ S ), we set the input quantum state to be ρ and the observable to be H = γ S in the estimation formula of Eq. ( 12), which can then be simplified to where Here, A| S refers to the submatrix obtained by taking the columns and rows of the matrix A that are indexed by S. The simplified quantity can be calculated in polynomial time since fc and the Pfaffian function can be calculated efficiently.We give a detailed proof for the simplification process in Appendix E.
Based on the previous analysis, we can obtain an unbiased estimation of certain quantities with a single sampling.Nevertheless, in order to obtain an accurate estimation, it is necessary to limit the number of samples for the input state.In the following section, we will provide the appropriate number of samplings for both the calibration process (to estimate f 2k for k ranging from 0 to n) and the estimation process.

IV. MITIGATION ALGORITHM AND ERROR ANALYSIS
Let M := n k=0 f2k P 2k be the estimator for the noisy shadow channel M. Lemma 1 implies that calculating f2k can be done in polynomial time.Using the estimated noisy channel, we can now obtain an unbiased estimate for {Tr (ρH j )} m j=1 , where ρ represents certain quantum states and H j denotes certain observables.Algorithm 1 demonstrates the method for mitigated unbiased estimation.Prepare state ρ0 = |0 n ⟩, uniformly sample Q ∈ Orth(2n) or Perm(2n), implement the associated noisy Gaussian unitary UQ on ρ0, and measure in the Z-basis with outcomes x; , Nc, Kc ∀k ∈ [n]; Incorporating the MedianOfMeans sub-procedure, as explained in Ref. [13], guarantees that the number of quantum state copies needed relies on the logarithm of the number of observables.We included the MedianOfMeans subprocedure in Appendix F to ensure the completeness and consistency of this paper.Our error analysis will involve selecting appropriate values for the number of calibrations and estimation samplings N c , N e , K c , and K e to estimate the coefficients f2k associated with the noisy channel.
Let v := Tr M −1 U † (|x⟩⟨x|)H be an estimation of Tr (ρH) for some observables H in even subspaces and quantum states ρ, where x follows the distribution Tr |x⟩⟨x| U(ρ) for x ∈ {0, 1} n , then we have where ε e := v − Tr M −1 M(ρ) is the estimation error and ε c := Tr M −1 M (ρ) H − Tr (ρH) is the calibration error.Therefore, by determining the necessary number of samples N e and K e to achieve the desired level of estimation error ε e (as well as N c and K c to account for the calibration error ε c ), we can obtain an estimation v with an overall error of ε = ε e + ε c , using N e K e copies of the input state ρ.
Theorem 2. Let ρ be an unknown quantum state and be a set of observables in the even subspace Γ even .Consider Algorithm 1 with the number of estimation samplings , and , and the number of calibration samplings with error ε e + ε c and success probability 1 − δ e − δ c , under the assumption that We observe that the sampling for estimation we obtained is consistent with Wan et al. [30] in the absence of noise.In the following, we will provide an analysis of the necessary number of measurements to compute ⟨ k D⟩ using Algorithm 1.To calculate the representation of each element ⟨ k D⟩ j1,...,j k ;l1,...,l k , where j i , l i are in the range and the number of calibration samplings the estimation error can be bounded to ε e + ε c .The equations for R e and R c can be simplified to R e = O kn k ln(n/δe) for the general noises with constant average fidelity B k in subspace Γ 2k for any k ∈ {0, . . ., n}.We give more details for the calculations in Appendix H.However, some types of noise channels, such as certain Gaussian unitary channels present in the related 2n × 2n matrix Q, cannot be mitigated with our mitigation algorithm.In particular, there exists a signed permutation matrix Q for which f 2k = 0, resulting in complete loss of projection for the observable onto the subspace Γ 2k .As a result, it is impossible to calculate Tr (ρ γ S ) for any set S containing 2k elements.We anticipate that the noise in the quantum device will differ significantly from the U Q which belongs to the intersection of the matchgate and Clifford groups.

V. NUMERICAL RESULTS
Here we give the numerical results for the mitigated shadow estimation in the fermionic systems.Since the elements of a k-RDM can be expressed in the form Tr (ργ S ), we give the numerical results of the errors of the estimators for the expectation value of local fermionic observables Tr (ρ γ S ).
The estimator v for Tr (ρ γ S ) can be represented as in Eq. ( 13).Here we choose the number of qubits n = 4 and S = {1, 2}, and γ S = U † Q γ S U Q and Q is uniformly randomly chosen from Perm(2n).The quantum state ρ is a uniformly randomly generated 4-qubit pure state.As shown in Fig. 2, we depict the errors of classical shadow estimators [30] and our mitigated Algorithm 1, with the changes of noise strength (Fig. 2(a)) and the changes of the number of samples (Fig. 2(b)).Here we numerically test the estimations with respect to depolarizing, amplitude damping, X-rotation, and Gaussian unitaries.
The calibration samples for Algorithm 1 are N c = 4, 000 and K c = 20 for all noise models.The number of samples for the classical shadow method is set as N e = 4, 000, K e = 10, and for Algorithm 1 is set as N e = 4, 000/(1 − p noise ) and K e = 10, where p noise is the noise strength varying for different noise settings: (1) Depolarizing noise , where ρ is any quantum state and p is the depolarizing noise strength.In Fig. 2 (a), p varies from 0.05 to 0.3 (p = 0.05j for x-axis equals j where j ∈ [6]), and in Fig. 2 (b), p = 0.2.
(2) Amplitude damping noise Λ a with representation where u̸ =v E † uv E uv .Note that Eq. ( 17) is a generalization of Eq. ( 6), which connects to Eq. ( 6) by setting p uv = pu for any u, v ∈ [n].Here we uniformly randomly choose p uv in 2 −2n [0, 1/6] • j, and labeled it as j in the x-axis of Fig. 2 where j ∈ [6], and choose the generated damping errors for j = 3 case in Fig. 2 (a) as the damping errors for Fig. 2 (b).Here we specially choose the noise strength to be j/6.
(3) X-rotation noise Λ r defined in Eq. ( 7) with noise parameters θ j = π 2 8−j for j ∈ [6], and the noise strength is chosen as 1 − cos θ in Fig. 2 (a), and Fig. 2 (b) depicts the errors in the X-rotation noise with noise parameter θ 6 .The x-axis of the mitigation results of X-rotation noise in Fig. 2 (a) denotes the label for noise parameters cos θ j for j range from 1 to 6.
From Fig. 2, we see that with the increase of the noise strength, the classical shadow method with depolarizing, amplitude damping, and X-rotation noise all gradually diverge to the expected value Tr (ρ γ S ), while our error-mitigated estimation protocol in Alg. 1 gives an expected value that is close to the noiseless value.We also depict the estimated error with the increase in the number of samples in Fig. 2 (b).The numerical results show the estimation results with Alg. 1 converges to the expected value Tr (ρ γ S ), while the convergent value for the classical shadow method is far from the expected value with depolarizing, amplitude damping, and X-rotation noises.In Fig. 2 (b) the number of samples ranges from ⌊900 + 100 exp(j)⌋ for j ∈ {0, . . ., 5}.
(4) We present the resilience of our error-mitigated algorithm by numerically testing the performance of our algorithm under the Gaussian unitary noise channel U Q , where Q is a signed permutation matrix.We compared the errors of the estimation for Tr (ρ γ S ) with our algorithm and the fermionic CS algorithm, and visualize the results in Fig. 3.We choose the Gaussian unitary noise channel to be a signed permutation matrix with coefficient f 1 nonzero, for several randomly chosen signed permutation matrices in Fig. 3 (a).The number of calibration samplings N c and K c are the same as Fig. 2, and the number of estimation samplings N e = 4000, K e = 10 for CS algorithm, and N e = 8000, K e = 10 for Alg. 1. Fig. 3 (b) choose the same noise channel with the fifth noise channel in Fig. 3 (a).From the figure, we see that without mitigation, the error is enormous with the CS algorithm.

VI. CONCLUSION AND DISCUSSION
We present an error-mitigated classical shadow algorithm for noisy fermionic systems, thereby extending matchgate classical shadows for noiseless systems [28,30].With our method, the calibration process requires a number of copies of the classical state |0⟩⟨0| that scales logarithmically with the number of qubits.Assuming a constant average noise fidelity for the noise channel, our algorithm requires the same order of estimation copies as the matchgate classical shadow without error mitigation [30].Our algorithm is applicable for efficiently calculating all the elements of a given k-RDM.
To provide a clearer demonstration of the average fidelity of common noises, we consider depolarizing, amplitude damping, and X-rotation noises.The average fidelity of depolarizing and amplitude damping noises are given by (1 − p) and (1 − u pu ) respectively, where p, pu are the noise parameters.For X-rotation noise, the average fidelity lies between [min θ cos k θ, max θ cos k θ] where θ is the rotated angles.To evaluate the effectiveness of our algorithm in mitigating these noises, we compare its performance with the matchgate CS algorithm to calculate the expectations of γ S , where |S| = 2k, which is crucial for calculating k-RDMs.Our numerical results show good agreement with the theory, validating the effectiveness of our algorithm.
While our algorithm demonstrates good performance in the presence of common types of noise in near-term quantum devices, further investigations are required to explore its potential limitations and improvements.Some of the open questions that can be addressed in future research include: (1) Is it possible to extend our algorithm to handle other types of noise, such as coherent errors and environmental noise [56], or more generally, noise that does not satisfy the GTM assumption?If so, how would these different types of noise impact the performance of our algorithm?
(2) The number of gates required by the matchgate circuit is O(n 2 ) [57].As a result, the accumulation of noise significantly increases the error mitigation threshold [58,59], which raises the intriguing question of whether it is feasible to provide an error-mitigated clas-sical shadow using a shallower circuit.This may be compared with Bertoni et al. [60], who propose a shallower classical shadow approach for qubit systems.
(3) In addition, we have included in the Appendices the analysis and numerical results regarding the overlap between a Gaussian state and any quantum states, as well as the inner product between a Slater determinant and any pure state.This prompts the question of whether our algorithm can be utilized to calculate other physical, chemical, or material properties beyond the scope of this paper.
Exploring these questions would enhance our comprehension of the potential and limitations of our algorithm, and could potentially pave the way for advancements in the estimation of fermionic Hamiltonian expectation values with near-term quantum devices.
Gaussian state.The Gaussian state is defined as The covariance matrix for a Gaussian state is a 2n × 2n matrix with the (k, l)-th element being −iTr ([γ k , γ l ]ρ g ).The computational basis |x⟩⟨x| is a special Gaussian state with the representation and the covariance matrix representation C |x⟩ is equal to The covariance matrix for a general Gaussian state defined in Eq. (B1) is Slater determinant.The τ -fermion Slater determinant |ϕ τ ⟩ is defined as , and bk = n l=1 U kl b l for some unitary U ∈ C n×n .

Appendix C: Analysis for the average fidelity of the noise channel
Recall that the fermionic shadow scheme relies on a twirl of the full space that divides into 2n + 1 even subspaces Γ 2k for k ∈ {0, ..., n}.In this section, we will investigate the range of the average fidelity parameters B k under different noise scenarios, where k ≥ 1.To start, we will prove that in (0) noise-free case, the average noise fidelities B k equals one for any k.
The noise channels under investigation, which are also discussed in the main text, represent typical noise models applicable to various architectures, containing (1) depolarizing noise, a channel that gives the probability of returning the state itself or the maximally mixed state; (2) amplitude-damping noise, a model of physical processes such as spontaneous emission; (3) X-rotation noise, which describes the rotation offset around the x-axis; and (4) Gaussian unitary noise, which is common in continuous variable systems and has the form of a normal distribution [63,64].
Since Λ is a completely positive and trace-preserving map, we can write in its Kraus decomposition such that l E † l E l = I.In the following, we analyze the range of B k for the noise-free model and different common noise models.
(0) By the definition in Eq. ( 5), for a noise-free model where Λ implements the identity transformation I, (1) For the depolarizing noise with channel representation Λ d (A) = (1 − p)A + pTr (A) I 2 n for any linear operator A, where p ∈ [0, 1] is the error probability, we have (2) The amplitude-damping noise can be denoted as the following equation with Kraus decomposition representation where E uv = √ p uv |v⟩⟨u|, and E 0 = I − u,v∈{0,1} n u̸ =v E uv E † uv , and the probabilities satisfy v∈{0,1} n \{u} p uv ≤ 1.
Here we only analyze the bound of B k for the case that all of p uv are the same for v ∈ {0, 1} n \ {u}, and let it be pu since usually p uv are close to each other for a given v. Then the probability that an error happens with output |u⟩ equals v p vu = v pv ∈ [0, 1].Typically, the probability of this error is nearly negligible and is bounded by a constant in an actual quantum device.We leave the analysis of the general amplitude-damping noise to interested readers.Note that u,v∈{0,1} n u̸ =v With the definitions of B k and Λ a we have if k ̸ = 0, and B k = 1 if k = 0. Eq. (C10) holds by the result in Eq. (C8).Since usually the noise strength x px in the device is close to zero, B k is close to one.
(4) For the noise with the representation being a Gaussian unitary channel U Q , we have Below, we provide an illustration where selecting Q from a signed permutation matrix demonstrates that B k has the potential to become zero.Note that Q is a signed permutation and hence det(Q| D(S ′ ),D(S) ) ̸ = 0 iff Q| D(S ′ ),D(S) maps D(S ′ ) to D(S).Let the vector permutation representation of it be σ Q , where is a permutation for some D(P ′ ) where size(P ′ ) = size(P ) =: r.Consequently, B k = 0 if r < k < n.For instance, take σ Q = (3, 4, 1, 5, 6, 8, 7, 2) with n = 4.This leads to B k = 0 when k ∈ {2, 3}, since r = 1 for both of these cases.

Appendix D: The properties of noisy classical fermionic channel
Here we give proofs associated with the properties of the noisy classical fermionic channel.
Proof of Theorem 1.The noisy channel M can be represented as since P 2k forms n + 1 irreducible representations for the unitary channel U Q (See details for the irreducible representations of a group in Appendix A).Then we have By using Lemma 4, we can obtain ⟨⟨0|P 2k |0⟩⟩ = 2 −n n k .Combining this result with the previous findings leads us Below, we present a proof of Lemma 1.
Proof of Lemma 1.
Hence, by Lemma 7 (It can also be obtained by Proposition 1 in Ref. [30]), we have Hence f2k is the coefficient of x k in the polynomial . The polynomial in Eq. (D6) can be calculated in O(r 4 ) time since it has degree at most r, and the Pfaffian function of a 2n × 2n matrix can be calculated in O(n 3 ) time.It can be further optimized to O(r 3 ) time by Appendix D of Ref. [30], where r is the rank of the covariance matrix.
The variance can be bounded for the estimation with the bound of f 2k and E f 2 2k .In the following, we give proofs associated with the calculation of f2k .Since E ⟨⟨0|P 2k U † Q |x⟩⟩ = f 2k ⟨⟨0|P 2k |0⟩⟩, we give the closed form for ⟨⟨0|P 2k |0⟩⟩ with the following lemma. .Hence, by substituting it into Eq.(D7), we obtain where As a generalization of Proposition 2 of Chen et al. [34], we give the bounds for f 2k and E f 2 2k , with prior knowledge of the average noise fidelity B k in the following two lemmas.Lemma 5.
Proof.Since f 2k = E f2k , to find the value of f 2k , we can evaluate the expansion of f2k as follows, (2) where |R (2)
For the noise-free model, We can use a similar technique to show that E f 2 2k is also bounded, as demonstrated in the following lemma.
In Appendix F, we will give the upper bounds for the number of samples R e , R c to bound estimation and calibration errors ε e , ε c respectively, based on the results of the above two lemmas.
Let v be the estimation result obtained by computing We can simplify the estimate of the expectation value of the Majorana operator γ S as follows: where Wan et al. [30] introduce techniques to determine the overlap between a Gaussian state and any quantum state, along with a τ -slater determinant and any pure state.Notably, the distinction between the noisy channel and the original CS channel boils down to the varying coefficients f 2k .To ensure the integrity of our paper, we outline the equations used to calculate these quantities below.One can find detailed information about the deviation in Ref. [30].
(1) Calculate the overlap between an n-qubit Gaussian state with density matrix ρ g and any n-qubit quantum state with density matrix ρ, denoted as Tr (ρρ g ).By letting H = ρ g , the estimator can be calculated efficiently since Tr Hence v can be estimated in polynomial time with the polynomial interpolation method.

Appendix F: Error analysis for estimation process
To limit the estimation error, we first provide the MedianOfMeans method [13], as shown in Algorithm 2. To give a better bound, the number of repetitions is usually set as K = O(Var (v) /ε 2 ) and N = O(ln δ −1 ) in the algorithm, where v is the estimation in a single round.Lemma 8 (Theorem S1 in Ref. [13]).Let S = {v 1 , . . ., vR } be R = N K identical and independent samples from the same distribution with N = 34Var (v) /ε 2 for any v ∈ S, and K = 2 ln(2δ −1 ) for some precision parameters ε, δ ∈ [0, 1], then the estimation v in Algorithm 2 satisfies The required number of samples for accurate estimation can be upper bounded by the variance of estimation v(j) i in a single round of sampling (as shown in Step 14 of Algorithm 1) by utilizing MedianOfMeans method [13].We give the variance of a single estimation v in the following lemma. where Proof.By the definition of v, we have where With the definition of R k1k2k3 in Eq. (A13) and the definition of M, we have where Eq. (F9) holds since f 2k f2k ≤ 1 + ε c for any 0 ≤ k ≤ n from Lemma 10, and Eq.(F10) holds by Lemma 5. Furthermore, where l i = k i /2 for i ∈ [3].By combining all the elements, we arrive at It's worth noting that in the absence of noise, the variance bound is equivalent to the variance described in Ref. [30], where B k = 1 for any k.Using Chebyshev's inequality, we can ensure that the error ε of v is limited to Var(v) with a high degree of certainty.To assess the effectiveness of this bound, we provide a specific bound on the variance for estimating various physical quantities: (0) Calculate Tr (ρ γ S ), where |S| = 2k.If we choose the observable H = γ S , then the variance can be bounded by This can be further simplified to O n k for noises with constant average fidelity in Γ 2k subspace. ( where  .The number of samples in all of these applications is polynomial to the number of qubits n.We give more detailed calculations for the required number of samples in Appendix H.

Appendix G: Error analysis for calibration process
As an extension of the calibration bound introduced in Ref. [34], the subsequent lemma illustrates the minimum calibration error resulting from the estimation of f 2k .Lemma 10.Consider M as the estimated channel for M, where M = n k=0 f2k P 2k .Let ρ represent a quantum state, and let the observable H belong to the subspace Γ even , then Proof.For the quantum state in the even subspace ∪ i Γ 2i , we have With this lemma, we can further give the required number of samplings to bound the error resulting from the estimation of the noisy channel M.
for any i ∈ [m] with success probability 1 − δ c by running the calibration process in Algorithm 1, where H 0 is the noiseless term of the observable H i .
By Lemma 10 and error bound of MedianOfMeans, we can get By substituting the results of E f 2 2k , |f 2k | in Lemma 5 and Lemma 6 into Eq.(G10), and some tedious calculations, we can obtain Theorem 4.
Proof of Theorem 4. By the MedianOfMeans method, we have ) for any error ϵ f k and failure probability δ c .With Lemma 10, let Since k ∈ {0, n} will give us the trivial values of R c = 68 (1+εc) 2 ln(2m/δc) where k ∈ {0, n}.In the following, we assume k ∈ [n − 1], and utilize the technique in concrete mathematics [66] to simplify this bound.By the Stirling formula, with the assumption that n ̸ = 0, we have for some constant c, where l ∈ [n]\ {k, n − k}.Let h(z) = c l √ n z 2 l 2 −ln−z , then h(z) is monotonically increasing for z ∈ (0, l(n − l)).Note that if we let z be k(n − k) in the function h, then we get the function g(l, k).Hence for 0 < k < n 2 , the function g(l, k) is monotonically increasing with respect to k, while for n 2 < k < n−1, the function is monotonically decreasing with respect to k.For simplification, here we assume k is even, the result also holds for odd k with similar calculations.Let k = n 2 we have where γ ≈ 0.5772 is the Euler-Mascheroni constant, and c ′ is a constant.
In contrast, when the input l is either 0, k, or n − k, it is straightforward to verify that g(l, k) = ( 2n 2k ) ( n k ) 2 and it equals

FIG. 1 .
FIG. 1. Schematic diagram of the error-mitigated matchgate classical shadow algorithm.(a) Protocol to learn the noisy classical shadow channel, where M= U † Q • X • Λ • UQ(•) = n k=0 f 2k P 2k (•), where UQ is uniformly randomly sampled from the matchgate group.The estimation f2k for f 2k is obtained by projecting M to P 2k subspace with input state ρ0 = |0⟩⟨0|.(b) The shadow estimation ρ for input state ρ with the noisy quantum circuit and classical post-processing with the approximated noisy classical shadow channel in (a), where U Q ′ is uniformly randomly sampled from the matchgate group.

FIG. 2 .
FIG. 2. The estimation errors for the expectation values of Majorana operators change as the increases of (a) the noise strength and (b) the number of samples for a fixed noise parameter.The error ε of the estimation is obtained by repeating the procedure R = 10 rounds, and estimated as ε =

FIG. 3 .
FIG. 3. The estimation errors for the expectation values of Majorana operators change as (a) randomly selected Gaussian unitary noises and (b) the number of samples increases for a fixed noise parameter.The error ε of the estimation is obtained by repeating the procedure R = 10 rounds.

Lemma 9 .
For any observable H and an unknown quantum state ρ, the estimation v generated in Step 14 of Algorithm 1 for quantity Tr H M −1 M(ρ) satisfies Var (v) ≤(1 + ε c )

2 e. 2 e
n ), by running Algorithm 1, we can bound the estimation error vi − Tr H i M −1 M(ρ) ≤ ε e for any i ∈ [m] with success probability 1 − δ e .Proof.The theorem can be obtained from Lemma 8 with the number of copies for ρ being R e = K e N e , where K e = 2 ln(2mδ −1 e ), and N e = 34Var(v) ε With the bound on the variance in Lemma 9, the value of |f 2k | provided in Lemma 5, and the union bound, we see that with the number of estimation samplings R e = N e K e (F17) = 68 ln(2m/δ e )Var (v) ε ε c ) 2 ln(2m/δ e )

Theorem 4 .
For any given unknown quantum state ρ and observables {H i } m i=1 , with the number of calibration samplings R c = O B max √ n ln n ln(m/δ c ) where B max = max k |B k | and B min = min k |B k |. we can bound the error resulting in the estimation of channel M to

( 2 ) 2 , 2 e 1 FIG. 5 .
FIG. 5.The estimation errors for the inner product of the τ -Slater determinant and a pure state change as (a) the increase of the noise strength and (b) the number of samples increases for a fixed noise parameter.The error ε of the estimation is obtained by repeating the procedure R = 4 rounds.
) Calculate Tr (ρρ g ), where ρ g is a Gaussian state.By choosing the observable H = ρ g , the variance is bounded to max = max k |B k | and B min = min k |B k |.It can be further simplified to O ( We give the details for the simplification of variance in Appendix H.With this lemma and MedianOfMeans method, we can obtain Theorem 3. √ n) if the average fidelities of noise B k are constants for any k ∈ {0, ..., n}.(2) Calculate ⟨ψ|ϕ τ ⟩, where |ϕ τ ⟩ is a τ -fermionic Slater determinant.Let the initial input state of the quantum device be ρ = (|0⟩+|ψ⟩)(⟨0|+⟨ψ|)2, and observable H = |ϕ τ ⟩ ⟨0|, the variance is bounded to O e if the average fidelities of noise B k are constants for any k ∈ {0, . . ., n}.
F19) where vi − Tr H i M −1 M(ρ) ≤ ε e for any i ∈ [m], with success probability 1 − δ e .From this theorem, we can give the estimation of the number of samplings to approximate k-RDM, the overlap of m number of Gaussian state and any quantum state, and m number of τ -fermionic Slater determinant.The number of estimation samplings equals (1) R e = O k ln(n/δe) k n k to estimate k-RDMs; (2) R e = O e to estimate ⟨ψ|ϕ τj ⟩ m j=1