Introduction

It is widely believed that we are now in the noisy intermediate-scale quantum (NISQ) era1, where quantum computers with 50–100 qubits are available while noise in quantum gates severely limits the quantum circuits that can be executed reliably. It thus becomes important to make the best use of today’s NISQ devices to design practical applications. One promising scheme for near-term quantum applications is the variational quantum algorithms (VQA)2, which have been applied to solve many tasks including Hamiltonian ground and excited states preparation3,4, quantum state distance estimation5,6, and quantum data compression7,8,9. These variational quantum algorithms involve evaluating and optimizing loss functions that depend on parameters in parameterized quantum circuits (PQC). They are regarded as well-suited for execution on NISQ devices by combining quantum computers with classical computers. We refer the readers to10,11 for a detailed review on VQA.

Quantum entanglement12, the most nonclassical manifestation of quantum mechanics, has been identified as invaluable resource enabling a tremendous number of tasks ranging from quantum information processing13,14, quantum cryptography15,16, quantum algorithms17,18, quantum communication19, to measurement-based quantum computing20,21. As so, the ability to manipulate quantum entanglement12,22 is the cornerstone to achieve real applications of quantum technologies. A number of theoretical and experimental methods have been proposed in the past 20 years for entanglement detection and quantification12,23,24. For example, entanglement can be detected via entanglement witnesses25,26, Bell’s inequalities27, realignment criterion28,29, range criterion30, and majorization criterion31. These methods commonly assume that prior information about the target state is known. A direct way to obtain such information is to perform quantum state tomography and reconstruct the density matrix32,33. However, tomography becomes unrealistic as the number of required measurement settings scales exponentially with the size of the system. Briefly speaking, though there are many methods proposed for detecting and quantifying quantum entanglement, they are not specially designed for near-term quantum devices and thus are not directly applicable in most cases, rendering reliable detection and quantification of quantum entanglement on near-term quantum devices a vital challenge. Recently there are a number of works aiming to overcome this challenge34,35,36,37. The core idea of all these approaches is to perform measurements in randomly sampled bases, leading to ensembles of measurement outcomes whose statistical correlations provide a fingerprint of the system’s entanglement.

In this paper, we combine VQA and the quasi-probability decomposition technique38,39,40,41,42,43,44 to propose the variational entanglement detection (VED) and variational logarithmic negativity estimation (VLNE) algorithms, contributing new approaches to detect and quantify quantum entanglement on near-term quantum devices, respectively. VED uses criteria based on positive maps as a bridge and works as follows. Given an unknown target bipartite quantum state, it firstly decomposes the chosen positive map into a linear combination of NISQ implementable quantum operations. Then, it variationally estimates the minimal eigenvalue of the final state, which is obtained by executing these quantum operations on the target state and averaging the output states. Two methods are proposed to compute the average: the first one averages the output states according to the quasi-probability distribution, and the second one estimates the average via the sampling technique and is probabilistic. At last, it asserts that the target state is entangled if the optimized minimal eigenvalue is negative. Following the idea of VED, VLNE variationally computes the well-known log-negativity entanglement measure, building on a linear decomposition of the transpose map into Pauli terms and the recently proposed trace distance estimation algorithm. Experimental and numerical results reveal the validity of the proposed entanglement detection and quantification methods.

Results

Quantum entanglement detection

In this section, we integrate variational quantum algorithms with the quasi-probability decomposition technique38,39,40,41,42,43,44 to propose a bipartite entanglement detection framework specially designed for near-term quantum devices, using positive map criterion as a bridge. For simplicity, we assume A and B are two n-qubit quantum systems throughout this section. However, we remark that the proposed framework can be applied to bipartite systems with different dimensions directly.

Let Δ be a discrete set of quantum operations that are implementable in the near-term quantum devices. For example, one may choose Δ to be the set of implementable operations introduced in42,43. Alternatively, one may set Δ to be the set of Pauli channels induced by Pauli operators from the Pauli set [see Supplementary Note 1]. For a positive (but not completely positive) and trace-preserving map \({{{{\mathcal{N}}}}}_{B\to B}\), we assume that it can be decomposed w.r.t. Δ as

$${{{\mathcal{N}}}}(\cdot )=\mathop{\sum}\limits_{{{{\mathcal{O}}}}\in {{\Delta }}}{r}_{{{{\mathcal{O}}}}}{{{\mathcal{O}}}}(\cdot ),\,{r}_{{{{\mathcal{O}}}}}\in {\mathbb{R}}.$$
(1)

Note that such a decomposition always exists if Δ contains a universal basis42. The trace-preserving condition imposes \({\sum }_{{{{\mathcal{O}}}}}{r}_{{{{\mathcal{O}}}}}=1\). We emphasize that there must exist negative coefficients \({r}_{{{{\mathcal{O}}}}}\) since otherwise, \({{{\mathcal{N}}}}\) is completely positive. Given a bipartite quantum state ρAB, we have

$${\sigma }_{AB}:= {{{{\mathcal{N}}}}}_{B\to B}({\rho }_{AB})=\mathop{\sum}\limits_{{{{\mathcal{O}}}}\in {{\Delta }}}{r}_{{{{\mathcal{O}}}}}{{{{\mathcal{O}}}}}_{B\to B}({\rho }_{AB}).$$
(2)

To see if ρAB can be detected by \({{{\mathcal{N}}}}\), i.e., if ρAB is entangled from \({{{\mathcal{N}}}}\)’s perspective, we need to check if the output state σAB has a negative eigenvalue or not. Denote by \({\lambda }_{\min }({\sigma }_{AB})\) the smallest eigenvalue of σAB. By the positive map criterion, if ρAB is separable, then it must hold that \({\lambda }_{\min }({\sigma }_{AB})\ge 0\). Equivalently, if \({\lambda }_{\min }({\sigma }_{AB}) \,< \, 0\), we safely conclude that ρAB is entangled and it can be detected by the positive map \({{{{\mathcal{N}}}}}_{B\to B}\). This highlights the importance of computing or estimating \({\lambda }_{\min }({\sigma }_{AB})\) in entanglement detection.

Deterministic detection

As we have argued, σAB cannot be obtained directly via \({{{\mathcal{N}}}}(\rho )\) since \({{{\mathcal{N}}}}\) does not represent a physically implementable quantum operation. Fortunately, the decomposition in Eq. (2) empowers us an effective way to simulate the role of \({{{\mathcal{N}}}}\) and reconstruct σAB as an average of a set of output states, obtained using quantum circuits implementable in near-term devices. This decomposition technique, combined with the variational quantum algorithm, enables a general framework that estimates \({\lambda }_{\min }({\sigma }_{AB})\), whose value can witness the entanglement of the input state ρAB. We call this framework the variational entanglement detection (VED). The core idea is to use the linear decomposition in Eq. (2) of the target state σAB and the framework goes as follows. First of all, it holds that (see Supplementary Note 1)

$${\lambda }_{\min }({\sigma }_{AB})=\mathop{\min }\limits_{{\left|\psi \right\rangle }_{AB}}\left\langle \psi \right|{\sigma }_{AB}\left|\psi \right\rangle$$
(3)
$$=\mathop{\min }\limits_{{\left|\psi \right\rangle }_{AB}}\mathop{\sum}\limits_{{{{\mathcal{O}}}}\in {{\Delta }}}{r}_{{{{\mathcal{O}}}}}\left\langle \psi \right|{{{\mathcal{O}}}}({\rho }_{AB})\left|\psi \right\rangle ,$$
(4)

where the minimization ranges over all pure bipartite quantum states \({\left|\psi \right\rangle }_{AB}\) in AB. We use a variational quantum circuit with parameters α to prepare the test state \(\left|\psi \right\rangle\). More precisely, we choose a parameterized quantum circuit ansatz that generates a unitary U(α) and prepares the test state via \(\left|\psi \right\rangle =U({{{\boldsymbol{\alpha }}}}){\left|0\right\rangle }^{\otimes 2n}\). Each inner product \(\left\langle \psi \right|{{{\mathcal{O}}}}(\rho )\left|\psi \right\rangle\) in Eq. (4) can be estimated via the canonical Swap Test subroutine45, as both U(α) and \({{{\mathcal{O}}}}\) can be implemented in near-term devices. However, this subroutine costs a total number of 4n + 1 qubits and requires a 4n-qubit SWAP gate, which is resource consuming when n becomes large. Here we explore the special structure of the overlap \(\left\langle \psi \right|{{{\mathcal{O}}}}(\rho )\left|\psi \right\rangle\) and propose an qubit efficient estimating procedure which uses 2n qubits and avoids the use of expensive SWAP gate. First of all, notice that

$$\left\langle \psi ({{{\boldsymbol{\alpha }}}})\right|{{{\mathcal{O}}}}({\rho }_{AB})\left|\psi ({{{\boldsymbol{\alpha }}}})\right\rangle$$
(5)
$$=\left\langle {0}^{2n}\right|{U}_{{{{\boldsymbol{\alpha }}}}}^{{\dagger} }{{{\mathcal{O}}}}({\rho }_{AB}){U}_{{{{\boldsymbol{\alpha }}}}}\left|{0}^{2n}\right\rangle$$
(6)
$$={{{\rm{Tr}}}}\left[{U}_{{{{\boldsymbol{\alpha }}}}}^{{\dagger} }{{{\mathcal{O}}}}({\rho }_{AB}){U}_{{{{\boldsymbol{\alpha }}}}}\left|{0}^{2n}\right\rangle \left\langle {0}^{2n}\right|\right] ,$$
(7)

where the second equality follows from the cyclic property of trace function. Since each \({{{\mathcal{O}}}}\) is implementable on near-term quantum devices, we may use ρAB as input to the quantum circuit implementing \({{{\mathcal{O}}}}\), and estimate the overlap \(\left\langle \psi \right|{{{\mathcal{O}}}}({\rho }_{AB})\left|\psi \right\rangle\) using the quantum circuit illustrated in Fig. 1. The overlap is obtained by counting the relative frequency of the measurement outcome 02n. Then, we repeat the estimation procedure Δ times, where Δ is the size of Δ, to obtain the overlaps for different \({{{\mathcal{O}}}}\) in Eq. (4). With these data in hand, we compute the following loss function:

$$L({{{\boldsymbol{\alpha }}}}):= \mathop{\sum}\limits_{{{{\mathcal{O}}}}\in {{\Delta }}}{r}_{{{{\mathcal{O}}}}}\left\langle \psi ({{{\boldsymbol{\alpha }}}})\right|{{{\mathcal{O}}}}({\rho }_{AB})\left|\psi ({{{\boldsymbol{\alpha }}}})\right\rangle .$$
(8)

We remark that this loss function is a global cost since it requires measuring the expectation value of all zero results, which may lead to barren plateaus46. We provide detailed discussions and potential solutions in the section “Resource cost and barren plateaus”. In particular, it would be interesting to adapt the technique invented in46 to define a local version for pursuing better scalability and trainability. At last, we perform gradient-based optimization methods including SGD47 and Adam48 to minimize the loss function L(α) by varying the parameters α, whose value will determine the separability of the input state ρAB. More precisely, if L(α) is negative, we conclude that ρAB is entangled, since by the positive map criterion, separable states cannot yield a negative spectrum.

Fig. 1: Circuit estimating the overlap \(\left\langle \psi \right|{{{\mathcal{O}}}}({\rho }_{AB})\left|\psi \right\rangle\) in Equation (7).
figure 1

This simplified quantum circuit estimates the overlap \(\left\langle \psi \right|{{{\mathcal{O}}}}({\rho }_{AB})\left|\psi \right\rangle\) in Eq. (7) for a given implementable operation \({{{\mathcal{O}}}}\), where \(\left|\psi \right\rangle := {U}_{{{{\boldsymbol{\alpha }}}}}{\left|0\right\rangle }^{\otimes 2n}\) is the parameterized input state.

Taking into account the noise in NISQ quantum devices, we may introduce a tolerance threshold δ > 0 so that L(α) < −δ implies the input state is entangled. This threshold δ can be set with prior knowledge about the noise characterization on the NISQ devices. What’s more, for the purpose of entanglement detection, it is unnecessary to minimize L(α) since the condition L(α) < 0 is sufficient to assert that the input state is entangled. Based on this observation, we can terminate the optimization procedure that minimizes the loss function L(α) in advance to save the optimization cost. It was heuristically observed that the loss function achieves lower values with noise-free training than with noisy training49,50,51, where the intuition behind is that the cost landscape is flattened and hence gradient magnitudes are reduced due to the hardware noise52. Based on this observation and the fact that separable states have positive eigenvalues in the positive map criteria, we conclude that the optimized loss function for separable states will always be positive, and thus our algorithm will not lead to false-positive results.

The detailed VED framework is summarized in Algorithm 1 and illustrated in Fig. 2. We name it the deterministic VED to distinguish it from the probabilistic framework described in the next section.

Fig. 2: The VED framework for detecting entanglement on near-term quantum devices.
figure 2

The difference between the deterministic VED in Algorithm 1 and the probabilistic VED in Algorithm 2 lies in how the decomposed quantum operations are sampled.

Algorithm 1

Deterministic VED

1: Input: 2n-qubit quantum state ρAB, decomposition in Eq. (1) of the positive map \({{{\mathcal{N}}}}\), parameterized quantum circuit U(α) with initial parameters α, and tolerance δ;

2: Initialize L(α) = 0;

3: for all \({{{\mathcal{O}}}}\in {{\Delta }}\) such that \({r}_{{{{\mathcal{O}}}}}\,\ne\, 0\) do

4: Apply Uα to \({\left|0\right\rangle }^{\otimes 2n}\) and obtain test state \(\left|\psi \right\rangle ={U}_{{{{\boldsymbol{\alpha }}}}}{\left|0\right\rangle }^{\otimes 2n}\);

5: Input ρAB and compute the overlap \({c}_{{{{\mathcal{O}}}}}:= \left\langle \psi \right|{{{\mathcal{O}}}}({\rho }_{AB})\left|\psi \right\rangle\)

using the quantum circuit in Fig. 1;

6: Update the loss function \(L({{{\boldsymbol{\alpha }}}})=L({{{\boldsymbol{\alpha }}}})+{r}_{{{{\mathcal{O}}}}}{c}_{{{{\mathcal{O}}}}}\), where \({r}_{{{{\mathcal{O}}}}}\) is given by the decomposition in Eq. (1);

7: end for

8: Perform optimization methods to minimize L(α); terminate the optimization if the error tolerance is satisfied: L(α) < −δ;

9: Output “Entangled” if the optimized L(α) < −δ.

Probabilistic detection

In Algorithm 1, we have used a brute-force approach, where we iterate over the set of implementable operations Δ, to estimate the loss function L(α). Actually, L(α) can be estimated in a probabilistic way using the sampling technique, by virtue of the quasi-probability decomposition in Eq. (1). This method would be beneficial when the number of decomposed operations in Eq. (1) with non-zero coefficients is large while the sampling cost is relatively low. Now we describe the sampling method accurately. First of all, notice that the decomposition in Eq. (1) induces a quasi-probability distribution \({\{{r}_{{{{\mathcal{O}}}}}\}}_{{{{\mathcal{O}}}}\in {{\Delta }}}\) over Δ. From this quasi-probability distribution, we can construct a probability distribution \({\{{p}_{{{{\mathcal{O}}}}}\}}_{{{{\mathcal{O}}}}\in {{\Delta }}}\) using the canonical technique, i.e.,

$${p}_{{{{\mathcal{O}}}}}:= \frac{| {r}_{{{{\mathcal{O}}}}}| }{\gamma },\quad \gamma := \mathop{\sum}\limits_{{{{\mathcal{O}}}}\in {{\Delta }}}| {r}_{{{{\mathcal{O}}}}}| .$$
(9)

Substituting Eq. (9) into Eq. (8) yields

$$L({{{\boldsymbol{\alpha }}}})=\gamma \mathop{\sum}\limits_{{{{\mathcal{O}}}}\in {{\Delta }}}{{{\rm{sgn}}}}({r}_{{{{\mathcal{O}}}}}){p}_{{{{\mathcal{O}}}}}\left\langle \psi ({{{\boldsymbol{\alpha }}}})\right|{{{\mathcal{O}}}}({\rho }_{AB})\left|\psi ({{{\boldsymbol{\alpha }}}})\right\rangle$$
(10)
$$={{\mathbb{E}}}_{{{{\mathcal{O}}}}}\left[\gamma {{{\rm{sgn}}}}({r}_{{{{\mathcal{O}}}}})\left\langle \psi ({{{\boldsymbol{\alpha }}}})\right|{{{\mathcal{O}}}}({\rho }_{AB})\left|\psi ({{{\boldsymbol{\alpha }}}})\right\rangle \right],$$
(11)

where \({\mathbb{E}}(X)\) denotes the expectation of the random variable X, and the expectation in Eq. (11) is evaluated w.r.t. the probability distribution \({\{{p}_{{{{\mathcal{O}}}}}\}}_{{{{\mathcal{O}}}}\in {{\Delta }}}\). Based on Eq. (11), we propose Algorithm 2, which can be viewed as a probabilistic version of Algorithm 1. In particular, Algorithm 2 replaces the brute-force approach (Steps 3–7) in Algorithm 1 with the sampling approach, yielding a probabilistic algorithm as summarized in Algorithm 2.

Let’s analyze Algorithm 2 in depth. First, we remark that the obtained \(L^{\prime} ({{{\boldsymbol{\alpha }}}})\) in step 10 of Algorithm 2 is an unbiased estimator of true value L(α) due to Eq. (11). Second, since L(m) ≤ γ, we can apply the Hoeffding's inequality53 to ensure that \(M=2{\gamma }^{2}\log (2/\varepsilon )/{\delta }^{2}\) number of samples would estimate the true value L(α) within error δ with success probability no less than 1−ε, i.e.,

$$p\left(| L^{\prime} ({{{\boldsymbol{\alpha }}}})-L({{{\boldsymbol{\alpha }}}})| \le \delta \right)\ge 1-\varepsilon .$$
(12)

This confirms the validity of the sampling procedure (steps 4–9) of Algorithm 2. We call γ the sampling cost since it determines M, the number of samples required to achieve the desired precision. At last, we examine the success probability of the algorithm, given the success probability condition in Eq. (12) of the sampling procedure. Assume the optimization procedure repeats K times. The overall success probability of Algorithm 2 is no less than 1−Kε, as a direct corollary of Eq. (12) and the union bound. That is to say, if Algorithm 2 outputs “Entangled”, ρAB is entangled with probability larger than 1−Kε.

To summarize, we have proposed two variational entanglement detection methods. Algorithm 1 is deterministic in the sense that whenever it outputs “Entangled”, one can safely assert that ρAB is entangled. On the other hand, Algorithm 2 is probabilistic in the sense that even if it outputs “Entangled”, one can only declare that ρAB is entangled with certain success probability. Nevertheless, when the number of decomposed operations in Eq. (1) with non-zero coefficients is large while the simulation cost γ is relatively low, the latter method may be beneficial. In this case, one can reduce the number of iterations via sampling and thus save computational resources. Algorithm 2 scarifies precision for efficiency in entanglement detection.

Algorithm 2

Probabilistic VED

1: Input: 2n-qubit quantum state ρAB, decomposition in Eq. (1) of the positive map \({{{\mathcal{N}}}}\), parameterized quantum circuit U(α) with initial parameters α, error tolerance δ, and fail probability ε.

2: Initialize \(L^{\prime} ({{{\boldsymbol{\alpha }}}})=0\);

3: Compute γ defined in Eq. (9) and set \(M=2{\gamma }^{2}\log (2/\varepsilon )/{\delta }^{2}\);

4: for all m = 1,  , M do

5: Apply Uα to \({\left|0\right\rangle }^{\otimes 2n}\) and obtain test state \(\left|\psi \right\rangle ={U}_{{{{\boldsymbol{\alpha }}}}}{\left|0\right\rangle }^{\otimes 2n}\);

6: Sample a quantum operation \({{{{\mathcal{O}}}}}^{(m)}\) from Δ according to the probability distribution \({\{{p}_{{{{\mathcal{O}}}}}\}}_{{{{\mathcal{O}}}}\in {{\Delta }}}\) in Eq. (9); Let r(m) be the coefficient of \({{{{\mathcal{O}}}}}^{(m)}\) in Eq. (1);

7: Input ρAB and compute the overlap \({c}^{(m)}:= \left\langle \psi \right|{{{{\mathcal{O}}}}}^{(m)}(\rho )\left|\psi \right\rangle\) using the quantum circuit in Fig. 1;

8: Compute \({L}^{(m)}=\gamma {{\mathrm{sgn}}}\,({r}^{(m)}){c}^{(m)}\);

9: end for

10: Compute the loss function \(L^{\prime} ({{{\boldsymbol{\alpha }}}})=\frac{1}{M}\mathop{\sum }\nolimits_{m = 1}^{M}{L}^{(m)}\);

11: Perform optimization methods to minimize \(L^{\prime} ({{{\boldsymbol{\alpha }}}})\); terminate the optimization if the error tolerance is satisfied: \(L^{\prime} ({{{\boldsymbol{\alpha }}}}) < -\delta\);

12: Output “Entangled” if the optimized \(L^{\prime} ({{{\boldsymbol{\alpha }}}}) < -\delta\).

Prominent positive maps

In section “Quantum entanglement detection” we have outlined the general deterministic and probabilistic VED frameworks for detecting entanglement via positive map criterion. In this section, we elaborate on three prominent positive maps—the transpose map54, the reduction map55, and the enhanced reduction map56,57—to illustrate how the deterministic VED framework works. We choose the set of NISQ implementable quantum operations Δ to be the set of Pauli channels induced by Pauli operators from the Pauli set Pn, i.e.,

$${{\Delta }}:= \left\{{{{\mathcal{P}}}}\,| \,{{{\mathcal{P}}}}(\cdot )=P(\cdot ){P}^{{\dagger} },P\in {{{{\boldsymbol{P}}}}}_{n}\right\}.$$
(13)

For each of the three positive maps under consideration, we firstly decompose it w.r.t. Δ as in Eq. (1) and then adopt the variational framework summarized in Algorithm 1 to fulfill entanglement detection. However, we remind that not all positive maps can be decomposed w.r.t. the set of Pauli channels.

Here are remarks for the three criteria under consideration. First, the reduction criterion is strictly weaker than both the transpose criterion and the enhanced reduction criterion, in the sense that the states that can be detected by the first criterion can also be detected by the latter two criteria. Second, there is no inclusion relation between the transpose criterion and the enhanced reduction criterion. That is, there are states that can be detected by one but not by the other. As so, given an unknown state, one may execute VED twice. One adopts the PPT criterion, and the other adopts the enhanced reduction criterion. The state is necessarily entangled if at least one of these two VEDs outputs “Entangled”. We also show by example how VED works in qutrit systems in Supplementary Note 2, utilizing the Choi map58,59.

PPT criterion

A necessary condition for entanglement detection is the positive partial transpose (PPT) criterion54, which we briefly review as follows. Let ρAB be a bipartite quantum state. We can express it as

$${\rho }_{AB}=\mathop{\sum}\limits_{ijkl}{\alpha }_{ijkl}\left|i\right\rangle \,{\left\langle j\right|}_{A}\otimes \left|k\right\rangle \,{\left\langle l\right|}_{B},$$
(14)

where \({\{\left|i\right\rangle \}}_{i}\) and \({\{\left|k\right\rangle \}}_{k}\) are the computational bases of A and B, respectively. Its partial transpose with respect to system B is defined as

$${\rho }_{AB}^{{T}_{B}}:= ({{{{\rm{id}}}}}_{A}\otimes {T}_{B})({\rho }_{AB})$$
(15)
$$=\mathop{\sum}\limits_{ijkl}{\alpha }_{ijkl}\left|i\right\rangle \,\left\langle j\right|\otimes {(\left|k\right\rangle \left\langle l\right|)}^{{\mathrm {T}}}$$
(16)
$$=\mathop{\sum}\limits_{ijkl}{\alpha }_{ijkl}\left|i\right\rangle \,\left\langle j\right|\otimes \left|l\right\rangle \,\left\langle k\right|,$$
(17)

where TB denotes the transpose map on system B. The PPT criterion says that if ρAB is separable, then \({\rho }_{AB}^{{T}_{B}}\ge 0\). Conversely, the negative spectrum witnesses entanglement of ρAB. What’s more, the PPT criterion is not only necessary but also sufficient for separability of the 2  2 and 2  3 cases25,60,61.

We begin with the two-qubit bipartite quantum state case. Notice that the qubit transpose map admits the following decomposition w.r.t. Δ specialized in Eq. (13):

$$T(\rho )=\frac{\rho +X\rho X-Y\rho Y+Z\rho Z}{2},$$
(18)

where X, Y, Z are the Pauli matrices. The validity of this decomposition can be checked by direct calculation. Substituting Eq. (18) into Eq. (17), we obtain

$${\rho }_{AB}^{{T}_{B}}:= ({{{{\rm{id}}}}}_{A}\otimes {T}_{B})({\rho }_{AB})$$
(19)
$$=\frac{1}{2}(\rho +{X}_{B}\rho {X}_{B}-{Y}_{B}\rho {Y}_{B}+{Z}_{B}\rho {Z}_{B}),$$
(20)

where the quantum operation XBρABXB should be understood as (IAXB)ρAB(IAXB), and similarly for YBρYB and ZBρZB. Adapting the decomposition in Eq. (20) into Algorithm 1, we successfully apply the proposed VED to accomplish the PPT criterion in the qubit case.

Now we show the above detection method can be generalized to the multi-qubit bipartite quantum state case. Let B ≡ B1B2Bn be a composite system with n qubits, i.e., Bi represents the ith qubit system. A key observation is that the transpose operation satisfies the tensor product property: transposing the composite system B is equivalent to transposing the local qubit systems Bi individually. More precisely,

$${T}_{{{{\boldsymbol{B}}}}}=\mathop{\bigotimes }\limits_{i=1}^{n}{T}_{{B}_{i}},$$
(21)

where \({T}_{{B}_{i}}\) is the transpose operation on the ith qubit. Equations (21) and (18) together give TB as a linear combination of 4n Pauli channels in total. Using this decomposition, we may apply VED (Algorithm 1 or Algorithm 2) to accomplish the multi-qubit PPT criterion deterministically or probabilistically.

Reduction criterion

In this section, we first review the reduction criterion55 and then propose a variational algorithm implementing this criterion within the VED framework. Let

$${{{{\mathcal{R}}}}}_{B\to B}({X}_{B}):= {{{\rm{Tr}}}}[{X}_{B}]{I}_{B}-{X}_{B},$$
(22)

which is known as the reduction map. The reduction criterion says that if a bipartite quantum state ρAB is separable, then it must hold that

$${\sigma }_{AB}:= \left({{{{\rm{id}}}}}_{A}\otimes {{{{\mathcal{R}}}}}_{B\to B}\right)({\rho }_{AB})\ge 0.$$
(23)

Equivalently, if σAB has negative eigenvalues, then ρAB is entangled. It is based on this observation that our variational algorithm works.

To apply the framework in the section “Quantum entanglement detection”, we have to first decompose \({{{{\mathcal{R}}}}}_{B\to B}\) into a linear combination of Pauli channels. Indeed, we can do so since

$${{{{\mathcal{R}}}}}_{B\to B}({\rho }_{B}):= {{{\rm{Tr}}}}[{\rho }_{B}]{I}_{B}-{\rho }_{B}$$
(24)
$$=\frac{1}{{2}^{n}}\mathop{\sum }\limits_{P\in {{{{\boldsymbol{P}}}}}_{n}}{{{\mathcal{P}}}}({\rho }_{B}){I}_{B}-{\rho }_{B}$$
(25)
$$=\frac{1-{2}^{n}}{{2}^{n}}{\rho }_{B}+\frac{1}{{2}^{n}}\mathop{\sum }\limits_{P\ne {I}^{\otimes n}}{{{\mathcal{P}}}}({\rho }_{B}),$$
(26)

where the second equality follows from the twirling property of Pauli channels62, Exercise 4.7.3. Using this decomposition, we can call Algorithm 1 or Algorithm 2 to accomplish the reduction criterion. Specially, in the qubit case where n = 1, the reduction map is of the form

$${{{{\mathcal{R}}}}}_{B\to B}(\rho )=\frac{-\rho +X\rho X+Y\rho Y+Z\rho Z}{2}.$$
(27)

As one might see, deterministic VED using the reduction criterion is not efficient in the many qubits case since it has to compute exponentially many overlaps: for a 2n-qubit bipartite quantum state, one has to compute about 4n overlaps. Notice that the probabilistic VED using the reduction criterion is slightly better than the deterministic one since the sampling cost satisfies γ ≈ 2n. The quasi-probability sampling method does not show notable advantages over the deterministic method for the positive maps used in our work. However, the probabilistic VED provides a different perspective to reduce the cost using the quasi-probability sampling technique and it may be useful for entanglement detection via other positive maps. In the section “VED based on reduction criterion without decomposition”, we also propose another entanglement detection method (cf. Algorithm 3) with better efficiency in measurement cost by exploring the simple structure of the reduction map in Eq. (22) and showcase the efficiency of Algorithm 3 compared to the regular VED using typical entangled states.

Enhanced reduction criterion

In this section, we consider an enhanced version of the reduction map56,57 for bipartite quantum states. This enhanced criterion is based on an elementary positive map which operates on state spaces with even dimension. It is known that the enhanced reduction criterion detects many bound entangled states (states that satisfy the PPT criterion). As before, we first review this enhanced reduction criterion and show how to combine it with the VED framework proposed in the section “Quantum entanglement detection” to detect entanglement.

Define the following anti-symmetric unitary in an n-qubit Hilbert space:

$${U}_{a}={{{\rm{antidiag}}}}(1,-1,1,-1,\ldots \,,1,-1),$$
(28)

where antidiag means anti-diagonal. For example, when n = 2, the corresponding anti-symmetric unitary has the form

$${U}_{a}=\left[\begin{array}{cccc}0&0&0&1\\ 0&0&-1&0\\ 0&1&0&0\\ -1&0&0&0\end{array}\right]=X\otimes iY.$$
(29)

Indeed, one can check that the n-qubit Ua can be decomposed w.r.t. the Pauli set as Ua = XXiY, where there are n−1 X operators in the tensor product. Based on Ua, we define the following map56

$${{{{\mathcal{K}}}}}_{B\to B}({\rho }_{B}):= {{{{\mathcal{R}}}}}_{B\to B}({\rho }_{B})-{U}_{a}{T}_{B}(\rho ){U}_{a}^{{\dagger} },$$
(30)

where \({{{{\mathcal{R}}}}}_{B\to B}\) is the reduction map defined in Eq. (22) and TB is the transpose map defined in Eq. (17). This map has been shown to be positive but not completely positive56. What’s more, this map improves the reduction criterion and can detect bound entangled states that cannot be detected by the PPT criterion. Substituting the Pauli decomposition in Eq. (26) of \({{{\mathcal{R}}}}\) and the Pauli decomposition in Eq. (21) of TB into Eq. (30) and regrouping the Pauli terms, we obtain a Pauli decomposition of \({{{\mathcal{K}}}}\), where there is a total number of 4n Pauli terms. Using this decomposition, we can use VED (Algorithm 1 or Algorithm 2) to accomplish the enhanced reduction criterion.

Resource cost and barren plateaus

Note that another way to detect and quantify entanglement of a state ρAB is to obtain its density matrix via quantum state tomography63. Full density matrix reconstruction of an unknown (nA + nB)-qubit state in the worst-case costs exponential copies of the state64,65, e.g., \(\widetilde{{{\Omega }}}({4}^{{n}_{A}+{n}_{B}})\) measurement results are necessary to reconstruct a matrix close to ρ in terms of trace distance65. Using the learned density matrix, we can either numerically apply a positive map on it or compute the fidelity between ρAB and any entangled target states. However, such methods are resource-demanding compared to the VED framework.

To use the decomposed maps for entanglement detection on the state ρAB, we can apply them either to subsystem A or to subsystem B, which requires \({{{\rm{poly}}}}(D){4}^{\min \{{n}_{A},{n}_{B}\}}\) measurement results with circuit depth D in optimization loops. Under the assumption that the PQC could be well trained, our method is considerably better than the state tomography method. The barren plateaus phenomena might weaken the advantage of our method over the state tomography in terms of the measurement cost. However, for large-scale quantum systems, our method could work in the proof of concept while computing the minimum eigenvalue in the tomography method is extremely difficult. We further elaborate the barren plateaus problem from different aspects. In particular, methods based on the state tomography need vast memory to store and process the density matrix on a classical computer, which are unbearably resource-demanding as the scale of the number of qubits increases. The VED framework, on the other hand, does not require such classical memory and post-processing.

As our current methods use a global cost function and choose a hardware-efficient ansatz, it is possible that VED exhibits a barren plateau, resulting an exponentially suppressed gradient with respect to the problem dimension66. Under this circumstance, there exists a vast flat area on the loss/optimization landscape. This phenomenon is known as the barren plateau (BP) and is independent of the optimizer utilized, meaning that a gradient-free optimizer would not help in mitigating this phenomenon67. Furthermore, noise and entanglement could also induce BP52,68.

In order to mitigate BP, one can adopt the following strategies: variable structure ansatzes69,70,71, layerwise learning72, meta-learning73, and parameter initialization and parameter correlation strategies74,75. Enormous evidences show that a local cost function could help mitigate BP by extending the trainable circuit depth to a shallow level \({{{\mathcal{O}}}}(\log (n))\)46. What’s more, we can suppress the hardware noise using various error mitigation techniques (see, e.g., refs. 41,42,76,77,78,79), further improving the optimization results. By exploiting the above methods and continuous progresses in the areas of barren plateau and error mitigation, we believe that the effect and practicability of our VED framework could be further improved.

VED based on reduction criterion without decomposition

In the section “Prominent positive maps” we have shown how VED uses the reduction criterion to detect entanglement; it works by decomposing the reduction map \({{{\mathcal{R}}}}\) into a linear combination of Pauli channels and then variationally estimate the minimal eigenvalue of the averaged output state.

Here we propose another variational entanglement detection algorithm for the reduction criterion, motivated by the simple structure of the reduction map. The intuition behind this protocol is as follows. We know that ρAB is entangled if \({{{{\mathcal{R}}}}}_{B\to B}({\rho }_{AB})\) is not semidefinite positive. Using the variational characterization of a Hermitian operator’s minimum eigenvalue [see Supplementary Note 1], this means that

$$\mathop{\min }\limits_{\left|{\psi }_{AB}\right\rangle }\left\langle \psi \right|{{{{\mathcal{R}}}}}_{B\to B}({\rho }_{AB})\left|\psi \right\rangle$$
(31)
$$=\mathop{\min }\limits_{\left|{\psi }_{AB}\right\rangle }\left\langle \psi \right|({I}_{A}\otimes {\rho }_{B}-{\rho }_{AB})\left|\psi \right\rangle$$
(32)
$$=\mathop{\min }\limits_{\left|{\psi }_{AB}\right\rangle }\left\{{{{\rm{Tr}}}}[{\psi }_{B}{\rho }_{B}]-{{{\rm{Tr}}}}[{\psi }_{AB}{\rho }_{AB}]\right\} \,< \,0,$$
(33)

where the minimization ranges over all pure bipartite quantum states \(\left|{\psi }_{AB}\right\rangle\) in system AB, \({\psi }_{AB}\equiv \left|\psi \right\rangle \,{\left\langle \psi \right|}_{AB}\) and \({\psi }_{B}:= {{{{\rm{Tr}}}}}_{A}{\psi }_{AB}\). From Equation (33), one can see that it suffices to compute the difference of two overlaps and then variationally estimate the minimal eigenvalue. The crucial point is that the number of overlaps is independent on the dimension of the n-qubit system B. This detection method could save a large amount of computing resources when n becomes large. The improved VED based on the reduction criterion is summarized in Algorithm 3.

Algorithm 3

Improved VED based on reduction criterion

1: Input: 2n-qubit quantum state ρAB, parameterized quantum circuit U(α) with initial parameters α, and tolerance δ;

2: Apply U(α) to \({\left|00\right\rangle }_{AB}\) on system AB and obtain the test state \({\left|\psi \right\rangle }_{AB}=U({{{\boldsymbol{\alpha }}}}){\left|00\right\rangle }_{AB}\);

3: Compute the overlap between state ψB and ρB on subsystem B using the Swap Test and obtain \({c}_{1}={{{\rm{Tr}}}}[{\psi }_{B}{\rho }_{B}]\);

4: Apply U(α) to \({\left|00\right\rangle }_{AB}\) on system AB and obtain the test state \({\left|\psi \right\rangle }_{AB}=U({{{\boldsymbol{\alpha }}}}){\left|00\right\rangle }_{AB}\);

5: Compute the overlap between state ψAB and ρAB using the Swap Test and obtain \({c}_{2}={{{\rm{Tr}}}}[{\psi }_{AB}{\rho }_{AB}]\);

6: Compute the loss function L(α) = c1c2;

7: Perform optimization methods to minimize L(α); terminate the optimization if the error tolerance is satisfied: L(α) < −δ.

8: Output “Entangled” if the optimized L(α) < −δ.

To showcase the advantage of the improved VED, we use the reduction map’s simple structure exploited by this algorithm to estimate \(\left\langle 0\right|{{{{\mathcal{R}}}}}_{B\to B}({\rho }_{AB})\left|0\right\rangle\) for \({\rho }_{AB}=\left|W\right\rangle \,\left\langle W\right|\) being the four-qubit generalized W state, where

$$\left|W\right\rangle =\frac{1}{2}(\left|1000\right\rangle +\left|0100\right\rangle +\left|0010\right\rangle +\left|0001\right\rangle ),$$
(34)

and both local systems have two qubits. The simulation results are summarized as box plots and compared to the estimation obtained by the Pauli channel decomposition of the reduction map in Fig. 3. When taking the same number of measurement shots, the method used by Algorithm 3 gives estimation results more concentrated in a range close to the ideal value than the Pauli channel decomposition method adopted by the standard VED. Thus, the improved VED can achieve the desired accuracy using fewer measurement shots, which means that fewer copies of the input quantum state are required.

Fig. 3: Comparison between VED and improved VED on estimate accuracy.
figure 3

The overlap \(\left\langle 0\right|{{{{\mathcal{R}}}}}_{B\to B}({\rho }_{AB})\left|0\right\rangle\) is estimated using the reduction map's Pauli decomposition (Equation (26) used in VED) and its simple structure (Eq. (33) in the improved VED), where ρAB is the four-qubit generalized W state. The methods are simulated under five different numbers of measurement shots, and for each given number of shots, they are, respectively, repeated 50 times. The distributions of both methods' estimations are summarized in box plots, where each box extends from the first quartile to the third quartile of the corresponding data, with a line at the median. The whiskers extend to the most extreme data points within 1.5 times the interquartile range. The red horizontal line gives the theoretical value for reference. The improved VED (cf. Algorithm 3) achieves a better performance.

We remark that this idea can also be adopted to improve the efficiency of VED using the enhanced reduction criterion.

Quantum entanglement quantification

One of the most well-known entanglement measure is the logarithmic negativity80,81, which has various applications in quantum information theory. For a bipartite state ρAB, its logarithmic negativity is defined as

$${E}_{{{{\rm{N}}}}}({\rho }_{AB}):= \log \parallel {\rho }_{AB}^{{T}_{B}}{\parallel }_{1}.$$
(35)

Based on the recently developed near-term quantum algorithm for trace distance estimation6 and the fact that EN is defined via the transpose map TB, we introduce a variational quantum algorithm to estimate EN using an ancillary qubit system R. According to6,Corollary 3], it holds that

$$\parallel {\rho }_{AB}^{{T}_{B}}{\parallel }_{1}=2\mathop{\max }\limits_{U}{{{\rm{Tr}}}}\left|0\right\rangle \,{\left\langle 0\right|}_{R}{Q}_{R}-{{{\rm{Tr}}}}{\rho }_{AB}^{{T}_{B}}$$
(36)
$$=2\mathop{\max }\limits_{U}{{{\rm{Tr}}}}\left|0\right\rangle \,{\left\langle 0\right|}_{R}{Q}_{R}-1,$$
(37)

where \({Q}_{R}={{{{\rm{Tr}}}}}_{AB}{Q}_{ABR}\), \({Q}_{ABR}=U({\rho }_{AB}^{{T}_{B}}\otimes \left|0\right\rangle \,{\left\langle 0\right|}_{R}){U}^{{\dagger} }\), and the maximization ranges over all unitaries on the composite system ABR. Note that the second equality follows from the fact that TB is trace-preserving. Following the idea of VED, we may decompose the transpose map TB appeared in the operator QABR (correspondingly, QR) into a linear combination of Pauli terms via Eqs. (20) and (21), compute the overlaps in Eq. (37) one by one, and then variationally estimate the maximal value. For illustrative purposes, we give Algorithm 4, the variational logarithmic negativity estimation (VLNE), as an example of estimating the logarithmic negativity of a two-qubit quantum state ρAB. However, we emphasize that method outlined in Algorithm 4 can be easily generalized to quantify multi-qubit bipartite entanglement, as the transpose operation satisfies the preferable tensor product property in Eq. (21). What’s more, Algorithm 4 can be modified to use the sampling technique to estimate the average state, following the idea illustrated in Algorithm 2.

Algorithm 4

Variational logarithmic negativity estimation

1: Input: a 2-qubit quantum state ρAB and parameterized circuit UABR(α) with initial parameters α;

2: Apply UABR(α), respectively, to

$${\rho }_{AB}\otimes \left|0\right\rangle \,{\left\langle 0\right|}_{R},$$
(38)
$$({I}_{A}\otimes {X}_{B}){\rho }_{AB}({I}_{A}\otimes {X}_{B})\otimes \left|0\right\rangle \,{\left\langle 0\right|}_{R},$$
(39)
$$({I}_{A}\otimes {Y}_{B}){\rho }_{AB}({I}_{A}\otimes {Y}_{B})\otimes \left|0\right\rangle \,{\left\langle 0\right|}_{R},$$
(40)
$$({I}_{A}\otimes {Z}_{B}){\rho }_{AB}({I}_{A}\otimes {Z}_{B})\otimes \left|0\right\rangle \,{\left\langle 0\right|}_{R},$$
(41)

and obtain the states σ(0), σ(1), σ(2), σ(3), respectively.

3: Obtain \({o}_{j}={{{\rm{Tr}}}}[{\sigma }_{R}^{(j)}\left|0\right\rangle \,{\left\langle 0\right|}_{R}]\) for j = 0, 1, 2, 3 by measurements on system R.

4: Compute the loss function \({{{{\mathcal{L}}}}}_{1}:= -({o}_{0}+{o}_{1}-{o}_{2}+{o}_{3})/2\).

5: Perform optimization methods to minimize \({{{{\mathcal{L}}}}}_{1}({{{\boldsymbol{\alpha }}}})\);

6: Compute \(\beta =2| {{{{\mathcal{L}}}}}_{1}| -1\) as the estimated trace norm of \({\rho }_{AB}^{{T}_{B}}\);

7: Output \(\log \beta\) as the estimated logarithmic negativity.

One may also evaluate the entanglement measures82,83,84 based on the sandwiched Rényi relative entropy85,86 of order 1/2, making use of the recently proposed variational quantum algorithm estimating the fidelity between two quantum states6.

Experiments on IBMQ

In this section, we discuss how to apply the VED framework to detect the two-qubit maximally entangled state \(\left|{{\Phi }}\right\rangle := (\left|00\right\rangle +\left|11\right\rangle )/\sqrt{2}\) on IBM-Q superconducting quantum hardware accessible to the public. The specific quantum device used is ibmq-santiago (5 qubits) with a quantum volume of 32. The positive map adopted here for detection purpose is the qubit reduction map \({{{{\mathcal{R}}}}}_{B\to B}\) defined in Eq. (22). After implementing the decomposed reduction map by 4 Pauli terms as Eq. (27), we use a parameterized quantum circuit U(α) to prepare four identical test states \({\psi }_{AB}({{{\boldsymbol{\alpha }}}})=U({{{\boldsymbol{\alpha }}}})\left|00\right\rangle \,{\left\langle 00\right|}_{AB}{U}^{{\dagger} }({{{\boldsymbol{\alpha }}}})\) and compute the loss function defined in Eq. (8). The PQC used is depicted in Fig. 4 with three randomly initialized parameters α = (α1, α2, α3). During the optimization procedure, we apply the gradient descent algorithm to guide the learning process where the analytical gradient is calculated via the following parameter-shift rule87:

$$\frac{\partial {{{\mathcal{L}}}}({{{\boldsymbol{\alpha }}}})}{\partial {\alpha }_{j}}:= \frac{1}{2}\left[{{{\mathcal{L}}}}\left({\alpha }_{j}+\frac{\pi }{2}\right)-{{{\mathcal{L}}}}\left({\alpha }_{j}-\frac{\pi }{2}\right)\right].$$
(42)
Fig. 4: Circuit for experiments on IBMQ.
figure 4

This parameterized two-qubit quantum circuit U(α) is used for preparing the test state ψAB(α) on the ibmq-santiago hardware. The parameters α are randomly initialized as (α1, α2, α3) = (3.2292, 4.8579, 5.4691).

Due to the finite sampling restriction for measurements, the optimization procedure essentially falls into the regime of Stochastic Gradient Descent (SGD)47. The optimized loss values converges to \({{{{\mathcal{L}}}}}_{\min }\approx -0.43\). The gap between the experiment data and simulation result \({\lambda }_{\min }=-0.5\) is due to various hardware noises on the ibmq-santiago processor. One can further adopt error mitigation methods10 to improve the result. This result proves the validity of our VED framework. Note that if we adopt the termination setup in Algorithm 1, it will require much fewer optimization iterations (4–5 rounds are sufficient) to obtain the detection result. As mentioned in ref. 88, the communication bottleneck between the IBM-Q hardware and classical optimizer blocks us from efficiently conducting experiments without any specified reservation. This leads to a 9-minutes waiting time on average for each circuit evaluation from the IBM-Q cloud service. We also implement the probabilistic detection Algorithm 2 for comparison. In the experiment, we set the backend to the Aer simulator which imitates the behavior of ibmq-santiago and choose ε = 0.01 and δ = 0.1. The experimental results reveal that the probabilistic VED achieves the same detection precision level compared to the deterministic VED. As comparison, we conduct numerical simulations on the Baidu Quantum Leaf platform89 and obtain similar results. We summarize the experimental and numerical results in Fig. 5.

Fig. 5: Estimated minimum eigenvalue \({\lambda }_{\min }\) by VED using the reduction criterion on the Bell state \(\left|{{\Phi }}\right\rangle\).
figure 5

The red square curve records the results of Algorithm 1 on ibmq-santiago with shots = 8192 for each circuit evaluation. The cyan diamond curve records the results of Algorithm 2 by choosing δ = 0.1 and ε = 0.01. The blue circle curve records the numerical results on the Baidu Quantum Leaf platform89. Learning rate in the gradient descent algorithm is set to be LR = 0.5.

Numerical simulations for entanglement detection

In this section, we carry out numerical simulations that apply VED to detect a variety of bipartite quantum states of interest to investigate the performance of VED and its motivated entanglement quantification algorithm. All simulations, including optimization loops, are conducted using the Paddle Quantum90 toolkit on the PaddlePaddle Deep Learning Platform91.

Isotropic states

The n-qubit isotropic state family (n is even and each local system has n/2-qubits) is defined as92,Eq. (32)]

$${\rho }_{{{{\boldsymbol{A}}}}{{{\boldsymbol{B}}}}}^{\,{{\mbox{iso}}}\,}(p):= p{{{\Phi }}}_{{{{\boldsymbol{AB}}}}}+(1-p)\frac{{I}_{{{{\boldsymbol{AB}}}}}}{{2}^{n}},$$
(43)

where p [0, 1] is a parameter, ΦAB is the n-qubit maximally entangled state, and IAB is the identity operator in AB in which A ≡ A1An/2 and B ≡ B1Bn/2. The qubit systems \({\{{A}_{i}\}}_{i}\) are at Alice’s hand, while the qubit systems \({\{{B}_{i}\}}_{i}\) are at Bob’s hand. Intuitively, the isotropic state is a convex combination of the maximally entangled state ΦAB and the maximally mixed state IAB/2n. It has been shown that \({\rho }_{{{{\boldsymbol{AB}}}}}^{\,{{\mbox{iso}}}\,}(p)\) is separable (w.r.t. the A: B cut) if and only if p ≤ 1/(2n/2 + 1)92.

We numerically carry out Algorithm 1 together with the three prominent positive maps—the PPT criterion, the reduction criterion, and the enhanced reduction criterion—introduced in the section “Prominent positive maps”, using four-qubit isotropic states as inputs. The minimized loss values of these three maps obtained by our simulations on the isotropic states are represented by different markers in Fig. 6. As can be seen from this figure, VED can successfully identify the range of p for which the corresponding isotropic state can be detected by each positive map. The markers representing results from simulations fall on the lines that give the minimums of the loss function L(α), verifying the validity and viability of our VED framework. Note that for detecting entanglement in four-qubit isotropic states, all three maps are both necessary and sufficient. However, this phenomenon is not universal for all four-qubit states, as we shall see in the experiment using Breuer states.

Fig. 6: Numerical results on the four-qubit isotropic states defined in Eq. (43).
figure 6

Each line depicts the smallest eigenvalue of every isotropic state with parameter p [0, 1] under the corresponding map. This line of the smallest eigenvalues is a lower bound of the loss function L(α). Each marker depicts the minimized loss value obtained by simulations (sim) of Algorithm 1 on a chosen isotropic state, aligning with the theoretical (the) line.

To explore the barren plateau phenomenon that might be possible in our proposed VED algorithms, we carry out extensive numerical simulations on isotropic states by ranging the total number of qubits from 2 to 10 and fixing the noise parameter p to 0.7. In the following Table 1, we compare the average loss achieved by VED using the reduction criterion to the theoretical minimum, assuming that the number of shots does not scale with the number of qubits when estimating each overlap induced by the decomposition. As the number of qubits increases, there exhibits no large discrepancy between the estimated value by VED and the theoretical value. The scaling behavior of the optimized loss suggests that the VED algorithm is resilient to the barren plateau phenomenon when detecting bipartite states with a moderate number of qubits.

Table 1 Estimated \({\lambda }_{\min }\) by VED using the reduction criterion on isotropic states with up to 10 qubits and the noise parameter p = 0.7.

Breuer states

As we have mentioned in the section “Prominent positive maps”, there are states that can be detected by the enhanced reduction criterion yet cannot be detected by the PPT criterion. In this section, we use the proposed VED framework to numerically consolidate this statement. The four-qubit Breuer state family is defined as56, Eq. (7).

$${\rho }_{{{{\boldsymbol{A}}}}{{{\boldsymbol{B}}}}}^{\,{{\rm{Breuer}}}\,}(\lambda ):= \left(\begin{array}{cccc}\frac{1-\lambda }{3}&0&0&0\\ 0&\frac{1+2\lambda }{6}&\frac{1-4\lambda }{6}&0\\ 0&\frac{1-4\lambda }{6}&\frac{1+2\lambda }{6}&0\\ 0&0&0&\frac{1-\lambda }{3}\end{array}\right),$$
(44)

where λ [0, 1] is a parameter, A ≡ A1A2, and B ≡ B1B2. The qubit systems A1 and A2 are at Alice’s hand while the qubit systems B1 and B2 are at Bob’s hand. It has been shown that ρBreuer is separable (w.r.t. the A: B cut) if and only if λ = 0 and can be detected by the enhanced reduction criterion56. On the other hand, it has positive partial transpose if and only if λ ≤ 1/656, witnessing the power of the enhanced reduction criterion.

Following the same line of the case of the isotropic state, we carry out Algorithm 1 on the three criteria using four-qubit Breuer states as inputs. The minimized loss values obtained by our simulations on selected Breuer states are represented in Fig. 7 by markers, which again align with the theoretical lines. From the numeric results, we can see that while the enhanced reduction criterion is still necessary and sufficient for entanglement detection in the four-qubit Breuer states, neither the reduction criterion nor the PPT criterion can detect all entangled states in the Breuer state family, attesting the advantage of the enhanced reduction criterion in this case.

Fig. 7: Numerical results on the four-qubit Breuer states defined in Eq. (44).
figure 7

Each line depicts the smallest eigenvalue of every Breuer state with parameter p [0, 1] under the corresponding map. This line of the smallest eigenvalues is a lower bound of the loss function L(α). Each marker depicts the minimized loss value obtained by simulations (sim) of Algorithm 1 on a chosen Breuer state, aligning with the theoretical (the) line.

Numerical simulations for logarithmic negativity estimation

For simulations of variational entanglement quantification with logarithmic negativity, we adopt the hardware efficient ansatz used for trace distance estimation in ref. 6 where the circuit depth is 4. The simulations are carried out on two-qubit isotropic states, which is defined as

$${\rho }_{AB}^{\,{{\rm{iso}}}\,}(p):= p{{{\Phi }}}_{AB}+(1-p)\frac{{I}_{AB}}{4},$$
(45)

where ΦAB is the two-qubit maximally entangled state. As shown in Fig. 8, the logarithmic negativity of a two-qubit isotropic state is positive if and only if its parameter p > 1/3, which matches the range of p where the corresponding isotropic states are entangled. The estimated logarithmic negativities by our method, which are represented by markers in Fig. 8, agree with the precisely calculated values given by the blue line.

Fig. 8: Numerical results on the two-qubit isotropic states.
figure 8

The blue line represents the precisely calculated logarithmic negativity of isotropic states with parameter p [0, 1]. The yellow markers depict the estimated logarithmic negativities by simulations of Algorithm 4 on selected isotropic states.

Discussion

In this paper, we combined two techniques that find crucial applications in the NISQ quantum devices, the variational quantum algorithms and the quasi-probability decomposition method, to propose the variational entanglement detection (VED) and variational logarithmic negativity estimation (VLNE) frameworks, contributing feasible solutions to detect and quantify entanglement on near-term devices. VED is built upon the positive map criterion and works as follows. Firstly, it decomposes a chosen positive map into a linear combination of NISQ implementable quantum operations. Then, it variationally estimates the minimal eigenvalue of the output state of some positive map acting on the target bipartite state. Two methods are proposed to generate the output state: the first one averaged the output states according to the quasi-probability distribution; the second one estimated the average via the sampling technique. At last, it asserts that the target state is entangled if the optimized minimal eigenvalue is negative, guaranteed by the positive map criterion. We elaborated three well-known positive maps to illustrate how the VED framework is applied. Following the idea of VED, VLNE variationally computes the log-negativity entanglement measure, relying on a linear decomposition of the transpose map into Pauli terms and the recently proposed trace distance estimation algorithm. Experimental and numerical results on various bipartite states of interest have validated the proposed entanglement detection and quantification methods.

We expect that the VED framework can be upgraded to detect more entangled states. A crucial step towards this aim is to explore what kind of positive maps can be decomposed into a linear combination of Pauli channels. In the section “Quantum entanglement quantification” we showed by case how to variationally compute the log-negativity entanglement measure. It would be meaningful to design quantum algorithms to estimate other distance-based entanglement measures (see, e.g., refs. 93,94,95).

Methods

Entanglement detection via positive maps

Let ρAB be a bipartite quantum state in the composite system AB. By definition ρAB is separable if it can be decomposed into a convex combination of tensor products of states describing local systems as96

$${\rho }_{AB}=\mathop{\sum}\limits_{x}{p}_{x}\left|{\psi }_{x}\right\rangle \,{\left\langle {\psi }_{x}\right|}_{A}\otimes \left|{\phi }_{x}\right\rangle \,{\left\langle {\phi }_{x}\right|}_{B},$$
(46)

where px ≥ 0, ∑xpx = 1, and \({\{\left|{\psi }_{x}\right\rangle \}}_{x}\) and \({\{\left|{\phi }_{x}\right\rangle \}}_{x}\) are two sets of pure states in systems A and B, respectively. Otherwise, ρAB is entangled. Given the definition, it is natural to ask whether a given unknown bipartite quantum state is separable or entangled, known as the separability problem. This problem has been shown to be NP-hard97,98. There are many separability criteria that have been proposed to determine the separability or entanglement of bipartite quantum states as necessary conditions12,23.

One of the most celebrated criteria for distinguishing separable states from entangled states is the positive map criterion. The core of the positive map criterion is that one subjects a subsystem of ρAB to a positive (but not completely positive) map \({{{{\mathcal{N}}}}}_{B\to B}\) that preserves the positivity of inputs. If ρAB is a product state, i.e., it is of the form ρAρB, the resulting operator \({\rho }_{A}\otimes {{{\mathcal{N}}}}({\rho }_{B})\) is still positive. Consequently, due to the linearity, an arbitrary separable state is mapped into some positive operator by this map. On the other hand, if ρAB is entangled, the output operator \({{{{\mathcal{N}}}}}_{B\to B}({\rho }_{AB})\) may be no longer positive; the transpose map is a prominent example54. That is to say, the negative spectrum of the output operator indicates entanglement of the input state. Mathematically, the positive map criterion states that a bipartite quantum state ρAB is separable if and only if for arbitrary system C and arbitrary positive (but not completely positive) map \({{{{\mathcal{N}}}}}_{B\to C}\), it holds that \({{{{\mathcal{N}}}}}_{B\to C}({\rho }_{AB})\ge 0\)25.

Despite its proven efficiency in entanglement detection, the positive map criterion is not directly applicable in practice, especially on recent NISQ devices. This is an immediate consequence of the fact that generically positive but not completely positive maps do not represent physically implementable quantum operations99 and thus cannot be realized in near-term quantum devices. In this work, we showed how to overcome this obstacle and employ the positive map criterion to detect entanglement on NISQ devices.

Ansatz design

For the entanglement detection purpose, we adopt the circuit ansatz shown in Fig. 9 to prepare the test state \({\left|\psi \right\rangle }_{AB}\). It consists of parameterized single-qubit gates U3(θ, ϕ, φ) = Rz(ϕ)Ry(θ)Rz(φ) and circular layers of CNOT gates. Note that this ansatz can be easily generalized to multi-qubit case.

Fig. 9: Three qubit parameterized ansatz U(α) used for VED.
figure 9

The quantum circuit within the dotted block is repeated twice.