Unbounded number of channel uses may be required to detect quantum capacity

Transmitting data reliably over noisy communication channels is one of the most important applications of information theory, and is well understood for channels modelled by classical physics. However, when quantum effects are involved, we do not know how to compute channel capacities. This is because the formula for the quantum capacity involves maximizing the coherent information over an unbounded number of channel uses. In fact, entanglement across channel uses can even increase the coherent information from zero to non-zero. Here we study the number of channel uses necessary to detect positive coherent information. In all previous known examples, two channel uses already sufficed. It might be that only a finite number of channel uses is always sufficient. We show that this is not the case: for any number of uses, there are channels for which the coherent information is zero, but which nonetheless have capacity. The transmission of quantum information through channels is a fundamental step for future quantum communication technologies. Cubitt et al.now show that there exist channels whose potential for transmitting quantum information requires an unbounded number of usages to be detected.

Transmitting data reliably over noisy communication channels is one of the most important applications of information theory, and well understood when the channel is accurately modelled by classical physics.However, when quantum effects are involved, we do not know how to compute channel capacities.The capacity to transmit quantum information is essential to quantum cryptography and computing, but the formula involves maximising the coherent information over arbitrarily many channel uses [1][2][3].This is because entanglement across channel uses can increase the coherent information [7], even from zero to non-zero [8]!However, in all known examples, at least to detect whether the capacity is non-zero, two channel uses already suffice [8,24].Maybe a finite number of channel uses is always sufficient?Here, we show this is emphatically not the case: for any n, there are channels for which the coherent information is zero for n uses, but which nonetheless have capacity.This may be a first indication that the quantum capacity is uncomputable.
In the classical case, not only can we exactly characterise the maximum rate of communication over any channel -its capacity -we also have practical errorcorrecting codes that attain this theoretical limit.It is instructive to review why the capacity of classical channels is a solved problem.Even though optimal communication over a discrete, memoryless classical channel involves encoding the information across many uses of the channel, Shannon showed that a channel's capacity is given mathematically by optimising an entropic quantity (the mutual information) over a single use of the channel.This follows immediately from the fact that mutual information is additive.Thanks to this, the capacity of any classical channel can be computed efficiently.
It is for this reason that additivity questions for quantum channel capacities took on such importance, and why the major recent breakthroughs proving that additivity is violated [5,8] had such an impact.It has been known for some time [1][2][3] that the quantum capacity is given by a regularised expression-the optimisation of an entropic quantity (the coherent information) in the limit of arbitrarily many uses of the channel: (Here, Q (n) (N ) is the coherent information I coh maximised over a joint input ρ (n) for n uses of the channel N .)However, the regularisation renders computing the quantum capacity infeasible; it involves an optimisation over an infinite parameter space.
Were the coherent information additive (i.e. , the regularisation could be removed and the quantum capacity could be computed as easily as the capacity of classical channels.However, this is not the case.The first explicit examples of this superadditivity phenomenon were given by Di Vincenzo et al. [7], and extended by Smith et al. [9].For these examples (where N is a particular depolarising channel) it was shown (numerically) that 0 ≤ Q (1) for small values of n ≤ 33.
While the classical capacity of quantum channels also involves a regularised formula [5], we at least know precisely in which cases it is zero: simply for those channels whose output is completely independent of the input.The set of zero-quantum-capacity channels is much richer.Indeed, we do not even have a complete characterisation of which channels have zero quantum capacity.To date, we know of only two kinds of channels with zero quantum capacity: antidegradable channels [10,11] and entanglement-binding channels [12].The former has the property that the environment can reproduce the output, thus Q = 0 by the no-cloning theorem [13].The latter can only distribute PPT entanglement, which cannot be distilled by local operations and classical communication [14], which again implies Q = 0.
This has dramatic consequences.It is possible to take two quantum channels above, N 1 antidegradable and N 2 entanglement-binding, which individually have no capacity whatsoever, yet when used together can transmit quantum information reliably (Q(N 1 ⊗ N 2 ) > 0).This superactivation phenomenon was discovered recently by Smith and Yard [8].They also used their examples to construct a single channel N exhibiting an extreme form of superadditivity of the coherent information, where 0 = Q (1) (N ) < Q (2) (N ).(In their construction, having two uses of N effectively enables one use of N 1 and one of N 2 .)Even stronger superactivation phenomena have been shown in the context of zero-error communication over quantum channels [15][16][17][18][19].
On the one hand, additivity violation means regularisation is required in formulas for computing capacities.On the other hand, it also means that entanglement can protect information from noise (the coherent information is additive for unentangled input states).
But just how bad can this additivity violation be?One might hope that, at least in determining whether the quantum capacity is non-zero, one need only consider a finite number of uses of a channel.Indeed, since the Smith and Yard construction relies on combining the only two known types of zero-capacity channels, one might dare to hope that even two uses suffice.(Similarly, for the classical capacity of quantum channels the only known method for constructing examples of additivity violation [4,5] cannot give a violation for more than two uses of a channel, and there is some evidence that this may be more than just a limitation of the proof techniques [6].)Was this indeed the case, additivity violation would be reduced to something relatively benign: entangling the inputs across more than two uses of the channel would give no advantage.And one would be able to compute the quantum capacity by optimising the coherent information over two uses of the channel, which is not substantially more difficult than the optimisation over a single channel use.
In this paper, we show for the first time that this is not the case: additivity violation is as bad as it could possible be.We prove that, for any n, one can construct a channel N for which the coherent information of n uses is zero (Q (n) (N ) = 0), yet for a larger number of uses the coherent information is strictly positive, implying that the channel has non-zero quantum capacity (Q(N ) > 0).This is also the first proof that there can be a gap between Q (n) (N ) and the quantum capacity for an arbitrarily large number n of uses of the channel.Our result implies that, in general, one must consider an arbitrarily large number of uses of the channel just to decide whether the channel has any quantum capacity at all! Perhaps the earliest indication that determining the quantum capacity may be a difficult problem comes from the work of Watrous [20], who showed that an arbitrarily large number of copies of a bipartite quantum state can be required for entanglement distillation assisted by two-way classical communication.Our result can be regarded as the counterpart of [20] for the quantum capacity (which is mathematically equivalent to entanglement distillation assisted by one-way communication).However, the proof ideas and techniques of [20] require twoway communication, thus they do not apply to the usual capacity setting.Our result is instead based on the ideas of Smith and Yard, in particular the intuition provided by Oppenheim's commentary thereon [25].
This intuition comes from a class of bipartite quantum states called pbits (private bits) [21]: , together with the standard equivalences between quantum capacity (send-ing entanglement over a channel) and distilling entanglement from the Choi-Jamio lkowsky state associated with the channel.Here, |φ ± are Bell states, and σ ± are hiding states [22].The latter are orthogonal (globally perfectly distinguishable), but cannot be distinguished using local operations and classical communication (LOCC).
If ρ aAbB is shared between Alice (who holds aA) and Bob (who holds bB), then they share at least one ebit of entanglement due to the Bell states.But this entanglement is inaccessible to them unless they can determine which of the two Bell states they share.This they could do if only they could determine which hiding state they have.But σ ± cannot be distinguished by LOCC, preventing them from extracting the entanglement from ρ aAbB .The ab part of the system is usually called the "key", and AB the "shield" (as it decouples the systems ab from any external system).Now imagine they have access to a quantum erasure channel E 1 2 , which with probability 1/2 transmits its input perfectly, and with probability 1/2 completely erases it.It is well known that such a channel cannot be used to transmit any entanglement.However, if they also share ρ aAbB , Alice can use the erasure channel to send her part A of the shield to Bob.If the erasure channel transmits, Bob now holds the entire AB system and can now distinguish σ ± .Thus, with probability 1/2, Alice and Bob can now extract the entanglement from ρ aAbB .
Instead of supplying Alice and Bob with the state ρ aAbB and an erasure channel, we instead supply them with a switched channel.This has an auxiliary classical input that controls whether the channel acts as E 1  2 or Γ, where Γ is the channel with Choi-Jamio lkowsky state ρ aAbB .The above argument then implies that no quantum information can be sent over a single use of the channel, but it can be sent using two uses, by switching one to E 1 2 and the other to Γ.This is the intuition behind the Smith and Yard construction [25].However, because it is constructed out of two very particular types of quantum channels, this idea does not seem to extend to larger numbers of uses.Nonetheless, the intuition behind our result is based on a refinement of these ideas, which we now sketch.
We want to achieve two seemingly contradictory goals: (1) To prevent Alice from sending any quantum information to Bob over n of uses of the channel.(2) To permit this when Alice has access to some larger number of uses N > n.We can achieve (1) by increasing the erasure probability of the erasure channel to something much closer to 1, and also adding noise to the Γ channel; the noise then swamps any entanglement.The problem is that this seems to also render (2) impossible.If the channel is so noisy that it destroys all entanglement sent through it, then (by definition) no amount of coding over multiple uses of the channel can succeed in transmitting quantum information.
However, note that the information that Alice needs to send to Bob in order to extract entanglement from the pbit ρ aAbB is essentially classical.Bob just needs to know one classical bit of information to distinguish the two hiding states.This suggests that classical error correction might help Alice send this information to Bob, even when the channel is very noisy.The intuition behind our proof is that a simple classical repetition code suffices.Instead of the pbit ρ aAbB , we use a pbit 1  2 ) that contains N copies of the shield.For Bob to distinguish the hiding states, it suffices for one copy to make it through the erasure channel.Alice now tries to send all of the copies of the shield through many uses of the erasure channel.However high the erasure probability, the probability that at least one will get through can be made arbitrarily high for sufficiently many attempts.
We now give a more precise description of our construction.The erasure channel with erasure probability p is , where I A→B is the identity channel from A to B, and F is the erasure flag.The channel Γ Ã→ B belongs to the class of PPT entanglement-binding channels whose Choi state is an approximate pbit [21].We show that Γ can be constructed with A := A 1 . . .A N and B := B 1 . . .B N consisting of N parts, such that even if Bob only receives part A i of Alice's shield for any i, they obtain approximately one ebit of one-way distillable entanglement.Let ΓÃ →F •Γ Ã→ B be a noisy version of the channel Γ.Our construction uses channels of the form Here P S→S i projects onto the i-th computational basis vector of the qubit system S which thereby acts as a classical switch allowing Alice to choose whether the channel acts as E p or Γκ on the main input Ã. S is retained in the output which lets Bob learn which choice was made.
Making the above intuition rigorous for this channel is non-trivial: First, we must prove that the coherent information of n uses of the channel is strictly zero, for any input to the channel (not just the input states from the above intuition).To this end, we cannot just directly use a pbit with N -copy shield of the form given above, as it would have distillable entanglement.Fortunately, we find that an approximate pbit construction from [21] can be adapted for the role.But then we must take this approximation into account in the proof that the channel does have capacity.This requires a careful analysis of the various parameters of our channel to show that both of the desired properties can hold simultaneously, which requires a somewhat delicate argument.The technical arguments are described in the Methods section.
One natural question (which we leave open) is whether we can obtain a stronger form of our result with a constant upper bound on the channel dimension.It would also be interesting to see if one can obtain a result analogous to ours for the private capacity of quantum channels.Finally, our result gives a first indication that the quantum capacity of a channel might well be an uncomputable quantity; uncomputability of the quantum ca-pacity would necessarily imply the behaviour we have shown here.

METHODS
We state and outline the proof of our main result -for any number of uses we can show that there exists a channel with positive capacity but zero coherent information.Formally, we prove the following: Theorem.Let M be the channel defined in Eq. (1).For any positive integer n, if κ ∈ (0, 1/2) and p ∈ [(1 + κ n ) −1/n , 1] then we can choose N and Γ such that: The proof is divided in two parts.We first prove that, given n and κ, for any Γ with zero capacity there is a range of p that makes the coherent information of M ⊗n zero.In the second part we prove that there exists Γ with zero capacity such that M has positive capacity.
For the first part we can simplify the analysis of M ⊗n by showing that it is optimal to make a definite choice (i.e. a computational basis state input) for each of the n switch registers.For each possible setting of the n switches, the coherent information is a convex combination of the coherent information for three cases, weighted by their probabilities: (a) every channel erases, (b) all of the E p erase but not all Γ erase, (c) at least one of the E p does not erase (and therefore acts as the identity channel).The coherent information for cases (b) and (c) can be upper bounded respectively by zero and H(R), where R is a system that purifies the input.For (a) it is bounded above by −H(R).Weighting by the probabilities, we find that the total coherent information is upper-bounded by 1 − (1 + κ n )p n H(R).This allows us to conclude that for any n and κ we can find p such that the coherent information of n uses of the channel is zero.
To prove the second part, we show that for fixed κ, p we can find a Γ with an N -copy shield such that the coherent information of N + 1 uses of the channel M is positive for some N + 1 > n.We number the channel uses 0, . . ., N and label the systems involved in the i-th use of the channel with superscript i.Consider the following input.The switch registers are set to choose Γκ for use 0 and E p for the remaining uses 1, . . ., N .We maximally entangle subsystem A 0 i of Ã0 (which is acted on by Γκ ) with subsystem A i 1 of Ãi (acted on by an erasure channel).We also maximally entangle subsystem a 0 of Ã0 with a purifying reference system a which is retained by Alice.The remaining input subsystems are set to an arbitrary pure state.The resulting coherent information is a convex combination of cases where (a) Γκ erases, (b) Γκ does not erase but all the E p erase, and (c) Γκ and at least one E p do not erase.Case (a) contributes coherent information −1 weighted by its probability κ.Case (b) contributes approximately zero coherent information (due to a standard property of pbits).
In case (c), after channel use 0, Alice and Bob share the Choi state of Γ on systems ab 0 A 1 1 B 0 1 . . .A N 1 B 0 N , and after the N uses of E p at least one of A 1 1 . . .A N 1 reaches Bob unerased.They then share a state with approximately one ebit of one-way distillable entanglement (coherent information +1).This contribution is weighted by the probability (1 − κ)(1 − p N ).We show that for p ∈ (0, 1), κ ∈ (0, 1/2), we can find a Γ with large enough N for which the overall coherent information is positive, prov-ing that Q(M) > 0. Further mathematical details are given in the Supplementary Information.

Preliminaries
In the following, each system Q is associated to a Hilbert space of finite dimension dim(Q), and the Hilbert space has an orthonormal computational basis {|i Q : i ∈ {0, . . ., dim(Q) − 1}}.For any system Q, let µ Q := 1 Q / dim(Q) denote its maximally mixed state.Let A and B be two systems of equal dimension, I A→B denote the identity channel between them, and F be a binary erasure flag.The total erasure channel E A→FB 1 maps any input state to |1 1| F ⊗ µ B , while denotes the erasure channel with erasure probability p.For any number of uses of E 1 and any input state ρ we have For any register F, a flagged channel is of the form For any flagged channel we have which follows easily from For any i ∈ {0, . . ., dim(S) − 1}, let where each N i is a quantum channel.The register S acts as a classical switch allowing the sender to choose between different channels N i to be applied on the "main input" A to produce a state of the "main output" B. We will need the following simple lemma regarding switched channels: Lemma 1.For any switched channel, where 0 ≤ i ≤ dim(S) − 1.
Proof.To see this, note that any purification ρ SAR of ρ SA can be written in the form Here p i is the probability that the switch is set to i, and |ρ i AR is a purification of the channel input state ρ A i conditioned on that setting.Conversely, given probabilities p i and states ρ A i for each switch value, we can always find |ρ SAR satisfying (S5).Given this, we see that where ρ AR i := |ρ i ρ i | AR .From (S2) it follows that which completes the proof.
We will also require some basic facts about pbits ("private bits") [21], which we gather here.Given a bipartite system ab with dim a = dim b = 2 and a bipartite system AB with dim A = dim B, a perfect pbit with key ab and shield AB is a state γ abAB of the form where φ ab is the projector onto |φ ab := 1 √ 2 (|00 + |11 ) ab , σ AB is some mixed state, and is a twisting unitary controlled by the key ab and acting on the shield AB as some unitary U AB ij .Note that due to the form of φ ab and U abAB , we have Let us define U bAB := 1 j=0 |j j| b ⊗ U AB jj .If Bob has access to b and the whole shield AB then he can apply the unitary operation (U † ) bAB to these systems, yielding a 2-qubit maximally entangled state on ab.Therefore, I(a bAB) γ abAB = I(a bAB) φ ab ⊗σ AB = 1. (S13) On the other hand, if we throw away the shield systems AB, we are left with a state γ ab that can be converted into a perfectly random shared classical bit by locally measuring systems a and b in the standard basis.The coherent information of a shared random bit is zero, so from (S17) we get I(a b) γ ab ≥ 0 and thus Channel construction We will now describe the input and output systems of our channel M. Let a and b be two-dimensional systems (qubits).We call ab the "key".Let A i,j,k and B i,j,k be d-dimensional systems for all i ∈ [N ], j ∈ [r], k ∈ [m] where [n] := {1, . . ., n}.We define composite systems A i := {A i,j,k : j ∈ [r], k ∈ [m]} and A := {A i : i ∈ [N ]} for Alice, and similar systems B i and B for Bob.We call AB the "shield" and call A i "Alice's i-th share of the shield".Let F be a qubit called "the erasure flag".Let Ã := aA, and B := bB.
Our construction is a switched channel It depends on parameters N, r, m ∈ N and p, κ, q ∈ [0, 1], where q is an implicit parameter of Γκ .We define ΓÃ →F B κ to be the composite channel A useful fact regarding compositions is that which is just the quantum data processing inequality for coherent information [25].
We define Γ Ã→ B by giving its Choi state, which depends on the parameters N , r, m, and q.Defining the composite systems C i,j,k := A i,j,k B i,j,k and C ), and are the Eggeling-Werner data hiding states [22].Here µ are the states proportional to the projectors onto the symmetric and anti-symmetric subspaces respectively.
In Eq. ( 139) of [21], a state ρ rec (p,d,k) is defined.Apart from p, d and k, it also implicitly depends on a parameter m, so we will denote it by ρ rec (p,d,k;m) .Our ζ abAB is precisely ρ rec (q,d,rN ;m) .From Sections X-A (in particular Lemma 5) and X-B of [21] we see that ρ rec (q,d,rN ;m) is PPT if 0 < q ≤ 1/3 and 1 − q q ≥ d d − 1 rN . (S19) Since a channel is PPT-binding iff its Choi matrix is PPT, the same conditions suffice for Γ to be PPT-binding.This condition is key to our subsequent analysis.
Proof.First note that the ρ abA1B1 is simply ρ rec (q,d,r;m) .Adopting the notation of [21], let A 0011 1 be the norm of the upper right block of the matrix ρ rec (q,d,r;m) expanded in the computational basis of the key system ab.In Proposition 4 of [21], it is shown that if 1/2 − A 0011 1 < ǫ < 1/8 then τ ≤ δ(ǫ) for some function δ(ǫ).The function δ is given in Eq. ( 70) of [21]