Fundamental limits of repeaterless quantum communications

Quantum communications promises reliable transmission of quantum information, efficient distribution of entanglement and generation of completely secure keys. For all these tasks, we need to determine the optimal point-to-point rates that are achievable by two remote parties at the ends of a quantum channel, without restrictions on their local operations and classical communication, which can be unlimited and two-way. These two-way assisted capacities represent the ultimate rates that are reachable without quantum repeaters. Here, by constructing an upper bound based on the relative entropy of entanglement and devising a dimension-independent technique dubbed ‘teleportation stretching', we establish these capacities for many fundamental channels, namely bosonic lossy channels, quantum-limited amplifiers, dephasing and erasure channels in arbitrary dimension. In particular, we exactly determine the fundamental rate-loss tradeoff affecting any protocol of quantum key distribution. Our findings set the limits of point-to-point quantum communications and provide precise and general benchmarks for quantum repeaters.

Bounds are plotted in terms of distance (km) assuming the standard loss rate of 0.2dB/km. Note that Φ(η) remains the tighter upper bound even if we constrain the input energy down to one mean photon. This is true everywhere, except for short distances (where the energy constraint is not so interesting since we can efficiently use highly-modulated CV-QKD). (1) Suppose that Alice and Bob implement a QC protocol for transmitting qubits from system A to system b by means of channel E (red curvy line).
In the upper LO, Alice applies a suitable quantum error correcting code (QECC) Λ m→n enc to encode an m-qubit logical state |ϕ (m) into an n-qubit codeword which is sent through E ⊗n . In the lower LO, Bob applies a decoding operation Λ n→m dec , so that Λ n→m dec • E ⊗n • Λ m→n enc tends to the identity, and the n-use output state ρ n b approximates |ϕ (m) ϕ (m) |. In the general case, we assume that the previous LOs are assisted by unlimited two-way CCs between Alice and Bob. By optimizing over all QECCs and in the limit of infinite channel uses, one defines the two-way quantum capacity Q 2 (E). (2) Notice that Alice can use the QECC to send part of m ebits (see the Bell state Φ in the grey box), so that Alice and Bob share an output state ρ n ab which approximates Φ ⊗m . Assuming an asymptotic and optimal QECC, each ebit is reliably shared at the quantum capacity rate Q 2 (E). (3) Finally, assume that the channel E can be described by teleportation over the resource state σ. Any entanglement distribution strategy through channel E can therefore be seen as a specific protocol of entanglement distillation applied to the copies of σ. This observation leads to Q 2 (E) ≤ D 2 (σ). (b) Different re-organization of the quantum operations in teleportation stretching. When we apply teleportation stretching to a QC protocol, we directly reduce the output state as follows ρ b (E ⊗n ) =Λ(σ ⊗n ), for a trace-preserving LOCCΛ which is not connected with ED, but collapses the preparation |ϕ (m) ϕ (m) |, the encoding/decoding maps, and all the teleportation operations. This is not asymptotic but done for any n.

Truncation of infinite-dimensional Hilbert spaces
In the following it will be useful to use truncation tools which enables us to connect continuous-variable (CV) and discrete-variable (DV) states. Consider m bosonic modes with Hilbert space H ⊗m and space of density operators D(H ⊗m ). Then, consider the energy operatorĤ = m i=1N i (withN i being the number operator of mode i) and the following compact set of energy-constrained states [1] D E (H ⊗m ) := {ρ ∈ D(H ⊗m ) | Tr(ρĤ) ≤ E}. (1) It is easy to show that every such state is essentially supported on a finite-dimensional Hilbert space.
Lemma 1: Consider an energy-constrained m-mode bosonic state ρ ∈ D E (H ⊗m ). There exists a finite-dimensional projector P d which projects this state onto a d-dimensional support of the m-mode Hilbert space with probability Correspondingly, the trace distance between the original state ρ and the d-dimensional truncated state satisfies the inequality Note that n + 1 counts the dimension of the truncated Hilbert space and we have h n ≤ n, because of the degeneracy of the eigenvalues. Since h n is the total number of photons in all m modes, we have that each mode can have at most dimension (h n + 1), so that we may write the upper bound n ≤ n + 1 ≤ (h n + 1) m or, equivalently, Then, we proceed as in Refs. [1,2]. Denote by P n := |h n h n | the eigenprojector associated with |h n . For dimension d, we consider the truncation projector Therefore, for all |ψ ∈ H, we may write This implies that, for all ρ ∈ D E (H), we have According to Eq. (6), we may write h d ≥ m √ d − 1, so that which proves Eq. (2). The proof of Eq. (4) is a simple modification of the one given by ref. [2].
Note that we may derive a similar result in terms of a truncation channel, i.e., by means of a completely positive trace-preserving (CPTP) map.
Lemma 2: Consider an energy-constrained m-mode bosonic state ρ ∈ D E (H ⊗m ). There exists a truncation channel T d which maps the state ρ into a truncated stateρ defined over a d-dimensional support of the m-mode Hilbert space, such that where γ is defined in Eq. (2).
Proof: For any multimode energy-constrained bosonic state ρ ∈ D E (H ⊗m ), we may define the following (non-local) truncation mapρ where Π 0 := P d and Π 1 := I − P d , while for any projected state σ we have either the identity channel E 0 (σ) = σ or the collapsing map E 1 (σ) = ρ 0 , where ρ 0 is an arbitrary fixed state within the d-dimensional support. Setting p := Tr(ρP d ), we may writeρ where δ is defined in Eq. (3). Note that S 0 := ρ − ρ 0 ≤ 2. Then, by exploiting the convexity of the trace norm, we may write where we have also used p ≤ 1 and Lemma 1.

Local CV-DV mappings
It is easy to modify the previous truncation tools to make them bipartite and local, i.e., based on LOs assisted by (generally two-way) CCs. Suppose that Alice and Bob share a CV bipartite state ρ ab , where Alice's local system a contains m a modes and Bob's local system b contains m b modes. Then, we may analyze how this state is transformed by a truncation channel which is based on LOCC. In fact, we may state the following.
Lemma 3: Consider an energy-constrained bosonic state ρ ab ∈ D E (H ⊗m a ⊗ H ⊗m b ) where Alice (Bob) has m a (m b ) modes. There is an LOCC truncation channel T ⊗ d (local with respect to the bipartition m a + m b ) which maps ρ ab into a truncated stateρ ab defined over a d × d-dimensional support and such that The implementation of such truncation channel needs two bits of CC between Alice and Bob.
Proof: Assuming the bipartition of modes m = m A + m B , let us write the energy operator asĤ =Ĥ A +Ĥ B , wherê It is clear that, given an arbitrary |h n , we may always decompose it as |h n = |h A k ⊗ |h B l for some pair of labels k and l. For this reason, any set of d states {|h n } for the m modes can certainly be represented by a tensor product of d × d states suitably chosen within the local sets {|h A k } and {|h B l }. As a consequence, the support of a d-dimensional projector as in Eq. (7) is always contained in the support of a local d × d projector for some suitable choice of {|h A k } and {|h B l }. This implies that there always exists a local projector P ⊗ d for which we may write where we have also used Lemma 1. Set p := Tr(ρ ab P d ) and p := Tr(ρ ab P ⊗ d ), so that we have truncated states Because of the wider support of P ⊗ d , it is easy to check that where we have used Lemma 1 in the last inequality. In order to construct the LOCC truncation channel, let us consider the local POVM The parties apply these projections and then they communicate their outcomes to each other, employing one bit of classical information for each one-way CC. If both parties project onto the local d-dimensional support then they apply an identity channel; if one of them projects outside this local support, they both apply a damping channel which maps any input into a fixed state within the support (which can always be chosen as the vacuum state). More precisely, we define the LOCC truncation channel where Π ij is the local POVM defined above and where channel E * a(b) provides an m a -(m b -) mode vacuum state for any input. It is clear that the result is a truncated stateρ ab := T ⊗ d (ρ ab ) where each set of modes a and b is supported in a d-dimensional Hilbert space. In particular, we havẽ Using the convexity of the trace norm, we get which concludes the proof.
Finally, note that LOCC channels from DVs to CVs can be constructed by using hybrid quantum teleportation [3]. For instance, a polarisation qubit α |↑ a + β |↓ a can be teleported onto a single-rail qubit, which is the bosonic subspace spanned by the vacuum |0 b and the single-photon state |1 b . It is sufficient to build a hyper-entangled Bell state |↑ a |1 b + |↓ a |0 b and apply a discrete variable Bell detection on qubits a and a . This teleports a onto the bosonic mode b, up to Pauli operators (suitably re-written in terms of the ladder operators) that can be undone from the output state. Such procedure can be readily extended to teleport qudits into bosonic modes in a LOCC fashion.
Supplementary Note 2. LOWER BOUND AT ANY DIMENSION Coherent and reverse coherent information of a quantum channel Consider a quantum channel E applied to some input state ρ A of system A. Let us introduce the purification |ψ RA of ρ A by means of an auxiliary system R. We can therefore consider the output ρ RB = I ⊗E(|ψ RA ψ|). By definition, the coherent information is [4,5] where ρ B := Tr R (ρ RB ) and S(ρ) := −Tr(ρ log 2 ρ) is the von Neumann entropy. Similarly, the reverse coherent information is given by [6,7] where ρ R := Tr B (ρ RB ). When the input state ρ A is a maximally-mixed state, its purification is a maximally-entangled state Φ RA , so that ρ RB is the Choi matrix of the channel, i.e., ρ E . We then define the coherent information of the channel as Similarly, its reverse coherent information is Note that for unital channels, i.e., channels preserving the identity E(I) = I, we have I C (E) = I RC (E). This is just a consequence of the fact that, the reduced states ρ A and ρ R of a maximally entangled state Φ RA is a maximally-mixed state I/d, where d is the dimension of the Hilbert space (including the limit for d → +∞). If the channel is unital, also the reduced state ρ B = E(ρ A ) is maximally-mixed. As a result, S(ρ B ) = S(ρ A ) = S(ρ R ) and we may write In the specific case of discrete-variable systems (d < +∞), we have S(ρ R ) = log 2 d and therefore In particular, for unital qubit channels (d = 2), one has The latter two formulas will be exploited to compute the coherent information of discrete-variable channels. The coherent information is an achievable rate for forward one-way entanglement distillation. Similarly, the reverse coherent information is an achievable rate for backward one-way entanglement distillation (i.e., assisted by a single and final CC from Bob to Alice). In fact, thanks to the hashing inequality [8], we may write

Hashing inequality in infinite dimension
The hashing inequality is known to be valid for finite-dimensional quantum systems. It is easy to extend this inequality to energy-constrained bosonic states by exploiting the continuity of the (reverse) coherent information in the limit of infinite dimension. Consider the state ρ AB of two bosonic modes, each mode having ≤n mean photons. Then, we may apply a projector P d generating a d-dimensional truncated state δ AB such that (see Lemma 1) According to ref. [9,Lemma 17], the trace-distance condition D(ρ, δ) ≤ √ γ < 1/6 implies that the coherent where H 2 the binary Shannon entropy For anyn, the limit d → +∞ implies that γ → 0 and therefore An equivalent result holds for the reverse coherent information I(A B) = −S(B|A). Thus for anyn, the coherent and reverse coherent information are continuous in the limit of infinite dimension. This means that the hashing inequality [8] is extended to bosonic systems with constrained energy. In other words, I(A B) ρ (I(A B) ρ ) represents an achievable rate for the distillable entanglement of the energy-bounded bosonic state ρ via forward (backward) CCs.

Extension to energy-unbounded Choi matrices of bosonic Gaussian channels
For bosonic systems, the ideal EPR state Φ is defined as the limit of two-mode squeezed vacuum (TMSV) states Φ µ , where µ =n + 1/2 is sent to infinity (heren is the mean photon number in each mode) [10]. Thus, the Choi matrix of a Gaussian channel is defined as the asymptotic operator Correspondingly, the computation of the (reverse) coherent information of the channel is performed as a limit, i.e., we have As we will see afterwards in the technical derivations of Supplementary Note 4, for bosonic Gaussian channels the functionals I(A B) ρ µ E and I(A B) ρ µ E are continuous, monotonic and bounded in µ. Therefore, the previous limits are finite and we can continuously extend the hashing inequality of Eq. (34) to the asymptotic Choi matrix ρ E of a Gaussian channel, for which we may set D 1 (ρ E ) := lim µ D 1 (ρ µ E ).

Supplementary Note 3. UPPER BOUND AT ANY DIMENSION
We provide alternate proofs of the weak converse theorem (Theorem 1 in the main paper). The first proof relies on an exponential growth for the total dimension of the private state [11][12][13] (which is justified by well-known arguments [14,15]). The second proof relies on an exponential growth for the total energy. Finally, the third proof does not have any of the previous assumptions; in particular, it only depends on the "key part" of the private state. The first and third proofs are first given for DV channels and then extended to CV channels by means of truncation arguments (see Supplementary Note 1 for full details). The second proof simultaneously applies to both DV and CV channels, by means of embedding arguments. Besides truncation and embedding, the other main ingredients are basic properties of the trace norm and the relative entropy of entanglement (REE) [16], the "asymptotic continuity" of the REE [17,18], and the REE upper bound for the distillable key of a quantum state [11,12].
First proof of the weak converse theorem Let us start by assuming that the output state ρ n ab in Alice and Bob's registers has total finite dimension d ab . Given ρ n ab and φ n such that ρ n ab − φ n ≤ ε ≤ 1/3, we may write the Fannes-type inequality [17] where f (ε) := 4ε − 2ε log 2 ε. This result is also known as asymptotic continuity of the REE. An alternate version states that ρ n where H 2 is the binary Shannon entropy. Note that the total dimension d ab of the output state may always be considered to be greater than or equal to the dimension d P of the private state. The latter involves two key systems (with total dimension d 2 K ) and a shield system (with total dimension d S ), so that d P = d 2 K d S . The logarithm of the dimension d K determines the key rate, while the extra dimension d S is needed to shield the key and can be assumed to grow exponentially in n (see the next subsection "Private states and size of the shield system" for full details on this secondary technical issue).
According to ref. [11], we may write where K(φ n ) is the distillable key of φ n . Therefore, from Eq. (42), we find For some sufficiently high α ≥ 2, let us set Then the previous inequality becomes Asymptotically in n, we therefore get For ε → 0, we derive whose optimization over adaptive protocols leads to the following weak converse bound for the key generation capacity When ρ n ab is a CV bosonic state, we may consider an LOCC truncation channel T ⊗ which maps the state into a DV stateρ n ab = T ⊗ (ρ n ab ) supported in a subspace with cut-off α, so that the effective dimension is 2 αnR ε n as in Eq. (45). This CV-to-DV mapping is large enough to leave the private state φ n invariant, i.e., φ n = T ⊗ (φ n ). Because ρ n ab − φ n ≤ ρ n ab − φ n ≤ ε, we can then repeat the previous derivation and write Eq. (49) forρ n ab . Then, we introduce the upper-bound E R (ρ n ab ) ≤ E R (ρ n ab ), which derives from the monotonicity of the REE under trace-preserving LOCCs (such as T ⊗ ). For clarity, this derivation can be broken down into the following steps where (1) we use the optimal separable stateσ opt s which is the closest toρ n ab in terms of relative entropy; (2) we introduce the non-optimal separable state σ s = T ⊗ (σ opt s ), where σ opt s is the separable state closest to ρ n ab (because T ⊗ is a LOCC, it preserves the separability of input states); and (3) we exploit the fact that the relative entropy cannot increase under trace-preserving LOCCs, which holds in arbitrary dimension [16,19]. Thus, we may write Eq. (49) where E R (ρ n ab ) is directly computed on the bosonic state ρ n ab .
Private states and size of the shield system Let us discuss here the secondary technical detail related with the size of the shield system which appears in the definition of a private state. Consider a finite dimensional system of dimension d K and basis {|i } dK−1 i=0 . A private state between Alice and Bob can be written in the form [11,12] where AB is the total key system in the maximally entangled state while A B is the shield system in a state χ A B protecting the key from eavesdropping. In Eq. (51), the unitary U is a controlled-unitary known as "twisting unitary" which takes the form [12] with U ij A B being arbitrary unitary operators. One can prove that a dilation of a private state into an environment E (owned by Eve) must take the form [12] with χ A B = Tr E (χ A B E ). Note that one can also equivalently write By making local measurements on the key system AB and tracing out the shield system A B , Alice and Bob retrieve the ideal classical-classical-quantum (ccq) state [12] with τ E arbitrary. More precisely, one shows [12] that τ i E = τ E for any i in Eq. (56). The shared randomness in the final classical AB system provides log 2 d K secret-key bits. Thus, the dimension d K of each key system defines the number of secret-key bits (i.e., the rate of the protocol), while the dimension d S of the shield system can in principle be arbitrary. The total dimension of the private state is d P = d 2 K d S . In a key distillation protocol, where Alice and Bob start from n shared copies ρ ⊗n AB and apply LOCCs to approximate a private state, the size of the shield d S grows with the number of classical bits exchanged in their CCs. In fact, Eve may store all these bits in her local register and a private state can be approximated by the parties only if the dimension of Eve's register is smaller than the dimension of the shield system. This is implied by Eq. (56) as explained in ref. [12, Section III]. Now we may ask: Is the shield size d S super-exponential in n? The answer is no for DV systems.
This was originally proven in ref. [14] and also discussed in ref. [15]. This result also holds for key distribution through memoryless channels at any dimension (finite or infinite). Let us remark that, despite the proof may appear involved, it is actually a trivial modification of the one in ref. [14, Appendix A (arXiv v3 version)]. It is based on the fact that one can always design an approximate protocol where key distribution through n uses of a finite-or infinitedimensional channel is broken down into m identical and independent n 0 -long sub-protocols. These sub-protocols provide m copies, which are truncated, measured and whose shields may be discarded. The effective increase of the shield size will then come from one-way key distillation of these output copies, which has an exponential contribution in m < n. In the following, we report this adaptation (with all details) only for the sake of completeness.
Lemma 4 (Shield size. Trivially adapted from ref. [14]): Consider n uses of an adaptive key generation protocol through a quantum channel at any dimension (finite or infinite). Without loss of generality, we can assume that the effective dimension of the shield system d S grows in such a way that lim inf n (d S /c n ) is a constant for some c ≥ 1.
Proof : Let us represent Alice's and Bob's local registers as a = AA and b = BB , where A and B are the local key systems, while A and B are the local shield systems. Denote Eve's register by E. Even if all these systems are infinite-dimensional for bosonic channels, the key systems A and B of the private state φ n 0 AA BB E have finitedimensional support, as we can see from Eq. (54). Then consider an arbitrary adaptive key generation protocol P n key , with key rate R and communication cost that is not necessarily linear in the number n of channel uses (this cost may even be singular, i.e., involving an infinite number of classical bits per channel use). For any ε > 0, there is a sufficiently large integer n 0 such that its output ρ n0 AA BB E satisfies where φ n 0 AA BB E is a (dilated) private state with l n 0 secret bits such that Assume that the restricted n 0 -adaptive protocol is repeated m times, so that the total number of channel uses can be written as n = n 0 m. Correspondingly, Alice and Bob's output state will be equal to the tensor product (ρ n0 AA BB ) ⊗m with ρ n 0 AA BB = Tr E (ρ n 0 AA BB E ). Now, assume that the parties measure their key systems in the computational basis |i A ⊗ |j B while truncating any outcome outside the finite-dimensional 2 ln 0 × 2 ln 0 support S key of the private state's key system. They then discard their shield systems. This means that they apply the LOCC channel where Π ij := Π A i ⊗ Π B j projects onto the local computational bases, while the conditional channel E ij is where E e is a map replacing any input with an erasure state |e orthogonal to the support S key . (Note that the contribution of the extra dimension of |e to the output state is completely negligible since 2 ln 0 is very large).
The action of L ⊗ on the (dilated) output state ρ n0 AA BB E is such that we achieve a truncated ccq stateρ n0 ABE := (L ⊗ ⊗ I E )(ρ n 0 AA BB E ), where the key systems A and B are classical and finite-dimensional. Similarly, the action on the (dilated) private state provides the ideal ccq target state τ n 0 ABE := (L ⊗ ⊗ I E )(φ n 0 AA BB E ) which corresponds to Eq. (56) with log 2 d K = l n0 . Also note that this classicalization and truncation step just needs two bits of CC to be implemented: These bits are needed to identify those instances where the measurement of the other party falls outside the support (this classical overhead is clearly linear in the number m of blocks).
All the procedure is a trivial modification of the one in ref. [14, Appendix A (arXiv v3 version)]. Here we implement it in a coherent way and we include CV systems, for which L ⊗ measures the key systems within the finite-dimensional support of the private state, while collapsing any contribution from the remaining part of the infinite-dimensional Hilbert space. It is easy to check that L ⊗ can also be implemented in two subsequent steps: First a truncation channel into a (2 ln 0 + 1) × (2 ln 0 + 1) subspace and then a measurement channel in the computational bases.
Using the monotonicity of the trace distance under channels (at any dimension), from Eq. (57) we may write Let us consider the reduced stateρ n 0 AB = Tr E (ρ n 0 ABE ). Given many copies (ρ n 0 AB ) ⊗m , Alice and Bob may apply one-way key distillation at an achievable rate (secret bits per block) given by the Devetak-Winter (DW) rate [8] where I(:) is the quantum mutual information (equal to the classical mutual information on classical systems A and B), and S(|) is the conditional von Neumann entropy, with all quantities being computed on the extended output ρ :=ρ n0 ABE . Note that the DW rates are achievable rates at any dimension (finite or infinite).
Let us set τ := τ n 0 ABE . We then compute the difference where H 2 is the binary Shannon entropy. In Eq. (63), we have used the Alicki-Fannes' inequality for the conditional quantum entropy [20] which is valid for any ||ρ − τ || ≤ ε < 1 as in Eq. (61). Note that Eq. (63) only contains the dimension of Alice's Hilbert space H A (truncated in each sub-protocol), while the Hilbert spaces of Bob and Eve do not have any restriction of their dimensionality. Because R DW τ = l n0 and dim H A = 2 l n 0 , we may then write exactly as in ref. [14, Appendix A (arXiv v3 version)]. Therefore, by dividing the latter equation by n 0 , one gets the average rate (per channel use)R It is now important to note that Alice and Bob can achieve the average DW rateR using an amount of one-way CC which is linear in the block number m < n. In fact, the communication cost (bits per block) associated with the one-way key distillation of Alice and Bob's copies (ρ n0 AB ) ⊗m is equal to the conditional (Shannon) entropy S(A|B) between the two classical finite-dimensional systems A and B [8]. This overhead is bounded by log 2 dim H A,B = l n 0 classical bits per block, so that it scales at most linearly as ml n 0 . Therefore, by decreasing ε, we get a sequence of protocols whose classical communication scales linearly in m while their rates approach R according to Eq. (65). Correspondingly, the size of the shield grows at most exponentially in m.
Let us take a closer look at the dynamics of the shield. Within each block, the shield size may increase superexponentially (even to infinite) but then this size collapses to zero at the end of each block (after n 0 uses) once the parties have generated their finite-dimensional cc-stateρ n 0 AB . For this reason, there is no surviving contribution to shield coming from the m sub-protocols. The only contribution to the shield size is that (exponential) coming from the protocol of one-way key distillation on the output finite-dimensional copies. Thus, at values of n = n 0 m for integer m, the dimension d S scales as an exponential function, i.e., it is bounded by c n for some c ≥ 1. If we look at the shield dynamics for every n, then we may always replace the limit by an inferior limit, i.e., we may always say that lim inf n (d S /c n ) is a constant (with the infimum reached by the sequence of points n = n 0 m).

Second proof of the weak converse theorem
This second proof simultaneously applies to DV and CV systems, and relies on the physical assumption that the energy of the output state grows at most exponentially in the number of channel uses. Consider bosonic modes, since any DV system can be unitarily embedded into a CV system (operation which does not change the trace distance). In general, we assume m a modes at Alice's side and m b modes at Bob's side (recall that the parties' local registers may be composed of a countable set of quantum systems). Assume that the output state ρ n ab and the target state φ n ab have mean photon numbers bounded by E n , where we may set E n ≤ 2 cn for some constant c. Let us apply a LOCC truncation channel T ⊗ d , local with respect to Alice and Bob's bipartition of modes m a + m b , which truncates Alice's and Bob's local Hilbert spaces to finite dimension d = E 4 n (other choices are possible). This means that the truncated states ρ n,d ab : Because ρ n ab − φ n ab ≤ ε, we can apply the triangle inequality and find Now the asymptotic continuity of the REE [18] leads to where we use the fact that the total dimension of the truncated states is d ab = d 2 = E 8 n .
In Eq. (68) we may replace E R (ρ n,d ab ) ≤ E R (ρ n ab ) due to the fact that the REE is monotonic under T ⊗ d and invariant under embedding local unitaries. We may also replace log 2 E n ≤ cn and E R (φ n,d ab ) ≥ K(φ n,d ab ) = nR ε n (E n ), where the energy-constrained key rate must satisfy lim n R ε n (E n ) = lim n R ε n , with R ε n being the (finite) key rate associated with φ n ab . Therefore, we may write Diving by n and taking the limit for n → +∞, we get Finally, by taking the limit of ε → 0, we find which gives the final result K(E) ≤ E R (E) by optimizing over all adaptive protocols.
Third proof of the weak converse theorem Let us now give a final proof which is completely independent from the dimensionality of the shield system in the private state. We start from the DV case and then we prove the CV case by resorting to truncation arguments. After n uses of a DV quantum channel E, an adaptive key-generation protocol has an output ρ n ab = ρ ab (E ⊗n ) such that where φ n ab is a private state. Let us write the local registers as a = AA and b = BB , with AB being the key part (with dimension d K × d K ) and A B being the shield. By definition of private state, we have where U is a twisting unitary, χ A B is a state of the shield, and Φ n AB is a Bell state with log d K = nR ε n secret bits. Let us "untwist" the output state ρ n ab = ρ n ABA B and then take the partial trace over the shield system A B . This means to consider Trace norm is non-decreasing under partial trace and invariant under unitaries, so that Eq. (72) implies Following ref. [12], let us consider the set T of bipartite states σ AB which are defined by σ AB = W(σ ABA B ) where σ ABA B is an arbitrary separable state (with respect to Alice and Bob's bipartition AA and BB ). One may define the relative entropy distance from this set as Because the set T is compact, convex and contains the maximally mixed state [12], this distance is asymptotically continuous, i.e., the condition where d is the total dimension of the Hilbert space and H 2 is the binary Shannon entropy. By applying this property to Eq. (75) with d AB = d 2 K , we then get Now we exploit two observations. The first is that as shown in ref. [12,Lemma 7]. Then, we also have In fact, this is proven by the following chain of (in)equalities (81) where (1) we use some arbitrary state σ AB ∈ T , (2) we use Eq. (74), (3) we use the fact that the relative entropy is monotonic under partial trace and invariant under unitaries, and finally (4) we may always choose the separable state σ ABA B to be the one which is the closest to ρ n ABA B in relative entropy (so that it defines its REE). Using Eqs. (79) and (80) into Eq. (78), we find Because log 2 d K = nR ε n , this leads to so that, for large n, we may write By taking the limit of ε → 0, we then find Finally, by optimizing over all adaptive protocols L, we establish the weak converse bound for the two-way keygeneration capacity of the channel We now consider the CV case, i.e., a bosonic channel E. In this case, the private state φ n bits. This Bell state may equivalently be thought to be embedded into a CV system where it is supported within a d K × d K subspace of the infinite-dimensional Hilbert space. The shield state χ A B can be an arbitrary CV state and the twisting U is an arbitrary control-unitary as in Eq. (53) but where the target unitaries U ij A B are defined on a CV state. Let us apply an LOCC truncation channel T ⊗ d K to the key systems A and B, so that their Hilbert spaces are truncated to finite dimension is an identity channel acting on the shield systems. By using the monotonicity of the trace norm under channels, we may write As before, let us define a channel W which untwists and partial-traces the states as in Eq. (74). This channel provides the d K × d K statesρ n AB = W(ρ n,dK ABA B ) and Φ n AB = W(φ n ABA B ), for which we may write (using monotonicity) Consider now the set T of states defined by σ AB = W(σ ABA B ), where σ ABA B is an arbitrary separable state (with respect to the bipartition AA and BB ) where the key-part AB has dimension d K × d K while the shield-part A B is infinite-dimensional. The set T is compact, convex and contains the maximally mixed state. Thus, the relative entropy distance E T R (ρ) = inf σ∈T S(ρ||σ) is asymptotically continuous. This means that we may write Now we derive where (1) follows the derivation given in Eq. (81), and (2) comes from the monotonicity of the REE under T ⊗ dK ⊗I A B . By replacing Eqs. (79) and (90) into Eq (89), we find Eq. (82) where ρ n ab is a now a CV state. The remainder of the proof is the same as before. Let us consider n bosonic modes with quadrature operatorsx = (q 1 , . . . ,q n ,p 1 , . . . ,p n ) T and canonical commutation relations [21] [x, with I being the n × n identity matrix. An arbitrary multimode Gaussian state ρ(u, V ), with mean value u and covariance matrix (CM) V , can be written as [22] where the Gibbs matrix G is specified by The CM of a Gaussian state can be decomposed by using Williamson's theorem [10]. This provides the symplectic spectrum {ν 1 , . . . , ν n } which must satisfy the uncertainty principle ν k ≥ 1/2. Similarly, we may write ν k =n k + 1/2 wheren k are thermal numbers, i.e., mean number of photons in each mode. The von Neumann entropy of a Gaussian state can be easily computed as where The most typical Gaussian state of two modes A and B is a two-mode squeezed thermal state. This has zero-mean and CM of the form with arbitrary a, b ≥ 1/2 and c satisfying the condition These bona-fide conditions can be checked using the tools in Refs. [23,24] adapted to our different notation. For a CM as in Eq. (96), separability corresponds to Thus, at any fixed a and b, the maximally-correlated but still separable Gaussian state is given by imposing the boundary condition c = c sep . It is easy to check that this state contains the maximum correlations among the separable states, e.g., as quantified by its (unrestricted, generally non-Gaussian) quantum discord [25]. For c sep < c ≤ c max in Eq. (96), the Gaussian state is entangled. A specific case is the TMSV state Φ µ with CM of the form As already discussed, for µ → ∞, this state describes the asymptotic CV EPR state Φ, realizing the ideal EPR A Gaussian channel is a CPTP map which transforms Gaussian states into Gaussian states. Single-mode Gaussian channels can be greatly simplified by means of input-output unitaries. In fact, these can always be put in canonical form [10] whose general action on input quadraturesx = (q,p) T is given bŷ where T and N are diagonal matrices, E is an environmental mode withn mean photons, and z is a classical Gaussian variable, with zero mean and CM ξI where ξ ≥ 0. All Gaussian channels are teleportation-covariant and, therefore, Choi-stretchable (with an asymptotic Choi matrix). Teleportation-covariance is given by the fact that any displacement of the inputx →x + d k is mapped into a displacement T d k on the output. Depending on the specific canonical form we have different expressions in Eq. (100). We have: • The thermal-loss channel E loss (η,n) with transmissivity 0 ≤ η ≤ 1 andn thermal photons. This is described bŷ Forn = 0, the channel E loss (η) := E loss (η, 0) is called pure-loss channel or just "lossy channel".
• The amplifier channel E amp (η,n) with gain η > 1 andn thermal photons (in the main text we use the letter g for the gain). This corresponds to the transformation • The additive-noise Gaussian channel E add (ξ), which simply corresponds tô • Finally, there are other secondary forms. One is the conjugate of the amplifier, which is described byx Then, other pathological forms [10]: The A 2 -form, which is a 'half' depolarising channel and corresponds tox → (q, 0) T +x E ; and the

Coherent and reverse coherent information of a Gaussian channel
Here we discuss the computation of the (reverse) coherent information for the most important single-mode Gaussian channels, i.e., the thermal-loss channel, the amplifier channel and the additive-noise Gaussian channel. Compactly, their action on input quadratures is given byx where η ≥ 0 is the transmission (or gain), E is the environmental mode in a thermal state withn mean photons, and z is a classical Gaussian variable with CM ξI ≥ 0. The Choi matrix ρ E of this Gaussian channel E = E(η,n, ξ) is defined as an asymptotic limit. At the input we consider a sequence of TMSV states Φ µ with CM as in Eq. (99). Then, at the output, we get a sequence of finite-energy Gaussian states Let us consider the symplectic eigenvalues of the output CM in Eq. (106), which are given by [10] Using the formula of the von Neumann entropy for Gaussian states and the definitions of the coherent information I C and reverse coherent information I RC , we may write where function s(·) is given in Eq. (95). It is easy to see that these quantities are continuous and increasing in µ, for any fixed values of η,n and ξ. For instance, for the lossy channel (0 ≤ η ≤ 1,n = ξ = 0), we simply have Thus, the limit for µ → +∞ in the expressions of Eq. (108) is regular and finite. The asymptotic values represent the coherent and reverse coherent information of the considered Gaussian channels, i.e., we have as already defined in Eqs. (39) and (40). Correspondingly, the hashing inequality can be safely extended to the limit, i.e., from we may write For the thermal-loss channel, the best lower bound is the reverse coherent information, given by [7] where h(·) is the entropic function defined in Eq. (95). In particular, for a lossy channel (n = 0), one has For the amplifier channel, the best lower bound is given by the coherent information, which is equal to [7,26] and becomes for the quantum-limited amplifier (n = 0). The coherent information and reverse coherent information of the additivenoise Gaussian channel coincide. We have [26] Due to the hashing inequality, the quantities I C (E) and I RC (E) are achievable rates for one-way entanglement distillation. Therefore, they also represent achievable rates for key generation, just because an ebit is a particular type of secret bit. In particular, ref. [7] proved that I RC (E) is an achievable lower bound for quantum key distribution (QKD) through a Gaussian channel without the need of preliminary entanglement distillation. In fact, I RC (E) can be computed as the asymptotic key rate of a coherent protocol where: (i) Alice prepares TMSV states Φ µ AA sending A to Bob; (ii) Bob heterodynes each output mode B and sends final CCs back to Alice; (iii) Alice measures all her modes A by means of an optimal coherent detection that reaches the Holevo bound.
The achievable rate of this coherent protocol is given by a Devetak-Winter rate R DW [8]. Because Eve holds the entire purification of Alice and Bob's Gaussian output state ρ µ E and Bob's detections are rank-1 measurements, this rate is equal to the reverse coherent information [7] R DW = I(A B) ρ µ E computed on Alice and Bob's output. Then, by taking the limit of µ → +∞, one obtains K(E) ≥ I RC (E).
How to compute the entanglement flux of a Gaussian channel Here we discuss how to compute the entanglement flux of a single-mode Gaussian channel (in canonical form). We provide the general recipe and then we go into details of the specific channels in the next subsections. The entanglement flux of a Gaussian channel E satisfies where ρ µ E is a sequence of quasi-Choi matrices as defined in Eq. (105) with CMs as in Eq. (106), whileσ µ s is a suitable sequence of separable Gaussian states.
For any µ, we choose a separable Gaussian stateσ µ s with CMṼ µ (η,n, ξ) as in Eq. (106) but with the replacement for the off-diagonal term. At fixed marginals µ and β, this is the most-correlated separable Gaussian state that we can build according to Eqs. (96) and (98); it has maximum (non-Gaussian) discord [25] and minimizes the relative entropy S(ρ µ E ||σ µ s ) as long as ρ µ E is an entangled state. In the specific case where the channel E is entanglement-breaking, then ρ µ E becomes separable and we can trivially pickσ µ s = ρ µ E , which gives S(ρ µ E ||σ µ s ) = 0. In general, we are left with the analytical calculation of the relative entropy S(ρ µ E ||σ µ s ) between two Gaussian states. This can be done in terms of their statistical moments according to our formula for the REE between two arbitrary multimode Gaussian states, which is given in the "Methods" section of our paper. For S(ρ µ E ||σ µ s ) we find regular expressions with a well-defined limit, so that we can put lim inf µ = lim µ in Eq. (118). We provide full algebraic details below for the various Gaussian channels.

Entanglement flux of a thermal-loss channel
Consider a thermal-loss channel E loss (η,n) with transmissivity 0 ≤ η ≤ 1 and thermal numbern, so that thermal noise has variance ω =n + 1/2. Forn ≥ η(1 − η) −1 , this channel is entanglement-breaking and we have Φ(η,n) = 0. Forn < η(1−η) −1 we compute the relative entropy S µ := S (ρ µ E ||σ µ s ) from the CMs V µ (η ≤ 1,n, 0) andṼ µ (η ≤ 1,n, 0) of the zero-mean Gaussian states ρ µ E andσ µ s . Using our formula for the relative entropy between Gaussian states, we get where S 1 is the contribution of the von Neumann entropy, while the other two terms come from the entropic functional Σ(V µ ,Ṽ µ , 0) (see Methods for its definition). Term ∆ is analytical but too cumbersome to be reported here. By expanding for large µ, we may write and Taking the limit S ∞ = lim inf µ S µ = lim µ S µ , we get As a result, by replacing in Eq. (118), we find that the entanglement flux of a thermal-loss channel E loss (η,n) satisfies The thermal bound in Eq. (124) is clearly tighter than previous bounds based on the squashed entanglement, such as the "Takeoka-Guha-Wilde" (TGW) thermal bound [27] and its improved version [28]. However, Φ loss does not generally coincide with the achievable lower-bound [7] given by the reverse coherent information of the channel [see Eq. (113)]. Thus, the generic two-way capacity of the thermal-loss channel satisfies the sandwich relation It is easy to check that, for a lossy channel (n = 0), the bounds Eq. (126) coincide, therefore establishing Relation with quantum discord The result of Eq. (127) sets the fundamental limit for secret-key generation, entanglement distribution and quantum communication in bosonic lossy channels. For high loss it provides the fundamental rate-loss scaling of 1.44η bits per channel use. This also coincides with the maximum discord that can be distributed to the parties in a single use of the channel. In fact, we may write the reverse coherent information of a (bosonic) channel E as [29] where D(B|A) is the quantum discord of the (asymptotic) Gaussian Choi matrix ρ E loss [25]. In particular, this discord can be computed as Gaussian discord [31,32].

Full calculation details for the lossy channel
For the sake of completeness, we provide the specific details of the computation of the relative entropy S µ for the specific case of a lossy channel. After some algebra, we achieve where and We now insert the expression of ∆ in Eq. (129) and we take the limit for µ → +∞. This limit is defined (i.e., lim inf µ = lim µ ) and we get We can show this limit step-by-step. First note that, for large ν, we have Thus, in the limit of µ → +∞, the first two terms in the RHS of Eq. (129) become Then, it is easy to show that, for µ → +∞, we have In conclusion, by using Eqs. (136), (137) and (138) into Eq. (129), we obtain the final result in Eq. (134).

Entanglement flux of a quantum amplifier
Consider an amplifier channel E amp (η,n) with gain η > 1 and thermal numbern, so that thermal noise has variance ω =n + 1/2. Forn ≥ (η − 1) −1 this channel is entanglement breaking and therefore Φ(η,n) = 0. Forn < (η − 1) −1 we compute the relative entropy S µ := S (ρ µ E ||σ µ s ) from the CMs V µ (η > 1,n, 0) andṼ µ (η > 1,n, 0) of the zero-mean Gaussian states ρ µ E andσ µ s . Up to terms O(µ −1 ), we get For large µ we therefore obtain Thus we find In general, Φ amp (η,n) does not coincide with the best known lower bound which is given by the coherent information of the channel in Eq. (115). Thus, the two-way capacity of a quantum amplifier channel satisfies It is easy to check that, for the quantum-limited amplifier (n = 0), the previous upper and lower bounds coincide, thus determining its two-way capacity Thus, C amp (η) turns out to coincide with the unassisted quantum capacity Q amp (η) [26,33]. The result of Eq. (143) sets the fundamental limit for key generation, entanglement distribution and quantum communication with amplifiers. A trivial consequence is that infinite amplification is useless for communication since C amp (∞) → 0. For an amplifier with typical gain 2, the maximum achievable rate for quantum communication is just 1 qubit per use.

Entanglement flux of an additive-noise Gaussian channel
Consider an additive-noise Gaussian channel E add (ξ) with noise variance ξ ≥ 0. For ξ ≥ 1 this channel is entanglement breaking and therefore we have Φ(ξ) = 0. For ξ < 1 we compute the relative entropy S µ := S (ρ µ E ||σ µ s ) from the CMs V µ (1, 0, ξ) andṼ µ (1, 0, ξ) of the zero-mean Gaussian states ρ µ E andσ µ s . Discarding terms O(µ −1 ), we get which leads to Thus we find The best lower bound is its coherent information I C (ξ) of Eq. (117), so that the two-way capacity satisfies It is interesting to note how quantum communication rapidly degrades when we compose quantum channels. For instance, a quantum-limited amplifier with gain 2 can transmit Q 2 = 1 qubit per use from Alice to Bob. This is the same amount which can be transmitted from Bob to Charlie, through a lossy channel with transmissivity 1/2. By using Bob as a quantum repeater, Alice can therefore transmit at least 1 qubit per use to Charlie. If we remove Bob and we compose the two channels, we obtain an additive-noise Gaussian channel with variance ξ = 1/2, which is limited to Q 2 0.278 qubits per use.

Secondary canonical forms
For the conjugate of the amplifier it is easy to check that this channel is always entanglement-breaking, so that it has zero flux and, therefore, zero two-way capacity C = 0. The A 2 -form [10], which is a 'half' depolarising channel, is also an entanglement-breaking channel, so that Φ = C = 0. Finally, for the "pathological" B 1 -form [10], we find the trivial bound Φ = +∞.

Supplementary Note 5. TECHNICAL DERIVATIONS FOR DISCRETE-VARIABLE CHANNELS
Given a discrete-variable channel E in dimension d, we can easily derive its Choi matrix ρ E = I ⊗ E(Φ) from the maximally-entangled state where {|0 , . . . , |i , . . . , |d − 1 } is the computational basis of the qudit. We write the spectral decomposition where p = {p k } are the eigenvalues of the Choi matrix. The von Neumann entropy is simply equal to the Shannon entropy of the previous eigenvalues, i.e.,

S(ρ
From the Choi matrix we may compute the coherent and reverse coherent information of the channel. In particular, for unital channels, these quantities coincide and are given by the simple formula in Eq. (32), i.e., To compute the entanglement flux of the channel (upper bound), recall that we have for some suitable separable stateσ s . Let us write its spectral decompositioñ where |λ k (s k ) are the orthogonal eigenstates (eigenvalues) ofσ s . We may then write The separable stateσ s may be constructed by applying the channel I ⊗ E to the input separable state so that we have the outputσ This specific choice will be optimal in some cases and suboptimal in others.

Erasure channel in arbitrary finite dimension
Consider a qudit in arbitrary dimension d with computational basis {|i } (results can be easily specified to the case of a qubit d = 2). The erasure channel replaces an incoming qudit state ρ with an orthogonal erasure state |e with some probability p. In other words, we have the action The simplicity of this channel relies in the fact that the input states either are perfectly transmitted or they are lost (while in other quantum channels, the input states are all transmitted into generally-different outputs). This feature allows one to apply simple reasonings such as those in ref. [34] which determined the Q 2 of this channel (more precisely, the Q 2 of the qubit erasure channel, but the extension to arbitrary d is trivial). It is easy to see that this channel is teleportation-covariant (and therefore Choi-stretchable). In fact, any input unitary U applied to the state ρ is mapped into an output augmented unitary U ⊕ I, i.e., we may write Let us write the Kraus decomposition of this channel where A := √ 1 − pI (with I being the d × d identity) and A i := √ p|e i|. We then compute its Choi matrix Note that Tr[Φ(I ⊗ |e e|)] = 0, so that Eq. (160) is the spectral decomposition of ρ E over two orthogonal subspaces, where Φ has eigenvalue 1 − p, and I ⊗ |e e| is degenerate with d eigenvalues equal to p/d. Therefore, it is easy to compute the von Neumann entropy, which is To compute the entanglement flux of the channel, we consider the separable stateσ s in Eq. (156), which here becomes We have now all the elements to be used in Eq. (154), which provides For the lower bound, one can easily check that the coherent and reverse coherent information of this channel are not sufficient to reach the upper bound, since we get where the extra term H 2 (p) is the binary Shannon entropy. Note that these quantities are achievable rates for one-way entanglement distribution but not necessarily the optimal rates. Indeed it is easy to find a strategy based on one-way backward CCs which reaches (1 − p) log 2 d. This follows the same reasoning of ref. [34]. Alice can send halves of EPR states to Bob in large n uses of the channel. A fraction 1−p will be perfectly distributed. The identification of these good cases can be done by Bob performing a dichotomic POVM {|e e|, I − |e e|} on each received system and communicating to Alice which instances were perfectly transmitted. At that point Alice and Bob possess n(1 − p) EPR states with log 2 d ebits each. On average this gives a rate of (1 − p) log 2 d ebits per channel use. Thus, one may write whose combination with Eq. (163) provides Since the two-way quantum capacity of the erasure channel is already known [34], our novel result regards the determination of its secret key capacity It is clear that, for qubits, we have K(E erase ) = 1 − p.

Qubit Pauli channels
Consider a Pauli channel P acting on a qubit state ρ. The Kraus representation of this channel is where p := {p k } is a probability distribution and P k ∈ {I, X, Y, Z} are Pauli operators, with I the identity and It is easy to check that a Pauli channel is teleportation-covariant and, therefore, Choi-stretchable. Teleportation covariance simply comes from the fact that the Pauli operators (qubit teleportation unitaries) either commute or anticommute with the other Pauli operators (Kraus operators of the channel). For a Pauli channel we can also write the stronger condition [ρ P , P * k ⊗ P k ] = 0 for any k, i.e., its Choi matrix is invariant under twirling operations restricted to the generators of the Pauli group. In fact, the Choi matrix of a Pauli channel is Bell-diagonal, i.e., it has spectral decomposition where the eigenvalues p k are the channel probabilities, and the eigenvectors Φ k are the four Bell states It is clear that S(ρ P ) = H(p). Then, using the separable stateσ s as in Eq. (156), we derive the following upper bound for the entanglement flux of this channel Since a Pauli channel is unital, its (reverse) coherent information is just given by I (R)C (P) = 1 − H(p). Therefore, the two-way capacity of a Pauli channel with arbitrary distribution p := {p k } must satisfy Latter result can be made stronger by exploiting the fact that ρ P is Bell-diagonal. For any such a state we can compute the REE by using the formula of ref. [35]. In fact, let us set p max := max{p k }, then we may write Thus, we have the tighter upper bound In the following subsections, we specialize this result to depolarising and dephasing channels.

Qubit depolarising channel
This is a Pauli channel with probability distribution so that we have Let us set Then, from Eq. (176), we derive the following bounds for the two-way capacity of the depolarising channel for p ≤ 2/3, while C(P depol ) = 0 otherwise.

Qubit dephasing channel
This is a Pauli channel with probability distribution p = {1 − p, 0, 0, p}, so that we have It is easy to see that H(p) = H 2 (p max ) = H 2 (p), so that Eq. (176) leads to which also coincides with the unassisted quantum capacity of this channel Q(P deph ) [36].

Pauli channels in arbitrary finite dimension
Let us now consider Pauli channels P d in arbitrary dimension d ≥ 2. These qudit channels are also called "Weyl channels" and they have Kraus representation where p ab is a probability distribution for a, b ∈ Z d := {0, 1, . . . , d − 1}. Here X and Z are generalized Pauli operators whose action on the computational basis {|j } is given by where ⊕ is the modulo d addition and These operators satisfy the generalized commutation relation Not only for d = 2 (qubits) but also at any d ≥ 2 a Pauli channel is teleportation-covariant. The channel's Choi matrix ρ P d is Bell-diagonal with eigenvalues {p ab }, so that we may write its von Neumann entropy in terms of the Shannon entropy as follows Note that the Choi matrix can also be written as Then, let us consider a separable stateσ s which is constructed as in Eq. (156). This state can be re-written as By applying Eq. (154), we find where p a := d−1 b=0 p ab . Since the d-dimensional Pauli channel is unital, we may also write I (R)C (P d ) = log 2 d − H({p ab }), so that we derive the following bounds for its two-way capacity which generalizes Eq. (174) to arbitrary dimension d. In the following two subsections, we consider the specific cases of the depolarising and dephasing channels in arbitrary finite dimension d.

Depolarising channel in arbitrary finite dimension
Consider a depolarising channel acting on a qudit with dimension d ≥ 2. This channel can be written as where A = √ 1 − pI and A ij = p/d|i j|. Its Choi matrix is the isotropic state satisfying the twirling condition for any qudit unitary U . The REE of an isotropic state can be evaluated exactly by using the formula of ref. [37]. Thus we can exactly compute the entanglement flux of the d-dimensional depolarising channel. Let us set Then, we may write the following expression Because the depolarising channel is unital, we may use Eq. (151) to compute its (reverse) coherent information. We specifically find Thus, the two-way capacity of this channel must satisfy the bounds for p ≤ d/(d + 1), while zero otherwise.

Dephasing channel in arbitrary finite dimension
Consider a generalized dephasing channel affecting a qudit in arbitrary dimension d ≥ 2. This channel has Kraus representation [38,39] where Z is the generalized Pauli (phase-flip) operator defined in Eq. (184), and P i is the probability of i phase flips. The channel's Choi matrix is By diagonalizing, we find d non-zero eigenvalues P := {P 0 , . . . , P d−1 }, so that the Von Neumann entropy is given by The separable stateσ s in Eq. (156) turns out to be diagonal in the computational basis and takes the form Thus, using Eq. (154), we find Since this channel is unital, from Eq. (151) we have that its (reverse) coherent information is I (R)C (P d-deph ) = log 2 d − H(P), so that lower and upper bounds coincide. This means that this channel is distillable and its two-way capacity is equal to Amplitude damping channel The amplitude damping channel describes the process of energy dissipation through spontaneous emission in a two-level system. Its application to an input qubit state is defined by the Kraus representation where and p is the probability of damping. This channel is not teleportation-covariant. In fact, because we have The amplitude damping channel can be decomposed as where E DV→CV is an identity mapping from the original qubit (e.g. a spin) to a single-rail qubit, which is the subspace of a bosonic mode spanned by the vacuum and the single photon states; then, E η(p) is a lossy channel with transmissivity η(p) := 1 − p; finally, E CV→DV is an identity mapping from the single-rail qubit to the original qubit. Note that the two mappings can be performed via perfect hybrid teleportation and the middle lossy channel preserves the 2-dimensional effective Hilbert space of the system. Thanks to this decomposition, we can include E DV→CV in Alice's LOs and E CV→DV into Bob's LOs. The middle lossy channel E η(p) can therefore be stretched into its asymptotic Choi matrix ρ E η(p) . Overall, this means that the amplitude damping channel can be stretched into the asymptotic resource state σ = ρ E η(p) by means of an asymptotic simulation. By applying teleportation stretching, we therefore reduce the output of an adaptive protocol to the form where bothΛ and ρ E η(p) are intended as asymptotic limits. Thus, our reduction method provides the upper bound We can combine the latter result with the fact that we cannot exceed the logarithm of the dimension of the input Hilbert space (see this simple "dimensionality bound" in the main text, in the discussion just before Proposition 5). This leads to The best lower bound is given by optimizing the reverse coherent information over the input states ρ u = diag(1−u, u) for 0 ≤ u ≤ 1. In fact, we have [6] I RC (p) := max This is an achievable lower bound for entanglement distribution assisted by a final round of backward CCs. Note that this is strictly higher than the Q 1 = Q of the channel, which is given by [6] Thus, in total, we may write which is shown in Supplementary Fig. 1a. See the next section for the derivation of a tighter upper bound which is based on the squashed entanglement.
Amplitude damping channel: Upper bound based on the squashed entanglement An alternative upper bound for the two-way capacity of a quantum channel is its squashed entanglement, i.e., we may write [40] C(E) ≤ E sq (E). (215) The squashed entanglement of an arbitrary channel E, from system A to system B, is defined as [40] E sq (E) : where ρ A is an arbitrary input state, and ω is the global output state with U E A→BC being an isometric extension of E and V C→EF being an arbitrary "squashing isometry". In Eq. (216), the terms in the brackets are conditional von Neumann entropies computed over ω BEF , i.e., Then note that the most general input state reads where γ ∈ [0, 1] is the population of the excited state |1 , while the off-diagonal term |c| ≤ (1 − γ)γ accounts for coherence. Thus, the maximization in Eq. (216) is mapped into a maximization over parameters γ and c. Let us compute the squashed entanglement of the amplitude damping channel E damp . Recall that its action is described by Eq. (205) with Kraus operators as in Eq. (206). In the computational basis {|00 , |01 , |10 , |11 }, the unitary dilation of E damp is therefore given by the following matrix so that we may write where C is an environmental qubit prepared in the fundamental state |0 . It is clear that Eq. (221) expresses the isometric extension of the channel, i.e., it corresponds to E damp (ρ A ) = Tr C [U damp A→BC (ρ A )]. As a squashing channel we consider another amplitude damping channel but with damping probability equal to 1/2, so that its unitary dilation is V = U 1/2 . In other words, we consider the squashing isometry V C→EF = [U damp C→EF ] p=1/2 (so that we are more precisely deriving an upper bound of the squashed entanglement of the channel). Let us derive the global output state ω BEF step-by-step.
The state of systems B and C at the output of the dilation U p is given by Now the system C is sent through the squashing amplitude damping channel with probability 1/2. At the output of the dilation U 1/2 we have the final output state We now proceed with the calculation of the entropies in Eq. (218), which are obtained from the eigenvalues of the reduced states ρ BE , ρ BF , ρ E and ρ F . We obtain with eigenvalues The eigenvalues of ρ BE and ρ BF are too complicated to be reported here but it is easy to check that, exactly as for λ 1,2 in previous Eq. (225), their dependence on c is just through the modulus |c|, so that we can choose c to be real without losing generality. Because c is real, we also have that the entropic functional (ρ) = S(B|E) ω + S(B|F ) ω computed over the input state ρ is exactly the same as that computed over the state ZρZ, with Z being the phase-flip Pauli operator. Using the latter observation, together with the concavity of the conditional quantum entropy, one simply has whereρ is diagonal. This means that we may reduce the maximization to diagonal input states (c = 0). As a result, we may just consider with eigenvalues and with eigenvalues From the previous eigenvalues, we compute the conditional quantum entropies in Eq. (218). Thus, we find that the squashed entanglement of the amplitude damping channel must satisfy the bound where H 2 is the binary Shannon entropy of Eq. (37). In particular, the function H 2 (ν 1 ) − H 2 (λ 1 ) is concave and symmetric in γ, so that the maximum is reached for γ = 1/2, which corresponds to a maximally mixed state at the input. This reduces Eq. (231) to the simple bound If we choose a squashing amplitude damping channel with generic probability of damping η and we repeat the calculation from the beginning we obtain the following bound for the squashed entanglement The minimum of the function inside the curly bracket is for η = 1/2, so our choice of a balanced amplitude damping channel as a squashing channel is now justified. Note that the sub-optimal choice η = 0 corresponds to use the identity as squashing channel; correspondingly, the right hand side of Eq. (233) becomes half of the entanglement-assisted classical capacity C A of the amplitude damping channel, i.e., In conclusion, combining the lower bound of Eq. (212) and the upper bound of Eq. (232), we find that the two-way capacity of the amplitude damping channel is within the sandwich This is shown in Supplementary Fig. 1b We consider the state of the art in high-rate QKD, by analyzing the maximum rates which are achievable by current practical protocols in CVs and DVs. We assume the optimal asymptotic case of infinitely long keys, so that finite-size effects are negligible. We also assume ideal parameters. For CVs this means: Unit detector efficiency, zero excess noise, large modulation and unit reconciliation efficiency. For DVs this means: Unit detector efficiencies, zero dark count rates, zero intrinsic error, unit error correction efficiency, and no other internal loss in the devices. Note that all the following results are already present in the literature or are easily derivable from those in the literature. They are given to the reader for the sake of completeness.

Continuous-variable protocols
• No-switching protocol [41]. This is the practical CV protocol with the highest secret key rate. It is based on coherent states and heterodyne detection. In reverse reconciliation (RR), its maximum secret key rate over a lossy channel with transmissivity η is equal to where s(·) is the entropic function given in Eq. (95). For high loss (η 0), it scales as η/2 ln 2, which is 1/2 of the secret key capacity.
• Switching protocol [42,43]. This was the first practical CV protocol. It is based on coherent states and homodyne detection (with switching between the two quadratures). In RR, it reaches the rate which is 1/2 of the secret key capacity. For high loss, it clearly scales as the previous protocol.
• CV measurement-device-independent (MDI) protocol [44,45]. This is based on coherent states sent to an untrusted relay implementing a CV Bell detection. Alice-relay channel has transmissivity η A and Bob-relay channel has transmissivity η B , so that the total Alice-Bob channel transmissivity is η = η A η B . In the symmetric configuration with the relay perfectly in the middle (η A = η B ) [44,46], it has maximum rate In the asymmetric configuration (η A = η B ), it has maximum rate In particular, in the most asymmetric configuration, where the relay coincides with Alice (η A = 1) [44,47], we recover the one-way rate of Eq. (236).
• CV two-way protocols [48]. In the first main variant, Bob sends coherent states to Alice, who randomly displaces their amplitudes before sending them back to Bob for heterodyne detection. In RR (Bob as encoder), this protocol has maximum rate In the second main variant, the protocol runs as before except that Bob's measurement is homodyne detection (with switching between the quadratures). In RR, it has maximum rate [49] R 2way-hom = 1 4 It is easy to check that both the variants scale as η/4 ln 2 for high loss. Despite the fact that two-way protocols have lower key rates than one-way protocols in a lossy channel, they are more robust when excess noise is present. In this case, one considers the "security threshold" of the protocol which is defined as the maximum tolerable excess noise above which the rate becomes negative. Two-way protocols have higher security thresholds than one-way protocols [48,49].

Discrete-variable protocols
Here we consider various DV protocols. As said before, we assume the optimal asymptotic case of infinitely long keys and also ideal parameters, which here means: Unit detector efficiencies, zero dark count rates, zero intrinsic error, unit error correction efficiency, and no other internal loss in the devices. Under these assumptions, we consider the ideal BB84 protocol with single photon sources [50], the BB84 with weak coherent pulses and decoy states [51,52], and DV-MDI-QKD [53,54].
Let us consider the BB84 protocol [50] assuming that Alice's source generates perfect single-photon pulses. The general formula of the key rate can be found in ref. [51]. It reduces to the following expression where H 2 is the binary Shannon entropy. In Eq. (242), δ(Q) = f H 2 (Q) is a function accounting for the leak of information from imperfect error correction, f ≥ 1 is the efficiency of the classical error correction codes, Q is the total error rate (QBER), andR is the total detection rate after quantum communication (the raw key). Under ideal conditions of zero dark-count rates, unit efficiency detectors, perfect visibility, and perfect classical error correction (f = 1), one has Q = 0 and obtains the following maximum rate R BB84-1ph = η/2, setting the maximum rate for the current DV protocols. A realistic photon source is a device emitting attenuated coherent pulses. In this case, the performance of the protocol depends on an additional parameter which is the intensity of the source. In the BB84 protocol, with weak coherent pulses and decoy states [52], Alice randomly changes the intensity µ of the pulses, and reveals publicly their values during the final classical communication. In this way Eve cannot adapt her attacks during the quantum communication. The µ-dependent key rate of the protocol is given by [51] where Q µ is the µ-dependent QBER, and Y µ n = R µ n /R is the ratio between the µ-dependent detection rate R µ n , associated to Alice sending n photons, and the total detection rateR. Assuming ideal conditions, one finds R µ = e −µ ηµ/2. The optimal key rate is obtained by maximizing over the intensities, i.e., R = max µ R µ . It is easy to check that the optimum is given by µ = 1 and the maximum key rate becomes R BB84-decoy = η/(2e).
Finally consider DV-MDI-QKD. The general expression of the key rate is given by the following expression [54] where P 11 is the joint probability that both emitters (with intensities µ A and µ B ) generate a single-photon pulse. The quantity Y 11 Z gives the gain in the Z-basis (one assumes Y 11 X = Y 11 Z for the X-basis), and e 11 Z is the error rate in the Z-basis. Finally, the quantity G Z describes the gain and Q Z the QBER, both in the Z-basis. Under ideal conditions, the µ-dependent key rate becomes where η A and η B are the transmissivities of Alice's and Bob's channels. It is easy to check that the maximum is taken for µ A = µ B = 1, providing R DV-MDI = η/(2e 2 ).

Input energy constraints
It is important to remark that the two-way capacities that we computed for Gaussian channels are bounded quantities, which do not diverge even if the maximum is achieved in the limit of infinite input energy (excluding the case of a pathological canonical form). In fact, one may consider an alphabet of input states whose mean number of photons is capped at some finite valueN . This assumption automatically defines a hard-constrained two-way capacity C (E,N ). For a bosonic Gaussian channel, C(E,N ) is increasing inN but also upper-bounded by the entanglement flux of the channel Φ(E). (In fact, note that all the procedure of teleportation stretching still applies if we enforce an input energy constraint for the adaptive protocols. For instance the constraint can be realized by a pinching map which is then absorbed in Alice's LOs). As a result, the asymptotic limit of the unconstrained capacity C(E) := limN C(E,N ) is finite. This is clearly true for Q 2 (E), D 2 (E) and K(E), but the situation would be different for the two-way classical capacity of the channel.
Another possibility is imposing a "soft constraint" on the input energy. This means to fix the average number of photons at the input to some finite valuem. In this case, it is interesting to see that our "unconstrained" upper bounds remain sufficiently tight even in the presence of such an energy constraint. The best way to show this is considering our main result for the lossy channel with arbitrary transmissivity η, for which we have proven that Even if we constrain the input tom mean photons, it is easy to show that: (1) The unconstrained bound Φ(η) is still very tight, since it is rapidly approached from below by the reverse coherent information computed at finite energy; (2) The unconstrained bound Φ(η) remains tighter than other constrained bounds based on the squashed entanglement, even whenm is of the order of a few photons.
Let us start with point (1). From Eq. (109), we see that the reverse coherent information associated with a lossy channel and a TMSV state is which is obtained by setting µ =m + 1/2 in Eq. (109) and using the h-function of Eq. (95). In Supplementary Fig. 2, we see that I RC (m, η) rapidly approaches the unconstrained upper bound Φ(η) already form 1 − 5 photons. Let us now discuss point (2). We compare the unconstrained upper bound Φ(η) with the unconstrained TGW upper bound for the lossy channel [27] K TGW (η) = log 2 and its energy-constrained version (Note that the latter was just a partial result [27] used to derive the bound in Eq. (248) form → +∞). In Supplementary Fig. 3 we clearly see that Φ(η) not only is tighter than K TGW (η) but also outperforms the constrained version K TGW (η,m) for all input energies down to one mean photon. This is certainly true in the regime of intermediate-long distances (> 25 km), where DV-QKD protocols have ideal performances at one mean photon per channel use. At short distances (< 25 km), energy constraints do not really have so much practical value since we can efficiently use highly-modulated CV-QKD whose number of photons is high enough to approach the asymptotic infinite-energy behavior. In general, note that CV-QKD protocols with highly-modulated Gaussian states can be used at any distance. Their performance is not limited by the input energy, but critically depends on the efficiency of the output detection scheme and the quality of the data-processing (reconciliation efficiency).

Cost of classical communication
It is important to discuss the cost associated with the CCs. In fact, in order to achieve its performance, an optimal protocol will need a certain number of classical bits per channel use. Furthermore, the physical transmission of these bits is ultimately restricted by the speed of light. It is therefore essential to consider these aspects in order to translate a capacity, which is expressed in terms of target-bits (e.g. secret bits) per channel use, into a practical throughput, which is expressed in terms of target-bits per second. Consider the case of a bosonic lossy channel which is the most important for quantum optical communications.
By definition, an adaptive protocol is assisted by unlimited and two-way CCs. This is a very general formulation but it has an issue for practical applications: An adaptive protocol, which may be optimal in terms of target-bits per channel use, may have zero throughput in terms of target-bits per second, just due the fact that its implementation may require infinite rounds of feed-forward and feedback CCs in each channel use. The existence of such protocol is not excluded by the TGW bounds [27] of Eqs. (248) and (249), which are non-tight and do not have control on the CCs. By contrast, this problem is completed solved by our bound.
In fact, for any distillable channel E (e.g., bosonic lossy channel, quantum-limited amplifier, dephasing or erasure channel), the generic two-way capacity C(E) is equal to D 1 (ρ E ), which is the entanglement distillable from the Choi matrix of the channel by means of one-way CCs (forward, from Alice to Bob, or backward, from Bob to Alice). This means that an optimal protocol achieving the capacity is non-adaptive and it does not involve infinite rounds of CCs, but just a single round of forward or backward CCs.
For the specific case of a bosonic lossy channel, with transmissivity η, we find that an optimal key-generation protocol, achieving the repeaterless bound K(η) = − log 2 (1 − η), can be implemented by using backward CCs. In fact, as already discussed in Supplementary Note 4, an optimal key-generation protocol is the following: Alice prepares TMSV states Φ µ AA sending A to Bob; Bob heterodynes each output mode, with outcome Y , and sends final CCs back to Alice; Alice measures all her modes A by means of an optimal coherent detection. Taking the limit for large µ, the key rate of the parties achieves the bound K(η).
Because this is a Devetak-Winter rate (in reverse reconciliation), the amount of CCs required by the protocol (bits per channel use) is equal to the following conditional entropy [8] where S(Y ) = H(Y ) is the Shannon entropy of Bob's outcomes Y , while S(A) and S(A|Y ) are the von Neumann entropies of Alice's reduced state ρ A and conditional state ρ A|Y . These quantities are all easily computable for any finite value of µ. By taking the limit for large µ, we derive the asymptotic cost γ CC (η) = 2η log 2 π + (2η − 3) log 2 (3 − 2η) + 3 log 2 3 2η ≤ log 2 (3πe) ≈ 4.68 classical bits/use, where the latter bound is achieved for low transmissivities (long-distances), i.e., γ CC (η 0) log 2 (3πe). According to Eq. (251), at any transmissivity η, Bob needs to send Alice no more than log 2 (3πe) classical bits per channel use. Consider a practical scenario where the rounds of the protocol are not infinite but yet a very large number, e.g., n = 10 9 , so that the performance of such a large block of data is close to the asymptotic one. The amount of classical bits to be transmitted is linear in n, and the total cost is no larger than 4.68 × 10 9 bits, i.e., less than 1 gigabyte per block. Assuming the existence of a broadband classical channel between Alice and Bob, the extra time associated with the transmission of this classical overhead can be made negligible (for instance, it may happen at the beginning of the second large block of quantum communication). Assuming that the procedures of error correction and privacy amplification are also sufficiently fast within the block, then the final achievable throughput (secret-bits per second) will only depend on the capacity K(η) (secret-bits per use) multiplied by the clock of the system (uses per second). Clearly, this is a simplified reasoning which does not consider other technical issues.

Supplementary Note 8. ADVANCES IN CHANNEL SIMULATION
The idea of channel simulation was originally introduced by Bennett-DiVincenzo-Smolin-Wootters (BDSW) [55] as a simple modification of the original teleportation protocol. Instead of performing standard teleportation by using a Bell state, one may consider an arbitrary mixed state as a resource. As a result, the effect of teleportation is not an identity map (transfer operator) but a noisy channel from the input to the output. BDSW introduced this teleportation-simulation argument to simulate DV channels that preserve the finite dimension d of the input Hilbert space H d , also known as the "tight" case [56]. Let us discuss the BDSW simulation in more detail.
Consider a mixed state σ of two qudits, A and B, both having dimension d, i.e., their joint Hilbert space is The "teleportation channel" associated with the density operator σ ∈ D(H d A ⊗ H d B ) is the dimensionpreserving quantum channel T σ : D(H d ) → D(H d ), which is given by teleporting an input d-dimensional qudit by using the resource state σ. The procedure goes as follows. Alice measures qudit A and input qudit a in a Bell detection, whose outcome k ∈ {0, . . . , d 2 − 1} is associated with a qudit Pauli unitary U k . This detection projects Bob's qudit B onto a k-dependent state. Once the outcome k is communicated to Bob, he applies the Pauli correction U −1 k to qudit B thus retrieving the final state on the output qudit b. The average over all outcomes k defines the teleportation channel T σ from the states of a to those of b.
BDSW [55, Section V] also recognized that a Pauli channel E (there called "generalized depolarizing channel") can be simulated by teleporting over its Choi matrix ρ E , so that E = T ρ E . This particular case was later re-considered in ref. [57] as a property of mutual reproducibility between mixed states and quantum channels. In a few words, we may store a channel E into its Choi matrix ρ E (by sending half of an EPR state), and then recover the channel back by performing teleportation over ρ E . At this point, a natural question to ask is the following:

Can we generate other DV channels (beyond Pauli) using the teleportation-simulation of BDSW [55, Section V]?
The answer is no. In fact, ref. [58] showed that the standard teleportation protocol (based on Bell detection and Pauli corrections) performed over an arbitrary d × d state σ can only simulate a quantum channel of the form where M ab := (U ab ⊗I) † |Φ Φ| (U ab ⊗I) are the POVM elements of the Bell detection (with |Φ being a d-dimensional Bell state), and U ab are Pauli operators. This is clearly a d-dimensional Pauli channel. The possibility to generate other DV channels relies on a stronger modification of the original teleportation protocol, where we allow for more general quantum operations [56,59] and also for the possibility of varying the dimension of the Hilbert space. Recently, ref.
[60] considered a generalization of the teleportation-simulation argument for DV channels, using tools from ref. [56] and moving important steps into the study of teleportation covariance (see also ref. [61]). Similarly, ref. [62] moved the first steps in the simulation of single-mode Gaussian channels by using Gaussian resources and the standard CV teleportation protocol [63]. In our paper we provide the most general and rigorous formulation. In fact, we remove all the assumptions regarding the dimension of the quantum systems which may also vary through the channel. Thus we may tele-simulate, DV channels, CV channels and even hybrid channels, i.e., mappings between DVs and CVs. More generally, our simulation is not limited to teleportation-LOCCs (i.e., Bell detection and unitary corrections), but considers completely general LOCCs which may also be asymptotic, i.e., defined as suitable sequences. Furthermore, the simulating LOCCs may also include portions of the channels (i.e., we may decompose a channel E as E 2 •Ẽ • E 1 and include E 1 and E 2 in the LOCCs). For all these reasons, we may simulate any quantum channel at any dimension. As discussed in the main text, the best case is when the simulation can be done directly on the channel's Choi matrix. To identify this case we introduce the criterion of teleportation-covariance at any dimension, finite or infinite.
Note that ours is the most general simulation to be used in quantum/private communication, which is a setting where two remote parties can only apply LOCCs. In this regard, it is different and more precise than the channel simulation realized by using a deterministic version [64] of a programmable quantum gate array (PQGA) [65,66]. This is also known as "quantum simulation" [67] and considers the simulation of "programmable channels" by means of joint operations. A programmable channel is defined as a (finite-dimensional) channel E that can be simulated as for a universal joint quantum operation Ω and some programme state σ E . This clearly fails to catch the LOCC structure which is essential for protocols of quantum/private communication. Furthermore, this type of simulation has not been developed into an asymptotic version (via CV teleportation), which is clearly needed for the representation of bosonic channels. Finally, the universal character of the operation Ω restricts the class of channels that can be simulated 3. Any task. Maintaining the task and output of the original protocol is crucial, because the reduction can now be applied to any kind of adaptive protocol, not just quantum communication, but any other task, including key generation (considered in this paper) and parameter estimation/channel discrimination (considered in ref. [68]). This aspect is also important in order to extend the procedure to more complex scenarios, from two-way quantum communication to the presence of quantum repeaters in arbitrary network topologies [69]. 4. Any channel and dimension. The BDSW reduction argument was given for the restricted class of Pauli channel. Teleportation stretching is formulated for any channel at any dimension (finite or infinite). This is non-trivial because it involves the use of asymptotic simulations for fundamental channels such as the bosonic Gaussian channels and the amplitude damping channel. In general, we may write an output decomposition of the type lim µΛµ (σ µ⊗n ) for sequences of trace-preserving LOCCsΛ µ and resource states σ µ .
In the literature, we can also find another type of adaptive-to-block reduction, which is based on the use of a deterministic PQGA. It is known that a PQGA can simulate an arbitrary unitary or channel in a probabilistic way [65]. However, as discussed in Supplementary Note 8, one may also define a class of programmable channels for which the PQGA works deterministically: These are (finite-dimensional) channels E that can be simulated as in Eq. (253) for a universal generally-joint quantum operation Ω and a programme state σ E . It is easy to check that, in a protocol, this "quantum simulation" [67] leads to an output decomposition of the type Q(σ ⊗n E ), where Q is a joint quantum operation for Alice and Bob. Clearly this is not suitable for quantum/private communication, where the parties are restricted to LOCCs and, therefore, both the channel simulation and the adaptive-to-block reduction must maintain the LOCC structure of the original protocol. Furthermore, it lacks an asymptotic formulation which is needed for bosonic channels and also the flexibility to include portions of the channels in the simulating operations (these are elements introduced by our approach). It is worth to mention that the quantum simulation plays a role for the simplification of adaptive protocols in quantum metrology and channel discrimination, where the parties are close (they are indeed the same entity) and may therefore apply joint unitaries and joint measurements. See refs. [68,70].

Supplementary Note 10. ADVANCES IN BOUNDING TWO-WAY CAPACITIES
By simulating Pauli channels, BDSW showed how to reduce a quantum communication protocol into an entanglement distillation protocol. By combining this argument with an opposite implication, they were able to show that, for a Pauli channel E, one may write Q 1 (E) = D 1 (ρ E ), which was implicitly extended to The latter result is not exploitable for computing the two-way quantum capacity Q 2 unless one identifies simple (and tight) upper bounds for D 2 . Such elements were missing in 1996 but today we can exploit powerful tools. Using today's knowledge, the simplest approach is to combine Eq. (256) with the fact that D 2 (ρ E ) ≤ K(ρ E ) (since an ebit is a particular type of secret-bit) and the REE upper bound on the distillable key of quantum states [11], so that K(ρ E ) ≤ E ∞ R (ρ E ). All this leads us to write Our work shows the bound of Eq. (257) for any finite-dimensional Choi-stretchable channel. In particular, we show that the single-letter REE bound of Eq. (257) is tight for dephasing and erasure channels. The next non-trivial generalization is moving from quantum to private communication. In this regard, the notions of private capacities [71] and private states [11,12] were available well after 1996. Note that we may consider the secret-key capacity K, which is the number of secret bits which are distributed between the parties (via adaptive protocols), and the two-way private capacity P 2 , which is the maximum rate at which classical messages can be securely encoded and transmitted [71]. Because of the unlimited two-way CCs and the one-time pad, we have P 2 = K. For a finite-dimensional Choi-stretchable channel E, it is easy to write the equivalence The simplest way to show this is to apply teleportation stretching to reduce adaptive key-generation protocols, which leads to K(E) = K(ρ E ) as in Proposition 6 of our main text. An alternate way is to show P 2 (E) = K(ρ E ) by means of a suitable extension of the BDSW reduction argument. In fact, for a finite-dimensional Choi-stretchable channel, we may transform a protocol of private communication [71] through E into a protocol of key-distillation [11,12] over the Choi matrix ρ E , so that P 2 (E) ≤ K(ρ E ). The latter bound is achievable by a protocol where Alice transmits part of Bell states, so that the parties distill a key from the output Choi matrices, which is then used to send the message via the one-time pad. Note that these extensions from quantum to private communication, and from entanglement to key distillation were not available in 1996, which is why Eq. (258) can only be written today. At the same time, it is surprising that Eq. (258) was never written before our work, with many of the tools being available since 2005. Now it is very important to observe that both Eqs. (257) and (258) cannot be used to investigate the most important setting for quantum/private communication, which is the bosonic one. Furthermore, they miss to provide single-letter bounds for other DV channels which involve asymptotic simulations (e.g., amplitude damping). For these important reasons, it is necessary to develop a general theory which is dimension-independent and applicable to channels of any dimension, finite or infinite. This is the main content of our Theorem 5 in the main text. This states that, for any channel E stretchable into a resource state σ (even asymptotically), we may write where C(E) is any among the two-way capacities Q 2 (E) = D 2 (E) ≤ P 2 (E) = K(E). In particular, for a Choi stretchable channel (σ = ρ E ), we have Recall that the proof of Eq. (259) relies on the following steps: • First the derivation of the REE bound C(E) ≤ E R (E) for any channel E at any dimension (weak converse theorem) • Second, the adaptive-to-block reduction by teleportation stretching at any dimension, which decomposes the output of an arbitrary adaptive protocol intoΛ(σ ⊗n ) or a suitable asymptotic form.
Because the REE is a functional which is monotonic under trace-preserving LOCCs and subadditive over tensor products, we may then derive Eq. (259). It is clear that this procedure can be adapted to simplify any functional which is monotonic under LOCCs, which includes the Rains bound [72,73] and entanglement monotones.
Single-hop networks (broadcast, multiple-access and interference channels) Ref. [74] investigates the maximum rates for transmitting quantum information, distributing entanglement and secret keys in a single-hop multipoint network, with the assistance of unlimited two-way classical communication among all the parties. Ref. [74] first considers a sender directly communicating with an arbitrary number of receivers, so called quantum broadcast channel. In this case, it provides a simple analysis in the bosonic setting considering quantum broadcasting through a sequence of beamsplitters. This specific case has been also investigated in ref. [75] where the use of our method (REE+teleportation stretching) has led to the determination of the capacity region of the lossy broadcast channel. Then, ref. [74] also considers the multipoint setting where an arbitrary number of senders directly communicate with a single receiver, so called quantum multiple-access channel. Finally, ref. [74] studies the general case of a quantum interference channel where an arbitrary number of senders directly communicate with an arbitrary number of receivers. Upper bounds are formulated for quantum systems of arbitrary dimension, so that they can be applied to many different physical scenarios involving multipoint quantum and private communication.

Improving the lower bound for the thermal-loss channel
It remains an open problem to determine the two-way capacities of several channels, most notably that of the thermal-loss channel E loss (η,n). Here we have shown lower-and upper-bounds in Eq. (126). Recently, ref. [76] has studied the specific case of the secret-key capacity K(η,n) of this channel investigating a region where the lower-bound given by the reverse coherent information can be beaten. This is possible by resorting to a Gaussian QKD protocol based on trusted-noise detection. However, the improved lower bound is still far from closing the gap.

Improved upper bounds based on the squashed entanglement and secret-key capacity of the erasure channel
Note that the first version of our paper appeared on the arXiv in October 2015 [77]. It originally contained the main result for the bosonic lossy channels. The other results for DV and CV channels were given in a second paper, uploaded on the arXiv in mid December 2015 [78]. These two papers were later merged into a single contribution, which is the present manuscript.
In late November 2015, one month after our first arXiv version, another manuscript appeared on the arXiv by Goodenough et al. [28]. This is a very interesting paper that improves the upper bounds of ref. [27] based on the squashed entanglement. As is clear from our main text, these improved bounds are still larger than ours based on the REE. However there are two notable exceptions: the amplitude damping channel and the erasure channel. For the amplitude damping channel, ref. [28] led us to improve our previous results and to find the tightest known upperbound based on the squashed entanglement, which is the one given in Eq. (232). Regarding the erasure channel, the REE and the squashed entanglement lead to the same upper bound, so that both methods are sufficient to determine the secret-key capacity of this channel.
In our main text, we acknowledge the independent derivation of ref. [28] for the secret-key capacity of the erasure channel. This is independent because of the completely different method. It is simultaneous because it has been achieved in a short time window between our first [77] and second [78] arXiv papers. Goodenough et al. [28] first wrote their upper bound for the erasure channel without making the crucial observation that it was tight. They then realized this important fact after seeing our updated results on the arXiv two weeks later [78], where we first explicitly claimed the secret-key capacity of the erasure channel. In a later update of their manuscript (arXiv version 2, April 2016) they then remarked the tightness and claimed to have found the capacity too. In agreement with these authors, we have therefore decided to credit each other for the independent derivation of the secret-key capacity of the erasure channel.

Simulation and stretching of bosonic channels
In March 2016, several months after our manuscript was available on the arXiv, an author uploaded a paper [79] discussing some mathematical aspects associated with our treatment of teleportation stretching with bosonic channels. Let us briefly give some background before clarifying that these mathematical aspects were already taken into account and addressed in our arXiv version 2 of December 2015 [80].
Teleportation stretching of bosonic channels involves the use of an asymptotic CV EPR state Φ, defined as the limit of TMSV states Φ µ . As a consequence, we have to consider the following steps: (i) We first perform an imperfect stretching of the protocol based on a finite-energy TMSV state Φ µ ; (ii) we compute the relevant functionals on the finite-energy decomposition of the output; and (iii) we take the infinite-energy limit µ → +∞ on the final result. This is actually a standard procedure in any calculus with a delta function, which is implicitly meant to be a limit of test functions. This is also why the Vaidman teleportation protocol [81] (based on an asymptotic delta-like CV EPR state) has to be implicitly replaced by the Braunstein-Kimble protocol [63], where the resource state is a TMSV state Φ µ and the infinite-energy limit is computed at the end on the fidelity.
Such a basic argument was already present in our earlier arXiv versions. Already in December 2015 [80] we stated that, for bosonic channels, one needs to relax the condition of infinite energy and replace the asymptotic CV EPR state Φ by a sequence of TMSV states Φ µ , defining a sequence of Choi-approximating states ρ µ E := I ⊗ E(Φ µ ). The latter states are then used to compute the relative entropy of entanglement before taking the limit for large µ; see Eq. (9) and corresponding text of ref. [80]. Therefore, our treatment of bosonic channels was already rigorous and correct well before ref. [79]. However, we have also realized that these non-trivial steps were too implicit. For this reason, we have decided to fully expand the specific treatment of bosonic channels in more recent arXiv versions of our manuscript. Furthermore, in order to be completely rigorous, we have also accounted for the fact that the CV Bell detection also needs to be approximated by a suitable limit of finite-energy measurements.

Shield system
In earlier arXiv versions of our manuscript, we proved our weak converse theorem by exploiting an (at most) exponential growth of the dimensionality of the shield system in the private state. This corresponds to the first proof in Supplementary Note 3. This assumption on the shield size is correct and fully justified by the argument of refs. [14,15] which may be applied to both DV and CV channels, as presented in Lemma 4 of Supplementary Note 3 for the sake of completeness. Despite the correctness of this approach, in later arXiv versions we have also provided two additional proofs, alternative but essentially equivalent to the first one (with exactly the same conclusions). Our second proof relies on an exponential increase of the mean number of photons in the private state, while our third proof is independent from the shield system. See Supplementary Note 3 for full details. It is clear that these proofs are all complete proofs which do not need further confirmation or validation by follow-up works.