Strategies for single-shot discrimination of process matrices

The topic of causality has recently gained traction quantum information research. This work examines the problem of single-shot discrimination between process matrices which are an universal method defining a causal structure. We provide an exact expression for the optimal probability of correct distinction. In addition, we present an alternative way to achieve this expression by using the convex cone structure theory. We also express the discrimination task as semidefinite programming. Due to that, we have created the SDP calculating the distance between process matrices and we quantify it in terms of the trace norm. As a valuable by-product, the program finds an optimal realization of the discrimination task. We also find two classes of process matrices which can be distinguished perfectly. Our main result, however, is a consideration of the discrimination task for process matrices corresponding to quantum combs. We study which strategy, adaptive or non-signalling, should be used during the discrimination task. We proved that no matter which strategy you choose, the probability of distinguishing two process matrices being a quantum comb is the same.


Introduction
The topic of causality has remained a staple in quantum physics and quantum information theory for recent years.The idea of a causal influence in quantum physics is best illustrated by considering two characters, Alice and Bob, preparing experiments in two separate laboratories.Each of them receives a physical system and performs an operation on it.After that, they send their respective system out of the laboratory.In a causally ordered framework, there are three possibilities: Bob cannot signal to Alice, which means the choice of Bob's action cannot influence the statistics Alice records (denoted by A ≺ B), Alice cannot signal to Bob (B ≺ A), or neither party can influence the other (A||B).A causally neutral formulation of quantum theory is described in terms of quantum combs [1].
One may wonder if Alice's and Bob's action can influence each other.It might seem impossible, except in a world with closed time-like curves (CTCs) [2].But the existence of CTCs implies some logical paradoxes, such as the grandfather paradox [3].Possible solutions have been proposed in which quantum mechanics and CTCs can exist and such paradoxes are avoided, but modifying quantum theory into a nonlinear one [4].A natural question arises: is it possible to keep the framework of linear quantum theory and still go beyond definite causal structures?
One such framework was proposed by Oreshkov, Costa and Brukner [5].They introduced a new resource called a process matrix -a generalization of the notion of quantum strategy.In Section 8, we analyze an alternative way to achieve this expression using the convex cone structure theory.Concluding remarks are presented in the final Section 9.In the Appendix A, we provide technical details about the convex cone structure.

Mathematical preliminaries
Let us introduce the following notation.Consider two complex Euclidean spaces and denote them by X , Y.By L(X , Y) we denote the collection of all linear mappings of the form A : X → Y.As a shorthand put L(X ) := L(X , X ).By Herm(X ) we denote the set of Hermitian operators while the subset of Herm(X ) consisting of positive semidefinite operators will be denoted by Pos(X ).The set of quantum states, that is positive semidefinite operators ρ such that tr ρ = 1, will be denoted by Ω(X ).An operator U ∈ L (X ) is unitary if it satisfies the equation U U † = U † U = 1l X .The notation U (X ) will be used to denote the set of all unitary operators.We will also need a linear mapping of the form Φ : L(X ) → L(Y) transforming L(X ) into L(Y).The set of all linear mappings is denoted M(X , Y).There exists a bijection between set M(X , Y) and the set of operators L(Y ⊗X ) known as the Choi [17] and Jamio lkowski [18] isomorphism.For a given linear mapping Φ M : L(X ) → L(Y) corresponding Choi matrix M ∈ L(Y ⊗ X ) can be explicitly written as We will denote linear mappings by Φ M , Φ N , Φ R etc., whereas the corresponding Choi matrices as plain symbols: M, N, R etc. Let us consider a composition of mappings Φ R = Φ N • Φ M where Φ N : L(Z) → L(Y) and Φ M : L(X ) → L(Z) with Choi matrices N ∈ L(Z ⊗ Y) and M ∈ L(X ⊗ Z), respectively.Then, the Choi matrix of Φ R is given by [19] where M T Z denotes the partial transposition of M on the subspace Z.The above result can be expressed by introducing the notation of the link product of the operators N and M as Finally, we introduce a special subset of all mappings Φ, called quantum channels, which are completely positive and trace preserving (CPTP).In other words, the first condition reads for all X ∈ Pos(X ⊗ Z) and I Z is an identity channel acts on L(Z) for any Z, while the second condition reads tr(Φ(X)) = tr(X) (5) for all X ∈ L(X ).
In this work we will consider a special class of quantum channels called non-signaling channels (or causal channels) [20,21].We say that Φ It can be shown [22] that each non-signaling channel is an affine combination of product channels.More precisely, any non-signaling channel Φ where For the rest of this paper, by NS(X I ⊗ X O ⊗ Y I ⊗ Y O ) we will denote the set of Choi matrices of non-signaling channels.
The most general quantum operations are represented by quantum instruments [23,24], that is, collections of completely positive (CP) maps {Φ M i } i associated to all measurement outcomes, characterized by the property that i Φ M i is a quantum channel.
We will also consider the concept of quantum network and tester [25].We say that Φ R (N ) is a deterministic quantum network (or quantum comb) if it is a concatenation of where R (k−1) ∈ L 2k−3 i=0 X i is the Choi matrix of the reduced quantum comb with concatenation of k − 1 quantum channels, k = 2, . . ., N .We remind that a probabilistic quantum network Φ S (N ) is equivalent to a concatenation of N completely positive trace non increasing linear maps.Then, the Choi operator S (N ) of Φ S (N ) satisfies 0 ≤ S (N ) ≤ R (N ) , where R (N ) is Choi matrix of a quantum comb.Finally, we recall the definition of a quantum tester.A quantum tester is a collection of probabilistic quantum networks R ) , and additionally dim(X 0 ) = dim(X 2N −1 ) = 1.We will also use the Moore-Penrose pseudo-inverse by abusing notation X −1 ∈ L(Y, X ) for an operator X ∈ L(X , Y).Moreover, we introduce the vectorization operation of X defined by

Process matrices
This section introduces the formal definition of the process matrix with its characterization and intuition.Next, we present some classes of process matrices considered in this paper.
Let us define the operator X Y as for every Y ∈ L(X ⊗ Z), where Z is an arbitrary complex Euclidean space.We will also need the following projection operator where Definition 1.We say that W ∈ Herm(A ) is a process matrix if it fulfills the following conditions where the projection operator L V is defined by Eq. (10).
The set of all process matrices will be denoted by W PROC .In the upcoming considerations, it will be more convenient to work with the equivalent characterization of process matrices which can be found in [26].
Definition 2. We say that W ∈ W PROC is a process matrix if it fulfills the following conditions W ≥ 0, The concept of process matrix can be best illustrated by considering two characters, Alice and Bob, performing experiments in two separate laboratories.Each party acts in a local laboratory, which can be identified by an input space A I and an output space .Finally, the joint probability for a pair of outcomes i and j can be expressed as where W ∈ W PROC is a process matrix that describes the causal structure outside of the laboratories.The valid process matrix is defined by the requirement that probabilities are well defined, that is, they must be non-negative and sum up to one.These requirements give us the conditions present in Definition 1 and Definition 2.
In the general case, the Alice's and Bob's strategies can be more complex than the product strategy M A i ⊗ M B j which defines the probability p ij given by Eq. ( 13).If their action is somehow correlated, we can write the associated instrument in the following form Φ N AB ij .It was observed in [26] that this instrument describes a valid strategy, that is tr for all process matrix W ∈ W PROC if and only if In this paper, we will consider different classes of process matrices.Initially, we define the subset of process matrices known as free objects in the resource theory of causal connection [27].Such process matrices will be defined as follows.
Definition 3. We say that W A||B ∈ W PROC is a free process matrix if it satisfies the following condition where The set of all process matrices of this form will be denoted by W A||B .
We often consider process matrices corresponding to quantum combs [19].For example, a quantum comb A ≺ B (see in Fig. 1) shows that Alice's and Bob's operations are performed in causal order.This means that Bob cannot signal to Alice and the choice of Bob's instrument cannot influence the statistics Alice records.Such process matrices are formally defined in the following way.Definition 4. We say that W A≺B ∈ W PROC is a process matrix representing a quantum comb A ≺ B if it satisfies the following conditions The set of all process matrices of this form will be denoted by W A≺B .
One can easily observe that the set W A||B is an intersection of the sets W A≺B and W B≺A .Finally, the definition of the set W A≺B , together with W B≺A allow us to provide their convex hull which is called as causally separable process matrices.
Definition 5. We say that W SEP ∈ W PROC is a causally separable process matrix if it is of the form where W A≺B ∈ W A≺B , W B≺A ∈ W B≺A for some parameter p ∈ [0, 1].The set of all causally separable process matrices will be denoted by W SEP .
There are however process matrices that do not correspond to a causally separable process and such process matrices are known as causally non-separable (CNS).The examples of such matrices were provided in [5,28].The set of all causally non-separable process matrices will be denoted by W CNS .In Fig. 2 we present a schematic plot of the sets of process matrices.

Discrimination task
This section presents the concept of discrimination between pairs of process matrices.It is worth emphasizing that the definition of a process matrix is a generalization of the concept of quantum states, channels, superchannels [29] and even generalized supermaps [30,31].The task of discrimination between process matrices poses a natural extension of discrimination of quantum states [32], channels [33] or measurements [13].The process matrices discrimination task can be described by the following scenario.
Let us consider two process matrices W 0 , W 1 ∈ W PROC .The classical description of process matrices W 0 , W 1 is assumed to be known to the participating parties.We know that one of the process matrices, W 0 or W 1 , describes the actual correlation between Alice's and Bob's laboratories, but we do not know which one.Our aim is to determine, with the highest possible probability, which process matrix describes this correlation.For this purpose, we construct a discrimination strategy S. In the general approach, such a strategy S is described by an instrument S = {S 0 , S 1 }.Due to the requirement given by Eq. ( 14), the instrument S must fulfill the condition The result of composing a process matrix W with the discrimination strategy S results in a classical label which can take values zero or one.If the label zero occurs, we decide to choose that the correlation is given by W 0 .Otherwise, we decide to choose W 1 .In this setting the maximum success probability p succ (W 0 , W 1 ) of correct discrimination between two process matrices W 0 and W 1 can be expressed by The following theorem provides the optimal probability of process matrices discrimination as a direct analogue of the Holevo-Helstrom theorem for quantum states and channels.
Theorem 1.Let W 0 , W 1 ∈ W PROC be two process matrices.For every choice of discrimination strategy S = {S 0 , S 1 }, it holds that where is the set of Choi matrices of non-signaling channels.Moreover, there exists a discrimination strategy S, which saturates the inequality Eq. (20).

Proof. Let us define the sets
and We prove the equality between sets A and B.
Let us fix Finally, it is suffices to take and Moreover, from Holevo-Helstrom theorem [33] there exists a projective binary measurement Q = {Q 0 , Q 1 } such that the last inequality is saturated, which completes the proof.
Corollary 1.The maximum probability p succ (W 0 , W 1 ) of correct discrimination between two process matrices W 0 and W 1 is given by As a valuable by-product of Theorem 1, we receive a realization of process matrices discrimination scheme.The schematic representation of this setup is presented in Fig. 3. To distinguish the process matrices W 0 and W 1 , Alice and Bob prepare the strategy ) with the Choi matrix K given by where It is worth noting that the quantum channel Φ K is correctly defined due to the fact that tr X 1,2,3,4 K ∈ NS(A I ⊗A O ⊗B I ⊗B O ).Afterwards, they perform the binary measurement ) is defined by Eq. ( 24).Next, they decide which process matrix was used during the calculation assuming W 0 if the measurement label is 0. Otherwise, they assume W 1 .

Discrimination between different classes of process matrices
This section presents some examples of discrimination between different classes of process matrices.We begin our consideration with the problem of discrimination between two free process matrices W A||B .Next, we will consider various cases of process matrices discrimination representing a quantum comb.First, we calculate exact probability of correct discrimination between two process matrices come from the same class W A≺B .Next, we study the discrimination task assuming that one of the process matrices is of the form W A≺B and the other one is of the form W B≺A .Finally, we construct a particular class of process matrices which can be perfectly distinguished.
A schematic representation of the setup for distinguishing between process matrices W 0 and W 1 .The discrimination strategy is constructed by using the quantum channel Φ K : L(A ) and the binary measurement Q = {Q 0 , Q 1 } defined in the proof of Theorem 1.

5.1.
Free process matrices.The following consideration confirms an intuition that the task of discrimination between free process matrices reduces to the problem of discrimination between quantum states.
From definition of p succ (W 0 , W 1 ) we have Let W 0 and W 1 be two process matrices of the form Let us observe tr } is a binary measurement and therefore, from Holevo-Helstrom theorem for quantum states, we have max Now, assume that E = {E 0 , E 1 } is the Holevo-Helstrom measurement (by taking E 0 and E 1 as positive and negative part of ρ − σ, respectively).Hence, we obtain max Observe, it is suffices to take 0| is non-signaling channel.Therefore, we have which completes the consideration.Due to the above consideration, we obtain the following corollary.
Corollary 2. Let ρ, σ ∈ Ω(A I ⊗ B I ) be quantum states and let W 0 , W 1 ∈ W A||B be two free process matrices of the form W 0 = ρ ⊗ 1l and W 1 = σ ⊗ 1l.Then, 5.2.Process matrices representing quantum combs.Here, we will compare the probability of correct discrimination between two process matrices being quantum combs of the form W A≺B 0 , W A≺B 1 ∈ W A≺B by using non-signalling strategy S = {S 0 , S 1 } described by Eq. (54) or an adaptive strategy.
Before that, we will discuss the issue of adaptive strategy.The most general strategy of quantum operations discrimination is known as an adaptive strategy [19,34].An adaptive strategy is realized by a quantum tester [1].A schematic representation of this setup is presented in Fig. 4.
Let us consider a quantum tester {L 0 , L 1 }.The probability of correct discrimination between W A≺B 0 and W A≺B 1 by using an adaptive strategy is defined by equation It turns out that we do not need adaptation in order to obtain the optimal probability os distinction.This is stated formally in the following theorem.
Proof.For simplicity, we will omit superscripts (A ≺ B and B ≺ A).The inequality p succ (W 0 , W 1 ) ≤ p adapt (W 0 , W 1 ) is trivial by observing that we calculate maximum value over a larger set.
A schematic representation of an adaptive strategy discriminating two process matrices W 0 , W 1 ∈ W A≺B by using a quantum tester {L 0 , L 1 }.
To show p succ (W 0 , W 1 ) ≥ p adapt (W 0 , W 1 ), let us consider the quantum tester {L 0 , L 1 } which maximizes Eq. ( 34), that means Hence, from definition of W A≺B we have and then we obtain Observe that tr B O (L 0 + L 1 ) = 1l B I ⊗ J, where J is a Choi matrix of a channel Φ J : L(A I ) → L(A O ).Let us define a strategy S = {S 0 , S 1 } such that

It easy to observe that S ∈ NS(A
It implies that p succ (W 0 , W 1 ) ≥ p adapt (W 0 , W 1 ) , (41) which completes the proof.5.3.Process matrices of the form W A≺B and W B≺A .Now, we present some results for discrimination task assuming the one of the process matrices if of the form W A≺B and the other one is of the form W B≺A .We will construct a particular class of such process matrices for which the perfect discrimination is possible.
Let us define a process matrix of the form where ρ ∈ Ω(A I ), |U U | is the Choi matrix of a unitary channel Ad U : L(A O ) → L(B I ) of the form Ad U (X) = U X Ū and 1l ∈ L(B O ).A schematic representation of this process matrix we can see in Fig. 5.
Figure 5.A schematic representation of process matrix W A≺B given by Eq. ( 42).
Proposition 1.Let W A≺B be a process matrix given by Eq. (42).Let us define a process matrix W B≺A of the form where P π is the swap operator replacing the systems A I → B I and A O → B O .Then, the process matrix W A≺B is perfectly distinguishable from W B≺A .
Proof.Let us consider the process matrix given by Eq. ( 42) described by Fig. 5. W.l.o.g.let d be a dimension of each of the systems.Let ρ = d−1 i=0 λ i |x i x i |, where λ i ≥ 0 such that i λ i = 1.Based on the spectral decomposition of ρ we create the unitary matrix V by taking i-th eigenvector of ρ, and the measurement ∆ V (in basis of ρ) given by Let us also define the permutation matrix Alice and Bob prepare theirs discrimination strategy.Alice performs the local channel (see Fig. 6) given by Φ Meanwhile, Bob performs his local channel (see Fig. 7) given by Φ A : A schematic representation of Alice's discrimination strategy described by Eq. ( 45).
Φ B : A schematic representation of Bobs' discrimination strategy described by Eq. ( 46).
Let us consider the case A ≺ B. The output after Alice's action is described by Next, we apply the quantum channel Ad U (see Fig. 5), and hence we have In the next step, Bob applies his channel as follows Finally, we apply partial trace operation on the subspace B O (see Fig. 5), that means So, the quantum state obtained after the discrimination scenario in the case A ≺ B is given by It implies that if Alice measures her system, she obtains the label i with probability λ i whereas Bob obtains the label (i + 1) mod d with the same probability.On the other hand, considering the case B ≺ A, then the state obtained after the discrimination scenario is given by So, Bob and Alice obtain the same label (i + 1) mod d with probability λ i .Then, the quantum channel Φ K (realizing the discrimination strategy S) is created as a tensor product of Alice's and Bob's local channels, that means Φ K = Φ A ⊗ Φ B .Due to that they perform the binary measurement Q = {Q 0 , Q 1 }, where the effect Q 1 is given by Hence, we have In summary, the process matrices W A≺B and W B≺A are perfectly distinguishable by Alice and Bob which completes the proof.

SDP program for calculating the optimal probability of process matrices discrimination
In the standard approach, we would need to compute the probability of correct discrimination between two process matrices W 0 and W 1 .For this purpose, we use the semidefinite programming (SDP).This section presents the SDP program for calculating the optimal probability of discrimination between W 0 and W 1 .
Recall that the maximum value of such a probability can be noticed by with requirement that the optimal strategy S = {S 0 , S 1 } is a quantum instrument such that S 0 + S 1 ∈ NS(A I ⊗ A O ⊗ B I ⊗ B O ).Hence, we arrive at the primal and dual problems presented in the Program 1.To optimize this problem we used the Julia programming language along with quantum package QuantumInformation.jl[35] and SDP optimization via SCS solver [36,37] with absolute convergence tolerance 10 −5 .The code is available on GitHub [38].It may happen that the values of primal and dual programs are equal.This situation is called strong duality.Slater's theorem provides the set of conditions which guarantee strong duality [33].It can be shown that Program 1 fulfills conditions of Slater's theorem (it is suffices to take Y 0 , Y 1 = 0 and α > 1 2 max{Λ max (W 0 ), Λ max (W 1 )}, where Λ max (X) is the maximum eigenvalue of X).Therefore, we can consider the primal and the dual problem equivalently.SDP program for calculating the optimal probability of discrimination between W 0 and W 1 Primal problem maximize: Program 1. Semidefinite program for maximizing the probability of correct discrimination between two process matrices W 0 and W 1 .

Distance between process matrices
In this section we present the semidefinite programs for calculating the distance in trace norm between a given process matrix W ∈ W PROC and different subsets of process matrices, such that W A||B , W A≺B , W B≺A or W SEP .
For example, let us consider the case W A||B .Theoretically, the distance between a process matrix W and the set of free process matrices W A||B can be expressed by dist W, W A||B = min Analogously, for the sets W A≺B , W B≺A and W SEP with the minimization condition min W ∈W A≺B , min W ∈W B≺A , min W ∈W SEP , respectively.Due to the results obtained from the previous section (see Program 1) and Slater theorem we are able to note the Eq. ( 55) to SDP problem presented in the Program 2. We use the SDP optimization via SCS solver [36,37] with absolute convergence tolerance 10 −8 and relative convergence tolerance 10 −8 .The implementations of SDPs in the Julia language are available on GitHub [38].
SDP calculating the distance between a process matrix W and the set Υ. minimize: Program 2. Semidefinite program for computation the distance between a process matrix W and Υ, which can be one of the set W A||B , W A≺B , W B≺A or W SEP .Depending on the selected set we include additional constrains to SDP described by Eq. ( 16) for W A||B , Eq. ( 17) for W A≺B and W B≺A or Eq. ( 18) for W SEP .

Example. Let A
Let us consider a causally nonseparable process matrix comes from [5] of the form where σ X x , σ X z are Pauli matrices on space L(X ).We have calculated the distance in trace norm between W CNS and different subset of process matrices.Finally, we obtain The numerical computations give us some intuition about the geometry of the set of process matrices.Those results are presented in Fig. 8.Moreover, by using W CNS given by Eq. ( 56) it can be shown that the set of all causally non-separable process matrices is not convex.To show this fact, it suffices to observe that for every σ i , σ j , σ k , σ l ∈ {σ x , σ y , σ z , 1l} the following equation holds Simultaneously, the average of the process matrices of the form Eq. (61) distributed uniformly states . A schematic representation of the distances between W CNS defined in Eq. ( 56) and the sets W A||B , W A≺B , W B≺A and W SEP .

Convex cone structure theory
From geometrical point of view, we present an alternative way to derive of Eq. ( 26).It turns out that the task of process matrices discrimination is strictly connected with the convex cone structure theory.To keep this work self-consistent, the details of convex cone structure theory are presented in Appendix A.
Let V be a finite dimensional real vector space with a proper cone C ⊂ V.A base B of the proper cone C is a compact convex subset B ⊂ C such that each nonzero element c ∈ C has a unique representation in the form c = α • b, where α > 0 and b ∈ B. The corresponding base norm in V is defined by From [34,Corollary 2] the author showed that the base norm can be written as Consider the linear subspace S ⊂ V given by S = {W ∈ V : together with its proper cone C S .Observe that if we fix trace of W ∈ C S such that tr(W ) = dim(A O ) • dim(B O ), we achieve the set of all process matrices W PROC .And then, W PROC is a base of C S .
Proposition 2. Let W PROC be the set of process matrices.Then, the set W PROC is determined by Proof.We want to prove that tr Let us first take X ∈ NS(A I ⊗ A O ⊗ B I ⊗ B O ).Then, from [22, Lemma 1], we note where From definition of process matrix and linearity we obtain To prove opposite implication, let us take W = 1l A O ⊗ J, where J is the Choi matrix of quantum channel Φ From [39] we have tr

Conclusion and discussion
In this work, we studied the problem of single shot discrimination between process matrices.Our aim was to provide an exact expression for the optimal probability of correct distinction and quantify it in terms of the trace norm.This value was maximized over all Choi operators of non-signaling channels and and poses direct analogues to the Holevo-Helstrom theorem for quantum channels.In addition, we have presented an alternative way to achieve this expression by using the convex cone structure theory.As a valuable by-product, we have also found the optimal realization of the discrimination task for process matrices that use such non-signalling channels.Additionally, we expressed the discrimination task as semidefinite programming (SDP).Due to that, we have created SDP calculating the distance between process matrices and we expressed it in terms of the trace norm.Moreover, we found an analytical result for discrimination of free process matrices.It turns out that the task of discrimination between free process matrices can be reduced to the task of discrimination between quantum states.Next, we consider the problem of discrimination for process matrices corresponding to quantum combs.We have studied which strategy, adaptive or non-signalling, should be used during the discrimination task.We proved that no matter which strategy you choose, the optimal probability of distinguishing two process matrices being a quantum comb is the same.So, it turned out that we do not need to use some unknown additional processing in this case.Finally, we discovered a particular class of process matrices having opposite causal order, which can be distinguished perfectly.This work paves the way toward a complete description of necessary and sufficient criterion for perfect discrimination between process matrices.Moreover, it poses a starting point to fully describe the geometry of the set of process matrices, particularly causally non-separable process matrices.
Next, consider a linear space with fixed inner product.If X is an inner product space, then the Riesz representation theorem [40] holds that the inner product determines an isomorphism between X and X * .Therefore, the cone C is equal to C * .
An interior point e ∈ int(C) of a cone C is called an order unit if for each x ∈ X , there exists λ > 0 such that λe − x ∈ C. Whereas, a base of C is defined as compact and convex subset B ⊂ C such that for every z ∈ C \ {0}, there exists unique t > 0 and an element b ∈ B such that z = tb.It can be shown that the set is the base of C (determined by element e) if and only if an element e is an order unit and e ∈ int (C * ).Finally, we define the base norm as It can be shown [34] that the base norm is expressed as A O for Alice, and analogously B I and B O for Bob.In general, a label i, denoting Alice's measurement outcome, is associated with the CP map Φ M A i obtained from the instrument Φ M A i i .Analogously, the Bob's measurement outcome j is associated with the map Φ M B j from the instrument Φ M B j j

Figure 2 .
Figure 2. A schematic representation of the sets of process matrices W PROC .

Theorem 2 ., W A≺B 1 ∈
Let W A≺B 0 W A≺B be two process matrices representing quantum combs A ≺ B. Then,

Corollary 3 .
) where P ∈ Pos(B I ⊗ B O ).Similarly, if we takeW := 1l B O ⊗ K, where K is the Choi matrix of a quantum channel Φ K : L(A I ⊗ A O ) → L(B I ), we obtain tr B O X = 1l B I ⊗ P,(74)whereP ∈ Pos(A I ⊗ A O ).It implies that X ∈ NS(A I ⊗ A O ⊗ B I ⊗ B O ), which completes the proof.Due to Proposition 2, we immediately obtain the following corollary.The base norm || • || W PROC between two process matrices W 1 , W 2 ∈ W PROC can be expressed as||W 1 − W 2 || W PROC = max{|| √ N (W 1 − W 2 ) √ N || 1 : N ∈ NS(A I ⊗ A O ⊗ B I ⊗ B O )}. (75) whereB = { b ∈ C : tr(b b) = 1, ∀b ∈ B}.