Tight security bounds for decoy-state quantum key distribution

The BB84 quantum key distribution (QKD) combined with decoy-state method is currently the most practical protocol, which has been proved secure against general attacks in the finite-key regime. Thereinto, statistical fluctuation analysis methods are very important in dealing with finite-key effects, which directly affect secret key rate, secure transmission distance and most importantly, the security. There are two tasks of statistical fluctuation in decoy-state BB84 QKD. One is the deviation between expected value and observed value for a given expected value or observed value. The other is the deviation between phase error rate of computational basis and bit error rate of dual basis. Here, we provide the rigorous and optimal analytic formula to solve the above tasks, resulting to higher secret key rate and longer secure transmission distance. Our results can be widely applied to deal with statistical fluctuation in quantum cryptography protocols.

www.nature.com/scientificreports/ method and that obtained from the Gaussian analysis. In order to close this gap, the inverse solution Chernoff bound method 26 is presented, which achieves a similar performance with Gaussian analysis. Here, we should point out that the inverse solution Chernoff bound method also seems to be not rigorous. An important assumption in Chernoff bound is that one should have the prior knowledge of expected value. However, the problem that we have in hand is the opposite that we need to estimate expected value for a given observed value. This is why the multiplicative form Chernoff bound is somehow complex and carefully tailored. A direct criterion is that the lower bound result of inverse solution Chernoff bound is superior to the Gaussian analysis when one has a small observed value. Note that the result of Gaussian analysis should be optimal because the identically distributed assumption is a special case. For BB84 protocol, one need bound the the conditional smooth min-entropy 27 , which relates to the phase error rate. The phase error rate cannot be directly observed, which can only be estimated by using the random sampling without replacement theory for security against the general attacks. A hypergeometric distribution method 28 is first proposed to deal with the deviation between phase error rate of computational basis and bit error rate of dual basis in the finite-key regime. By using the inequality scaling technique, a numerical equation solution by using Shannon entropy function 29 is acquired to estimate the phase error rate. Based on this, an analytical solution is obtained when the data size is large 25 . A looser analytical solution is using the Serfling inequality 24 . By exploiting the Ahrens map for Hypergeometric distribution, one uses Clopper-Pearson confidence interval 30 replace the Serfling inequality. Recently, a specifically tailored analytical solution is acquired 31 , which achieves a big advantage compared to Serfling inequality. Here, we should point out that the specifically tailored analytical solution 31 for random sampling without replacement is incorrect. The inequality scaling of binomial coefficient and Eq. (11) in supplementary information of Ref. 31 is wrong.
In order to further improve the secret key rate in the case of high-loss, some authors of us have developed the tightest method to solve the above two tasks of statistical fluctuation 32 . Thereinto, the numerical equation of Chernoff bound is used to estimate the observed value for a given expected value. A numerical equation of Chernoff bound's variant is exploited to obtain the expected value for a given observed value. A numerical equation relating to the hypergeometric distribution is directly applied to acquire the phase error rate for a given bit error rate. These numerical equation solutions are very tight but they are very inconvenient to use. On the one hand, it will be very time consuming if we optimize the system parameters globally by solving transcendental equations. On the other hand, it is a challenge to solve transcendental equations for each time post-processing in commercial QKD system with hardware. In this work we present the optimal analytical formulas to solve the two tasks of statistical fluctuation by using the rigorous inequality scaling technique. Furthermore, we establish the complete finite-key analysis for decoy-state BB84 QKD with composable security. The simulation results show that the secret key rate and secure transmission distance of our method have a significant advantage compared with previous rigorous methods.

Statistical fluctuation analysis.
We let x * be the expected value, x be the observed value, x and x be the lower and upper bound of x. Here, we first introduce the numerical equation result of Ref. 32 . Then we present the tight analytical formulas by using the rigorous inequality scaling technique, which are the slightly looser bounds than those obtained by solving equations.
Random sampling without replacement. Let X n+k := {x 1 , x 2 , ..., x n+k } be a string of binary bits with n + k size, in which the number of bits value is unknown. Let X k be a random sample (without replacement) bit string with k size from X n+k . Let be the probability of bit value 1 observed in X k . Let X n be the remaining bit string, where the probability of bit value 1 observed in X n is χ . Then, in this article, we let C j i = i! j!(i−j)! be the binomial coefficient. For any ǫ > 0 , we have the upper tail Pr[χ ≥ + γ U ] ≤ ǫ , where γ U represents γ U (n, k, , ǫ) and γ U is the positive root of the following equation 32 Calculating Eq. (1), we get numerical results of γ U , corresponding to the upper bound of the random sampling without replacement. Solving transcendental equation Eq. (1) is usually very complicated. Here, we are going to make use of some techniques mathematically to get rigorous tight analytical result. Detailed proof can be found in "Methods" section. For the upper tail, let 0 < < χ ≤ 0.5 , we have the analytical result where A = max{n, k} and G = n+k nk ln n+k 2πnk (1− )ǫ 2 . Therefore, the upper bound of χ can be given by χ = + γ U with a failure probability ǫ . Figure 1 shows the comparison results between our method and previous method [24][25][26]32 , which means that our analytic result is optimal and closes to the numerical results.
Chernoff bound. Let X 1 , X 2 ..., X N be a set of independent Bernoulli random variables that satisfy Pr(X i = 1) = p i (not necessarily equal), and let X := N i=1 X i . The expected value of X is denoted as x * := E[X] = N i=1 p i . An observed value of X is represented as x for a given trial. Note that, we have x ≥ 0 , x * ≥ 0 , x * is known and x is unknown. For any ǫ > 0 , we have the upper tail Pr[x ≥ (1 + δ U )x * ] ≤ ǫ , where δ U represents δ U (x * , ǫ) and δ U > 0 is the positive root of the following equation 32 (1) Therefore, the lower and upper bound of observed value x for a given expected value x * can be given by x = x * + β 2 + 2βx * + β 2 4 and x = x * − √ 2βx * with a failure probability ǫ , respectively. Note that we must have the lower bound x ≥ 0 . The analytic result of upper bound in Eq. (5) is also acquired in Ref. 26 while we obtain more optimal lower bound in Eq. (6).
Variant of Chernoff bound. Let X 1 , X 2 ..., X N be a set of independent Bernoulli random variables that satisfy Pr(X i = 1) = p i (not necessarily equal), and let X := N i=1 X i . The expected value of X is denoted as and U is the positive root of the following equation 32 For any ǫ > 0 , we have the upper tail Pr[x * ≥ x + L ] , where L represents � L (x, ǫ) and L is the positive root of the following equation 32 By solving Eqs. (7) and (8), we get numerical results of U and L , corresponding to the upper bound and lower bound. Solving transcendental equations Eqs. (7) and (8) are usually very complicated. For the upper tail, by using the inequality ln 1 in Eq. (7), we have the analytical result For the lower tail, by using the inequality ln 1 www.nature.com/scientificreports/ Therefore, the lower and upper bound of expected value x * for a given observed value x can be given by with a failure probability ǫ , respectively. Note that we must have the lower bound x * ≥ 0 . Utilizing a simple function transformation, the numerical result of upper bound x * with Eq. (7) is the same as (Eq. (28) in this paper) in Ref. 26 , while the analytic result of upper bound is more optimal in this work. The numerical result of lower bound x * with Eq. (8) is different from that in Ref. 26 , and the difference between two analytic results of lower bound is only β . However, we should point out that our result is always inferior to the Gaussian analysis, while the result of Ref. 26 is superior to the Gaussian analysis given a small observed value, details can be found in Fig. 2. It means that our result is rigorous while that of Ref. 26 is not. The case of small observed value is very important since the vacuum state is widely used in decoy-state method, especially for the experiment of measurement-device-independent QKD 33 .
Finite-key analysis for decoy-state BB84 QKD. We exploit our statistical fluctuation analysis methods to deal with finite-key effects against coherent attacks 25,34 for BB84 QKD with two decoy states. Note that the four-intensity protocol 35 usually has better performance. Compared with previous results [24][25][26] , we provide the complete extractable secret key formula. For example, the number of vacuum component events, the number of single-photon component events, and the phase error rate associated with the single-photons component events are all required to use observed values in the extractable secret key formula, while all or part of them are taken as the expected values in Ref. [24][25][26] . Obviously, they are observed values, for instance, the QKD system with single-photon source 27 .
The asymmetric coding BB84 protocol, based on which we consider our protocol, means that the bases Z and X are chosen with biased probabilities, both when Alice prepare the quantum states and when Bob measure those states. Furthermore, intended to simplifying the protocol a little, we let the secret key be extracted only if Alice and Bob both choose the Z basis. Also, for the same purpose, the protocol will be built on the transmission of phase-randomized laser pulses and makes use of vacuum and weak decoy states. Below we provide a detailed description of the protocol with active basis choosing.
1. Preparation The first three steps are repeated by Alice and Bob for i = 1, . . . , N until the conditions in the reconciliation step are satisfied. Alice will prepare weak coherent pulse and encode under the {Z, X} basis, along with an intensity k ∈ {µ, ν, 0} . Let the probability of choosing Z and X basis be p z and p x = 1 − p z . Simultaneously, the probabilities of selecting intensities are p µ , p ν and p 0 = 1 − p µ − p ν , respectively. Then Alice sends the weak coherent pulse to Bob through the insecure quantum channel. 2. Measurement When receiving the pulse, Bob also chooses a basis Z and X with probabilities q z and q x = 1 − q z , respectively. Then, he measures the state with two single-photon detectors in that basis. An effective event represents at least one detector click. For double detector click event, he acquires a random bit value. 3. Reconciliation Alice and Bob share the effective event, basis and intensity information with each other using an authenticated classical channel. We use the following sets Z k ( X k ), which identifies signals where both Alice and Bob select the basis Z ( X ) for k intensity. Then, they check for |Z k | ≥ n Z k and |X k | ≥ n X k for all values of k. They repeat step 1 to step 3 until these conditions are satisfied. We remark that the vacuum state prepared by Alice has no basis information. 4. Parameter estimation After reconciling the basis and intensity choices, Alice and Bob will select a size of n Z = n Z µ + n Z ν to get a raw key pair (Z A , Z B ) . All sets are used to compute the number of vacuum events s Z 0 and single-photon events s Z 1 and the phase error rate of single-photon events φ Z 1 in Z A . After that, a condition (10) www.nature.com/scientificreports/ should be met that the phase error rate φ Z 1 is less than φ tol , where φ tol is a predetermined phase error rate. If not, Alice and Bob abort the results and get started again. Otherwise, they move on to step 5. 5. Postprocessing First, Alice and Bob operate an error correction, where they reveal at most EC bits of information. Then, an error-verification step is performed using a random universal 2 hash function that announces ⌈log 2 1 ε cor ⌉ bits of information 36 , where ε cor is the probability that a pair of nonidentical keys passes the errorverification step. At last, there is a privacy amplification on their keys to get a secret key pair ( S A ,S B ), both of which are ℓ bits, by using a random universal 2 hash function.
Before stating how to calculate the security bound, we will spell out our security criteria, i.e., the so-called universally composable framework 37 . We have two criteria ( ε cor and ε sec ) to determine how secure of our protocol. If Pr[S A � = S B ] ≤ ε cor , which means the secret keys are identical except with a small probability ε cor , we can call it is ε cor -correct. Meanwhile, if (1 − p abort )�ρ AE − U A ⊗ ρ E � 1 /2 ≤ ε sec , we can call it is ε sec -secret. Thereinto, ρ AE is the classical-quantum state describing the joint state of S A and E, U A is the uniform mixture of all possible values of S A , and p abort is the probability that the protocol aborts. This security criterion guarantees that the pair of secret keys can be unconditionally safe to use, we can call the protocol is ε-secure if it is ε cor -correct and ε sec -secret with ε cor + ε sec ≤ ε.
The protocol is ε sec -secret if the secret key of length ℓ satisfies 25 is the binary Shannon entropy function. Note that observed values s Z 0 , s Z 1 and φ Z 1 are the lower bound for the number of vacuum events, the lower bound for the number of singlephoton events, and the upper bound for the phase error rate associated with the single-photons events in Z A , respectively. Here, we simply assume an error correction leakage EC = n Z ζ h(E Z ) , with the efficiency of error correction ζ = 1.22 and the bit error rate E Z in (Z A , Z B ).
Let n Z k and n X k are the observed number of bit in set Z k and X k . Let m Z k and m X k denote the observed number of bit error in set Z k and X k . Note that one cannot obtain the m Z µ and m Z ν , which we just hypothetically use to estimate the error correction information. The bit error rate is E Z = (m Z µ + m Z ν )/n Z . By using the decoy-state method for finite sample sizes, we can have the lower bound on the expected numbers of vacuum event s Z * 0 and single-photon event s Z * 1 in Z A , where n Z * 0 and n Z * ν ( n Z * µ and n Z * 0 ) are the lower (upper) bound of expected values associated with the observed values n Z 0 and n Z ν ( n Z µ and n Z 0 ). We can also calculate the lower bound on the expected number of single-photon event s X * 1 and the upper bound on the expected number of bit error t X * 1 associated with the single-photon event in X µ ∪ X ν , where we use a fact that expected value m X * 0 ≡ n X * 0 /2 . Parameters n X * 0 and n X * ν ( n X * µ , n X * 0 and m X * ν ) are the lower (upper) bound of expected values associated with the observed values n X 0 and n X ν ( n X µ , n X 0 and m X ν ). The nine expected values n Z * 0 , n Z * ν , n Z * µ , n Z * 0 , n X * 0 , n X * ν , n X * µ , n X * 0 and m X * ν can be obtained by using the variant of Chernoff bound with Eqs. (9) and (10) for each parameter with failure probability ε sec /23 , for example, n Z * ν = n Z ν − � L (n Z ν , ε sec /23). Once acquiring the four expected values s Z * 0 , s Z * 1 , s X * 1 and t X * 1 , one can exploit the Chernoff bound with Eqs. (5) and (6) to calculate the corresponding observed values s Z 0 , s Z 1 , s X 1 and t X 1 for each parameter with failure probability ε sec /23 , for example, s Z 1 = s Z * 1 (1 − δ L (s Z * 1 , ε sec /23)) . By using the random sampling without replacement with Eq. (2), one can calculate the upper bound of hypothetically observed phase error rate associated with the single-photon events in Z A , www.nature.com/scientificreports/ In order to show the performance of our method in terms of the secret key rate and the secure transmission distance, we consider a fiber-based QKD system model with active basis choosing measurement. We use the widely used parameters of a practical QKD system 38 , as listed in Table 1. For a given experiment, one can directly acquire the parameters n Z k , n X k , m Z k and m X k . For simulation, we can use the formulas n Z k = Np k p z q z Q Z k , n X k = Np k p x q x Q X k , m Z k = Np k p z q z E Z k Q Z k and m X k = Np k p x q x E X k Q X k , where Q Z k and Q X k are the gain of Z and X basis when Alice chooses optical pulses with intensity k. For vacuum state without basis information, we should reset and E X k are the bit error rate of Z and X basis when Alice chooses optical pulses with intensity k. Without loss of generality, these gain and bit error rate parameters can be given by 23 where we assume that those observed values for different parameters can be denotes by their asymptotic values without Eve's disturbance. η = η d × 10 −αL/10 is the overall efficiency with the fiber length L and single-photon detector (Table 1).
To show the advantage of our results compared with previous works 24-26 , we drew the curves about the secret key rate ℓ/N as function of the fiber length, as shown in Fig. 3. For a given number of signals 10 10 , only ten seconds in 1 GHz system, we optimize numerically ℓ/N over all the free parameters. For fair comparison, we add a step about from expected value to observed value estimation for all curves, which is not taken into account in Refs. 24,25 . The corresponding methods of Refs. [23][24][25][26] to deal with statistical fluctuation can be summarized in Methods. Note that the black dashed line uses the Gaussian analysis to obtain expected value instead of the inverse solution Chernoff bound method 26 . The simulation results show that the secret key rate and secure transmission distance of our method have significant advantage under the security against the general attacks. Table 1. List of simulation parameters. η d is the detection efficiency of single-photon detector, ζ is the efficiency of error correction, α is the attenuation coefficient of single-mode fiber, e d is the misalignment rate, and N is the number of optical pulses sent by Alice. Secret key rate per pulse This work Ref. [18] Ref. [17] Ref. [19] with Gaussian www.nature.com/scientificreports/

Conclusion
In this work, we proposed the almost optimal analytical formulas to deal with the statistical fluctuation under the security against the general attacks. Analytical formulas of classical postprocessing can be expediently used in practical system, which do not introduce complex calculations of resource consumption. Our methods can directly increase the performance without changing the quantum process, which should be widely used to quantum cryptography protocols against the finite-size effects. In order to compare with previous works, we establish the complete finite-key analysis for decoy-state BB84 QKD, including from observed value to expected value, from expected value to observed value and from the observed bit error of X basis to hypothetical observed phase error of Z basis. We remark that the joint constraint method 39 can further decrease the statistical fluctuation. However, we do not consider this issue in this paper due to the lack of the analytical solutions, which is difficult to implement in commercial systems. The secret key rate of decoy-state BB84 QKD is linear scaling with channel transmittance η , which has been shown by the repeaterless PLOB bound 40 .

Methods
Proof of random sampling without replacement. Here, we use the technique of Ref. 31 to acquire the correct analytical results. We remark that the result of Ref. 31  χ(1−χ) < 1 for n, k > 0 and 0 < < y < χ ≤ 0.5 . Thereby, the inequality can be given by By using Taylor expanding for the case of n ≥ k , we have nh(χ) 2πnk (1 − )/(n + k) . www.nature.com/scientificreports/ Note that the above result is always true for all n, k > 0 and 0 < < χ ≤ 0.5. 24 . The upper bound of the random sampling without replacement can be calculated by using the Serfling inequality,

2
To simplify this simulation, we consider the case of ǫ = ǫ 1 = ǫ 2 . For all observed value x, we make x * = x + U and x * = x − L , where Note that it is not rigorous in Eq. (23) for small x.
Method in Ref. 25 . The upper bound of the random sampling without replacement can be calculated by where the result is true only when n and k are large. The upper bound and lower bound of expected value for a given observed value can be calculated by using the tailored Hoeffding inequality for decoy-state method. Let x k be the observed value for k intensity and X = k x k . Therefore, we have x * k = x k + U and x * k = x k − L , where Note that the deviation is the same for all intensities of k, which will lead large fluctuation for small intensity, especially vacuum state.
Method in Ref. 26 . The upper bound of the random sampling without replacement can be calculated by using the following transcendental equation, The upper bound and lower bound of expected value for a given observed value can be calculated by using the Gaussian analysis. Therefore, we have x * = x + U and x * = x − L with where a = erfcinv(b) is the inverse function of b = erfc(a) and erfc(a) = 2 √ π ∞ a e −t 2 dt is the complementary error function.
Furthermore, the upper bound and lower bound of expected value for a given observed value can also be calculated by using the inverse solution Chernoff bound. Therefore, we have x * = x/(1 − δ U ) and x * = x/(1 + δ L ) , where δ U and δ L can be obtained by using the following transcendental equation, (22) γ U = (n + k)(k + 1) nk 2 ln ǫ −1 .