## Introduction

The amount of information that can be transmitted through a physical channel depends on the fundamental properties of the channel1,2 and the physical states used as information carriers. Recent work has shown that coherent states of light, routinely produced by lasers, can achieve the ultimate limits of information transfer, classical capacity, in communication channels with loss,3 and phase-insensitive Gaussian noise.4,5 These results provide strong support for using coherent states as the centerpiece for current and future developments of optical communication networks.6,7,8 Moreover, beyond the realm of classical communications, coherent states have shown to be of great practical use for quantum communications,9,10 including quantum key distribution,11,12,13,14,15,16,17,18,19,20 quantum digital signatures,21 and quantum fingerprinting.22,23 However, despite the theoretical breakthroughs in identifying the capacities for phase-insensitive Gaussian channels, finding the ultimate information rates for other channels, such as noisy channels with a specific non-Gaussian noise that may be encountered in different situations, is still an open problem. Moreover, even in channels for which capacity is known, reaching this ultimate rate for reliable communications requires finding the optimal encoding schemes and optimal measurements over the physical information carriers.24,25 Furthermore, finding optimal encodings and measurements to maximize information transfer in a specific channel with fundamental noise, in addition to technical noise in real devices, would represent a large advance in our understanding of the limits in realistic optical communications.

Quantum mechanics in principle allows for constructing measurements for coherent states surpassing the classical limits of sensitivity and information transfer.2,26 Discrimination strategies for coherent states based on optimized measurements with photon counting have been proposed18,27,28,29,30,31,32,33,34 and demonstrated35,36,37,38,39,40,41,42,43,44 to surpass the conventional limits of detection, the quantum noise limit (QNL), and approach the ultimate quantum limit, the Helstrom bound.26 These nonconventional measurements can enhance information transfer in optical communications43,45 and surpass the classical limits of information transfer using joint measurements over sequences of coherent states.24 Furthermore, photon-counting measurements can be optimized to provide inherent robustness against noise and imperfections of realistic systems in communications.42,46 While these optimized measurements can enhance sensitivities and information transfer with coherent states, the fundamental noise intrinsic in the channel can severely degrade the information encoded in these states. This in turn compromises the potential benefits of these optimized measurements for optical communications.47,48

In this work, we investigate a new approach for optimizing communications in a channel with specific intrinsic noise in addition to unavoidable technical noise, with the goal of maximizing sensitivities and information transfer based on non-Gaussian measurements and coherent states. The central concept of this approach consists of finding optimized communication strategies where measurements and coherent state encodings are jointly optimized to become more robust to the specific noise in the channel, and ultimately maximize sensitivities and information transfer over the noisy channel. As a proof-of-concept demonstration, we investigate optimized communication strategies for communications over a noisy channel with phase diffusion,49,50,51 based on optimized single-shot photon-counting measurements and binary coherent state encodings. Phase diffusion is the most detrimental noise for states of light carrying information in the phase, since it destroys the coherence of the quantum states.52,53,54 We show that an optimized strategy that simultaneously optimizes the non-Gaussian measurement and the binary state alphabet allows for surpassing the limits in performance of an ideal conventional measurement (CM) in terms of probability of error and information transfer per channel use over the non-Gaussian channel.

## Results

### Optimized strategy for a phase diffusion channel

Phase diffusion noise has been extensively investigated in quantum metrology, measurements and communications for phase estimation,49 interferometry,50 state discrimination, and information transfer in communication.55,56,57 This noise is most damaging when information is contained in the coherent properties of the states used as information carriers. In particular, Gaussian phase diffusion makes the task of extracting information more difficult,47,48,49,50,51,52,53,54,58 degrading measurement sensitivities and lowering the achievable information transfer in coherent communications. As a first step for constructing an optimized communication strategy with binary encoding over a channel with phase diffusion, we consider the optimization of the input alphabet to provide robustness to phase diffusion and to other sources of noise and imperfections. This optimization consists of finding the optimal energy distribution in the alphabet to minimize the detrimental effects of phase diffusion, while allowing for measurements to provide high sensitivity.

Figure 1 shows the effect of phase diffusion on three different binary alphabets with coherent states with the same average energy $$\langle n\rangle = \bar n$$: (a) binary phase-shift keying (BPSK) {|−α〉, |+α〉}, with α real and positive; (b) on-off keyed (OOK) alphabet $$\left\{ {|0\rangle ,|\sqrt 2 \alpha \rangle } \right\}$$; and (c) a general binary coherent state alphabet {|α1〉, |α2〉}. We observe that phase diffusion affects equally the states {|−α〉, |+α〉} in the BPSK alphabet, and dramatically reduces their distinguishability, which causes discrimination errors to become very high. On the other hand, when considering the OOK alphabet $$\left\{ {|0\rangle ,|\sqrt 2 \alpha \rangle } \right\}$$, phase diffusion impacts only the state $$|\sqrt 2 \alpha \rangle$$, and leaves the vacuum state |0〉 unaffected. In this case, their distinguishability weakly depends on the phase noise, highlighting the robustness of this alphabet to phase diffusion noise. Therefore, while BPSK has a smaller overlap and better distinguishability than OOK encoding in the absence of phase noise, OOK states have an overlap that is independent of the level of phase diffusion. The optimized alphabet {|α1〉, |α2〉} in Fig. 1c represents a smooth transition and a tradeoff between BPSK with a high degree of distinguishability for low levels of noise, and OOK which is immune to phase diffusion. Figure 1c shows an example of an optimized alphabet {|α1〉, |α2〉}, which is optimized under the average energy constraint $$\bar n = \frac{1}{2}(|\alpha _1|^2 + |\alpha _2|^2)$$ for a given level of the phase noise. The result of this optimization is an alphabet that combines the robustness of OOK with the distinguishability of BPSK.

Optimized non-Gaussian measurements based on photon number resolution (PNR)46 provide robustness against technical noise and imperfections for the discrimination of BPSK states surpassing the QNL. Optimized communication strategies in a non-Gaussian channel with phase diffusion can combine these measurements with an optimized input alphabet in order to minimize the probability of error in the channel. This strategy then optimizes simultaneously the measurement and the alphabet, resulting in a high degree of robustness to phase diffusion while maintaining the benefits of non-Gaussian measurements for surpassing the limits of CMs.

Figure 2a shows the concept of an optimized communication strategy for a binary channel with phase diffusion. The sender (Alice) prepares an input state from a coherent state alphabet {|α1〉, |α2〉}, and sends it to the receiver (Bob) through a non-Gaussian noisy channel. Phase diffusion causes the input states {|αj〉} (j = 1, 2) to become phase-diffused mixed states:54

$$\hat \rho _j(\sigma ) = \mathop {\int}\limits_{ - \infty }^\infty {\frac{{e^{-\frac{\phi ^{2}}{2\sigma ^{2}}}}}{\sqrt {2\pi \sigma ^{2}}}} \left| {\alpha _je^{ - i\phi }} \right\rangle \left\langle {\alpha _je^{ - i\phi }} \right|{\mathrm{d}}\phi,$$
(1)

where the strength of the phase diffusion noise is quantified by the width σ of the Gaussian phase distribution.

At the channel output, the receiver implements an optimized single-shot measurement based on photon counting to discriminate these states with high sensitivity.46 In this strategy, the input state $$\hat \rho _j$$ is displaced in phase space to $$\hat D(\beta )\hat \rho _j(\sigma )\hat D^\dagger (\beta )$$, where the displacement operation, $$\hat D(\beta ) = e^{\beta \hat a^\dagger - \beta ^ \ast \hat a}$$ with $$\hat a$$ ($$\hat a^\dagger$$) as the lowering (raising) operator, is implemented by interference of the input state with a displacement field β in a high transmittance beam splitter.59 Subsequently, the photons in the displaced state are detected by a PNR detector with number resolution m (PNR(m)). Here, m represents the maximum number of photons that a detector can resolve before becoming a threshold detector.46 This measurement strategy uses a maximum a posteriori (MAP) decision rule to infer the input state based on the photon detection outcome k given the mean photon number $$\bar n$$, displacement field |β|, and PNR(m), for a level of phase noise σ.

The MAP strategy assumes that the correct state is the one with the highest conditional posterior probability $$P(\hat \rho _{j}(\sigma )|\beta ,k,m)$$ obtained through Bayes’ rule:

$$P(\hat \rho _{j}(\sigma )|\beta ,k,m) = \frac{{P(k|\hat \rho _{j}(\sigma ),\beta ,m)P(\hat \rho _{j}(\sigma ))}}{{P(k|m)}}.$$
(2)

Here, P(k|m) is the total probability of detecting k photons given a PNR(m) strategy, and $$P(k|\hat \rho _{j}(\sigma ),\beta ,m)$$ is the conditional probability of detecting k photons given |β| and m. We consider equiprobable input states, so that the prior probabilities become $$P(\hat \rho _{j}(\sigma )) = 0.5$$. The probability of error in the discrimination of the input states for a strategy with PNR(m) is:

$$P_{\mathrm{E}}(\bar n,\{ \hat \rho _j(\sigma )\} ,\beta ,m) = 1 - \frac{1}{2}\sum\limits_{k = 0}^m \max\limits_j (\{ P(k|\hat \rho _j(\sigma ),\beta ,m)\} ).$$
(3)

Here, $$P(k|\hat \rho _j(\sigma ),\beta ,\sigma ,m)$$ is the conditional probability of detecting k photons for the input state given the displacement |β|, noise level σ, and PNR(m). The error probability PE in Eq. (3) depends on the input alphabet, the intrinsic properties of the channel, and the measurement performed by the receiver. This provides a way to find optimized strategies that simultaneously optimize the alphabet and the measurement to minimize the detrimental effects of the channel noise. The optimized strategies use an optimal displacement $$\hat D(\beta )$$ and an optimal input alphabet {|α1〉, |α2〉} for a given input power $$\bar n$$, PNR(m), and channel noise level σ to minimize the probability of error PE.

Figure 2b shows the performance of an optimized communication strategy for a channel with phase diffusion optimized for state discrimination for a strategy with PNR(1) for $$\bar n = 0.5$$, with ideal detection efficiency η = 1.0, an interference visibility ξ = 0.998, which quantifies the technical noise and imperfections in the receiver,42,46 and zero dark count rate ν = 0. To evaluate the performance of this strategy, we compare it with an ideal CM consisting of either homodyne or direct detection to minimize the discrimination error, with its own optimized alphabet (solid gray line). We note that the optimized alphabet for the CM results in either BPSK or OOK for this binary coherent state channel, and that it changes abruptly from BPSK to OOK when the conventional measurement switches from homodyne to direct detection.

As shown in Fig. 2b, while a PNR(1) strategy with BPSK (dashed red) can only outperform the ideal CM for small-phase noise σ,54 optimizing the input alphabet to interpolate between BPSK and OOK, shown in Fig. 2c, allows the strategy to outperform the ideal CM for all levels of noise. Moreover, for high levels of noise, the optimized communication strategy approaches the Helstrom measurement with its own optimized alphabet, showing that this optimized communication strategy is asymptotically the optimal quantum measurement.

Optimized communication strategies can also be used to increase information transfer over a noisy channel. These strategies simultaneously optimize the measurement and the input alphabet to maximize mutual information, instead of minimizing probability of error, for a channel with intrinsic noise and technical noise from the devices. Optimized strategies for information transfer for a phase-diffusion channel with binary state encoding are in general different from strategies designed for minimum error, as discussed in Section “Mutual information under phase diffusion”.

### Experimental demonstration

The optimized communication strategies described above can be implemented with current technologies. We demonstrate these strategies in a proof-of-principle experiment for enhancing sensitivities and information transfer for the phase diffusion channel with a binary coherent state encoding with a PNR non-Gaussian measurement, which provides robustness to technical noise and system imperfections.46 The experimental realization uses an interferometric setup to implement the optimized strategies. Coherent state pulses at 633 nm are displaced by interference on a highly transmissive beam splitter, and we use an avalanche photodiode (APD) as a photon number resolving detector. See ref. 46 for a detailed description. To investigate the optimized communication strategies, a controlled level of the phase diffusion noise is applied to the input state (see Supplementary Section 1). Our experiment achieves an overall detection efficiency η = 0.72, an interference visibility ξ = 0.998, and a dark count rate ν = 3.6 × 10−3. Technical noise in the experiment such as reduced visibility and dark counts affects the performance of the optimized strategy (see Supplementary Section 2). However, the levels of noise in our experiment only have a small effect on the strategy’s performance.

We systematically investigate the optimized communication strategies for a channel with phase diffusion by first studying the performance of optimized PNR measurements with a BPSK alphabet46 for this channel. Next, we investigate the optimized communication strategies with an optimized measurement-alphabet method for enhancing measurement sensitivity. Finally, we investigate optimized communication strategies for maximizing the mutual information for a phase-diffusion channel.

### Discrimination with a BPSK alphabet under phase diffusion

Figure 3 shows the experimental error probabilities for the discrimination of states from a BPSK alphabet with an optimized PNR measurement46 with PNR(m) of m = 1, 2, 3, for three mean photon numbers: (a) $$\bar n = 0.5$$, (b) $$\bar n = 1$$, and (c) $$\bar n = 2$$. We observe in all cases that while PNR(1) (red dots) outperforms an adjusted homodyne measurement up to a certain level of noise, as discussed in ref.,54 increasing PNR to PNR(2) (green dots) and PNR(3) (blue dots) extends the level of noise σ where this optimized measurement46 outperforms a homodyne measurement. The increase in robustness with PNR against phase diffusion becomes larger as the mean photon number increases. Figure 3b, c shows that PNR(3) extends the level of noise σ for which this measurement surpasses the homodyne limit by about 1.5 times for $$\bar n = 1$$ and about four times for $$\bar n = 2$$ compared to an on/off PNR(1) strategy.

### Discrimination with an optimized alphabet under phase diffusion

Phase diffusion severely affects measurements for state discrimination in a BPSK alphabet. To reduce the effects of phase diffusion in the channel, a communication strategy can implement an encoding alphabet, which is optimized for a particular level of phase noise. In conventional coherent communication with Gaussian measurements, constellation optimization has been used to mitigate some effects of phase noise.55,56,57 However, in a more general optimized communication strategy using a non-Gaussian measurement, this alphabet can be optimized simultaneously with the displaced photon-counting measurement to reduce errors and enhance information transfer.

Figure 4 shows the performance of the optimized strategy for the discrimination of states from an optimized alphabet with an optimized PNR measurement,46 with PNR(1) and PNR(3) for mean photon numbers: (a) $$\bar n = 0.5$$, (b) $$\bar n = 1.0$$, and (c) $$\bar n = 2.0$$. Experimental data is shown with red (green) dots for PNR(1) (PNR(3)), and expected performance is shown in dotted lines. Error bars represent 1 SD over five experimental runs of over 105 independent experiments. While a strategy with PNR(3) and a BPSK alphabet (solid green) can only outperform a CM for a limited range of noise levels σ, optimized strategies with optimal alphabets and measurements allow for outperforming the CM over larger ranges of noise σ. Moreover, optimized strategies with PNR(1) surpass the CM for all levels of noise for $$\bar n = 0.5$$ and $$\bar n = 1.0$$. For higher $$\bar n$$, increasing number resolution m is expected to enable discrimination below the CM at any noise level, as can be inferred from the trend in Fig. 4c.

Figure 4d–f show the optimal alphabet for $$\bar n = 0.5,\,1.0,\,{\mathrm{and}}\,2.0$$, respectively. Discrete jumps in the optimized alphabets for different PNR strategies are the results of optimization of Eq. (3), which requires a global optimization over multiple minima46 of PE. This optimization searches for the values of |α1| and |β|, resulting in the global minimum of PE for a given noise level σ for a PNR(m) strategy. There are levels of noise at which a small increase in σ causes the former global minimum of PE as a function of |α1| and |β| to become a local minimum, and a former local minimum to become the new global minimum (see Supplementary Section 3). These abrupt changes in the global minimum result in the sudden jumps of the optimal alphabet shown in Fig. 4e at σ ≈ 0.36 and σ ≈ 0.38, and in Fig. 4f at σ ≈ 0.20 and σ ≈ 0.42. We note that the optimized alphabets correspond to interpolations between BPSK and OOK alphabets for all $$\bar n$$, and result in large improvements over BPSK. This shows that strategies with optimized alphabets are essential for surpassing the sensitivity limits of conventional measurements in the channels with phase noise.

### Mutual information under phase diffusion

Optimized communication strategies can also be designed to maximize information transfer over a non-Gaussian noisy channel, for which optimal encoding and decoding are unknown. An optimized communication strategy, which minimizes probability of error, will provide some advantage for increasing mutual information. However, in a noisy channel, the measurement and the alphabet can be optimized in order to maximize mutual information I(X:Y) and will yield a different strategy than for minimum error. Mutual information quantifies the total amount of information between transmitter and receiver, and depends on the encoding alphabet and decoding measurement. For a displaced photon-counting measurement, I(X:Y) can be expressed according to a “soft” decision rule where the number of photons detected is used to infer the input symbol rather than the binary output from a binary decision rule.60 The mutual information for a channel with phase diffusion with a binary coherent state encoding can be expressed as:

$$I(\bar n,\{ \hat \rho _j(\sigma )\} ,\beta ,m) = \sum\limits_{k = 0}^m {\sum\limits_{j = 1}^2 P } (k|\{ \hat \rho _j(\sigma )\} ,\beta ,m)P(\{ \hat \rho _j(\sigma )\} )\log _2\left[ {\frac{{P(k|\{ \hat \rho _j(\sigma )\} ,\beta ,m)}}{{P(k|m)}}} \right],$$
(4)

where $$P(k|\{ \hat \rho _i(\sigma )\} ,\beta ,m)$$ is the conditional probability of detecting k photons. In an optimized communication strategy over a noisy channel, the input alphabet and measurement with PNR(m) are simultaneously optimized to maximize mutual information $$I(\bar n,\{ \hat \rho _i(\sigma )\} ,\beta ,m)$$ under the average energy constraint for a noise level σ.

Figure 5a, b shows the experimental results for the mutual information with optimized strategies for mean photon numbers: (a) $$\bar n = 1.0$$ and (b) $$\bar n = 2.0$$, and PNR(m) of m = 1, 3, 5 in red, green, and blue dots, respectively. The theoretical predictions are shown with dashed colored lines. The mutual information for a conventional measurement (dashed gray) and for BPSK are shown adjusted for our total detection efficiency η = 0.72. Optimized communication strategies surpass the limit in mutual information for a CM at high levels of phase-diffusion noise (σ ≥ 0.7), and for low noise (σ ≤ 0.1). Moreover, optimized strategies with higher PNR detection resolution m provide higher mutual information for all levels of noise. Note that optimized communication strategies with optimized alphabets drastically outperform BPSK for all PNR(m) in terms of mutual information. Figure 5c, d shows the optimized alphabets for (c) $$\bar n = 1.0$$, and (d) $$\bar n = 2.0$$, respectively. We observe that the optimal alphabet interpolates from BPSK to OOK similar to error probability. However this interpolation is continuous, because the mutual information is a convex function of σ for all PNR(m). In the intermediate level of noise (σ ≈ 0.5), there is a gap between the optimized strategies and the CM. This gap decreases as the PNR(m) of the optimized strategies increases. This suggests that optimized communication strategies with high-enough PNR(m) should provide levels of mutual information at least as high as those that can be achieved with ideal conventional measurements for all levels of phase diffusion noise.

Figure 5e shows the maximum percent difference R(m) between an optimized strategy with PNR(m) and a CM for $$\bar n$$ from 0 to 2.0 for different PNR(m) from m = 1 to m = 20. This corresponds to the percent difference at the level of noise for which a PNR(m) strategy has the worst performance relative to a conventional measurement. R(m) is defined as:

$$R(m) = \mathop {{\mathrm{max}}}\limits_\sigma \left( {\frac{{I_{\mathrm{CM}}(\sigma ) - I_{{\mathrm{PNR}}(m)}(\sigma )}}{{I_{\mathrm{CM}}(\sigma )}}} \right),$$
(5)

where IPNR(m)(σ) is the mutual information for an optimized communication strategy with PNR(m), and ICM(σ) is the mutual information for the conventional measurement. We observe that as the number resolution increases, the percent difference asymptotically approaches zero for all mean photon numbers. The blue regions to the right of the white line correspond to R(m) < 1%, that is, when a PNR(m) strategy is within 1% of the conventional measurement. Figure 5f shows R(m) on a log–log scale for $$\bar n = 0.5,\,1.0,\,1.5,\,{\mathrm{and}}\,2.0$$ in red, green, blue, and black lines, respectively. The straight lines indicate power-law scaling in the convergence of the form a(m)b, with b ≈ 1.1 for all lines. This convergence suggests that for all mean photon numbers, optimized communication strategies with large enough photon resolution m will at worst provide the same mutual information as the ideal CM, which serves as a lower bound for the performance of optimized communication strategies. At the same time, these optimized strategies with moderate PNR provide large advantages for increasing mutual information compared to CM at low noise and high noise levels.

## Discussion

We proposed and demonstrated optimized communication strategies to maximize information transfer and measurement sensitivity over a non-Gaussian noisy channel. These optimized strategies are based on simultaneous optimization of the states used as information carriers with an optimized non-Gaussian photon-counting measurement that surpasses the QNL for state discrimination. Simultaneous optimization of alphabet and measurement provides robustness to intrinsic channel noise, and allows for overcoming the sensitivity limits of conventional measurements and achieving higher information transfer in communications over noisy channels.

We demonstrated in a proof-of-principle experiment the concept of optimized strategies for communication over a channel with phase diffusion for binary coherent state alphabets and single-shot optimized measurements with PNR. These optimized communication strategies provide unexpected benefits to minimize the probability of decoding error and maximize the achievable mutual information in this noisy channel. Moreover, we observed that optimized communication strategies not only provide robustness to intrinsic channel noise but also to technical noise and imperfections in the receiver.

We expect that optimized communication strategies can provide advantages for different problems in coherent communications extending to communication with multiple states and complex measurements. Moreover, optimized communication strategies can be applied to other channels utilizing practically optimized measurements and encodings to maximize information transfer in realistic noisy communication channels for which capacity limits are unknown, but that are encountered in optical communication networks.