Generative Machine Learning for Robust Free-Space Communication

Realistic free-space optical communications systems suffer from turbulent propagation of light through the atmosphere and detector noise at the receiver, which can significantly degrade the optical mode quality of the received state, increase cross-talk between modes, and correspondingly increase the symbol error ratio (SER) of the system. In order to overcome these obstacles, we develop a state-of-the-art generative machine learning (GML) and convolutional neural network (CNN) system in combination, and demonstrate its efficacy in a free-space optical (FSO) communications setting. The system corrects for the distortion effects due to turbulence and reduces detector noise, resulting in significantly lowered SERs and cross-talk at the output of the receiver, while requiring no feedback. This scheme is straightforward to scale, and may provide a concrete and cost effective technique to establishing long range classical and quantum communication links in the near future.


Introduction
The field of FSO communications provides an exciting route forward in wireless communication by making use of various multiplexing schemes, including frequency and wavelength division multiplexing, and more recently spatial multiplexing [1][2][3][4][5][6] .A common method to implement the latter involves making use of orbital angular momentum (OAM), which is a degree-offreedom that is in principle unbounded, thus allowing for the use of large alphabets in FSO communication links.For example, by generating and transmitting various superpositions of OAM states, which result in different "petal pattern" images in the spatial domain as shown in Figs. 1 and 2, the alphabet size of the communications system may be significantly increased.An integral aspect of the communications system, however, is the ability to effectively demodulate the signal at the receiver, which in this case corresponds to determining which OAM superposition was sent and received.In practice, realistic FSO communications systems involve the propagation of such signals through turbulence, and include detectors with a nonzero amount of dark noise.As such, the distorted, noisy received signals (images) can result in a degraded SER (the ratio of the number of optical profiles incorrectly classified to the total number of optical profiles received), significantly limiting the ability to implement such systems in a real-world scenario 7,8 .Here we make use of generative machine learning techniques to design a communications system that is robust to these real-world hindrances, and demonstrate its ability, in combination with a convolutional neural network, to drastically reduce the effects of turbulence and noise on the SER in a realistic simulated communications setting.
Generative machine learning techniques have recently been developed and applied to a variety of systems, including molecular design, radiotherapy, geophysics, speech recognition and tomography [9][10][11][12][13] .In particular, unsupervised autoencoders have been shown useful in a variety of denoising scenarios [14][15][16] .Additionally, several groups have recently investigated the power of convolutional neural networks in the context of optical communications [17][18][19][20][21][22][23][24] .Here we expand significantly upon these works and develop a generative neural network, combined with a convolutional neural network as shown in Fig. 1, that act in concert to increase the robustness and decrease the SER and cross-talk of free-space optical communications links.This receiver-end system is shown to be effective for a wide range of turbulence and detector noise strengths, and requires no feedback to the transmitter of the communications link.Additionally, previous demonstrations have shown that in order to classify unknown distorted optical profiles with high accuracy using only a CNN as a classifier, a training set with large number of known distorted optical profiles is desirable, which is always unbounded.This limits the classification efficiency of the communication scheme with respect to randomly varying turbulent effects.In contrast, our scheme with the developed GML network generates new, significantly less distorted optical profiles at the receiver which is later classified by a CNN exclusively trained with undistorted (desired) optical modes with added dark noise alone.As GML is based on unsupervised learning, our technique may be easily extended to demodulate more complex optical profiles which are not only difficult to label but also difficult to classify accurately with the current supervised CNN techniques in the future.These new aspects of the current demonstration provide a significant step forward in the realistic implementation of error-correction techniques in free-space Figure 1.Schematic of the robust free-space optical communications scheme.Simulations include generating a desired optical mode, its propagation through turbulence, and added detector noise.The resultant image at the receiver is then fed into the generative neural network, which generates a new, less distorted and noisy image.This generated image is then classified by a convolutional neural network, allowing for the calculation of the SER and cross-talk between the modes.optical communications, paving the way toward their robust implementation.
The overall design of the communications system in shown schematically in Fig. 1.A laser is simulated to be incident on a spatial light modulator (SLM) with a given phase mask, such that the resultant optical spatial mode profile is in a desired superposition of OAM values ranging from 0 (Gaussian) to ±10 (a petal pattern with 20 lobes).The mode is then simulated to propagate through turbulence, resulting in a distorted profile, after which detector dark noise is added at the receiver.This noisy, distorted image is then fed into the generative neural network (GNN), which generates a new mode profile that is fed into a CNN that classifies which mode was sent and received (that is, which letter of the alphabet).Examples of the distorted, noisy images and GNN generated images are shown in Fig. 2, for varying degrees of signal-to-noise ratios, turbulence strengths, and communication link distances.This process is repeated many times for all spatial modes (with random turbulences and noises added), and the SER of the system is calculated and compared to the SER when the neural network detection system is not used.Additionally, we calculate the cross-talk between the noisy, distorted modes at the receiver, and show a significant enhancement when the generative neural network system is used.

Free-Space Turbulent Propagation and Network Architecture
We describe here in detail the various steps to reconstruct the distorted optical profiles that enhance the SER using the generative denoising autoencoder.We focus on the effects of turbulent propagation and dark noise that seriously degrade the OAM mode quality at the receiver.
First, we simulate the free-space propagation of a Gaussian beam G(x, y, w 0 ) with waist w 0 through a SLM loaded with a phase mask Θ ( 1 , 2 ) (x, y), corresponding to the superposition of two Laguerre-Gauss modes of OAM azimuthal quantum numbers 1 and 2 .This results in intensity profiles at the receiver, a distance Z m away from the transmitter, that are in the desired optical mode.In order to generate the superposition phase mask at the SLM we use equation (1), where "∠" represents the arctan of the ratio of the imaginary part to the real part, γ(r, with r as the radial distance from the center axis of the beam.Finally, the intensity at the receiver, I r , is found by using the Fourier propagator given by equation ( 2), where F is the fast Fourier transformation, and H is the transfer function.We use a Kolmogorov phase with the Von Karman spectrum effects model 25 to simulate the turbulence in the atmosphere, which is given by equation (3), where r 0 = (0.423k 2 C 2 n Z) −3/5 is the Fried parameter for a propagation distance Z, and k = 2π/λ is the wave-vector for a given wavelength λ of light.Here κ is the spatial frequency, κ m = 5.92/l min , and κ 0 = 2π/l max , with inner (l min ) and outer (l max ) scales of turbulence.Finally, we generate random phase screens using the inverse Fourier transformation as given in equation (4), where ℜ means taking only the real part, F −1 is the inverse fast Fourier transformation, C NN is a complex random normal number with zero mean and unit variance, and Φ NN (κ) is the square root of the phase distributions given by equation ( 3) over the sampling grid of size N × N. The turbulent environment is simulated by placing a turbulent phase screen generated by equation ( 4) a meter away from the SLM plane.Then, we propagate the Gaussian beam, G(x, y, w 0 ), through all the phase screens (SLM and turbulence) and have the intensity profile, I t r , with additive dark noise, N(0, σ ), at the receiver, Z ≥ 200 m away from the turbulence plane, using equation ( 5), where H 1 and H 2 are again transfer functions from the SLM to turbulence plane, and the turbulence plane to receiver, respectively.In order to generate the turbulence discussed here, we use w 0 = 4 cm, N = 128, λ = 1550 nm, l min = 1 mm, l max = 200 m, and C 2 n varies from 9 × 10 −15 m −2/3 to 5 × 10 −13 m −2/3 .

Generative convolutional denoising autoencoder
The distorted received optical profiles, I t r , as given by equation ( 5) are fed to the encoder of the GNN which compresses them into a latent space S as expressed in equation ( 6), where θ represents a parameter space of w k 1 , w k 2 (weights), and b k 1 , b k 2 (biases) of k th feature mappings of the first and second convolutional layers, respectively, where "Max" corresponds to a max-pooling operation.Also, W and B represent the weight and bias of a fully connected layer, and for convenience " * " represents the convolutional/transpose-convolutional operation.Note that we apply the ReLU activation after each convolutional operation.The resulting latent space S is then forwarded to the decoder that maps it to reconstructed pixels I of the input space as given by equation ( 7) where primes represent the parameter space of the decoder corresponding to n th feature mappings.Here each training optical profile I t r (i) is successively mapped into a corresponding latent space S (i) , and a generation I (i) .After that a square reconstruction loss L(I (i) , r is an undistorted optical profile at the receiver given by equation (2).Finally, in order to optimize the parameters, we minimize the average reconstruction loss given by equation ( 8) using adamoptimizer of tensorflow 26 .

Convolutional neural network as a demodulator
In order to train a CNN to classify the generated modes, we simulate optical modes at the receiver without any turbulence using equation ( 2) for each OAM superposition value ranging from = 0 to ±10.We then manually add random Gaussian noise with σ = 2.Here we keep low noise in the training and testing sets of the CNN to estimate how closely the generated modes by the GNN fit with the target mode.Finally we simulate 150 noisy images for each value of OAM for a total of 16,500 images.The image set is split into a training set with 130 images and a test set with 20 images, again, for each OAM profile.Then, the parameter space of the CNN is optimized by minimizing a softmax cross-entropy loss using adamoptimizer.Note that pre-trained CNN network has an unity accuracy with respect to the test images 22 .

Results
Apart from traditional autoencoder as described in 27 , convolutional denoising autoencoders (CDAEs) are able to reconstruct a clean, corrected input from those that are partially distorted 28 .The idea behind using such a network design is to learn a hidden representation and extract the important features which are robust to noise or distortion present in the inputs.The generative network used here consists of three layers -an encoder, latent space, and a decoder as shown in Fig. 2 (a-top).The encoder extracts the important features and compresses them into a smaller size, which is a latent space.The encoded information in latent space is then forwarded to the decoder, which finally generates the desired clean modes.Our CDAE is built with convolutional layers to encode the noisy inputs to latent space and a transpose-convolutional layer as a generator to decode the latent space.The encoder contains two convolutional layers (green blocks in Fig. 2 (a)) with a kernel size of 5 × 5, with zero padding, ReLU activation, stride length of 2, and 3 feature mappings followed by a max-pooling layer (red block) with a pool size of 2 × 2, and a single fully connected layer (blue circles) to a latent space.The decoder begins with a fully connected layer (blue circles) followed by a convolutional layer with the same parameter settings as described above.In order to regain the original size of the input, a transpose-convolutional layer (magenta block) is applied, again with the same parameter values.Finally, a convolutional block with a single feature mapping generates a clean, corrected mode profile.Note that we apply a dropout with a rate of 5% after each layer, except the fully connected layer at the end of the encoder and final convolutional layer of the decoder.We apply the small dropout rate to avoid overfitting, as well as the possible loss of features extracted from the convolution.The size of the fully connected layer is same as that of the latent space.Similarly, we implement a CNN to demodulate the generated, reconstructed clean mode profiles as well as the uncorrected received profiles.This network consists of a single convolutional unit with a kernel of size 5 × 5, zero padding, ReLU activation, stride length of 2, and a single feature mapping followed by a max-pooling with a 2 × 2 filter attached to a fully connected layer (28 × 28 neurons) and an output layer as shown in Fig. 2 (a-bottom).No dropout is employed in this network.First we evaluate the SER improvement with respect to latent space size of the GNN for various signal to noise ratios (SNRs) of the received OAM profiles.In order to generate training sets we simulate 99 random turbulent phase screens with a strength C 2 n of 5 × 10 −14 m −2/3 , and a communication link distance (Z) of 500 m.Note that the 99 simulated phase screens are all different from one another with respect to their phase distributions, such that two different turbulent phase screens produce different scintillation effects on OAM mode propagation even if they have the same turbulence strength.As a result we have 99 different distorted optical profiles for each superposition OAM mode ranging from = 0 to ± 10 for a total of 1,089 images.The resolution of the images is fixed to 128 × 128 pixels for all of the simulations performed in this paper.Also, the total intensity (the sum of all pixel values) for the OAM mode images are normalized to 226955 at the transmitter, in order to simulate a transmitter with a fixed transmission intensity.Finally we add random additive Gaussian noise to the distorted, received optical profiles to simulate the effects of dark noise at the receiver.Then, the SNR of the final noisy, distorted optical profiles is measured as discussed in 22 .Note that we train the GNN separately for separate SNR image sets (and that the SNR is decreased by increasing the amount of added detector noise, as the transmitter intensity is held constant).Next, the set of images is split into separate training and test sets.As a result, the training set contains 50 images for each value of for a total of 550 images, and the test set contains 49 images for each value of for a total of 539 images.With these training sets the GNN is then pre-trained with a learning hyper-parameter (rate) set at 0.008.Then, unknown test sets are fed to the pre-trained GNN which generates nearly ideal corrected optical profiles as the output, some examples of which for C 2 n = 7 × 10 −14 m −2/3 are shown in Fig. 2 (a).The left, middle and right columns in Fig. 2 (a) represent the noisy received modes (R) and reconstructed (C) profiles with a GNN when the average SNR of images are 0.11 dB (σ = 20), −3.87 dB (σ = 50), and −5.91 dB (σ = 80), respectively.Next, in order to calculate the SER, the noisy test optical images (without corrections) and reconstructed/generated images from the GNN are forwarded to a pre-trained CNN and the corresponding SERs are measured.The SER of the test set images with and without the GNN at various SNR levels of 3.11 dB to −5.91 dB with respect to different latent sizes of the GNN is shown in Fig. 3 (a).We find an improvement in SER from 0.13 to 0.07 for the image sets with SNR = 3.11 dB even at a small latent size of 4 × 4. As expected we obtain better reconstructions and more improved SERs as we increase the latent size up to 32 × 32, after which it begins to saturate.The reconstructed images shown in Fig. 2 (a) are from the GNN with a 32 × 32 latent size.Finally we achieve an improvement in SER from 0.13 to 0.03 at a latent size of 24 × 24, 0.16 to 0.05 at a latent size of 24 × 24, 0.23 to 0.08 at a latent size of 24 × 24, and 0.32 to 0.14 at a latent size of 72 × 72 for the image sets with SNR = 3.11 dB, 0.11 dB, −3.87 dB, and −5.91 dB, respectively.
Next, we vary the communication link distance and find the SER improvement with respect to different latent sizes.The same strength of turbulence as described in previous paragraphs is used, but again with different, random phase patterns.Here all optical mode profiles are assumed to be detected with an average SNR = −3.87dB at the receiver.Noisy and reconstructed OAM profiles from the pre-trained GNN for distances of 400 m, 600 m, and 800 m are shown in Fig. 2 (c

Discussion
In conclusion, we have developed an efficient and straightforward scheme to overcome the negative effects of turbulence and detector noise in FSO communications with a GML and CNN in combination.The developed state-of-the-art technique corrects for distortions caused by weak to extreme strengths of turbulence, as well as various strengths of detection noise, resulting in significantly improved SERs at the receiver.We also show an enhancement in SER for various communication link distances.Additionally, by using the same network system, we demonstrate the robustness of the GML approach with respect to the reconstruction of individual OAM profiles, which results in a decrease in the cross-talk between received mode profiles.Moreover, with the aid of generative networks, we have significantly improved SERs with a CNN as a classifier that is solely trained with undistorted modes with dark noise at the receiver, thereby avoiding the need for extremely large CNN training sets involving various distortions (which is unbounded).Furthermore, the addition of this unsupervised learning scheme may be extended to demodulate more complex optical profiles which are difficult to label and classify with current supervised techniques.This significant improvement in mode classification and demodulation is integral to the robust performance of realistic FSO communications systems, and we are hopeful that the techniques developed here may directly be applied to quantum systems in the near future [29][30][31][32][33][34] .
).The improvement in SER with respect to various latent sizes of the GNN at communication link distances from 200 m to 800 m are shown in Fig. 3 (b).Here, we find significant improvements in the SERs from 2.4 × 10 −2 to 1.8 × 10 −3 at a latent size of 40 × 40, 0.15 to 0.05 at a latent size of 88 × 88, 0.28 to 0.12 at a latent size of 40 × 40, and 0.33 to 0.15 at a latent size of 24 × 24 for the communication links distances of 200 m, 400 m, 600 m, and 800 m, respectively.

Figure 4 .
Figure 4. (a) SER versus C 2 n with various detector noise levels σ at fixed Z = 500 m.The corresponding SER at C 2 n = 9 × 10 −15 m −2/3 , and C 2 n = 1 × 10 −14 m −2/3 are zoomed in and shown in inset.(b) SER versus latent space size at different turbulent strength C 2 n with fixed σ = 50, and again Z = 500 m.The inset shows the zoomed in SER at a latent size of 24 × 24.