Learning diffractive optical communication around arbitrary opaque occlusions

Free-space optical communication becomes challenging when an occlusion blocks the light path. Here, we demonstrate a direct communication scheme, passing optical information around a fully opaque, arbitrarily shaped occlusion that partially or entirely occludes the transmitter’s field-of-view. In this scheme, an electronic neural network encoder and a passive, all-optical diffractive network-based decoder are jointly trained using deep learning to transfer the optical information of interest around the opaque occlusion of an arbitrary shape. Following its training, the encoder-decoder pair can communicate any arbitrary optical information around opaque occlusions, where the information decoding occurs at the speed of light propagation through passive light-matter interactions, with resilience against various unknown changes in the occlusion shape and size. We also validate this framework experimentally in the terahertz spectrum using a 3D-printed diffractive decoder. Scalable for operation in any wavelength regime, this scheme could be particularly useful in emerging high data-rate free-space communication systems.


Introduction
Traditionally radio frequency (RF) and microwave have dominated the area of wireless communication.To meet the growing need for faster data transfer rates, RF systems employ increasingly complex coding, multiple antennas, and higher carrier frequencies (1).For example, by utilizing higher frequency bands, 6 th generation (6G) technology is predicted to provide 100 to 1000 times faster speed than 5 th generation (5G) systems deployed for wireless communication (2).With ever-increasing data rates, maintaining the performance of these schemes will become more challenging.One possible solution is to shift to shorter wavelengths, such as the ultraviolet (UV), visible or infrared (IR) regions of the electromagnetic spectrum, which provide much wider bandwidths compared to radio waves or microwaves (1,(3)(4)(5).However, free-space optical communication becomes challenging when opaque occlusions block the light path.Non-line-of-sight (NLOS) communication, which exploits diffusely reflected waves from a nearby scattering medium, has been used as a way around the occlusion problem (6)(7)(8)(9)(10).However, the adaptability of these solutions to emerging optical communication techniques for channel capacity expansion faces challenges since even weak turbulence can cause a significant loss of information (10).Furthermore, the low power efficiency arising from the weak scattering or diffuse reflection is another limitation of NLOS communication.Other NLOS systems, e.g., for imaging around corners, also exist (11)(12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24); these approaches, however, involve relatively slow and power-consuming digital methods for image reconstruction.Alternative methods have been developed for image transmission through thick (but transmitting) occlusions, including e.g., holography (25)(26)(27), adaptive wavefront control (28)(29)(30), and others (31,32).However, many of these techniques also involve digital reconstruction of the information, often requiring iterative algorithms.Moreover, these are applicable for multiple scattering media that are transmissive, and do not address situations, where the light path is either partially or entirely obstructed by opaque occlusions with zero light transmittance.
Here we demonstrate a novel scheme for directly communicating optical information of interest around zero-transmittance occlusions using electronic encoding at the transmitter and all-optical diffractive decoding at the receiver.In our scheme, an electronic neural network, trained in unison with an alloptical diffractive decoder, encodes the message of interest to effectively bypass the opaque occlusion and be decoded at the receiver by an all-optical decoder, using passive diffraction through thin structured layers.This all-optical decoding is performed on the encoded wavefront that carries the optical information or the message of interest, after its obstruction by an arbitrarily shaped opaque occlusion.The diffractive decoder processes the secondary waves scattered through the edges of the opaque occlusion using a passive, smart material comprised of successive spatially engineered surfaces, (33) and performs the reconstruction of the hidden information at the speed of light propagation through a thin diffractive volume that axially spans < 100×λ, where λ is the wavelength of the illumination light.
We show that this combination of electronic encoding and all-optical decoding is capable of direct optical communication between the transmitter and the receiver even when the opaque occlusion body entirely blocks the transmitter's field-of-view (FOV).We also report an experimental demonstration of this scheme using a 3D-printed diffractive decoder that operates at the terahertz spectrum.Furthermore, we demonstrate that this scheme could be configured to be highly power efficient, reaching diffraction efficiencies of >50% at its output.In the case of opaque occlusions that change their size/shape over time, we also report that the encoder neural network could be retrained to successfully communicate with an existing diffractive decoder, without changing its physical structure that is already deployed.This makes the presented concept highly dynamic and easy to adapt to external and uncontrolled changes that might happen between the transmitter and receiver apertures.This framework can be extended for operation at different parts of the electromagnetic spectrum, and would find applications in emerging high-data-rate free-space communication technologies, under scenarios where different undesired structures occlude the direct channel of communication between the transmitter and the receiver.

Results
A schematic depicting the optical communication scheme around an opaque occlusion with zero light transmittance is shown in Fig. 1a.The message to be transmitted, e.g., the image of an object, is fed to an electronic/digital neural network, which outputs a phase-encoded optical representation of the message.This code is imparted onto the phase of a plane-wave illumination, which is transmitted toward the decoder through an aperture that is partially or entirely blocked by an opaque occlusion.The scattered waves from the edges of the opaque occlusion travel toward the receiver aperture as secondary waves, where a diffractive decoder all-optically decodes the received light to directly reproduce the message/object at its output FOV.This decoding operation is completed as the light propagates through the thin decoder layers.For this collaborative encoding-decoding scheme, the electronic encoder neural network and the diffractive decoder are jointly trained in a data-driven manner for effective optical communication, bypassing the fully opaque occlusion positioned between the transmitter aperture and the receiver.
Figures 1b and 1c provide a deeper look into the encoder and the decoder architectures used in this work.As shown in Fig. 1b, the convolutional neural network (CNN) encoder is composed of several convolution layers, followed by a dense layer representing the encoded output.This dense layer output is rearranged into a 2D-array corresponding to the spatial grid that maps the phase-encoded transmitter aperture.We assumed that both the desired messages and the phase codes to be transmitted comprise 28 × 28 pixels unless otherwise stated.The architecture of the encoder remains the same across all the designs reported in this paper.The architecture of the diffractive decoder, which decodes the transmitted and obstructed phase-encoded waves, is shown in Fig. 1c.This figure shows a diffractive decoder comprising  = 3 spatially-engineered surfaces/layers (i.e.,  1 ,  2 and  3 ); however, in this work, we also report results for designs comprising diffractive decoders with  = 1 and  = 5 layers, used for comparison.Together with the encoder CNN parameters, the spatial features of the diffractive surfaces of the all-optical decoder are optimized to decode the encoded and blocked/obscured wavefront.In this work, we consider phase-only diffractive features, i.e., only the phase values of the features at each diffractive surface are trainable (see the 'Materials and Methods' section for details).Figure 1 also compares the performance of the presented electronic encoding and diffractive decoding scheme to that of a lens-based camera.As shown in Fig. 1d, the lens images reveal significant loss of information caused by the opaque occlusion in a standard camera system, showcasing the scale of the problem that is addressed through our proposed approach.
For all the models reported in this work, the data-driven joint training of the electronic encoder CNN and the diffractive decoder was accomplished by minimizing a structural loss function defined between the object (ground-truth message) and the diffractive decoder output, using 55,000 images of handwritten digits from the MNIST (34) training dataset, augmented by 55,000 additional customgenerated images (see the 'Materials and Methods' section as well as Supplementary Fig. S1 for details).All our results come from blind testing with objects/messages never used during training.

Numerical analysis of diffractive optical communication around opaque occlusions
First, we compare, for various levels of opaque occlusions, the performance of trained encoder-decoder pairs with different diffractive decoder architectures in terms of the number of diffractive surfaces employed.Specifically, for each of the occlusion width values, i.e.,   = 32.0,  = 53.3and   = 74.7,we designed three encoder-decoder pairs, with  = 1,  = 3, and  = 5 diffractive layers within the decoders, and compared the performance of these designs for new handwritten digits in Fig. 2.This blind testing refers to 'internal generalization' because even though these particular test objects were never used in training, they are from the same dataset.As shown in Fig. 2, even  = 1 designs can faithfully decode the message for optical communication around these various levels of occlusions.Furthermore, as the number of layers in the decoder increases to  = 3 or  = 5, the quality of the output also gets better.While the performance of the  = 1 design deteriorates slightly as   increases, the  = 3 and  = 5 designs do not show any appreciable degradation in qualitative performance for such bigger occlusions.Note that the width of the transmitter aperture is   = 59.73λ; therefore, for an occlusion size of for   = 74.7,none of the ballistic photons can reach the receiver aperture since the opaque occlusion completely blocks the aperture of the encoding transmitter aperture.Nonetheless, the scattering from the occlusion edges suffices for the encoder-decoder pair to communicate faithfully.
To supplement the qualitative results of Fig. 2, we also quantified the performance of different encoderdecoder pairs designed for increasing occlusion widths (  ), in terms of peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) (35) averaged over 10,000 handwritten digits from the MNIST test set (never used before); see Figs. 3a and 3b, respectively.With increasing   , we see a larger decrease in the performance of  = 1 designs compared to  = 3 and  = 5 designs.Interestingly, there is a slight improvement in the performance of  = 1 and  = 3 decoders as   surpasses   = 59.73λ (the transmitter aperture width); this improved level of performance is retained for   >   , the cause of which will be discussed later in our Discussion section.
Next, for the same designs reported in Fig. 2, we explored the external generalization of these encoderdecoder pairs by testing their performance on types of objects that were not represented in the training set; see Fig. 4. For this analysis, we randomly chose two images of fashion products from the Fashion-MNIST (36) test set (top) and two additional images from the CIFAR-10 (37) test set (bottom).As shown in Fig. 4, our encoder-decoder designs show excellent generalization to these completely different object types.Although the decoder outputs of the  = 1 decoder designs for   = 53.3and   = 74.7are slightly degraded, the objects are still recognizable at the output plane even for the complete blockage of the transmitter aperture by the occlusion.
We also investigated the ability of these designs to resolve closely separated features in their outputs.For this purpose, we transmitted test patterns consisting of four closely spaced dots, and the corresponding diffractive decoder outputs are shown in Fig. 5.For the top (bottom) pattern, the vertical/horizontal separation between the inner edges of the dots is 2.12λ (4.24λ).None of the designs could resolve the dots separated by 2.12λ; however, the dots separated by 4.24λ were resolved by all the encoder-decoder designs with good contrast, as can be seen from the cross-sections accompanying the output images in Fig. 5.It is to be noted that this resolution limit of 4.24λ is due to the output pixel size, which was set as 2.12λ in our simulations.The effective resolution of our encoder-decoder system can be further improved within the diffraction limit of light by using higher-resolution objects and a smaller pixel size during the training.

Impact of phase bit depth on performance
Here, we study the effect of a finite bit-depth   phase quantization of the encoder plane as well as the diffractive layers.For the results presented so far, we did not assume either to be quantized, i.e., an infinite bit-depth of phase quantization was assumed.For the   = 32.0, = 3 design (trained assuming an infinite bit-depth  , = ∞), the first row of Fig. 6a shows the impact of quantizing the encoded phase patterns as well as the diffractive layer phase values with a finite bit-depth  , .This represents an "attack" on the design since the encoder CNN and the diffractive decoder were trained without such a phase bit-depth restriction; stated differently, they were trained with  , = ∞ and are now tested with finite levels of  , .For the  , = ∞ designs, the output quality remains unaffected for  , = 8; however, there is considerable degradation under  , = 4, and we face complete failure with  , = 3 and  , = 2.However, this sharp performance degradation with decreasing  , can be amended by considering the finite bit-depth during training.To showcase this, we trained two additional designs with   = 32.0 and  = 3 assuming finite bit-depths of  , = 4 and  , = 3; their blind testing performance with decreasing  , is reported in the second and third rows of Fig. 6a, respectively.Both of these designs show robustness against bit-depth reduction up to  , = 3 (i.e., 8level phase quantization at the encoder and decoder layers).However, even with  , = 2 (only 4-level phase quantization), the outputs are still recognizable as shown in Fig. 6.We also quantified the performance (PSNR and SSIM) of these three designs ( , = ∞  , = 4,  , = 3) for different  , levels; see Figs. 6b and 6c.These quantitative comparisons restate the same conclusion: training with a lower  , results in robust encoder-decoder designs that preserve their optical communication quality despite a reduction in the bit-depth  , , albeit with a relatively small sacrifice in the output performance.

Output power efficiency
Next, we investigate the power efficiency of the optical communication scheme around opaque occlusions using jointly-trained electronic encoder-diffractive decoder pairs.For this analysis, we defined the diffraction efficiency (DE) as the ratio of the optical power at the output FOV to the optical power departing the transmitter aperture.In Fig. 7a, we plot the diffraction efficiency of the same designs shown in Fig. 3, as a function of the occlusion size.These values are calculated by averaging over 10,000 MNIST test images.These results reveal that the diffraction efficiency decreases monotonically with increasing occlusion width, as expected.Moreover, the diffraction efficiencies are relatively low, i.e., below or around 1%, even for small occlusions.However, this issue of low diffraction efficiency can be addressed in the design stage by adding to the training loss function an additional loss term that penalizes low diffraction efficiency (see the Supplementary Materials). Figure 7b depicts the improvement of diffraction efficiency resulting from increasing the weight () of this additive loss term during the training stage.For example, the  = 0.02 and  = 0.1 designs yield an average diffraction efficiency of 27.43% and 52.52%, respectively, while still being able to resolve various features of the target images as shown in Fig. 7c.This additive loss weight  therefore provides a powerful mechanism for improving the output diffraction efficiency significantly with a relatively small sacrifice in the image quality as exemplified in Figs.7b-c.

Occlusion shape
So far, we have considered square-shaped opaque occlusions placed symmetrically around the optical axis.However, our proposed encoder-decoder approach is not limited to square-shaped occlusions and, in fact, can be used to communicate around any arbitrary occlusion shape.In Fig. 8, we show the performance comparison of four different trained encoder-decoder pairs for four different occlusion shapes, where the areas of the opaque occlusions were kept approximately the same.We can see that the shape of the occlusion does not have any perceptible effect on the output image quality.We also plot the average SSIM values calculated for these four models over 10,000 MNIST test images (internal generalization) as well as 10,000 Fashion-MNIST test images (external generalization) in Supplementary Fig. S2, which further confirm the success of our approach for different occlusion structures, including randomly shaped occlusions as shown in Fig. 8e.

Experimental validation
We experimentally validated the electronic encoding-diffractive decoding scheme for communication around opaque occlusion in the terahertz (THz) part of the spectrum ( = 0.75mm) using a 3D-printed single-layer ( = 1) diffractive decoder (see the 'Materials and Methods' section for details).We depict the setup used for this experimental validation in Fig. 9a.Figures 9b and 9c show the 3D printed components used to implement the encoded (phase) patterns, the opaque occlusion, and the diffractive decoder layer.Shown in Fig. 9c, the width of the transmitter aperture (dashed red square) housing the encoded phase patterns was selected as   ≈ 59.73, whereas the width of the opaque occlusion (dashed green square) was   ≈ 32.0 and the diffractive decoder layer (dashed blue square) width was selected as   ≈ 106.67.The axial distances between the encoded object and the occlusion, between the occlusion and the diffractive layer, and the diffractive layer and the output FOV were ~13.33, ~106.67, and ~40, respectively.In Fig. 9d, we show the input objects/messages, the simulated lens images, and the simulated and experimental diffractive decoder output images for ten different handwritten digits randomly chosen from the test dataset.Our experimental results reveal that the CNN-based phase encoding followed by diffractive decoding resulted in successful communication of the intended objects/messages around the opaque occlusion (see the bottom row of Fig. 9d).

Discussion
Our optical communication scheme using CNN-based encoding and diffractive all-optical decoding would be useful for the optical communication of information around opaque occlusions caused by existing or evolving structures.In case such occlusions change moderately over time (for example grow in size as a function of time), the same diffractive decoder that is deployed as part of our communication link can still be used with only an update of the digital encoder CNN.To showcase this, in Supplementary Fig. S3, we illustrate an encoder-decoder design with  = 3 that was originally trained with an occlusion size of   = 32.0(blue boxes), successfully communicating the input messages between the CNNbased phase transmitter aperture and the output FOV of the diffractive decoder when the occlusion size remains the same, i.e.,   = 32.0(dashed blue box).The same figure also illustrates the failure of this encoder-decode pair once the size of the opaque occlusion grows to   = 40.0(dotted blue box); this failure due to the (unexpectedly) increased occlusion size can be repaired without changing the deployed diffractive decoder layers by just retraining the CNN encoder part; see Supplementary Fig. S3, dashed green box.
The speed of optical communication through our encoder-decoder pair would be limited by the rate at which the encoded phase patterns (CNN outputs) can be refreshed or by the speed of the output detector-array, whichever is smaller.The transmission and the decoding processes of the desired optical information/message occur at the speed of light propagation through thin diffractive layers and do not consume any external power (except for the illumination light).Therefore, the main power consuming steps in our architecture are the CNN inference, the transmitter of the encoded phase patterns and the detector-array operation.
The communication around occlusions using our scheme works even when the occlusion width is larger than the width of the transmitter aperture since it utilizes CNN-based phase encoding of information to effectively exploit the scattering from the edges of the occlusions.Surprisingly, as the occlusion width surpasses the transmitter aperture width (  ), the performance of  = 1 and  = 3 designs slightly improved, as was seen in Fig. 3.This relative improvement might be explained by a switch in the mode of operation of our encoder-decoder pair.When the opaque occlusions are smaller than the transmitter aperture, the pixels at the edges of the transmitter can communicate directly to the receiver aperture and therefore, they dominate the power balance.In this operation regime, as the occlusion size gets larger, the effective number of pixels at the transmitter aperture that directly communicates with the receiver/decoder gets smaller, causing a decline in the performance of the diffractive decoder.However, when the occlusion becomes larger than the transmitter aperture, none of the input pixels can dominate the power balance at the receiver end by communicating with it directly; instead, all the pixels of the encoder plane are forced to indirectly contribute to the receiver aperture through the edge scattering of the occlusion.This causes the performance to get better for occlusions larger than the transmitter aperture since effectively more pixels of the encoder plane can contribute to the receiver aperture without a major power imbalance among these secondary wave-based contributions (through edge scattering).This turnaround in performance (i.e., the switching behavior between these two modes of operation) is not observed when the diffractive decoder has a deeper architecture (e.g.,  = 5) since deeper decoders can effectively balance the ballistic photons that are transmitted from the edge pixels; consequently, edge-pixels of the transmitter aperture do not dominate the output signals even when they can directly 'see' the receiver aperture since multiple layers of a deeper diffractive decoder act as a universal mode processor (38)(39)(40)(41).
Finally, the success of the simpler decoder designs with  = 1 layer, as shown in Figs.2-5 and 9, begs the question of whether such an optical communication around opaque occlusions is also feasible with electronic encoding only, i.e., without diffractive decoding.To address this question, we trained two encoder-only designs, for   = 32.0 and   = 53.3,and compared their performance against  = 1 designs in Supplementary Fig. S4.The encoder-only architecture barely succeeds for   = 32.0 and fails drastically for   = 53.3,whereas  = 1 designs provide significantly better performance.This demonstrates the importance of complementing electronic encoding with diffractive decoding for effective communication around opaque occlusions.

Model
In our model, the message/object  that is to be transmitted is fed to a CNN, which yields a phaseencoded representation  of the message.The message is assumed to be in the form of an   ×   = 28 × 28 pixel image.The coded phase  is assumed to have dimension   ×   = 28 × 28.The   ×   phase elements are distributed over the transmitter aperture of area   ×   , where   ≈ 59.73λ and λ is the illumination wavelength.The lateral width of each phase element/pixel is therefore     ⁄ ≈ 2.12λ.The phase-encoded input wave exp() propagates a distance   ≈ 13.33λ to the plane of the opaque occlusion, where its amplitude is modulated by the occlusion function (, ) such that: The encoded wave, after being obstructed and scattered by the occlusion, travels to the receiver through free space.At the receiver, the diffractive decoder all-optically processes and decodes the incoming wave to produce an all-optical reconstruction  � ′ of the original message  at its output FOV.We assume the receiver aperture, which coincides with the first layer of the diffractive decoder, to be located at an axial distance of   ≈ 106.67λ away from the plane of the occlusion.The effective size of the independent diffractive features of each transmissive layer is assumed to be 0.53λ × 0.53λ, and each of the  layers comprises 200 × 200 such diffractive features, resulting in a lateral width of   ≈ 106.67λ for the diffractive layers.The layer-to-layer separation is assumed to be   = 40λ.The output FOV of the diffractive decoder is assumed to be 40λ away from the last diffractive layer and extend over an area   ×   , where   ≈ 59.73λ.
The diffractive decoding at the receiver involves consecutive modulation of the received wave by the  diffractive layers, each followed by propagation through the free space.The modulation of the incident optical wave on a diffractive layer is assumed to be realized passively by its height variations.The complex transmittance  (, ) of a passive diffractive layer is related to its height ℎ(, ) according to: where  and  are the refractive index and the extinction coefficient, respectively, of the diffractive layer material at λ;  = exp �− 2 λ ℎ� and  = 2 λ ( − 1)ℎ are the amplitude and the phase of the complex field transmittance, respectively.For our numerical simulations, we assume the diffractive layers to be lossless, i.e.,  = 0,  = 1, unless stated otherwise.
The propagation of the optical fields through free space is modeled using the angular spectrum method (33,42), according to which the transformation of an optical field (, ) after propagation by an axial distance  can be computed as follows: where ℱ (ℱ −1 ) is the two-dimensional Fourier (Inverse Fourier) transform operator and �  ,   ; � is the free-space transfer function for propagation by an axial distance  defined as follows:

0, otherwise
In our numerical analyses, the optical fields were sampled at an interval of  ≈ 0.53λ along both  and  directions and the Fourier (Inverse Fourier) transforms were implemented using the Fast Fourier Transform (FFT) algorithm.
For the lens-based imaging simulations reported in this work, the plane wave illumination was assumed to be amplitude modulated by the object placed at the transmitter aperture, and the (thin) lens is assumed to be placed at the same plane as the plane of the first diffractive layer in the encodingdecoding scheme, with the diameter of the lens aperture equal to the width of the diffractive layer, i.e.,   ≈ 106.67λ.

Experimental design
In our experiments, the wavelength of operation was λ = 0.75mm.We used a single-layer diffractive decoder, i.e.,  = 1, with  = 200 2 independent features and the width of each feature was ~0.53λ ≈ 0.40mm, resulting in an ~80mm × 80mm diffractive layer.The width of the transmitter aperture accommodating the encoded phase messages was   ≈ 59.73λ ≈ 44.8mm, same as the width of the output FOV   .The occlusion width was   ≈ 32λ ≈ 24mm.The distance from the transmitter aperture to the occlusion plane was   ≈ 13.33λ ≈ 10mm, while the diffractive layer was   ≈ 106.67λ ≈ 80 mm away from the occlusion plane.The output FOV was 40λ ≈ 30mm away from the diffractive layer.
The diffractive layers and the phase-encoded messages (CNN outputs) were fabricated using a 3D printer (Objet30 Pro, Stratasys Ltd).Similar to the implementation of the diffractive layer phase, the phase-encoded messages were implemented by height variations according to ℎ  =  λ 2(−1) . The height variations were applied on top of a uniform base thickness of 0.2mm, used for mechanical support.The occlusion was realized by pasting aluminum on a 3D-printed substrate (see Fig. 9).The measured complex refractive index  +  of the 3D-printing material at λ = 0.75mm was 1.6518 + 0.0612.
While training the experimental model, the weight  of the diffraction efficiency-related loss term was set to be zero.To make the experimental design robust against misalignments, we incorporated random lateral and axial misalignments of the encoded objects, the occlusion and the diffractive layer into the optical forward model during its training (45).The random misalignments were modeled using the uniformly distributed random variables Δ  ~(−, ), Δ  ~(−, ) and Δ  ~(−2, 2) representing the displacements of the encoded objects, the occlusion and the diffractive layer along ,  and  directions, respectively, from their nominal positions.

Terahertz experimental setup
A WR2.2 modular amplifier/multiplier chain (AMC) in conjunction with a compatible diagonal horn antenna from Virginia Diodes Inc. was used to generate a continuous-wave (CW) radiation at 0.4 THz, by multiplying a 10 dBm RF input signal at  1 = 11.1111GHz 36 times.To resolve low-noise output data through lock-in detection, the AMC output was modulated at a rate of   = 1 kHz.The exit aperture of the horn antenna was positioned ~60 cm away from the input (encoded object) plane of the 3Dprinted diffractive decoder for the incident THz wavefront to be approximately planar.A single-pixel Mixer/AMC, also from Virginia Diodes Inc., was used to detect the diffracted THz radiation at the output plane.To down-convert the detected signal to 1 GHz, a 10dBm local oscillator signal at  1 = 11.0833GHz was fed to the detector.The detector was placed on an X-Y positioning stage consisting of two linear motorized stages from Thorlabs NRT100, and the output FOV was scanned using a 0.5 × 0.1 mm detector with a scanning interval of 2 mm.The down-converted signal was amplified, using cascaded low-noise amplifiers from Mini-Circuits ZRL-1150-LN+, by 40 dB and passed through a 1 GHz (+/-10 MHz) bandpass filter (KL Electronics 3C40-1000/T10-O/O) to filter out the noise from unwanted frequency bands.The filtered signal was attenuated by a tunable attenuator (HP 8495B) for linear calibration and then detected by a low-noise power detector (Mini-Circuits ZX47-60).The output voltage signal was read out using a lock-in amplifier (Stanford Research SR830), where the   = 1kHz modulation signal served as the reference signal.The lock-in amplifier readings were converted to a linear scale according to the calibration results.To enhance the signal-to-noise ratio (SNR), a 2 × 2 binning was applied to the THz measurements.We also digitally enhanced the contrast of the measurements by saturating the top 1% and the bottom 1% of the pixel values using the built-in MATLAB function imadjust and mapping the resulting image to a dynamic range between 0 and 1.
Supplementary Materials: This file contains the training details and Supplementary Figures S1-S4.

Fig. 2
Fig. 2 Generalization of trained encoder-decoder pairs to previously unseen handwritten digit objects.For different values of the occlusion width   , the performances of trained encoder-decoder pairs with different numbers of decoder layers () are depicted for comparison.

Fig. 3
Fig. 3 Quantification of the performance of encoder-decoder pairs with different numbers of decoder layers () trained for increasing occlusion widths (  ) in terms of (a) PSNR and (b) SSIM between the diffractive decoder outputs and the ground-truth messages.The PSNR and SSIM values are calculated by averaging over 10,000 MNIST test images.  refers to the width of the transmitter aperture.

Fig. 4
Fig. 4 Same as Fig. 2, except that these results reflect external generalizations on object types different from those used during the training.

Fig. 5
Fig. 5 Output resolution of diffractive decoders corresponding to  = 1,  = 3, and  = 5 designs trained for different occlusion widths (  ).As for the objects, the vertical/horizontal separation between the inner edges of the dots is 2.12 for the test pattern on the top and 4.24 for the one below.The diffractive decoder outputs are accompanied by cross-sections taken along the color-coded vertical/horizontal lines.

Fig. 6
Fig. 6 Effect of the phase bit depth of the encoded object and the diffractive layer features on the performance of trained encoder-decoder pairs.(a) Qualitative performance of the designs, which are trained assuming a certain phase quantization bit depth  , , reported as a function of the bit depth used during testing  , .(b) For different  , , PSNR and SSIM values are plotted as a function of  , .The PSNR and SSIM values are evaluated by averaging the results of 10,000 test images from the MNIST dataset.

Fig. 7
Fig. 7 Output power efficiency of the electronic encoding-diffractive decoding scheme for optical communication around fully opaque occlusions.(a) Diffraction efficiency (DE) of the same designs shown in Fig. 3. (b) The trade-off between DE and SSIM achieved by varying the training hyperparameter , i.e., the weight of an additive loss term used for penalizing low-efficiency designs.For these designs,   = 32 and  = 3 were used.The DE and SSIM values are calculated by averaging over 10,000 MNIST test images.(c) The performance of some of the designs shown in (b), trained with different  values.

Fig. 8
Fig. 8 Performance of encoder-decoder pairs trained for different opaque occlusion shapes.The performances of four designs trained for different occlusion shapes, i.e., a square, a circle, rectangle, and an arbitrary shape, are shown.The areas of these fully opaque occlusions are approximately equal.

Fig. 9
Fig. 9 Experimental results with an  = 1 design for an occlusion width of   = 32 operating at a wavelength of  = 0.75mm.(a) The terahertz setup comprising the source and the detector, together with the 3D-printed components used as the encoded phase objects, the occlusion, and the diffractive