Abstract
Recognition of multifrequency microwave (MW) electric fields is challenging because of the complex interference of multifrequency fields in practical applications. Rydberg atombased measurements for multifrequency MW electric fields is promising in MW radar and MW communications. However, Rydberg atoms are sensitive not only to the MW signal but also to noise from atomic collisions and the environment, meaning that solution of the governing Lindblad master equation of lightatom interactions is complicated by the inclusion of noise and highorder terms. Here, we solve these problems by combining Rydberg atoms with deep learning model, demonstrating that this model uses the sensitivity of the Rydberg atoms while also reducing the impact of noise without solving the master equation. As a proofofprinciple demonstration, the deep learning enhanced Rydberg receiver allows direct decoding of the frequencydivision multiplexed signal. This type of sensing technology is expected to benefit Rydbergbased MW fields sensing and communication.
Introduction
The strong interaction between Rydberg atoms and microwave (MW) fields that results from their high polarizability means that the Rydberg atom is a candidate medium for MW fields measurement, e.g., using electromagnetically induced absorption^{1}, electromagnetically induced transparency (EIT)^{2,3} and the Autler–Townes effect^{3,4,5,6}. The amplitudes^{7,8,9,10}, phases^{10,11} and frequencies^{9,10} of MW fields could then be measured with high sensitivity. Based on this measurement sensitivity for MW fields, the Rydberg atom has been used in communications^{7,8,12,13} and radar^{14} as an atombased radio receiver. In the communications field, the Rydberg atom replaces the traditional antenna with superior performance aspects that include subwavelength size, high sensitivity, system international (SI) traceability to Planck’s constant, high dynamic range, selfcalibration and an operating range that spans from MHz to THz frequencies^{7,9,10,15,16}. One application is analogue communications, e.g., realtime recording and reconstruction of audio signals^{13}. Another application is digital communications, e.g., phaseshift keying and quadrature amplitude modulation^{7,8,12}. The channel capacity of MWbased communications is limited by the standard quantum limited phase uncertainty^{7}. Furthermore, a continuously tunable radiofrequency carrier has been realized based on Rydberg atoms^{17}, thus paving the way for concurrent multichannel communications. Detection and decoding of multifrequency MW fields are highly important in communications for acceleration of information transmission and improved bandwidth efficiency. Additionally, MW fields recognition enables simultaneous detection of multiple targets with different velocities from the multifrequency spectrum induced by the Doppler effect. However, because of the sensitivity of Rydberg atoms, the noise is superimposed on the message, meaning that the message cannot be recovered efficiently. Additionally, it is difficult to generalize and scale the bandpass filters to enable demultiplexing of multifrequency signals with more carriers^{16}.
To solve these problems, we use a deep learning model for its accurate signal prediction capability and its outstanding ability to recognize complex information from noisy data without use of complex circuits. The deep learning model updates the weights via backpropagation and then extracts features from massive data without human intervention or prior knowledge of physics and the experimental system. Because of these advantages, physicists have constructed complex neural networks to complete numerous tasks, including farfield subwavelength acoustic imaging^{18}, value estimation of a stochastic magnetic field^{19}, vortex light recognition^{20,21}, demultiplexing of an orbital angular momentum beam^{22,23} and automatic control of experiments^{24,25,26,27,28,29}.
Here, we demonstrate a deep learning enhanced Rydberg receiver for frequencydivision multiplexed digital communication. In our experiment, the Rydberg atoms act as a sensitive antenna and a mixer to receive multifrequency MW signals and extract information^{9,11,12}. The modulated signal frequency is reduced from several gigahertz to several kilohertz via the interaction between the Rydberg atoms and the MWs, thus allowing the information to be extracted using simple apparatus. These interference signals are then fed into a welltrained deep learning model to retrieve the messages. The deep learning model extracts the multifrequency MW signal phases, even without knowing anything about the Lindblad master equation, which describes the interactions between atoms and light beams in an open system theoretically. The solution of the master equation is often complex because the higherorder terms and the noises from the environment and from among the atoms are taken into consideration. However, the deep learning model is robust to the noise because of its generalization ability, which takes advantage of the sensitivity of the Rydberg atoms while also reducing the impact of the noise that results from this sensitivity. Our deep learning model is scalable, allowing it to recognize the information carried by more than 20 MW bins. Additionally, when the training is complete, the deep learning model extracts the phases more rapidly than via direct solution of the master equation.
Results
Setup
We adapt a twophoton RydbergEIT scheme to excite atoms from a ground state to a Rydberg state. A probe field drives the atomic transition \(5{S}_{1/2},F=2\rangle \to 5{P}_{1/2},F^{\prime} =3\rangle\) and a coupling light couples the transition \(5{P}_{1/2},F^{\prime} =3\rangle \to 51{D}_{3/2}\rangle\) in rubidium 85, as shown in Fig. 1a. Multifrequency MW fields drive a radiofrequency (RF) transition between the two different Rydberg states \(51{D}_{3/2}\rangle\) and \(50{F}_{5/2}\rangle\). The energy difference between these states is 17.62 GHz. The multifrequency MW fields consist of multiple MW bins (more than three bins) with frequency differences of several kilohertz from the resonance frequency. The amplitudes, frequencies, and phases of the multiple MW bins can be adjusted individually (further details are provided in the “Methods” section). The detunings of the probe, coupling and MW fields are Δ_{p}, Δ_{c} and Δ_{s}, respectively. The Rabi frequencies of the probe, coupling and MW fields are Ω_{p}, Ω_{c} and Ω_{s}, respectively. The experimental setup is depicted in Fig. 1b. We use MW fields to drive the Rydberg states constantly, producing modulated EIT spectra, i.e., the probe transmission spectra, as shown in the inset of Fig. 1b. The phases of the MW fields correlate with the modulated EIT spectra and can be recovered from these spectra with the aid of deep learning. Specifically, the probe transmission spectra are fed into a welltrained deep learning model that consists of a onedimensional convolution layer (1D CNN), a bidirectional long–shortterm memory layer (BiLSTM) and a dense layer to extract the phases of the MW fields. Figure 1c–e shows these components of the neural network (further details are presented in the “Methods” section). Finally, the bin phases are recovered and the data are read out.
Frequencydivision multiplexed signal encoding and receiving
In the experiments, we use a fourbin frequencydivision multiplexing (FDM) MW signal for demonstration, where one of the four MW bins is used as the reference bin. The relative phase differences between the reference bin and the other bins are modulated by the message signal. Specifically, for the fourbin MW signal,
where ω_{0} is the resonant frequency, ω_{1,2,3} are the relative frequencies, the carrier frequencies are 2π(ω_{0} + ω_{1}) = 17.62 GHz − 3 kHz, 2π(ω_{0} + ω_{2}) = 17.62 GHz − 1 kHz, 2π(ω_{0} + ω_{3}) = 17.62 GHz + 1 kHz and 2π(ω_{0} + ω_{4}) = 17.62 GHz + 3 kHz, the frequency difference between two frequencyadjacent bins is Δf = 2 kHz and the message signal is φ_{1,2,3} = 0 or π, standing for 3 bits (0 or 1), and the reference phase is φ_{4} = 0 (which remains unchanged). The phase list \(\left({\varphi }_{1},\,{\varphi }_{2},\,{\varphi }_{3},\,{\varphi }_{4}\right)\) is a bit string for time t_{0}. By varying the phase of φ_{1,2,3} with time, we then obtain the FDM signal for binary phaseshift keying (2PSK). Additionally, the amplitudes of the four bins are 0.1A _{4} = A_{1,2,3} to solve the problem that results from the nonlinearity of the atom, where the probe transmission spectra of two different bit strings, e.g. (0, 0, π, 0) and (0, π, 0, 0), are the same (further details are presented in the “Methods” section). By increasing the frequency difference Δf, we can obtain higher information transmission rates. For four bins with Δf = 2 kHz, the information transmission rate is n_{b} × Δf = (4 − 1) × 2 × 10^{3} bps = 6 kbps, where n_{b} is the number of bits. In the experiments, disturbances originate from the environment and atomic collisions. Because of the sensitivity of Rydberg atoms to MW fields, the resulting noise submerges our signal. To use the sensitivity of the Rydberg atoms and simultaneously minimize the effects of noise, the deep learning model is used to extract the relative phases \(\left({\varphi }_{1},\,{\varphi }_{2},\,{\varphi }_{3}\right)\).
Deep learning
To improve the robustness and speed of our receiver, we use a deep learning model to decode the probe transmission signal. The complete encoding and decoding process is illustrated in Fig. 2a. The Rydberg antenna receives the FDM2PSK signal and downconverts this signal into the probe transmission spectrum. The information is then retrieved from the spectrum using the deep learning model. The precondition is that the different bit strings correspond to distinct probe spectra; this is resolved by setting 0.1A_{4} = A_{1,2,3}, as discussed earlier. Then, we combine the 1D CNN layer, the BiLSTM layer and the dense layer to form the deep learning model (see the “Methods” section for further details)^{30,31}. One of the reasons for using the 1D CNN layer and the BiLSTM layer is that the data sequences are long, which means that prediction of the phases \({{{{{\boldsymbol{\varphi }}}}}}=\left({\varphi }_{1},\,{\varphi }_{2},\,{\varphi }_{3},\,0\right)\) from the spectrum is a regression task and requires a longterm memory for our model. Another reason is to combine the convolution layer’s speed with the sequential sensitivity of the BiLSTM layer^{32}. The input sequence is first processed by the 1D CNN to extract the features, meaning that a long sequence is converted into a shorter sequence with higherorder features. This process is visualized to show how the deep learning model treats the transmission spectrum; more details are presented in the Supplementary Materials. The shorter sequence is then fed into the BiLSTM layer and resized by the dense layer to match the label size (see the “Methods” section for further details). Specifically, the probe spectrum \({{{{{\bf{T}}}}}}=\left\{{T}_{0},\,{T}_{\tau },\,{T}_{2\tau },\,\cdots \,,\,{T}_{{{{{{\rm{i}}}}}}\cdot \tau }\cdots \,,\,{T}_{{{{{{\rm{N}}}}}}\cdot \tau }\right\}\) and the corresponding phases \({{{{{\boldsymbol{\varphi }}}}}}=\left({\varphi }_{1},\,{\varphi }_{2},\,{\varphi }_{3},\,{\varphi }_{4}=0\right)\) are collected to form the data set, where T_{i⋅τ} is the ith data point of a probe spectrum and the fourth bit φ_{4} = 0 is the reference bit. Both the spectra and the phases are 1D vectors with dimensions of N + 1 and 4, respectively. These independent, identically distributed data {{T}, {φ}} are fed into our model as a data set. By shuffling this data set and splitting it into three sets, i.e., a training set, a validation set and a test set, we train our model on the training set (feeding both the waveforms and labels {{T}, {φ}}), validate, and test our model on the validation and test sets, respectively (by feeding waveforms without labels and comparing the predictions with ground truth labels). The validation set is used to determine whether there is either overfitting or underfitting during training. Finally, the performance (i.e., accuracy) of the model is estimated by predicting the test set.
The performance of our deep learning model is affected by the training epochs and the training and validation set sizes. The training curves on different training sets and validation sets are shown in Fig. 2b, c. Initially, our model performs well on the training set only, implying overfitting. The curves then converge (dashed line) and our model performs well on both the training set and the validation set. The sudden jump in the loss curve in Fig. 2c is caused by the change in the learning rate (further details are presented in the “Methods” section). Use of more training and validation data causes the curves to converge more quickly. The deep learning model performs well after these fewsample training. In Fig. 2d, we show a confusion matrix for prediction of a uniformly distributed test set, which demonstrates accuracy of 99.38%.
The “noise” shown in Fig. 2a refers to two kinds of noises. One comes from atoms and the external environment (systematic noise). The other comes from the noise added on purpose (additional noise). The systematic noise cannot be adjusted quantitatively and is discussed with its noise spectrum in the Supplementary Materials. Because the noise on the data set is independent and is distributed identically (i.i.d.), i.e., the entire data set is shuffled before being split into the training and test sets, the systematic noise pattern is almost the same in both the training set and the test set. The deep learning model has already learned the systematic noise pattern during the training process, which is one of the major advantages of use of deep learning against systematic noise. However, there is a case where the noise is not i.i.d. (i.e., the case where a specific noise occurs during testing only). This problem can be solved by online learning and addition of prior knowledge as new features into the data, e.g., data for the temperature, the weather, and other factors^{33}. Here for simplicity, we talk about the i.i.d. case only and add the white noise with a mean μ and a standard deviation σ. We ignore the 1/f noise in this case because it decays rapidly in the low frequency range and the signal with which it would interfere is located within the 2–200 kHz range. The additional noise is added both on the training set and the test set of the deep learning model in Fig. 2e, which demonstrates the performance of the deep learning model when used on a data set with biased or unbalanced noise. The results below the red line show the performance of the model after training on a weakernoise training set when predicting based on a strongernoise test set, i.e., generalization for a stronger noise case. These results indicate that the deep learning model has the generalization ability required to adapt to stronger noise. In the area above the red line, there is more noise in the training set than in the test set. Theoretically, a small amount of additional noise in the training set will increase the robustness of the deep learning model. However, when the noise increases, it affects the accuracy, which decays rapidly. Next, the welltrained model is used to reconstruct the QR code. In Fig. 3a–c, the results and the corresponding confusion matrices with their epochs are shown. First, the information is encoded into a QR code. After the code is transmitted, received and decoded using the Rydberg atoms and the deep learning model, the information is then reconstructed successfully using the 35epoch training model in Fig. 3c, but is not reconstructed in parts (a) and (b). The accuracy is defined by the number of correctly predicted bit strings divided by the total number of bit strings (147 bit strings). After 35 epochs, the accuracy reaches 99.32% and the message is reconstructed successfully from the QR code received.
Comparison between deep learning method and the master equation
In our case, the master equation that we employed is the commonly used one without considering the noise spectrum. The accuracies of the deep learning model and the master equation fitting on noisy data are different. Figure 4 shows the accuracies obtained by the two methods. The deep learning model is trained on a training set without additional noise, and tested on a test set with additional white noise whose standard deviation is σ (the transmission spectra with noise are given in Supplementary Materials). Here for simplicity, the data set is composed of the transmission of four MW bins only (one of them is reference bin) and the frequency difference between the adjoin bins is Δf = 2kHz. On the other hand, the result of the master equation is given based on the same test set as that of the deep learning model. The deep learning method outperforms the fitting of the master equation on the noisy data set.
Apart from the robutness to the noise, when the transmission rate is increased by increasing the number of MW bins or the frequency difference Δf, the deep learning model performs well, while it is difficult to retrieve the messages with high accuracy using the master equation. Specifically, to increase the bandwidth efficiency and the transmission rate, the number of MW bins used to carry the messages must be increased, but the information is still recognizable because of the scalability of the deep learning model. For 20 MW bins, the number of bits is (20 − 1) with one reference bit, giving a \(\left(201\right)\times 2\,{{{{{\rm{kbps}}}}}}=38\,{{{{{\rm{kbps}}}}}}\) transmission rate. The number of combinations of these bits is 2^{19}, which increases exponentially as the number of MW bins increases. Here, for demonstration purposes, only the first 3 bits of the total of 19 bits carry the messages and the other bits, including the reference, are set to be 0. To show how well our model performs, we train, validate and test the model on this new data set without varying the other parameters, with the exception of the training epochs of our model. The loss curves for training and validation are shown in Fig. 5a. A confusion matrix for epoch 78 is shown in Fig. 5b. The model performs well on this new test set, which was sampled uniformly from eight categories with an accuracy of 100%. Another method that can be used to increase the information transmission rate involves increasing the frequency difference. In our case, the frequency difference is increased from Δf = 2 kHz to Δf = 200 kHz. The transmission rate increases correspondingly, from (4 − 1) × 2 kbps = 6 kbps to (4 − 1) × 200 kbps = 0.6 Mbps. To detect the highspeed signal, the DD bandwidth is increased, which inevitably leads to increased noise. After the model is trained on this new data set, the training and validation loss curves are as shown in Fig. 5c. A confusion matrix for epoch 83 is shown in Fig. 5d. Increasing the number of training epochs allows the model to perform well on this new data set, with an accuracy of 98.83% on a uniformly sampled test set.
To compare the performances of the deep learning model and the master equation, we fitted the probe spectra for 20 bins with a frequency difference Δf = 2 kHz and four bins with a frequency difference Δf = 200 kHz by solving the master equation without considering the higherorder terms and the effects of noise. In each case, 160 probe spectra were fitted that were sampled uniformly from every category. The prediction results are shown in Fig. 5(e) and (f). The prediction accuracy of the master equation is lower than that of the deep learning model. In our case, the impact of increasing the number of bins is greater than increasing the DD bandwidth for highspeed signals on the fitting accuracy. The prediction accuracy for a 20bin carrier with frequency difference Δf = 2 kHz is 20.63%, which is like to the accuracy of guessing, i.e., 1/8. This implies that there is a disadvantage that comes from the fitting method itself, i.e., it can easily become trapped by local minima. Some type of prior knowledge is required to overcome this disadvantage, e.g., provision of the initial values of the phases before fitting. In contrast, the deep learning model is data driven and does not require any prior knowledge. The local minima problem of deep learning can be overcome using some wellknown techniques, including learning rate scheduling and design of a more effective optimizer^{32}. Additionally, the accuracy difference for the 200kHzdifference MW bins between the deep learning model and the master equation means that the deep learning model is more robust to noise. Furthermore, the prediction time for the master equation is 25 s per spectrum, while the time for the deep learning model is 1.6 ms per spectrum. The master equation is solved by “FindFit” function in Mathematica 11.1 with both “AccuracyGoal” and “PrecisionGoal” default, while the deep learning code is written in Python 3.7.6. These codes are run on the same computer with NVIDIA GTX 1650 and Intel®\({{{{{{\rm{Core}}}}}}}^{{{{{{\rm{TM}}}}}}}\) i79750H.
Another method to decode the signal is available that uses an inphase and quadrature (I–Q) demodulator or a lockin amplifier^{7,12}. However, the carrier frequency must be given when decoding the signal in this case. Additionally, for multiple MW bins, numerous bandpass filters are required. The deep learning method is thus much more convenient.
Discussion
We report a work on Rydberg receiver enhanced via deep learning to detect multifrequency MW fields. The results show that the deep learning enhanced Rydberg mixer receives and decodes multifrequency MW fields efficiently; these fields are often difficult to decode using theoretical methods. Using the deep learning model, the Rydberg receiver is robust to noise induced by the environment and atomic collisions and is immune to the distortion that results from the limited bandwidths of the Rydberg atoms (from dipoledipole interactions and the EIT pumping rate, as studied in ref. ^{7}) for highspeed signals (Δf = 200 kHz). In addition to increasing the transmission speed of the signals, further increments in the information transmission rate are achieved by using more bins, which is feasible because of the scalability of our model. Besides the transmission rate, this deep learning enhanced Rydberg system promises for use in studies of the channel capacity limitations. Because spectra that are difficult for humans to recognize as a result of noise and distortion are distinguishable when using the deep learning model, Rydberg systems enhanced by deep learning could take steps toward the realization of the capacity limit proposed in the literature ref. ^{34}. To obtain high performance (i.e. high signaltonoise ratio, information transmission rate, channel capacity and accuracy), the training epochs and training set must be extended and enlarged.
In summary, we have demonstrated the advantages of receiving and decoding multifrequency signals using a deep learning enhanced Rydberg receiver. In a multifrequency signal receiver, rather than using multiple bandpass filters, lockin amplifier^{7,12} and other complex circuits, signals can be decoded using the extremely sensitive Rydberg atoms and the deep learning model at high speed and with high accuracy without solving the Lindblad master equation. One of the advantages of use of the Rydberg atom is that the accuracy of the Rydberg atom approaches the photon shot noise limit^{35}. In principle, the accuracy of the Rydberg atom is higher than that of the classical antenna. According to recent work based on the atomic superheterodyne method, ultrahigh sensitivity can be obtained^{10}. However, in this proofofprinciple demonstration, there is considerable room for the optimization required to reach that limit (e.g., stabilization of the laser, narrowing the laser linewidth, and temperature stabilization). The sensitivity of the Rydberg atoms is a doubleedged sword because it also involves noise. The deep learning model restricts this side effect while taking full advantage of the Rydberg atoms’ sensitivity to the signal. Using the automatic feature extraction processes of the neural networks, the spectra are classified in a supervised manner. If the features (e.g. mean value, variance, frequency spectrum) are extracted manually, the spectra are then clustered by unsupervised learning methods such as tdistributed stochastic neighbour embedding (tSNE) or the densitybased spatial clustering of applications with noise (DBSCAN) method^{31}, without training on the training set. Our work will be useful in fields including highprecision signal measurement and atomic sensors. Additionally, this decoding ability can be generalized further to decode other signals that are encoded by different encoding protocols, e.g., frequency division multiplexing amplitude shift keying (FDMASK), frequency division multiplexing quadrature amplitude modulation (FDMQAM), and IEEE 802.11ac WLAN standard signals for a 5 GHz carrier. The frequency of carrier to be decoded covers from several hertz to terahertz since for Rydberg atoms to receive MW with different wavelengths, the only part of the system that needs to be tuned is the frequency of the laser, while in classical receivers, the wavelength of the received MW is limited by the size of the antenna^{36,37,38,39}. In addition to communications, our receiver can be used to detect multiple targets from multifrequency signals caused by the Doppler effect.
Methods
Generation and calibration of MW fields
The MW fields used in our experiments were synthesized by the signal generator (1465FV from Ceyear) and a frequency horn. Each bin in the multifrequency MW field is tunable in terms of frequency, amplitude and phase. The RF source operates in the range from DC to 40 GHz. The frequency horn is located close to the Rb cell. We used an antenna and a spectrum analyser (4024F from Ceyear) to receive the MW fields and then calibrated the amplitudes of the MW fields at the centre of the Rb cell.
The probe transmission spectrum in the time domain when Δ_{p} = 0, Δ_{c} = 0 and Δ_{s} = 0 reflects the interference among the multifrequency MW bins, which results from the beat frequencies of the bins that occur through the interaction between the atoms and light. The Rydberg atoms receive the MW bins by acting as an antenna and a mixer^{9,11,12}. After reception by the atoms, the frequency spectrum of the probe transmission shows that we can obtain the frequency differential signal from the probe transmission spectrum. This represents an application of our atoms to reduce the modulated signal frequency (from terahertz to kilohertz magnitude), which allows the signal to be received and decoded using simple apparatus. In our experiment, more than 20 frequency bins can be added to the atoms, for which the dynamic range is greater than 30 dBm. The amplitudes, phases and frequencies of these bins can be tuned individually. When the bandwidth is increased to detect an increasing frequency difference Δf signal, more noise is involved, but this noise is suppressed by the deep learning model. In other words, the signal can be recognized using the deep learning model when the information transmission rate is increased by raising the frequency difference Δf. These bins are used to send FDMPSK signals in the “FDM signal encoding and receiving ” section of the main text.
Master equation
The Lindblad master equation is given as follows: \({{{{{\rm{d}}}}}}\rho /{{{{{\rm{d}}}}}}t=i\left[H,\rho \right]/\hslash +L/\hslash \), where ρ is the density matrix of the atomic ensemble and H = ∑_{k}H[ρ^{(k)}] is the atom–light interaction Hamiltonian when summed over all the singleatom Hamiltonians using the rotating wave approximation. This Hamiltonian has the following matrix form:
where for the MW signal \(E={A}_{1}\cos [\left({\omega }_{0}+{\omega }_{1}\right)t+{\varphi }_{1}]+{A}_{2}\cos [\left({\omega }_{0}+{\omega }_{2}\right)t+{\varphi }_{2}]+{A}_{3}\cos [\left({\omega }_{0}+{\omega }_{3}\right)t+{\varphi }_{3}]+{A}_{4}\cos [\left({\omega }_{0}+{\omega }_{4}\right)t+{\varphi }_{4}]\), we have the Rabi frequency \({{{\Omega }}}_{{{{{{\rm{s}}}}}}}(t)=\sqrt{{E}_{1}^{2}+{E}_{2}^{2}}\), where \({E}_{1}={A}_{1}\sin [{\omega }_{1}t+{\varphi }_{1}]+{A}_{2}\sin [{\omega }_{2}t+{\varphi }_{2}]+{A}_{3}\sin [{\omega }_{3}t+{\varphi }_{3}]+{A}_{4}\sin [{\omega }_{4}t+{\varphi }_{4}]\) and \({E}_{2}={A}_{1}\cos [{\omega }_{1}t+{\varphi }_{1}]+{A}_{2}\cos [{\omega }_{2}t+{\varphi }_{2}]+{A}_{3}\cos [{\omega }_{3}t+{\varphi }_{3}]+{A}_{4}\cos [{\omega }_{4}t+{\varphi }_{4}]\). The Rabi frequency can be derived as follows:
where the second term (which resonates with the energy levels of the Rydberg atoms) induces the normal EIT spectrum and the first term modulates that spectrum. In the interaction between the atoms and the MW fields, the atoms act as a mixer such that the output signal frequency (ω_{1}, ω_{2}, ω_{3}) is less than the input signal frequency (ω_{0} + ω_{1}, ω_{0} + ω_{2}, ω_{0} + ω_{3}). The modulation signal’s nonlinearity is reduced by setting the reference and increasing its amplitude as shown in Eq. (3), which is a precondition for recognition of these phases via deep learning.
where the condition for the approximations on the second line and the third line is A_{4} ≫ A_{1,2,3}.
The Lindblad superoperator L = ∑_{k}L[ρ^{(k)}] is composed of singleatom superoperators, where L[ρ^{(k)}] represents the Lindbladian and has the following form: \(\frac{L[{\rho }^{(k)}]}{\hslash }=\frac{1}{2}{\sum }_{m}\left({C}_{m}^{{{\dagger}} }{C}_{m}\rho +\rho {C}_{m}^{{{\dagger}} }{C}_{m}\right)+{\sum }_{m}{C}_{m}\rho {C}_{m}^{{{\dagger}} }\) where \({C}_{1}=\sqrt{{{{\Gamma }}}_{{{{{{\rm{e}}}}}}}}\leftg\right\rangle \left\langle e\right\), \({C}_{2}=\sqrt{{{{\Gamma }}}_{{{{{{\rm{r}}}}}}}}\lefte\right\rangle \left\langle r\right\) and \({C}_{3}=\sqrt{{{{\Gamma }}}_{{{{{{\rm{s}}}}}}}}\leftr\right\rangle \left\langle s\right\) are collapse operators that stand for the decays from state \(\lefte\right\rangle \) to state \(\leftg\right\rangle \), from state \(\leftr\right\rangle \) to state \(\lefte\right\rangle \) and from state \(\lefts\right\rangle \) to state \(\leftr\right\rangle \) with rates Γ_{e}, Γ_{r} and Γ_{s}, respectively. Because we are only concerned with the steady state here, i.e. t → ∞, the Lindblad master equation can be solved using dρ/dt = 0. The complex susceptibility of the EIT medium has the form χ(v) = (∣μ_{ge}∣^{2}/ϵ_{0}ℏ)ρ_{eg}, where ρ_{eg} is the element of density matrix solved using the master equation. The spectrum of the EIT medium can be obtained from the susceptibility using \(T \sim {e}^{{{{{{\rm{Im}}}}}}[\chi ]}.\)
Deep learning layers
Our deep learning model consists of a 1D CNN layer, a BiLSTM layer and a dense layer. The mathematical sketches for these layers are given as follows.
The 1D CNN layer is illustrated in Fig. 1c. The input signal convolutes the kernel in the following form:
where f represents the input data, g is the convolution kernel, m is the input data index and n is the kernel index. The 1D CNN extracts the higherorder features from the input data to reduce the lengths of the sequences fed into the BiLSTM layer. Before flowing into the BiLSTM layer, the data pass through the batch normalization layer, the ReLU activation layer and the maxpooling layer, in that sequence. For a minibatch \({{{{{\mathcal{B}}}}}}=\left\{{x}_{1\cdots m}\right\}\), the output from the batch normalization layer is y_{i} = BN_{γ,β}(x_{i}) and the learning parameters are γ and β^{40}. The update rules for the batch normalization layer are:
where Eqs. (5) and (6) evaluate the mean and the variance of the minibatch, respectively; the data are normalized using the mean and the variance in Eq. (7) and the results are then scaled and shifted in Eq. (8). The training is accelerated using the batch normalization layer and the overfitting is also weakened by this layer. The output then passes through the ReLU activation layer. The activation function of this layer is \({f}_{{{{{{\rm{ReLU}}}}}}}(x)=\max (x,0)\). The vanishing gradient problem is diminished by this activation function. Next, the inputs are downsampled in a maxpooling layer^{30}.
The LSTM layer and an LSTM cell are shown schematically in Figs. 1d and 6a, respectively. The equations for the LSTM are shown as Eqs. (9)–(14)^{32,41}. At a time t, the input x_{t} and two internal states C_{t−1}and h_{t−1} are fed into the LSTM cell. The first thing to be decided by the LSTM cell is whether or not to forget in Eq. (9), which outputs a number between 0 and 1 that represents retaining or forgetting. Next, an input gate (Eq. (10)) decides which values are to be updated from a vector of new candidate values created using Eq. (11). The new value is then added to the cell state and the old value is forgotten in Eq. (12). Finally, the cell decides what to output using Eqs. (13) and (14).
where σ(x) = 1/(1 + e^{−x}) is the sigmoid function. The sigmoid and \(\tanh \) functions are applied in an elementwise manner. The LSTM is followed by a timereversed LSTM to constitute a BiLSTM layer that improves the memory for long sequences.
The dense layer and a neuron are drawn in Figs. 1e and 6b, respectively, and the corresponding equations are
where w is the vector of weights, b is the bias, x represents the input data, g(a) = 1/(1 + e^{−a}) is the sigmoid activation function used to limit the output values to between 0 and 1, and y is the output. The dense layer resizes the shape of the data obtained from the BiLSTM to match the size of the label.
The training consists of both forward and backward propagation. A batch of probe spectra propagates through the 1D CNN layer, the BiLSTM layer, and dense layer during the forward training process. The differentiable loss function is then calculated. In our case, the differentiable loss function is the mean squared error (MSE) between the predictions and the ground truth, which is used widely in the regression task^{32}. The equation for the MSE is
where m is the number of data points in one spectrum, n is the minibatch size, φ_{i} is the ground truth and f(T_{i}) is the model prediction. In backpropagation, the trainable weights of each layer are updated based on the learning rate and the derivative of the MSE loss function with respect to the weights to minimize the loss L_{MSE}, such that
where η is the learning rate and W is the trainable weight for each layer. The weights of each layer are then updated according to the RMSprop optimizer^{42}.
The network is implemented using the Keras 2.3.1 framework on Python 3.6.11 (ref. ^{30}). All weights are initialized with the Keras default. The hyperparameters of the deep learning model (including the convolution kernel length, the number of hidden variables and the learning rate) are tuned using Optuna^{43}.
Deep learning pipeline
To obtain better fitting results, the data are scaled based on their maximum and minimum values, i.e., \(T^{\prime} =({T}_{i}\min(T))/(\max(T)\min(T))\). The labels are encoded in dense vectors with four elements rather than in oneshot encoding vectors to save space^{32}. Each of these elements is either 0 or 1, representing the relative phase 0 or π of each bin, respectively.
A onedimensional convolution layer (1D CNN), a bidirectional long–shortterm memory layer (BiLSTM) and a dense layer are used in our deep learning model. The deep learning model structure is shown in Fig. 7. The data size for the input layer is given in the form (batch size, length of probe spectrum, number of features). The batch size is 64 in our case. Because the duration of the spectrum ranges from t = 0 to t = 0.999 ms with a time difference of τ = 1 μs, the spectrum length is 1000. For a 1D input, the number of features is 1. Therefore, the data size for the input layer is (64, 1000, 1).
During training of this model, fourfold crossvalidation is used to save the amount of training data.The data set is split as shown in Fig. 8. First, the data set is split into two parts. The first is the test set (red), which remains untouched during training. The second (purple) is used to train the model. In the crossvalidation process, the rest data set (purple) is copied four times and is divided equally into four parts each. One of these parts is the validation data set (green) and the others are used as training sets (blue). Four models are trained on the different training sets and validation sets. Then the best model is chosen according to the validation set and is tested on the test set. After splitting, the training set, the validation set, and the test set all remain unchanged. In every epoch, each model iterates the training set only once. There is no new set being taken; instead, the same training set is iterated once each epoch.
The computational graph is cleared before each training sequence to prevent leakage of the validation data. Gaussian noise (where the mean is 0 and the standard deviation is 0.5) is added to the training data to increase the robustness of the proposed model. In addition, the learning rate is adjusted during training to jump out of the local minimum, which results in the jump in Fig. 2c in the main text. The initial learning rate is 0.001. If the loss (meansquare error) of the validation set does not decrease over 10 epochs, the learning rate is multiplied by 0.1. The RMSprop optimizer is used to update the weight of each layer during training^{42}.
The bidirectional LSTM layer can be replaced with the wellknown selfattention layer to improve the memory of our proposed model further^{44}. However, this would require more training time and increased GPU memory. The current model has been able to meet our requirements to date.
Data availability
The data are available in Github^{45} (https://github.com/ZongkaiLiu/DeeplearningenhancedRydbergmultifrequencymicrowaverecognition). The deep learning results are presented in the Jupyter notebook. And the master equation results are presented in the Mathematica notebooks.
Code availability
The codes are provided in Github^{45} (https://github.com/ZongkaiLiu/DeeplearningenhancedRydbergmultifrequencymicrowaverecognition).
References
Liao, K.Y. et al. Microwave electrometry via electromagnetically induced absorption in cold Rydberg atoms. Phys. Rev. A 101, 053432 (2020).
Fleischhauer, M., Imamoglu, A. & Marangos, J. P. Electromagnetically induced transparency: optics in coherent media. Rev. Mod. Phys. 77, 633–673 (2005).
Holloway, C. L. et al. Electric field metrology for SI traceability: systematic measurement uncertainties in electromagnetically induced transparency in atomic vapor. J. Appl. Phys. 121, 233106 (2017).
Sedlacek, J. A. et al. Microwave electrometry with Rydberg atoms in a vapour cell using bright atomic resonances. Nat. Phys. 8, 819–824 (2012).
Autler, S. H. & Townes, C. H. Stark effect in rapidly varying fields. Phys. Rev. 100, 703–722 (1955).
AbiSalloum, T. Y. Electromagnetically induced transparency and AutlerTownes splitting: two similar but distinct phenomena in two categories of threelevel atomic systems. Phys. Rev. A 81, 053836 (2010).
Meyer, D. H., Cox, K. C., Fatemi, F. K. & Kunz, P. D. Digital communication with Rydberg atoms and amplitudemodulated microwave fields. Appl. Phys. Lett. 112, 211108 (2018).
Jiao, Y. et al. Atombased receiver for amplitudemodulated baseband signals in highfrequency radio communication. Appl. Phys. Express 12, 126002 (2019).
Gordon, J. A., Simons, M. T., Haddab, A. H. & Holloway, C. L. Weak electricfield detection with sub1 Hz resolution at radio frequencies using a Rydberg atombased mixer. AIP Adv. 9, 045030 (2019).
Jing, M. et al. Atomic superheterodyne receiver based on microwavedressed Rydberg spectroscopy. Nat. Phys. 16, 911–915 (2020).
Simons, M. T., Haddab, A. H., Gordon, J. A. & Holloway, C. L. A Rydberg atombased mixer: measuring the phase of a radio frequency wave. Appl. Phys. Lett. 114, 114101 (2019).
Holloway, C. L., Simons, M. T., Gordon, J. A. & Novotny, D. Detecting and receiving phasemodulated signals with a Rydberg atombased receiver. IEEE Antennas Wirel. Propag. Lett. 18, 1853–1857 (2019).
Holloway, C. L., Simons, M. T., Haddab, A. H., Williams, C. J. & Holloway, M. W. A realtime guitar recording using Rydberg atoms and electromagnetically induced transparency: quantum physics meets music. AIP Adv. 9, 065110 (2019).
Robinson, A. K., Prajapati, N., Senic, D., Simons, M. T. & Holloway, C. L. Determining the angleofarrival of a radiofrequency source with a Rydberg atombased sensor. Appl. Phys. Lett. 118, 114001 (2021).
Bason, M. G. et al. Enhanced electric field sensitivity of RFdressed Rydberg dark states. N. J. Phys. 12, 065015 (2010).
Zou, H. et al. Atomic receiver by utilizing multiple radiofrequency coupling at Rydberg states of rubidium. Appl. Sci. 10, 1346 (2020).
Song, Z. et al. Rydbergatombased digital communication using a continuously tunable radiofrequency carrier. Opt. Express 27, 8848–8857 (2019).
Orazbayev, B. & Fleury, R. Farfield subwavelength acoustic imaging by deep learning. Phys. Rev. X 10, 031029 (2020).
Khanahmadi, M. & Mølmer, K. Timedependent atomic magnetometry with a recurrent neural network. Phys. Rev. A 103, 032406 (2021).
Giordani, T. et al. Machine learningbased classification of vector vortex beams. Phys. Rev. Lett. 124, 160401 (2020).
Liu, Z., Yan, S., Liu, H. & Chen, X. Superhighresolution recognition of optical vortex modes assisted by a deeplearning method. Phys. Rev. Lett. 123, 183902 (2019).
Doster, T. & Watnik, A. T. Machine learning approach to OAM beam demultiplexing via convolutional neural networks. Appl. Opt. 56, 3386–3396 (2017).
da Silva, B. P., Marques, B. A. D., Rodrigues, R. B., Ribeiro, P. H. S. & Khoury, A. Z. Machinelearning recognition of light orbitalangularmomentum superpositions. Phys. Rev. A 103, 063704 (2021).
Wigley, P. B. et al. Fast machinelearning online optimization of ultracoldatom experiments. Sci. Rep. 6, https://doi.org/10.1038/srep25890 (2016).
Tranter, A. D. et al. Multiparameter optimisation of a magnetooptical trap using deep learning. Nat. Commun. 9, 4360 (2018).
Mukherjee, R., Xie, H. & Mintert, F. Bayesian optimal control of GreenbergerHorneZeilinger states in Rydberg lattices. Phys. Rev. Lett. 125, 203603 (2020).
Mills, K., Ronagh, P. & Tamblyn, I. Finding the ground state of spin hamiltonians with reinforcement learning. Nat. Mach. Intell. 2, 509–517 (2020).
Wang, Z. T., Ashida, Y. & Ueda, M. Deep reinforcement learning control of quantum cartpoles. Phys. Rev. Lett. 125, 100401 (2020).
Bukov, M. et al. Reinforcement learning in different phases of quantum control. Phys. Rev. X 8, 031086 (2018).
Chollet, F. et al. Keras. https://github.com/fchollet/keras (2015).
Pedregosa, F. et al. Scikitlearn: machine learning in Python. J. Mach. Learning Res. 12, 2825–2830 (2011).
Chollet, F. Deep Learning with Python (Manning Publications, 2017).
Sahoo, D., Pham, Q., Lu, J. & Hoi, S. C. H. Online deep learning: learning deep neural networks on the fly. In IJCAI'18: Proceedings of the 27th International Joint Conference on Artificial Intelligence (2018).
Cox, K. C., Meyer, D. H., Fatemi, F. K. & Kunz, P. D. Quantumlimited atomic receiver in the electrically small regime. Phys. Rev. Lett. 121, 110502 (2018).
Kumar, S., Fan, H., Kübler, H., Jahangiri, A. J. & Shaffer, J. P. Rydbergatom based radiofrequency electrometry using frequency modulation spectroscopy in room temperature vapor cells. Opt. Express 25, 8625–8637 (2017).
Meyer, D. H., Kunz, P. D. & Cox, K. C. Waveguidecoupled Rydberg spectrum analyzer from 0 to 20 ghz. Phys. Rev. Appl. 15, 014053 (2021).
Meyer, D. H., Castillo, Z. A., Cox, K. C. & Kunz, P. D. Assessment of Rydberg atoms for wideband electric field sensing. J. Phys. B Atom. Mol. Opt. Phys. 53, 034001 (2020).
Wade, C. G. et al. A terahertzdriven nonequilibrium phase transition in a room temperature atomic vapour. Nat. Commun. 9, 3567 (2018).
Holloway, C. L. et al. Broadband Rydberg atombased electricfield probe for SItraceable, selfcalibrated measurements. IEEE Trans. Antennas Propagation 62, 6169–6182 (2014).
Ioffe, S. & Szegedy, C. Batch normalization: accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on Machine Learning (eds. Bach, F. & Blei, D.), Vol. 37 of Proceedings of Machine Learning Research (PMLR), Lille, France, 448–456 (2015).
Hochreiter, S. & Schmidhuber, J. Long shortterm memory. Neural Comput. 9, 1735–1780 (1997).
Geoffrey, H., Nitish, S. & Kevin, S. Lecture 6a overview of minibatch gradient descent. http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf (2012).
Akiba, T., Sano, S., Yanase, T., Ohta, T. & Koyama, M. Optuna: a nextgeneration hyperparameter optimization framework. In Proceedings of the 25rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2019).
Vaswani, A. et al. Attention is all you need. Advances in Neural Information Processing Systems, (eds Guyon, I. et al.) Vol. 30 (Curran Associates, Inc., 2017).
Liu, Z.K. Deep learning enhanced Rydberg multifrequency microwave recognition. https://doi.org/10.5281/zenodo.6202552 (2022).
Šibalić, N., Pritchard, J. D., Adams, C. S. & Weatherill, K. J. Arc: an opensource library for calculating properties of alkali Rydberg atoms. Comput. Phys. Commun. 220, 319–331 (2017).
Acknowledgements
Z.K.L. gratefully acknowledges the instructive discussion about deep learning with Yue Chen at the National Engineering Laboratory for Speech and Language Information Processing and the enlightenment of leraning machine learning during studying with Dr. Lei Gong at Department of Optics and Optical Engineering at USTC. D.S.D. acknowledges funding from the National Key R&D Program of China (Grant No. 2017YFA0304800), the National Natural Science Foundation of China (Grant Nos. U20A20218, 61525504, 61435011), the Anhui Initiative in Quantum Information Technologies (Grant No. AHY020200), the Youth Innovation Promotion Association of the Chinese Academy of Sciences (Grant No. 2018490), and the major science and technology projects in Anhui Province. B.S.S. acknowledges funding from the National Natural Science Foundation of China (Grant No. 11934013). We thank David MacDonald, MSc, from Liwen Bianji, Edanz Editing China (www.liwenbianji.cn/ac), for editing the English text of a draft of this manuscript.
Author information
Authors and Affiliations
Contributions
D.S.D. conceived the idea for the study. Z.K.L. conducted the physical experiments and designed the deep learning model and communication protocols. Z.K.L. derived the theoretical formula. Z.K.L. analysed the data with assistance from L.H.Z., B.L., and Z.Y.Z. The manuscript was written by Z.K.L. The research were supervised by D.S.D., B.S.S. and G.C.G. All authors contributed to discussions regarding the results and analysis contained in the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Liu, ZK., Zhang, LH., Liu, B. et al. Deep learning enhanced Rydberg multifrequency microwave recognition. Nat Commun 13, 1997 (2022). https://doi.org/10.1038/s41467022296867
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467022296867
This article is cited by

Enhanced metrology at the critical point of a manybody Rydberg atomic system
Nature Physics (2022)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.