Recent advances in the control of single electron spins associated with defects in solids have enabled the sensing, imaging, and control of individual nuclear spins1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16. From a quantum sensing perspective, this has enabled the detection and imaging of nuclear spins with atomic-scale resolution and single spin sensitivity, in systems of up to 27 spins15,16,17,18,19,20. From a quantum information perspective, controlling individual nuclear spins provides quantum registers for quantum computation and optically connected quantum networks7,21,22,23,24. Proof-of-principle experiments have demonstrated quantum registers with 10 + qubits7,21,22,23,24,25, elementary quantum algorithms and error correction protocols11,26,27,28,29,30,31, and key quantum network protocols such as entanglement distillation32,33.

An important task in both these application fields is to detect and identify the nuclear spins and to characterize the electron–nuclear interaction. For imaging larger, more complex, spin structures and for the realization of large-scale quantum networks that consist of many multi-qubit devices, it is required to develop objective and automated methods that can efficiently identify signatures of nuclear spins and determine coupling parameters from experimental spectroscopy.

In this work, we develop neural-network-based algorithms that can efficiently and automatically detect nuclear spins by their coupling to a single electron spin. Previously, machine learning algorithms were applied, for example, to adaptively sense varying magnetic field in real-time34 and to reconstruct two-dimensional NMR spectroscopy from sparse sample data35. We focus on Carr–Purcell–Meiboom–Gill (CPMG)-type dynamical decoupling spectroscopy2,9,10,15,22,36,37, which is widely employed for single nuclear spin detection and control9,38 and is a common starting point for more advanced spectroscopy methods3,6,16,19. While our methods are general, we exemplify them through experiments on a single nitrogen-vacancy (NV) center in diamond with nearby naturally abundant 13C nuclear spins9,11,22. We show that our deep learning approach enables fast automatic nuclear spin detection and hyperfine parameter estimation for 31 individual spins.


Theoretical modeling

Figure 1a shows a schematic of the electron-nuclear spin complex considered in this work. The NV center, an impurity in the diamond crystal lattice, acts as a sensitive probe for the surrounding nuclear–spin environment. The ground state electron spin of the NV center can be initialized and measured using spin-dependent fluorescence and can be manipulated by microwaves39. In typical dynamical decoupling spectroscopy, for example, based on CPMG pulse sequence22 shown in Fig. 1b, the interaction of the electron with its nuclear spin environment leads to sudden and periodic losses of coherence at specific pulse timings. The magnitude and position of the dip in coherence depends on the longitudinal (transverse) hyperfine coupling parameter A (B). The CPMG signal is given by the probability Px that the NV center’s spin state is preserved. In the absence of nuclear–nuclear interactions this can be described as9,

$${\it{P}}_{\it{x}} = \frac{{1 + \mathop {\prod}\limits_{k = 1}^n {{\it{M}}_{\it{k}}} }}{2}$$
$${\it{M}}_{\it{k}} = 1 - m_{k,x}^2\frac{{(1 - \cos \alpha _k)(1 - \cos \beta )}}{{1 + \cos \alpha _k\cos \beta - m_{k,z}\sin \alpha _k\sin \beta }}\sin ^2\frac{{N\phi _k}}{2}$$
$$\cos \phi _k = \cos \alpha _k\cos \beta - m_{k,z}\sin \alpha _k\sin \beta$$

where \(m_{k,z} = (A_k + \omega _L)/\tilde \omega _k\), \(m_{k,x} = B_k/\tilde \omega\), \(\tilde \omega _k = \sqrt {(A_k + \omega _L)^2 + B_k^2}\),\(\alpha _k = \tilde \omega _k\tau\),\(\beta = \omega _L\tau\), τ is half of the delay between π pulses, k indicates kth nuclear spin, n is the total number of nuclear spins, ωL is the Larmor frequency, and N is the repetition number of the unit CPMG pulse (see Fig. 1b). The CPMG signal is given by the multiplication of all the Mk’s for n nuclear spins as depicted in Eqs. (1, 2). This characteristic introduces an additional complexity compared to conventional nuclear magnetic resonance (NMR) signals40,41,42 and, along with the decoherence and environmental noises, makes existing NMR peak decomposition packages43,44,45,46,47,48 ineffective for analyzing the signal.

Fig. 1: General procedure for identifying hyperfine parameters of 13C nuclear spins.
figure 1

a Schematic diagram showing the configuration of an electron spin within the nitrogen-vacancy (NV) center magnetic dipole field (blue oval curves) and 13C nuclear spins (green circles) interacting with the NV center via hyperfine interaction. Bz is the external magnetic field strength, ωL is the Larmor frequency, \(\omega _h = \sqrt {A^2 + B^2}\), and \(\tilde \omega = \sqrt {(A + \omega _L)^2 + B^2}\) where A (B) is the longitudinal (transverse) hyperfine interaction parameter. b Typical dynamical decoupling pulse sequence (Carr-Purcell-Meiboom-Gill, CPMG) used for experimental nuclear spectroscopy. The bottom panel shows an example of experimental CPMG data from which electron-nuclear hyperfine interaction is analyzed. c Pseudo-algorithm for training and hyperfine-parameters-prediction sequences including hyperfine parameter classifier (HPC), denoise and signal recovery, regression-based fitting, and fine-tuning models. The flow of experimental processes (computational processes) is on the red (gray) arrows.

Analysis procedure by deep learning models

The main task of our deep learning model is to efficiently encode the features of each kth nuclear spin in Eq. (1). Once successfully trained, the models can determine Ak and Bk of each nuclear spin from the experimental spectroscopy data (see the bottom panel of Fig. 1b for an example). Figure 1c shows the overall procedure to achieve this task. First, the measurements of CPMG signals and the implementations for generating datasets and training deep learning models are conducted simultaneously. Generating datasets for both hyperfine parameter classifier (HPC) models and denoising models is performed using the theoretical model in Eq. (1). Second, via the generated training datasets, denoising models are trained to reduce noise and HPC models are trained to identify whether specific hyperfine parameters exist in the data or not. Third, to enhance the signal-to-noise ratio, the raw noisy CPMG signal is pre-processed by the trained denoising model and decoherence recovery process and is fed into the trained HPC models. Fourth, using the outputs of the HPC models, an additional deep learning-based regression model is adapted to further restrict possible hyperfine parameter combinations. Lastly, in the auto fine-tuning phase, the prediction of the regression model is used as initial values of the hyperfine parameters, and automatic numerical fitting is performed.

Data representation for nuclear spin detection

The qualitative features of a typical dynamical decoupling signal are as follows. First, the coherence dip of kth nuclear spin is periodic with approximate periodicity9 (local period)

$$TP_k = 2\pi /(\tilde \omega _k + \omega _L)$$

where \(\tilde \omega _k = \sqrt {(A_k + \omega _L)^2 + B_k^2}\), and Ak and Bk are hyperfine parameters of the ‘kth target’ local period (see Supplementary Fig. 1 for detailed descriptions about TPk and corresponding (Ak, Bk) values). Second, the envelope of the coherence dip amplitudes as a function of τ is periodic essentially showing periodic quantum entanglement evolution with the resonant nuclear spin13 (global period). Third, each coherence dip can show additional fringes depending on the hyperfine interaction strength. In the strong coupling regime, for example when B/2π > 100 kHz, the CPMG signal can exhibit multiple and large fringe oscillations9,11 (Eqs. (2, 3)). While conventional numerical peak detection or Fourier transform analysis is inefficient in the presence of these oscillating signals, below we show that the deep learning approach offers an excellent alternative route to solve the problem.

In principle, supervised learning algorithms can be applied using the theoretical model given by Eqs. (13) for this nominally multi-class classification problem49,50,51,52,53. The data preparation and training, however, is challenging in that, (1) the number of nuclear spins interacting with the central NV center is not known a priori and (2) the number of possible (A, B) pair combinations for a given number of surrounding nuclear spins is large. Brute force generation of large datasets with the variable number of nuclear spins is impractical and generally not reliable to represent possible spin configurations unambiguously.

We convert the multi-class classification problem to that of a single class by reorganizing the data so that the deep learning model focuses on identifying a single target spin. Figure 2b shows the general concept of this conversion. By cutting the CPMG signal according to the TPk of a target spin and making a 2D image by stacking multiple slices, the difference of the local periods between two spins can be distinguished. The features of the global period can be also analyzed by the distribution of pixel values on the vertical axis. With this representation, the deep learning model analyzes whether the target spin signal marked by a vertical line exists in the 2D image. Moreover, non-linear oscillations near the main coherence dip in the strong coupling regime, which are difficult to address by hand-crafted coding, generally appear as fringe patterns in this data representation. The deep learning model shows a strong ability to classify target signals in the presence of these interfering patterns through image recognition54,55.

Fig. 2: Individual spin signature identification by hyperfine parameter classifier (HPC) deep learning model.
figure 2

a Simulated CPMG signal with three spins of different (A, B) values showing general features of nuclear spectra including local and global periods (see text for definitions). b Concept of the data conversion into 2D images by slicing and stacking the data fragments with specified target period TPk. In this 2D image, the x-axis label \(\tilde \tau\) represents evolution time modulo TPk. The 2D image reveals the signature of a target signal as a vertical line with vanishing slope, which is generally superposed with other interfering nuclear spin signals. c Training datasets and architecture of the HPC model in a case of classifying three classes (K = 3), where K is the number of nodes of the last layer. The input data consists of three different classes, where each class corresponds to the number of existing nuclear spins with the target period, and the output data is one-hot vector form assigned to each class. For example, Class 1 (Class 2) means that no (one) spin with the target period exists. d Example predictions of the HPC model depending on hyperfine coupling strength (first to third panels) and proximity to similar period (fourth and fifth panels). For all cases, the HPC model predicts correct spin signatures corresponding to input signals showing good consistency between the predicted vectors and the output vectors. (for example, in the first panel, the predicted vector is (0.01 0.99 0) and the output vector is (0 1 0)). The color scale bar in all 2D images ranges from 0 to 1.

Deep learning model for classification

Focusing on the local period of a specific target spin signal, we develop a set of deep learning models, coined HPC, each of which classifies the existence of a specific period of hyperfine-induced coherence dips in the data. Figure 2c illustrates a structure of the HPC model and training datasets by exemplifying a case of classifying three different classes (see a detailed implementation of generating training datasets in Supplementary Note 1). The input training data is prepared along with three output classes, as shown in Fig. 2c. Class 1 corresponds to data that does not contain a spin with the target period, class 2 is for one spin with the target period existing in the data, and class 3 for two spins with slightly dissimilar target periods in the data. The output data is denoted in one-hot vector form; (1, 0, 0), (0, 1, 0), and (0, 0, 1) corresponding to no, single, and double target periods, respectively. The model is trained to estimate the confidence score of each element of the three-dimensional vector according to the input image. The model consists of stacked Dense layers, Batch Normalization layers56, and LeakyRelu activation functions, as shown in Fig. 2c with employing AdaBound optimizer57. The detailed procedure of the neural network development is described in Supplementary Fig. 2 and Supplementary Note 1.

Figure 2d shows the classification results using our HPC model. The first panel is for the typical case that a single target period exists without strong disturbance from other spins nor spin bath signal and the model successfully outputs a vector close to (0, 1, 0). The second panel shows the performance of the model for a strongly coupled single target spin (A/2π, B/2π) = (381,275) (kHz) in a spin group TPD21 in Supplementary Table 1, taken from existing density functional theory (DFT) calculations58, used as an example. As mentioned above, although the spin signal is superposed with wide fringe patterns and oscillations, the model successfully identifies the signature of the target period with the output vector reaching (0.002, 0.99, 0). The third panel comes from the same signal as in the second panel but cut by the different target period of hyperfine parameters (A/2π, B/2π) = (48, 8) (kHz). It shows that the model also successfully classifies the target period even in the presence of another superposed strongly coupled spin signal (A/2π, B/2π) = (381, 275) (kHz). Furthermore, the fourth and the fifth panels give an example of the performance for input datasets with a single spin, (A/2π, B/2π) = (7.8, 20) (kHz) (fourth panel) and with two spins of similar local period, (A/2π, B/2π) = (7.9, 10), (A/2π, B/2π) = (7.8, 20) (kHz), (fifth panel). The model successfully distinguishes each case, showing high selectivity of the nuclear spins. Therefore, these results show that our deep learning model provides a promising approach to detect individual nuclear spins with high precision, with high selectivity, and for a wide range of hyperfine strengths.

Noise removal and decoherence effect recovery

Before evaluating the experimental CPMG signal by trained HPC models, we first pre-process the raw experimental data by a denoising model. Figure 3a shows the overall procedure. For the noise removal process, Gaussian noise with the standard deviation σ = 0.05 reflecting the experimental noise is added to the training datasets (see Supplementary Fig. 3). The decoherence effect is modeled by the approximate equation9,

$$P_x = \frac{1}{2}M \cdot \exp \left( { - \frac{\tau }{T}} \right)^n + \frac{1}{2}$$

where T accounts for dephasing of the electron spin, n is an exponential power obtained by fitting the experimental data and τ is half of the inter-pulse delay. We use an autoencoder structure59,60, which is an established structure to learn the representations of input data, to encode the features of the noisy input data, and generate the denoised data. A one-dimensional convolution neural network (1D CNN) layer61, which is widely used to capture the features of one-dimensional data such as time-series signal, is employed for building the denoising neural network.

Fig. 3: Denoising and decoherence effect recovery procedure.
figure 3

a Architecture of the signal recovery model. The pure data is generated by Eq. (1). The noisy data is generated by adding decoherence effects and noise to the pure data using Eq. (5). The model is trained to reproduce denoised data from the noisy data and the decoherence effect is recovered using Eq. (5). b Raw experimental (green), pure (blue), and recovered (orange) CPMG data showing successful recovery of the fringe patterns in the presence of noise with a comparable amplitude. The figure also shows signal recovery performance in the long evolution time regime. c The comparison of the raw experimental data with the noise recovered data in image representation used for the HPC model, showing an enhancement in signal-to-noise ratio and predictability. The color scale bar in all 2D images ranges from 0 to 1.

As shown in Fig. 3b, the signal recovery model effectively removes the noises while retaining nuclear spin signatures of the experimental data. This is highlighted with the capability of recovering detailed oscillatory features of the data where the amplitudes of signals are almost equivalent to the fluctuations due to noise. Fig. 3c compares the visibility of the spin signal of the raw (left panel) and the processed (right panel) data showing effective removal of experimental noise and enhancement of signal-to-noise ratio, leading to higher performance of prediction by the HPC model. After denoising the raw experimental data, the decoherence effect is recovered by applying Eq. (5) to the denoised data (see more detail in Supplementary Note 1). We find that the confidence scores by HPC models evaluating denoised experimental data are, in general, a few percent higher than evaluating raw data (compare (0.02, 0.94, 0) vs. (0, 0.99, 0) in Fig. 3c) and in some cases false predictions of raw data are corrected in denoised data (compare (0.17 0.71 0.1) vs (0.65 0.36 0.01) in Fig. 3c), successfully showing the efficiency of our pre-processing model.

Regression-based model and auto fine-tuning

We now discuss the final stage of the deep learning protocol and the application of the overall procedure to experimental dynamical-decoupling spectroscopy signals as shown in Fig. 4a. After the application of denoising and HPC models to predict possible local periods, we further apply a deep learning-based regression model to restrict the candidate hyperfine parameters for a subsequent fine-tuning process. Since the period information from the HPC model only provides one functional relation between A and B given as Eq. (4), the purpose of the regression model is to find specific (A, B) values that best explain the shape of the coherence dips as a function of τ. We set a search region for the value B/2π ranging from 10 to 80 kHz for N32 (from 2 to 20 kHz for N256) and find the best fitted (A, B) pairs repetitively for all predicted periods. Since these values are obtained by fitting coherence dips stemming from only individual nuclear spin, we use the whole deep learning-based fit results as initial guess values and tune all (A, B) pairs again in the final step to automatically search a collective list of best fitted (A, B) pairs. We describe a pseudo-code of the fine-tuning method with using particle swarm optimization algorithm62 in Supplementary Note 2.

Fig. 4: Multiple nuclear spin detection from experimental data.
figure 4

a Procedures for a regression model estimating hyperfine parameters from predicted periods of HPC models. Training datasets of the regression model are generated using (A, B) pairs with the predicted periods by HPC models. The regression model infers the single (A, B) pair that best fits with the features of the experimental data (reorganized as 2D image) including coherence dip amplitudes, envelope function, and fringe patterns. b Multiple nuclear spin detection from the experimental CPMG data for N = 32 (top panel) and N = 256 (bottom panel) using the same NV center. The panels show superimposed reproduced CPMG signal (solid curves) and the experimental data (dotted curve). The spin numbers (C# or C#) indicated in the figure corresponds to the full spin list summarized in Supplementary Table 2. c Confirmation of detected hyperfine parameters for spins with a large number of interfering signatures (1st panel), similar target periods (2nd and 3rd panels), weak local period signature (4th panel), small transverse hyperfine coupling (5th panel), and small longitudinal hyperfine coupling (6th panel). We compare the obtained values to the results reported in ref. 19 (bottom row, see main text). The panels also show examples of spins with small A that were not detected in ref. 19 (3rd, 5th, and 6th panels). The uncertainty in the last digit is given in parentheses. The color scale bar in all 2D images ranges from 0 to 1.

Demonstration with experimental data

We demonstrate the performance of the developed procedures with two experimental datasets with N = 32 and N = 256. These data are collected following the methods described in the ref. 22 and in Supplementary Fig. 4. Figure 4b shows the comparisons of the experimental data to the reproduced CPMG signal using predicted hyperfine parameters by our deep learning protocol. Panels in Fig. 4c show example cases of predicted spins along with corresponding raw experimental data. The first panel highlights the case where the model can capture the nuclear spin signal and determine (A, B)/2π = (−213.19(5), 4.2(9)) (kHz) even with overlapping signals stemming from other spins. The second and third panel show that the model can accurately distinguish spins with similar periods and automatic fine-tuning successfully identifies individual (A, B) pairs matching the experiments.

Our analysis returns a total of 48 nuclear spins that together accurately describe the data. However, several of these spins yield near-identical hyperfine parameters. It cannot be excluded that those signals originate from a single spin with a broadened signal due to dephasing and nuclear-nuclear spin interactions, which are not included in the model used here (see Supplementary Note 1 and Supplementary Figs. 57 for details). We anticipate that improved selectivity in this regime is possible by using other pulse sequences, for example, non-equally spaced dynamical decoupling sequence38,63 or by taking nuclear-nuclear interactions into account. Here, we chose to count groups of spins with nearly identical parameters as a single spin. In that way, we identify 31 nuclear spins. We summarize the full list of detected nuclear spins and the confidence levels in Supplementary Table 2.


We compare our results with those obtained by other methods on the same sample. A manual analysis on a similar data set, taken with the same measurement procedure, identified 7 spins22 with parameters that match closely to 7 of the 31 spins identified here. The large improvement in the number of identified spins from equivalent experimental data highlights the advantage of our deep learning approach. Additionally, we compare the results to a recent multi-dimensional spectroscopy characterization19, a more demanding experimental technique that accesses nuclear-nuclear interactions. For 23 of the 31 spins, a good match is observed (Supplementary Table 2). The other 8 spins were not previously identified and are in a spectral range that was not accessed in previous experiments. We corroborate the identification of these spins through additional experiments with a different number of decoupling pulses N = 96 and N = 128 (see Supplementary Fig. 8). On the other hand, 4 spins detected in the previous result are missing in the machine learning results due to the limited signal to noise ratio and the intrinsic insensitivity to nuclear spins with small B values of the CPMG sequence used here. Overall, these results show the capability of our deep learning protocol to automatically and accurately identify nuclear spins in complex spin systems and characterize the coupling parameters from dynamical decoupling spectroscopy.

We estimate a total computational time of ~3 h from generating training datasets and training the HPC models to complete the analysis on one set of experimental CPMG data (see details in Supplementary Fig. 2). Once trained, each HPC model can identify the most probable local periods of nuclear spins from the experimental CPMG data almost instantaneously (<1 s) and obtain the final fitted hyperfine parameters within ~50 s per spin (detailed specifications of the computational power used is given in Supplementary Note 1). This fast data analysis highlights the potential of deep learning approaches to efficiently scale up the sensing and characterization of large spin systems.

We find that examining dynamical decoupling spectroscopy signals for various numbers of pulses N is important for the following reasons. First, large N makes spins with small B values visible and this is in general reflected in an increased number of detected spins as shown, for example, in the fifth panel of Fig. 4c. Second, we find that some spins near the Larmor frequency with relatively high B/2π values (>10 kHz) are detectable only in N = 32 since for larger N too many spin signals are overlapped as illustrated in the sixth panel of Fig. 4c. The current protocol does not take nuclear–nuclear spin interactions into account. Therefore, our model fails to detect some of the interacting nuclear spins for N = 256, as for large N and long total evolution times, nuclear-nuclear spin interactions are non-negligible and lead to a deviating period in the signal (see Supplementary Fig. 9 and note that part of the nuclear-nuclear interactions are known from ref. 19). For N = 32, the data approximately follows a simple electron-nuclear interaction model and nuclear-nuclear interactions can be neglected. In that case, our protocol successfully detects these spins as shown in the fourth panel of Fig. 4c. We envision future improvement of the deep learning protocol by building a unified model which covers all ranges of hyperfine parameters, various N pulse sequences, and the nuclear–nuclear interactions where the challenge lies in the efficient generation of training datasets and the organization of datasets for effectively and unambiguously embedding the signatures of possible nuclear–nuclear pairs. At this current stage, discrepancies between experimental data for different N, for example between N = 32 and N = 256, can be used as a signature of nuclear–nuclear interaction.

In conclusion, we have proposed and demonstrated a deep learning approach to automatically detect and characterize individual nuclear spins based on dynamical decoupling spectroscopy with a single electron spin sensor. We have tested the method on a single NV center in diamond and have identified 31 individual 13C nuclear spins with a wide range of hyperfine parameters. Our method is able to distinguish spins with strong couplings to the NV center which are difficult to be handled through conventional peak detection algorithms38,64. The proposed models retain the general benefits of deep learning models; it is easy to modify the training procedure or neural network architectures for other types of experimental data such as spectroscopy data of other defect centers, including in diamond. Additionally, these results highlight the capacity of deep learning algorithms to efficiently analyze the complex nonlinear signatures in nano-scale and single-spin magnetic resonance and its robustness against realistic distortions, such as experimental decoherence and noise. Therefore, our methodology addresses one of the main challenges for quantum sensing experiments on complex spin structures and for large quantum registers and quantum networks based on spin qubits.


Measurement setup and sample preparation

Our experimental measurement of CPMG data is performed on a single, naturally occurring, NV center in a high-purity chemical-vapor-deposition homoexpitaxially grown diamond (type IIa) with a natural abundance of 13C (1.1%) and a <111> crystal orientation. To improve the photon-collection efficiency, we fabricate a solid immersion lens on top of the NV center and we use an aluminum-oxide anti-reflection coating layer (grown by atomic-layer-deposition)65. We use on-chip lithographically-defined strip lines to apply microwave fields for fast driving of the electron spin transitions.

We apply a static magnetic field, Bz ≈ 403 G, along the NV-axis using a permanent room-temperature neodymium magnet. This magnetic field was chosen to ensure that ωL is larger than the perpendicular hyperfine couplings B in order to reduce the oscillation fringes in the CPMG signal. The electron spin Rabi frequency is 14.31(3) MHz. We use Hermite pulse shapes to obtain effective MW pulses without initialization of the 14N spin66. We alternate the phases of the π-pulses according to the XY-8 scheme67. Albeit the additional signals that can be caused by such compensation sequences in combination with finite pulse durations68 are negligible in this work, another scheme that randomizes the phases of the pulses69 can be employed to suppress spurious responses and signal distortions. We stabilize the magnet field strength to <3 mG19 and the magnet is aligned to the NV-axis with uncertainty of 0.07° using thermal echo sequences (see ref. 19 for details of the alignment procedure).

Our experiments are performed at a temperature of 3.7 K in a commercial closed-cycle cryostat (Montana Cryostation). This enables us to readout the NV electron spin state in a single shot with high fidelity (94.5%), through spin-selective resonant excitation65 (see detailed pulse sequence in Supplementary Fig. 4). The electron spin relaxation time is T1 > 1 h22, the natural dephasing time is \(T_2^ \ast = 4.9(2)\;\mu {\mathrm{s}}\), the spin-echo coherence time is T2 = 1.182(5) ms, and the multipulse dynamical decoupling coherence time is T2DD > 1 s, for an optimized inter-pulse delay 2τ22.

Configuration for HPC and regression models

Although a two-dimensional convolution neural network is generally employed for image recognition, to boost computational speed while retaining the accuracy, we use Dense layers for HPC and regression models. The LeakyRelu activation shows slightly better performance for the convergence to the lower validation loss than using ReLU activation. Batch Normalization layer with epsilon = 1e-05, momentum = 0.1 (default values in Pytorch 1.3.1) shows faster convergence to the minimum loss than Dropout regularization. For the last layer, Sigmoid layer generally converges to the higher accuracy than the Softmax layer for our datasets.

Configuration for the denoising model

We introduce the auto-encoder structure which is an established structure to encode the distribution of the input data and generate the targeted data. For both encoder and decoder parts, 1D CNN layer and 1D transposed CNN layer are employed rather than RNN layers such as LSTM70, GRU71 layers because 1D layers show lower validation errors and faster convergence to the minimum loss. All the kernel size for both CNN layers is 4. In the encoder part, Maxpooling1D layer with a kernel size of 2 is used after every single 1D CNN layer. A batch normalization layer with the same parameters as the HPC model is used for all CNN layers.

In all models, Ada Bound57 is employed for the optimizer and the initial learning rate is 0.00015 decayed at each epoch with customized rate (0.5–0.25). For loss functions, binary cross entropy loss is used for HPC model and mean square error loss is used for regression and denoising models.

Usage of trained models and management of total computational time

The denoising model can be reused for other experimental data if the number of unit CPMG pulse sequences (N) and measurement time resolution are kept the same. The classifier model can be reused if the external magnetic field, N, measurement time resolution, and total measurement time length remain the same.

All HPC and denoising models can be trained separately and generating datasets can also be processed independently. Therefore, for example, to reduce total computational time to one-third, three computers can be used independently by dividing the training regions of all TPk into three regions.