Multiple machine learning approach to characterize two-dimensional nanoelectronic devices via featurization of charge fluctuation

Two-dimensional (2D) layered materials such as graphene, molybdenum disulfide (MoS2), tungsten disulfide (WSe2), and black phosphorus (BP) provide unique opportunities to identify the origin of current fluctuation, mainly arising from their large surface areas compared with those of their bulk counterparts. Among numerous material characterization techniques, nondestructive low-frequency (LF) noise measurement has received significant attention as an ideal tool to identify a dominant scattering origin such as imperfect crystallinity, phonon vibration, interlayer resistance, the Schottky barrier inhomogeneity, and traps and/or defects inside the materials and dielectrics. Despite the benefits of LF noise analysis, however, the large amount of time-resolved current data and the subsequent data fitting process required generally cause difficulty in interpreting LF noise data, thereby limiting its availability and feasibility, particularly for 2D layered van der Waals hetero-structures. Here, we present several model algorithms, which enables the classification of important device information such as the type of channel materials, gate dielectrics, contact metals, and the presence of chemical and electron beam doping using more than 100 LF noise data sets under 32 conditions. Furthermore, we provide insights about the device performance by quantifying the interface trap density and Coulomb scattering parameters. Consequently, the pre-processed 2D array of Mel-frequency cepstral coefficients, converted from the LF noise data of devices undergoing the test, leads to superior efficiency and accuracy compared with that of previous approaches.


INTRODUCTION
Low-frequency (LF) 1/f noise spectroscopy is a nondestructive defect diagnosis tool, which identifies dominant scattering origins. Such scattering origins are caused by imperfect crystallinity, lattice vibration, surface trap distribution, and channel and dielectric defects, in addition to the Schottky barrier inhomogeneity at the metal-semiconductor interface in semiconductor devices [1][2][3][4][5][6] . However, as the size of the channel material decreases, particularly in the case of two-dimensional (2D) layered materials, their atomically thin nature with a large surface-to-volume ratio makes it significantly difficult to investigate them using LF 1/f noise analysis as compared with their bulk silicon counterparts 7,8 . Conventionally, the time-resolved current (I) variation in electronic devices has been ascribed to the carrier number and/or mobility fluctuation 9 ; ΔIðtÞ / qμ ΔN ð Þþq Δμ ð ÞN, where q, μ, and N denote the elementary unit charge, carrier mobility, and number of charge carriers, respectively. However, since the inherent vulnerability of 2D materials to surrounding interfaces considerably influences the charge fluctuation, this high sensitivity of LF noise features would reflect the individual effects of both channel and dielectric materials in addition to the presence of chemical/ electrical doping 10,11 .
Thus far, numerous LF noise features have been reported on 2D materials, such as the presence of electron-hole puddle induced charge scattering on graphene 12 , the Coulomb scattering suppression via high-κ passivation of black phosphorus (BP) 13 , the promotion of charge fluctuation in molybdenum disulfide (MoS 2 ) due to the height and inhomogeneity of the Schottky barrier 11 , the anisotropic LF noise feature of rhenium disulfide (ReS 2 ) 14 , and the thickness-dependent Coulomb scattering parameter of molybdenum ditelluride (MoTe 2 ) 15 . These studies indicate the high feasibility of LF noise spectroscopy as a tool to classify the material and device properties. Nevertheless, the origin of carrier fluctuation, occurring either in the 2D layered material itself or at the interface between the 2D layered material and the gate dielectric, has not been identified clearly. Moreover, it is significantly difficult to identify an individual noise source from the LF noise data without appropriate data processing for the model-dependent LF noise analysis.
Most recently, the combination of artificial-intelligence (AI) based approach and scientific data analysis has been widely considered in various applications such as healthcare 16,17 , image recognition 18,19 , voice search 20 , and molecular/material science 21,22 . Further, it has also been determined that these combined techniques are suitable for solving the problems associated with non-linear processes or enormous combinatorial spaces with high efficiency 16,[22][23][24] . This clearly indicates that the machine learning (ML) and deep learning (DL) approaches can provide a better optimization and decision-making by converging the scientific data and extracting interpretable models from these data automatically 22,25 . Recently, studies on applying ML or DL to analysis of 2D layered materials have been widely conducted [26][27][28][29] .
In this study, we introduce an effective technique to classify and infer the characterization of current fluctuation with high efficiency and precision by combining AI and LF noise spectroscopy. Due to the similarities of the fabrication process, geometry, bandgap, and mobility of 2D FETs, classifying only using fundamental DC analysis of transfer curves and output characteristics is very difficult. On the other hand, since LF noise data measure the tiny fluctuations of carriers in channel according to time, characterizing an own FET is easily explained. Based on the time-resolved ΔI(t) measured from various 2D material-based fieldeffect transistors (FETs), 2D arrays of the Mel-frequency cepstral coefficient (MFCC) for several electronic properties were considered, and the corresponding features were obtained via a hidden Markov model (HMM). HMM has disadvantages that it must have a relatively large amount of data and hardly express dependencies between hidden states. However, HMM is suitable for processing a large amount of LF noise data based on the advantages of having a strong statistical foundation and enabling efficient learning from raw sequence data. This approach allows us to automatically identify essential device information such as the type of 2D channel materials and gate dielectrics, interface trap density (N it ), Coulomb scattering parameter (α SC ), and the presence of chemical and electron beam doping. Therefore, the combination of factors such as channel material, gate dielectric, contact metal, and electron beam irradiation significantly affects carrier fluctuations as a function of time. This combination, which has more than 100 LF noise data sets under 32 conditions, becomes a catalyst for machine learning that automatically and effectively classify the characteristics of various nanoelectronic devices. In addition, the obtained LF noise spectroscopy data are highly interpretable via machine learning techniques, thereby identifying the contribution of engineered features in characterizing the device information and performance.

RESULTS
Workflow for audio and current signal classification The decimal data type, measured in the time domain, has been used generally for ML and DL in data science; however, the Fourier transform (FT) of this data are frequently employed in ML algorithms to improve data interpretation 30,31 . The process of transforming raw data into a suitable representation for a learning algorithm is often called featurization. For instance, in speech recognition, proper methodology has been widely studied to convert a signal from the time domain to the frequency domain for more accurate classification and analysis [31][32][33] . A typical data demonstration method that extracts the characteristics of the original audio signal through the Mel-frequency cepstral coefficient is illustrated in Fig. 1a. Each speech frame of the time domain signal is first obtained through the pre-emphasis, framing, window, and other processing of the original audio signal as expressed in schematic (i) of Fig. 1a. Subsequently, the speech signals comprising a 30 ms frame window are Fast Fourier transformed (FFTed) with a Hamming window. Further, each spectrum signal is processed by Mel filters (26 filters) to obtain the corresponding Mel-frequency spectrum. Finally, the Mel-frequency spectrums are processed using discrete cosine transform (DCT) to acquire the MFCCs in the cepstral domain as shown in schematic (ii) of Fig. 1a.
We employ this data processing algorithm for a number of ΔI(t) data obtained from various 2D material based FETs which have been fabricated and analyzed under various experimental conditions such as different gate dielectrics 11,13,34,35 , temperatures 11,34,36 , channel materials 10,11,13,34,35,[37][38][39] , chemical/electron beam doping 40,41 , and source/drain contact metals 11,34 (see Fig. 1b). More than 100 LF noise data sets of various 2D layered FETs were considered in this study under 32 different conditions at a particular gate (V G ) and drain (V D ) bias condition. In contrast to the audio signal shown in Fig. 1a, after performing the additional signal normalization process, each MFCC of the current signal in the cepstral domain is consequently determined via FFT and DCT as displayed in schematic (i) and (ii) of Fig. 1b. The MFCCs of the audio and current signals, which comprise the 2D array, are respectively used in speech recognition and device classification (materials/characteristics) through the inference process using ML with the optimized algorithm (see Fig. 1c). The conditions of the device that ML trained and learned in this algorithm distinguishes are as follows (see Fig. 1d): BP, graphene, MoS 2 , ReS 2 , MoTe 2 , and tungsten diselenide (WSe 2 ) were used as channel materials; h-BN and SiO 2 were employed as gate oxides; Ti, Au, Pt, and Cr were used as the contact metals; and passivation, temperature variations, triethanolamine (TEOA) doping, and electron beam irradiation were considered as the different external factors.
Process flowchart for learning and classifying 2D transistor The ΔI(t) of 2D material-based FETs under several conditions were measured at a particular V G and V D (see Fig. 2a) in a shielding metal box (see Fig. S2 and Note 2 in the Supplementary Materials for details of the LF noise measurement system) 42 . The drain current I D can be defined as the sum of the average statistic (DC) drain current (I D ) and low-noise current fluctuations (ΔI D ); Since the amplitude of ΔI D is substantially smaller than I D , ΔI D is generally converted to the voltage signal using the low-noise current-to-voltage preamplifier, as depicted in Fig. 2b. The amplified noise signal was considered as the input ΔI D (t) data used in Python, where the amplitude normalization and pre-emphasis processes were performed, as presented in Fig. 2c. Subsequently, the preprocessed ΔI D (t) data were separated into specific frames with respect to the time domain, and FFT was performed on these data. The transformed data produced by each frame were expressed as power spectral density (S I ) in the frequency domain, and all S I were filtered onto the Mel scale. This transformation of specific frames into S I allowed the evaluation of periodic spectra, and the amount of spectral energy between frequencies could then be obtained by combining the respective frames. It was observed that the Mel scale filter interval was directly proportional to the frequencies i.e., narrow around low frequencies and became wider at the higher the frequencies indicating that the Mel-scale filter amplified the amount of energy around low frequencies (see Fig. S4 and Note 3 in the Supplementary Materials) 30,[43][44][45][46][47] .
The obtained data, which are called Mel-frequency spectrums and mainly used for learning, were consequently more sensitive to the low frequency values, allowing a precise carrier scattering analysis in the devices. Subsequently, the Mel-frequency spectrums were transformed through the DCT and extracted to a finite data point sequence, composed of the current MFCCs in the cepstral domain [43][44][45] . Further, all S I filtered by the Mel scale were overlapped, indicating the existence of correlations between the spectral densities. These correlations could be separated using the DCT method. The current MFCCs transformed by the DCT was expressed as the change in filter energy, and a part of them was extracted to store data as 2D arrays, as demonstrated in the schematic in Fig. 2c 30,47 . Based on the research conducted thus far, the engineered current MFCC features were characterized into 2D arrays with a number between 200 and 1000 for each device class.
Every engineered current MFCCs feature was stored by class, based on the device conditions, and they were learned and classified using ML with an HMM algorithm and DL with an NN, as illustrated in Fig. 2d, e. The HMM based on the Markov chain 30,31,48,49 in Fig. 2d was the first algorithm model used for learning and classifying the data in this study. In HMM, the unaligned training sequences are processed by iteratively evaluating the data stored as current MFCCs. For all the training parameters, the estimates with prior probability distributions are assumed using a maximum a posteriori approach 30,50,51 . The scores for the 32 classes were calculated as Y, as shown in Fig. 2d. Further, the class with the highest score was determined, and could be used to infer the device conditions as shown in Fig. 2f.
The second method used was the NN 30,52,53 , which is one of the DL methods. In this algorithm, the Y values of classes, which had been calculated by HMM, were classified by performing one additional learning step. In this method, the input score vectors, Y, were transferred to the first layer (layer-1) with 32 perceptrons, which is the number of classes, and were then transferred to the second layer (layer-2) by employing a rectified linear unit (ReLU) function as the activation function 54,55 . Instead of the widely used c The inference steps after ML for (i) speech recognition and (ii) materials/characteristics classification. d Side views of the device structures, which are fabricated using various materials after being subjected to external factors such as e-beam irradiation, triethanolamine (TEOA) chemical doping, and temperature variations; Au, Ti, Pt, and Cr are used as the source/drain contact metals; silicon dioxide (SiO 2 ) and hexagonal boron nitride (h-BN) are used as the gate dielectrics; the channels are composed of a combination of various atoms such as Mo, W, S, Se, Te, C, and black phosphorus (BP); MoS 2 , MoTe 2 , WSe 2 , ReS 2 , graphene, and BP have thicknesses varying from monolayer through to 40 layers.
sigmoid function, we considered a ReLU function here as the activation function because of its sparse activation property, which could be partially activated by providing zero as an output against a negative input 30,54,55 . Subsequently, the score vector data were classified using the softmax function, which is used for classification in layer-2, and the probability of a specific class was calculated and classified, demonstrating a normalization effect. The softmax function was obtained by dividing the sigmoid value of each class by the sum of sigmoid values of all classes as described below 30,54 : Compared to the HMM method, the second method had the advantage of classifying the score via repetitive training and learning, which was performed to determine the maximum value of the scores, Y, obtained through HMM. Finally, the device conditions could be inferred as indicated in Fig. 2f.

Data featurization
Numerous electrical properties of FETs, such as the carrier type (electrons or holes), field-effect mobility, subthreshold swing, and current on/off ratio of the device-under-test (DUT) can be determined from the I D -V G transfer characteristics of 2D layered FETs (see Fig. 3a and Supplementary Materials Fig. S1). However, a precise classification of the 2D FETs with fundamental DC analysis is significantly challenging, due to the similarities of mobility, bandgap, geometry, and fabrication process of 2D FETs, except under a few specific conditions such as graphene FET. For instance, ΔI D (t) can be measured during 0.5 s at a particular V G and V D in a device belonging to a specific class (condition) after excluding I D as illustrated in Fig. 3b. It is noteworthy that we only considered ΔI D (t) data where I D was larger than 100 nA to avoid a possible error caused by the minimum detection limit of our system. Subsequently, the current normalization process was performed for ΔI D (t) and divided into 11 frames with a 200 ms window. The S I of each frame was converted using FFT as shown in Fig. 3c and converted into a vector, x n , possessing 100 current MFCC elements, a nm , via Mel-scale filtering and DCT. The x n for each frame was concatenated to create the current MFCC 2D array of each class (condition), X (class)i , as indicated in Fig. 3d.
where i depends on the specific voltage applied to the device belonging to the class (condition). In order to examine the high feasibility of our approach, we considered the carrier number fluctuation-correlated mobility fluctuation (CNF-CMF) model to interpret our ΔI D (t) data (see the detailed LF noise theory in Supplementary Materials Note 2). This CNF-CMF model ascribes ΔI D (t) to the carrier number fluctuation (CNF) caused by trapping/detrapping phenomena in the interface traps between the channel and gate dielectric in addition to the correlated mobility fluctuation. More specifically, ΔI D (t) data can be influenced by many factors such as the carrier type of channel, interface quality and condition between the gate oxide and channel, and the presence of doping (see Fig. 3e). According to the CNF-CMF model, the drain current normalized S I can be expressed as follows 5,6,9,56 : where q is the carrier charge, k is the Boltzmann constant, T is the absolute temperature, f is the frequency, γ is the frequency exponent, C ox is the dielectric capacitance per unit area, g m is the transconductance (=ΔI D /ΔV G ), S Vfb is the flat-band voltage spectral density, and μ eff is the effective mobility. The trapped carriers near the channel-gate dielectric interface not only cause variations in S V fb , but also degrade electron mobility, resulting in modulation of the carrier density. Figure 3f shows the representative S I of each frame in the frequency domain among the fabricated DUTs. The observation of certain harmonics in the S I could be attributed to the carrier Fig. 2 Flowchart for learning and classifying characteristics of 2D transistors. a Schematic of a 2D layered FET, which was measured at a given V D and V G under various conditions in the shielded state; b amplification of the measured current signal using a low-noise current amplifier; c process of feature engineering the input data into a suitable representation (current MFCCs) through MFCC (the darker color, the smaller value); d ML with HMM algorithms using the current MFCCs, which comprises the 2D array; e deep learning process to re-learn into neural network (NN) through the score vector (Y) extracted via ML with HMM the algorithm using current MFCCs; and f inference steps of device conditions (channel material, gate material, chemical doping, and e-beam irradiation) through ML with the HMM algorithm and deep learning with the NN algorithm.
trapping/de-trapping process in the gate oxide trap sites. In fact, these harmonics assisted in understanding the characteristics of each device and expressed these characteristics as spectral envelopes with specific peaks 32,33,57,58 . Therefore, N it and α SC between the gate dielectric and the channel of each class (condition) have a significant effect on the spectral envelopes and unique characteristics of device (see Fig. 3g). As a result, the current MFCC 2D array comprises power spectral sequences for each frequency (amplified for the low frequency region), as demonstrated in Fig. 3h. The HMM algorithm that learns the previous state and infers the next state is efficient for learning the current MFCC 2D array that contains N it , α SC , and γ information according to the frequency sequence.  Fig. 4a, b. The obtained N it increases by a factor of 10 after electron beam irradiation. Moreover, the engineered current MFCC for α SC as a function of T in the monolayer MoS 2 FET on h-BN is also demonstrated (see (iv) to (vi) in Fig. 4a, b). α SC increases with increasing T from 3.23 × 10 4 (T = 25 K) to 3.08 × 10 5 V s C −1 (T = 200 K) 11,36 . The frequency distributions are presented in a histogram with 20 intervals, as shown in Fig. 4a, using the normalized elements of LF MFCC of classes (i)-(vi) (see Fig. 4b). As N it increases from condition (ii) to (iii), the highest frequency of the histogram shifts to the positive direction. A similar positive frequency shift is observed in cases (iv)-(vi) with the increasing T. Referring to Eq. (4), the S I varies as a function of N it and α SC , and the corresponding current MFCCs can be extracted via featurization, consequently enabling the representation of a specific histogram tendency.
The HMM algorithm, which learns considering the correlation between the previous state and the next state, progresses under the following two learning conditions. The first learning condition is that the specific current fluctuation of each device in a specific class is due to N it and α SC , and the current MFCC contributes to learning by considering the above information. The second learning condition is that the HMM algorithm is learned by considering the correlation between the MFCC of the previous frequency and the MFCC of the next frequency. Thus, in Eq. (4), the exponent γ, which reflects the trap distribution, also influences the learning process with the HMM Fig. 3 Detailed flowchart of ΔI D featurization. a Transfer characteristics (I D − V G ) of 2D layered FETs measured under various conditions; b ΔI D (t) data measured at a particular V D and V G and then divided into specific frames with sampling period; c current power spectral densities (S I ) in the frequency domain of each frame converted through FFT; d current MFCC comprising 2D array, X (class)i , obtained by concatenating current MFCC vector, x, processed through Mel filter and DCT; e sectional illustration of carrier behavior in a 2D layered FET; f S I (f) at a particular V D for several V G and spectral envelopes (the darker color, the larger V G ); g N it and α SC distributions, which were calculated by carrier number fluctuation-correlated mobility fluctuation (CNF-CMF) model of each class (the box plots are defined by 25th and 75th percentile); h engineered current MFCC 2D array, which contains carrier behaviors (the darker color, the smaller value).
algorithm, with N it and α SC . Figure 4c displays the classification accuracies and processing time obtained using the HMM algorithm and the HMM score vector learning method employing the NN for a number of data. When the number of data was 7800, the HMM classification accuracy was 76.3% with f1-score and AUC value of <0.78 using fourfold cross-validation (see Supplementary Materials and Note 4). On the other hand, the logistic regression model achieved only a classification accuracy of 88.8% with f1-score of 0.75, AUC value of 0.89. Therefore, provided that the performance of CNN architecture for classifying by learning perceptrons of each layer is acceptable, a transfer learning for any other channel or gate oxide materials can be possible 61 .
Most of the classes (or labels) were in good agreement with the CNF-CMF model with high averaged cross-validation accuracies of over 90% with f1-score of over 0.86 and AUC value of over 0.84, as presented in Fig. 4d. However, two exceptional classes, i.e., ReS 2 (blue bar) and MoS 2 (red bar) FETs fabricated on h-BN, are present in this figure with low classification accuracies of 74.2% (with f1score of 0.79 and AUC of 0.75) and 38% (with f1-score of 0.49 and AUC of 0.51). This indicates that the current MFCCs for these classes were misinterpreted in the high current region. To interpret this miscalculation clearly, the corresponding S I =I D 2 curves at f = 10 Hz for both cases are displayed in Fig. 4e. Although they are well fitted to the CNF-CMF model in most of the current regions, the additional contact resistance (R CT ) contributing towards the total LF noise behavior in the high current regions curtails the accuracies in particular 11 . The inset in Fig. 4d shows the confusion matrix of the HMM+NN architecture. Interestingly, some classes, which should consider the effects on additional contact resistance such as ReS 2 and MoS 2 FETs using h-BN as gate dielectric, WSe 2 FETs using Au as contact metal, and monolayered MoS 2 FETs, are sometimes confused with each other.

DISCUSSION
Combining the LF noise spectroscopy with machine learning algorithms provides an efficient and precise approach to characterize and classify 2D layered FETs. Through the use of an NN based on the hidden Markov model algorithm, we demonstrate that MFCCs, which were converted from the LF noise data of DUTs, can be predicted more precisely than the limits of fundamental measurements. Importantly, this method of applying only a specific voltage can be considered advantageous in both classifying device information and characterization of device performances. The combination of factors such as channel material, gate dielectric, contact metal, and electron beam irradiation have a profound effect on carrier fluctuations, enabling effective learning and training. Further, the learning models using LF noise spectroscopy presented herein are highly interpretable, and aid in identifying how engineered features, including the behaviors between carriers and traps, contribute to characterizing device information and performance. Therefore, the considerable flexibility of this approach makes it adaptable in distinguishing the degree of degradation and reliability of device and to modeling optimized fabrication conditions and device structures. The carrier transport direction, stacking order and orientation in 2D heterostructures would be a critical factor that influences significantly on charge fluctuation, expecting to enable the improved interpretation in the future via this approach. Moreover, the inference of engineered current MFCC features that currently lack sufficient noise data, combined with the CNF-CMF and additional contact noise approaches, and an improved ability to build models from limited experimental data should be possible using the developed model.

Sample fabrication
An appropriately selected chemical vapor deposited monolayer MoS 2 and mechanically exfoliated 2D multilayer materials such as MoS 2 , BP, ReS 2 , MoTe 2 , WSe 2 , and h-BN were transferred onto high-quality 300 nm-thick SiO 2 /p + -Si substrates. To make source and drain metal electrodes on them, standard electron beam lithography was used, and 80 nm-thick Au, Ti, Pt, and Cr were deposited using an electron-beam evaporation system. To suppress the contact resistance effect at the metal-semiconductor interface, all the fabricated devices were annealed under a high vacuum condition for 2 h at 473 K. The trenched graphene FETs in this study was fabricated on a pre-patterned parallel grid structure made of spin-coated poly(Methyl Methacrylate) A2 via conventional dry transfer methods 39 . The Al 2 O 3 passivation layer was deposited on 2D materials using an atomic layer deposition system.
In-situ measurement with e-beam irradiation Electron-beam irradiation was conducted under high vacuum conditions (~10 −6 Torr) at 300 K using a scanning electron microscope (SEM) (Quanta 3D FEG) chamber with a nano-manipulator for multilayer MoS 2 , WSe 2 , and monolayer MoS 2 for 30 s with 30 kV and 50 pA. Four tungsten probes installed on the nano-manipulator system were electrically connected to a semiconductor parameter analyzer.

Electrical transport measurement
All the devices, except the Al 2 O 3 passivated MoS 2 FETs, were characterized in a high vacuum-probe station system 10 . Fundamental electrical transport characterizations were performed using semiconductor analyzers (Keithley 4200, Agilent B1500A) with a temperature controllable probe system (335, Lake-Shore). Low-frequency noise characteristics were obtained from a home-made noise measurement system (the system details are presented in Fig. S2 in the Supplementary Materials), consisting of a home-made battery box, a low noise current-to-voltage pre-amplifier (SR570, Stanford Research Systems), and a data acquisition system (DAQ-4431, National Instruments) 42 .

Data processing and training
We used the Python speech features library in Github (https://github.com/ jameslyons/python_speech_features) for processing of LF noise data into MFCC parameters. We only considered data where I D was larger than 100 nA to avoid a possible error. The optimized combination of hyperparameters was based on the previous studied LF noise analysis, narrowed the range, and found the best result by iterating through for loop. After Augmenting training MFCCs dataset using Gaussian noise, we used hmmlearn (https://github.com/hmmlearn/hmmlearn) library in Github for using HMM trainer function with training MFCCs data. Through HMM training, trained data generated for each class were converted into score vectors, and these vectors were trained by neural network based on the Tensorflow keras (https://www.tensorflow.org/guide/keras). Finally, We learned and trained current MFCC data directly using CNN also based on the Tensorflow keras (https://www.tensorflow.org/guide/keras).

Model validation
We used the 4-fold cross-validation method to train our MFCCs dataset, training MFCCs dataset was divided into 4 subsets having equal sizes randomly. Of the 4 subsets, a single subset was retained for the test data for evaluating the model, and the remaining three subsets were used as training. Our cross-validation process is repeated 4 times, with each of the four subsets used once for test. The remained test MFCC datasets were converted into score vectors to evaluate the model with training data learned through the HMM+NN architecture. We obtained not only the accuracy, but also confusion matrix, receiver operating characteristic (ROC) curves, area under the curve (AUC) value, and f1-score to evaluate the model performance accurately with imbalance of the data (https://scikitlearn.org/stable/modules/classes.html#module-sklearn.metrics).

DATA AVAILABILITY
Some of LF noise data that support the findings of this study are available from Github (https://github.com/Kookjin-Lee/Kookjin.Sangjin.noiseML) and the test LF noise data are uploaded in the subfolder with each label name in the folder (Test data_noise). All LF noise data are available from the corresponding author(s) upon reasonable request.