Human activity recognition using magnetic induction-based motion signals and deep recurrent neural networks.

Recognizing human physical activities using wireless sensor networks has attracted significant research interest due to its broad range of applications, such as healthcare, rehabilitation, athletics, and senior monitoring. There are critical challenges inherent in designing a sensor-based activity recognition system operating in and around a lossy medium such as the human body to gain a trade-off among power consumption, cost, computational complexity, and accuracy. We introduce an innovative wireless system based on magnetic induction for human activity recognition to tackle these challenges and constraints. The magnetic induction system is integrated with machine learning techniques to detect a wide range of human motions. This approach is successfully evaluated using synthesized datasets, laboratory measurements, and deep recurrent neural networks.

H uman activity recognition (HAR) aims to provide information on human physical activity and to detect simple or complex actions in a real-world setting. It allows computer systems to assist users with their tasks and to improve the quality of life in areas such as senior care, rehabilitation, daily life-logging, personal fitness, and assistance for people with cognitive disorders [1][2][3][4][5][6] . Two main approaches for deployment of HAR systems are external and wearable sensors 7 . In the external approach, the monitoring devices are set at fixed points, and users are expected to interact with them 8 . The vision-based technique, for example, is one of the well-known external methods that has been extensively studied for human activity analysis 9,10 . However, it faces many challenges in terms of coverage, accuracy, privacy, and cost. It requires infrastructure support, such as the installation of video cameras in surveillance areas, which is usually costly. Additionally, cameras cannot capture any data if the user performs out of their reach 11,12 . In the second approach, on-body sensors, such as accelerometers, gyroscopes, and magnetometers, are used to translate human motion into signal patterns for activity recognition [13][14][15] . Recent advances in embedded sensor technology have made it feasible to monitor the user's activity using smart devices. Several research studies have reported the use of smartwatches and smartphones in human activity monitoring, and have presented a satisfactory performance [16][17][18][19] . Although these devices provide a privacy-aware alternative solution that overcomes many disadvantages of the external approach, they still might not be able to address the requirements of a diverse range of applications. A single wearable cannot cover the entire body and therefore fails to obtain adequate information about the mobility of all body segments [20][21][22] . For example, inertial sensors embedded in a smartwatch cannot capture the movement of legs, which restricts the capability of the system in classifying activities. Additionally, in systems relying on data from a single device, variations in position can have a significant effect on the performance or lead to the failure of the monitoring system 20,23,24 .
Wireless body area network (WBAN) consisting of wearable devices operating around the human body can tackle these problems 21,25 . In WBANs, sensors are spatially distributed over the human body and collect data from the user. Then data are transmitted wirelessly to a central processing unit for detection. This approach can provide comprehensive information on the mobility of body segments and potentially improve system accuracy. However, WBAN design is challenging as many constraining, and often conflicting, requirements have to be taken into account [26][27][28] . For example, the system has to be inexpensive, accessible to the general public, and meet ergonomic constraints and health requirements. It has to operate under proper guidelines limiting the power exposure to the user since the energy absorption may lead to temperature elevation in biological tissues. To ensure users' safety, it has to satisfy specific absorption ratio (SAR) constraints, while providing a reliable wireless link 29 . Moreover, the system should guarantee the security and privacy of the user's data. Wearable devices must be small and lightweight, which puts a restriction on the battery size and longevity. On the other hand, frequent battery recharging may not be practical for sensor networks with multiple sensors in applications such as senior monitoring 7 . Due to the limitation of energy resources, the power management has become a critical issue in designing a WBAN. Since wireless communication consumes a considerable portion of the energy 30 , numerous studies have proposed and investigated low-power solutions [31][32][33][34] . The conventional state-of-the-art wireless sensor networks working in the vicinity of the human body adopt radio-wave propagation for signal transmission. This technique is susceptible to the characteristics of the environment, and its signal experiences a high attenuation around a lossy medium, such as the human body. It results in higher power consumption, shorter battery life, and lower reliability 33,35,36 . Moreover, radio-wave propagation technologies are prone to interference with adjacent communication links since most of them, such as Bluetooth, operate at the busy 2.4 GHz, the industrial, scientific, and medical (ISM) band 37,38 . They also have potential security problems as their signal cannot be stopped from propagating into free-space. Therefore it can be intercepted even distant from the transmitter 39 .
We introduce the magnetic induction-based HAR (MI-HAR) system that effectively detects physical movements by magnetic induction (MI) signals. This system represents the motion of human body parts via variations in the MI signals transmitted from transmitter to the receiver during physical action, instead of spatial data measured by the inertial sensors. This approach can overcome several problems associated with conventional sensorbased HAR systems, such as eliminating the need for an extra wireless module, reducing power consumption, and the required bandwidth by combining data collection and wireless signal transmission steps. Moreover, it has other features that are inherited from the MI-based communication system. Here we verify the capability of the proposed method in identifying human actions. We first synthesize MI motion data corresponding to several physical activities. Then we apply machine learning-based classifiers and deep recurrent neural networks to classify human movements. The results indicate that the MI signals are informative descriptors for the motion of human body parts.

Results
System principle. The MI-based communication system is a short-range wireless physical layer that transmits signals by coupling non-propagating magnetic field between the wire coils rather than radiating as conventional methods. The main component of each node is a coil, which is lightweight, portable, inexpensive, simple, and can be worn as accessories such as belts, wristbands, and jewelry 33,40 . The manufacturing cost of an MI module is approximately less than $20, while a Bluetooth IMU costs more than $100 (refs. [41][42][43]. The MI coils have a small radiation resistance, which means that the energy propagated to the far-field is negligible. As a result, multipath fading is not an issue, and the MI system can offer a much better quality of service (QoS) compared to Bluetooth-type systems 33,44,45 . The non-propagating magnetic field produced by the coils falls off proportional to r −3 instead of r −1 for radiating fields at a transmission distance r. Although the rapid decay limits the coverage range, it can be favorable in short-range applications such as WBANs 46 . It allows the signal to remain in a 'bubble' around the coil, which provides a personalized space for the user. It also minimizes the leakage outside the targeted coverage range, reduces interference, increases security, and enables bandwidth reuse 44,47 . One of the main notable advantages of the MI system is that it works well in lossy dielectric media, such as the human body 48 . In these environments, the MI system experiences much less energy absorption compared to conventional radio-wave propagation technologies 49 . It results in lower SAR for applications working around the human body. Due to smaller path loss, the MI system can transmit a signal with much less power for the same range. This system can be up to six times more efficient in terms of battery power compared to other short-range communication systems (e.g., Bluetooth) 47 . This characteristic enables a large variety of novel and demanding applications in harsh environments such as underwater monitoring of scuba divers 39,49,50 .
The signal generated by an MI coil attenuates as a function of frequency, channel medium, coils' geometry, location, and alignment (see Methods section) 33 . The non-propagating magnetic field is mainly affected by the permeability of the medium, which is close to the air for non-ferrous materials. The MI channel condition remains constant even in an inhomogeneous lossy medium, such as around the human body 33,49 . For the frequency of up to 30 MHz, the dimension of the human body is relatively small compared to the wavelength, which makes the propagation and scattering effects insignificant 33 . The immunity of signal in this frequency range to the environment makes the forward voltage gain, S 21 , of the MI system only a function of coils' locations and alignments for a predefined coil geometry and operating frequency. The gain varies by changing the distance and alignment between the MI coils, and therefore, relative motion between the MI coils yields patterns in the received MI signal. This unique characteristic of the MI system is the fundamental principle of the proposed MI-HAR system.
System framework. The activity recognition process steps are different depending on the application. The framework used in this paper has two main stages: data acquisition and detection. For the first stage, an MI-based communication system is employed, which enables the integration of sensing and wireless data transfer into a single step. The user wears the receiver (RX) coil, for example, as a belt around the waist, and transmitter (TX) coils can be placed around the other skeleton bones, such as wrists, arms, and legs. The human body bones are spatially translated and oriented during a physical activity, which changes the relative location and alignment of the MI coils around them. Collecting the received MI signals transmitted from the coils enclosing skeleton bones can model the relative motion of human bones to represent motion. Since the spatial variations of skeleton bones over time are discriminative descriptors of human actions 51 , the vector of samples observed by the MI coils over time can be considered as the set of inputs for the activity detection algorithm. Increasing the number of coils around the skeleton bones results in a broader set of input data. It consequently enhances the accuracy of the MI-HAR system in detecting the relative motion of body parts. In the next step, a classification method is applied to the MI motion data for detecting human action.
MI system setup. The MI transceivers adopted in the experiments consist of a coil and L-reversed impedance matching network 52 . The matching network is used to maximize the transmission efficiency of the overall system 52 . The coils are identical, air-cored, single layer copper with 5 cm radius, 10 AWG wire diameter, and the user can wear them as accessories. The coil's radius can change depending on the size of the body part that they are designed to be placed around. The source and load impedances are 50 Ω, and the resonance frequency is 13.56 MHz. As the operating frequency is lower than 30 MHz, the human body effect is neglected 33 , and the effect of the background medium is considered to be the same as that of air. The reversed L-matching networks consist of a series inductor of 5380 nH and a parallel capacitor of 600 pF.
Synthetic MI motion data. In this study, we have synthesized MI motion data to evaluate the proposed MI-HAR system capability in motion detection. The circuit model of the MI system (see Methods section) is used to calculate the forward voltage gain, which is the scaled version of the received MI signal. As the pattern is the same, we used the generated voltage gain patterns of the system as the input features for the detection algorithm. Figure 1 shows the measured and simulated forward voltage gain of two coils during their movement. Since the distance and misalignment between two coils are required as inputs for the model, their location and alignment are captured using video object tracking (see Methods section). Results show that the simulated signal is consistent with the measured data, which is an indication of a valid model for generating time-series MI data. We have performed experiments for 20 different motions that involve both geo-translation and misalignment of coils. The average normalized root-mean-squared error (NRMSE) of the synthesized and measured S 21 for these experiments is less than 10.3%. The reported NRSME not only takes into account model error but also includes the error associated with the motion tracking algorithm using video and vector network analyzer (VNA) measurements.
To synthesize MI motion data during different human actions, we considered a receiver and eight transmitter coils around the torso, hands, arms, legs, and thighs, respectively (see Methods section). For spatial translation and rotation of human body bones, 3D motion capture (MoCap) datasets are employed. Each pair of markers placed at the joints can define a bone. Hence, the location and alignment of MI coils placed around the body parts can be derived and provided as inputs to the model for synthesizing the corresponding MI motion data. Two publicly available experimental datasets: Biological Motion Library (BML) 53 with 4 activities and Berkeley Multimodal Human Action Database (MHAD) 54 with 11 activities are used here. A brief description of these datasets is presented in the Methods section. The generated synthetic forward voltage gain of the MI transceivers corresponds to these datasets is presented in Fig. 2. A point to consider is that we have extended the single-transmitter/ single-receiver model to a multi-transmitter/single-receiver scenario, assuming the interferences such as cross-coupling between coils are negligible, because the interference mitigation techniques such as time-division multiplexing 55 or frequency splitting 56 can be applied to reduce or ideally eliminate interference between inductive systems. Moreover, interference protocols (e.g., RFID interference protocols) can control communication between transceivers while preventing their interference with one other. Therefore, the model can provide a reasonably accurate estimation of multi-coil system performance.
Performance. Tracking the motion of body parts during physical activity is critical in characterizing an individual's movement, and collecting data that provide a more accurate representation of these motions results in better activity detection. The MI signals express a strong relationship with the geo-translation of body segments since the system gain is directly affected by distance and misalignment between coils. Distributing more coils around the human body provides comprehensive information about the user's body movements and results in a better distinction between similar actions. We used the MHAD dataset to compare the capability of the MI signal and accelerometer data in estimating NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-15086-2 ARTICLE the location of a body part during physical activity. The accelerometer is considered here as a benchmark because it is the most frequently used wearable sensor modality for human activity monitoring. Six markers placed close to the accelerometers are considered as target points. Then the similarity between the 3D location of each target point and data of its corresponding accelerometer and MI transceiver is calculated. We used R 2 as the similarity metric, and the average values over the whole dataset are presented in Fig. 3. The results show that, on average, the MI signal has a stronger relationship with the 3D location of markers compared to the accelerometer data. This characteristic can be useful not only in classifying human activities but also in reconstructing the motion trajectories of body segments. Many studies have adopted IMUs to reconstruct the trajectories of movements for motion analysis in different applications. Examples include handwritten digit recognition 57 , monitoring trunk kinematics during standing up to sitting down 58 , and tracking the motion of body parts on patients who have been affected by neurological conditions for rehabilitation purposes 59 . In inertial sensor-based recognition systems, the velocity and positions are computed indirectly by the integration over sensor 23 Average R 2 between XYZ of each target point and data of its corresponding accelerometer and magnetic induction (MI) transceiver.
The R 2 reports the similarity between two sets of data by a number between zero and one, where a higher number shows a stronger relationship between two datasets.  59 . On the other hand, the MI motion signal is directly affected by the location and orientation of coils. As a result, the trajectory reconstruction using MI signals does not require integration over measured data, which removes the problem of the cumulative error.
To assess the performance of the proposed MI-HAR system in recognizing human activities, we implemented deep recurrent neural networks (RNNs) based on long short-term memory (LSTM) units due to their strong performance in human activity detection, and their capability in learning complex representations of the motion data 60,61 . We compared the results of this method with several commonly used classifiers for activity detection using generated synthetic MI motion data. Table 1 summarizes the performance results of LSTM with methods including support vector machines (SVM), K-nearest neighbors (KNN), decision trees (DT), random forests (RF), and logistic regression (LR). The confusion matrix of each classification method on BML and MHAD datasets are also presented in Figs. 4 and 5, respectively. The results are compared to other previously introduced methods using different modalities for activity detection. We employed accuracy as an evaluation metric for comparison, as datasets used in this paper are balanced and have an equal number of samples for each activity. The results presented in ref. 62 show that SVM and Multi-Task Conditional Restricted Boltzmann Machines (MT-CRBMs) classifiers have achieved an accuracy of 41.3% and 54.5% using BML motion capture data, respectively. For the MHAD dataset 63 , has reported an accuracy of 98% by applying SVM on accelerometer data. The random forest classifier has also achieved an accuracy of 96% and 68.2% using MHAD motion capture and audio data 64 . The accuracy of LSTM using camera RGB image for human activity classification is stated as 92.4% 65 . Our results indicate that the deep LSTM model with optimum hyperparameters outperforms other classifiers by a considerable margin on the generated synthetic MI motion data. The recurrent neural networks can capture sequential and time dependencies between input data that results in a strong performance. The LSTM cells let the model capture even longer dependencies compared to vanilla cells. A deep architecture with an optimal number of layers enables the neural network to extract useful discriminative features from the set of input data and to improve the performance of the model. It should be noted that the datasets used in this paper are diverse, which proves the classifier models are valid for a broad range of activity recognition tasks. Moreover, the actions recorded in the BML dataset, including knocking, lifting, and throwing, are very similar as only one hand is moving. The same movement of human body parts in these activities makes it difficult to distinguish and categorize them. Despite these challenges, the deep LSTM model has achieved high accuracy, and it indicates that the recurrent model is capable of classifying human actions by using MI motion signals.

Discussion
HAR is a powerful technology with a wide range of applications such as healthcare, rehabilitation, sports training, and senior monitoring. We proposed a new wearable-based HAR system using MI for motion capture and wireless signal transmission. This method can tackle existing issues with conventional HAR systems in various aspects, including power consumption, the complexity of implementation, and cost. It can also provide a suitable infrastructure for new applications working in harsh environments, such as underwater. The proposed system is a new sensing approach for capturing human motions, which can also be integrated with other monitoring modalities to provide a more comprehensive HAR system.
To show the capability of the MI-HAR system in detecting human movements, we generated synthetic MI motion data received from MI transmitters around the user's body during different activities by the MI system model. As mentioned before, the model used for synthesizing MI motion data does not consider cross-coupling between transmitter coils. However, this cross-coupling is not necessarily destructive and can even provide further information regarding the location and alignment of all coils relative to each other. In this scenario, each received signal is  not only a function of the transmitter and receiver coils but also the arrangement of all other coils affects it. Therefore, the movement of even a single body part results in a different signal pattern and can make the system more accurate in detecting actions similar to each other. In the future, we plan to build a realistic deployment-ready prototype of the MI system for capturing MI motion signals during various human activities. Such a system would allow us to perform experiments on real-world MI motion data to demonstrate the accuracy of our method and study the effect of cross-coupling interference on the MI-HAR system. The proposed system can also be integrated with other modalities and monitoring techniques to provide a more comprehensive system for human motion tracking. We employed several commonly used machine leaning-based classifiers and deep recurrent neural networks for the detection step. We empirically evaluated the proposed MI-HAR system by conducting experiments on the generated synthetic MI motion dataset and discussed the outcomes in detail. Experimental results reveal that the proposed deep LSTM model shows outstanding performance compared to other approaches. One of the benefits of using the deep recurrent neural network for sequence classification is that it can support multiple parallel temporal input data from different sensor modalities such as MI sensors, accelerometers, and gyroscopes. The model can learn complex features directly from raw data and map them to activities. It removes the need for manual feature engineering by experts while it achieves a comparable performance to models with the feature handcrafting step. Besides, the neural network model enables an interactive learning system when the user provides training data even after the initial training step. It allows the user to fine-tune a pretrained neural network model with their personal data. However, the neural network complexity should be assessed where models have to be implemented in embedded systems with limited processing capability. It highlights the importance of trade-off between computational cost and detection accuracy to ensure real-time feedback.

Methods
Theoretic circuit modeling of the MI system. The MI system consisting of two coils can be modeled as a two-port network shown in Fig. 6. Coils are attached to impedance matching networks, called input and output matching networks, to maximize the transmission efficiency of the overall system 52 . The closed-form expressions of these circuit parameters are reported in ref. 33 to facilitate performance analysis of the MI-based communication system around the human body. The model is validated by simulations and measurements performed for various coils in different locations and alignments relative to each other 33 . The average error of all experiments compared with the simulated signal attenuation results is lower than 10% in the frequency range below 30 MHz. The more advanced version of the expressions without any simplification is also calculated and reported in this work.
Assume that the transmitter coil with number of turns N TX , area S TX , and current I TX is centered at C TX , and its surface normal is b n TX . The receiver coil with number of turns N RX and area S RX is centered at C RX , and its surface normal is b n RX . The mutual inductance between the coils in a linear, homogeneous, isotropic background medium with permeability μ and complex propagation constant γ can be calculated from M ¼ μN TX I TX R S RX H TX :dS RX 33,66 . By using the exact expressions for the magnetic field generated by the TX coil H TX and applying a procedure similar to ref. 33 , one can derive the mutual inductance without any simplification as follows: ρdϕdρ Àρ 2 cos α À ρ sin ϕ ð1 þ cos 2 αÞðc rx :b yÞ Â À ρ sin ϕ sin α cos αðc rx :b zÞ À 2ρ cos ϕ cos αðc rx :b xÞ À cos αðc rx :b xÞ 2 À cos αðc rx :b yÞ 2 À sin αðc rx :b yÞðc rx :b zÞ Ã : Rf 1 þ γr þ γ 2 r 2 r 5 e Àγr g þ cos αðc rx :b zÞ 2 À sin αðc rx :b zÞðc rx :b yÞ Â À ρ sin ϕsin 2 αðc rx :b yÞ þ ρ sin ϕ cos α sin αðc rx :b zÞ Ã : where r is the distance between the origin and the observation point and can be defined in the cylindrical coordinates as follows: The parameters used in the above expressions are calculated from location and alignment of TX/RX coils as follows: where R x (θ x ), R y (θ y ), R z (θ z ) are rotation matrices that rotate vectors by an angle θ x , θ y , θ z about the x-, y-, or z-axis using the right-hand rule. The self-inductance and resistance, which comprises DC resistivity, skin depth δ w , and proximity effects, of a coil with radius a, length b, number of turns N, circular cross-section wire, core-material permeability μ, wire diameter ϕ w , and wire resistivity of ρ w can be expressed as follows 33,67 : The scattering matrix S is another set of two-port parameters defined in terms of incident and reflected waves at ports. One of the matrix elements is forward voltage gain S 21 , which shows the voltage of the network at port two divided by the voltage at port-1. Converting the ABCD parameters to S-parameters, the forward voltage gain of the MI system can be determined as follows 68 : where A, B, C, D are the ABCD parameters of the overall MI system including the MI coils and the matching circuits.
Measurement. The forward voltage gain of two coils is measured for 30 s via a VNA with 1800 points resolution. The corresponding synthetic S 21 is also generated by using the system model for comparison. All parameters of the model are predefined based on the MI system setup except the distance and misalignment between coils, which are variable during the movement. Hence, two coils are labeled with red markers and placed in front of a green screen. The motion of coils is captured via an iPhone's built-in camera with 30 fps, and the videos are processed offline to extract markers, their center, and alignment, as shown in Fig. 7.
Since only one camera is used, without loss of generality, coils only move in 2D such that the camera can capture their motion. The extracted pixel-wise movement of coils is then converted to the spatial translation using a predefined length 'calibration label'. The ratio of the calibration label's length to its size extracted from video provides a meter to pixel ratio. As the camera is fixed during the experiment, this ratio remains constant for all frames of the video. The recorded distance between coils covers up to 60 cm range. The generated synthetic MI data are synchronized with measured data by minimizing the NRMSE. The code used to track coils and calculate the forward voltage gain of the system based on the circuit model reported in this work is implemented in MATLAB.
Simulation. Figure 8a depicts the location of coils considered around the human body for generating synthetic MI motion data. The location of markers required to track coils motion is also displayed in Fig. 8b. Assuming that the coils are located at the midpoint of bones, we can calculate their center by averaging the location of corresponding paired markers. For example, Fig. 8c shows the right leg, its corresponding transmitter coil, and markers. The center of the transmitter coil TX 8 can be calculated as c TX 8 ¼ ðM 14 þ M 15 Þ=2. The coils are around the human bones, which indicates that the alignment of the line passing through the markers is the same as the surface normal of its corresponding coil. Therefore, the surface normal of the transmitter TX 8 can be written asn TX 8 (2,9), (3,4), (4,5), (10,11), (11,12), (6,7), (7,8), (13,14), (14,15) define two ends of the torso, left arm, left hand, left thigh, left leg, right arm, right hand, right thigh, and right leg, respectively. Consequently, these pairs can be utilized to calculate the location of coils (C TX i , C RX ) and their alignment (b n TX i , b n RX ). c The center and alignment of a bone and its corresponding coil can be calculated using markers locations.
approximate recording length of activities varies from 2 to 15 s. The dataset consists of data from four microphones with a sampling rate of 48 kHz; six accelerometers fixed on wrists, hips, and ankles with a sampling rate of 30 Hz, the optical motion capture system with a sampling rate of 480 Hz, cameras with a sampling rate of 22 Hz, and depth sensors with a sampling rate of 30 Hz. In our experiments, we used the down-sampled MoCap data to 60 Hz.
Data preprocessing. In our experiments, we have used the magnitude of MI signals as input for the classifiers. Data samples are processed before fetching into the classification models. The processing methods are implemented using Python 3.6. For data cleaning, the missing values are substituted with previous non-missing values, and a 5-point quadratic (order 4) polynomial Savitzky-Golay filter is applied for denoising. Then the baseline offset is removed from time-series data. In the MHAD dataset, 3% of the signals are removed from the end of each data sample as the reported experiments show improvement in the accuracy 63 .
Classification. The classifier models are implemented using Python 3.6. They are trained and evaluated on the generated synthetic motion datasets of eight bones using the leave-one-subject-out cross-validation (LOSO-CV) method. For the experiments on the BML and MHAD dataset, respectively, six and two subjects are used for validation and the rest for training.
Machine learning-based classifiers: The machine learning-based classifiers are implemented using python library Sklearn 69 . The multi-class models are non-linear SVM with a polynomial kernel, KNN, decision trees, random forests, and logistic regression. We used the bag-of-words (BoW) representation to characterize the time-series data with different lengths. First, the synthetic MI motion data are divided into fixed-length segments of 1 second using the sliding window technique with 0.8 second overlap. Attributes are then computed for the time domain, frequency domain, and time-frequency domain of each window segment. Frequency domain and time-frequency domain representations of the signal are calculated by the fast Fourier transform (FFT), and single-level discrete Wavelet transform (DWT) based on the Daubechies2 wavelet filter, respectively. The attributes considered here are extremes, mean, median, standard deviation, lower quartile, upper quartile, skewness, kurtosis, and the correlation between each pair of signals. As each action is associated with eight data samples, the resulting feature vector for each segment is generated by the concentration of eight feature sets. Features are also scaled using the min-max scaling method to bound values in the range of 0-1. The scaling makes the weight of all features equal in the process of classification. Next, the feature vectors from the training data are clustered using kmeans clustering to define a codebook that contains the cluster centers, which are called codewords. Then, each window segment is assigned the closest codeword, and a time-series is represented as a histogram of codewords. The bag-of-words representations of synthetic MI motion data are used as inputs for the machine learning-based classification models. In our experiments, we quantized the training data of BML and MHAD datasets to 100 and 20 codewords, respectively.
Recurrent neural network: A schematic diagram of the neural network structure is summarized in Fig. 9. The deep LSTM model is implemented in the TensorFlow framework. We used the mean cross-entropy between the ground truth labels and the predicted class membership probability vector as the loss function, and the network parameters are updated by minimizing this loss function. The model is trained using batch gradient descent with the RMSprop updating rule. In each epoch of training, the entire training set is passed through the neural network model to update the model with an exponentially decaying learning rate. The dropout regularization technique is also applied to all nodes in the network to avoid overfitting. The dropout keep-probability determines the probability of keeping a node during training. After each epoch, the performance of the model is evaluated on the validation set. We evaluated the influence of several hyperparameters related to the network architecture and learning process using Both datasets are trained with the optimizer decay rate of 0.95, the initial learning rate of 0.01, the exponential decay rate of 0.98, exponential decay step of 100, and keep the probability of 0.8.

Data availability
The data that support the findings of this study can be reproduced using the codes developed in this work and are also available on Figshare (https://doi.org/10.6084/m9. figshare.c.4844517). The raw data that our synthetic MI motion data were derived from are available in the public domain: BML dataset (http://paco.psy.gla.ac.uk); MHAD dataset (http://tele-immersion.citris-uc.org/berkeley_mhad).

Code availability
Computer code supporting the findings of this study are available on GitHub: synthesizing MI data (https://github.com/negargolestani/Synthesize_MI_data); Activity detection (https://github.com/negargolestani/Activity_Detection).  Fig. 9 Architecture of deep recurrent neural network (RNN). The set of magnetic induction (MI) signals observed by the coils at time t is considered as the input vector x t . A time window of 1 s (T = 1 s) is sliding over the data with 0.5 s overlap, and feeding the truncated subsequences of input data within the window to the batch normalization layer. Then the normalized input data (x t−T+1 , . . . , x t−1 , x t ) is fetched to the deep long short-term memory (LSTM) model. The network outputs sequences of vectors ðy L tÀTþ1 ; :::; y L tÀ1 ; y L t Þ, where each output vector shows the prediction score of its corresponding input sample. Assuming the input signals are sequenced to N samples, the overall score of the entire window can be calculated by averaging all of the scores within the window into a single prediction vector of scores b y t 70 . Then the prediction scores are converted into class membership probabilities b O t by applying a softmax layer. The predicted class membership probability vector contains the probability of every class generated by our model. Then the most probable class is selected as the predicted activity label for the given input data within the time window.