Automatic radar-based 2-D localization exploiting vital signs signatures

In light of the continuously and rapidly growing senior and geriatric population, the research of new technologies enabling long-term remote patient monitoring plays an important role. For this purpose, we propose a single-input-multiple-output (SIMO) frequency-modulated continuous wave (FMCW) radar system and a signal processing technique to automatically detect the number and the 2-D position (azimuth and range information) of stationary people (seated/lying down). This is achieved by extracting the vital signs signatures of each single individual, separating the Doppler shifts caused by the cardiopulmonary activities from the unwanted reflected signals from static reflectors and multipaths. We then determine the number of human subjects present in the monitored environment by counting the number of extracted vital signs signatures while the 2-D localization is performed by measuring the distance from the radar where the vital signs information is sensed (i.e., locating the thoracic region). We reported maximum mean absolute errors (MAEs) of 0.1 m and 2.29\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{\circ }$$\end{document}∘ and maximum root-mean-square errors (RMSEs) of 0.12 m and 3.04\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{\circ }$$\end{document}∘ in measuring respectively the ranges and azimuth angles. The experimental validation demonstrated the ability of the proposed approach in monitoring paired human subjects in a typical office environment.

The continuous and rapid growth of the combined senior and geriatric population has resulted in an increase of age-related chronic diseases, such as congestive heart failure, chronic obstructive pulmonary disease, sleep disorders, arthritis, osteoporosis, and dementia 1 . There are currently more than 1 billion people over the age of 60 worldwide, a number that is expected to double within the next 30 years 2 . This scenario translates into a shortage of healthcare personnel, in tandem with the ever-increasing demands for healthcare services. Coupled with the expected rise in healthcare cost and given that only a minority can afford private home-care personnel, the need for technologies enabling remote patient monitoring is certainly on the rise 3,4 .
In the last years, radar has become one of the most promising telemedicine technologies for both home and clinical environments enabling long-term smart monitoring of patients [5][6][7][8][9][10][11] . Practical applications are: sleep monitoring; contactless monitoring of patients in multi-bed; monitoring elderly people in domestic environment or in nursing homes; monitoring household members in quarantine or patients in departments of infectious diseases to reduce contamination risks; detecting whether people are respecting social distancing. Research focuses mainly on vital signs monitoring and indoor localization. Novel and sophisticated radars have been proposed to properly demodulate the phase shift caused by the subject's movements (i.e., cardiopulmonary activity, walking, running, etc) and embedded into the reflected radar signal (i.e., Doppler effect). The first devices were based on continuous wave (CW) architectures [12][13][14][15][16][17] . However, they were only able to monitor one single subject without providing any information on their position. To solve this limitation, several studies have been conducted on ultra-wideband (UWB) architectures, in particular on frequency-modulated CW (FMCW), stepped-frequency CW (SFCW), phase-modulated CW (PMCW), and UWB impulse-ratio (UWB-IR) radars [18][19][20][21][22][23][24][25][26][27] . Single-inputmultiple-output (SIMO), multiple-input-multiple-output (MIMO) and beamforming UWB architectures have been preferred over classic single-input-single-output (SISO) solutions as they can provide azimuth and range information, namely two-dimensional (2-D) localization, of multiple targets [28][29][30] . At the same time, many novel signal processing methods have been proposed to tackle the challenges that practical circumstances impose. Most research focused on monitoring a single subject. However, to better cope with many complex everyday life applications (e.g., monitoring people lying on their beds in the hospital, elderly people in nursing homes, household members in quarantine, etc), multi-people monitoring has become an important research direction. One of the biggest challenges in radar-based remote patient monitoring is to automatically estimate the number and the 2-D positions of stationary people (i.e., seated/lying down and normally breathing). With this respect, in recent years, some significant works have been published in literature. The authors proposed a SISO FMCW radar, integrating two frequency scanning antennas, which allows determining the 2-D positions and the vital signs of people 31 38 . Although these works achieved high accuracy in monitoring multiple individuals, one limitation is that the experiments were conducted in conditions that significantly reduce the effects of the multipath propagation [39][40][41] . This was achieved with no objects in between the radar and the person or with objects placed strategically to not generate significant interferences. This reduces the probability that the signals reflected from different individuals interfere each other and hence the subjects can be treated independently. If this (rare) condition is not met, especially in everyday indoor environments, those approaches would result in non-linear combinations of the Doppler signals generated by the subjects, making correlationbased algorithms unable to eliminate the multipaths 36 . This because, current approaches aim at directly extracting the Doppler signals from the received radar signals which are the overall results of the sums of reflections due to direct paths (desired information), multipath, and static reflectors (i.e., clutter, furniture, objects, etc), whose Doppler (phase) information combines non-linearly. In this condition, the radar may: (1) fail to determine the right number of subjects in the room; (2) erroneously conclude people being located elsewhere than the actual location; (3) detect a non-existing person (i.e., a radar ghost) or even represent a life-less object as alive (hence, static objects cannot be simply treated as stationary); (4) assign disturbed signals to an individual 42 .
In this article, we propose a signal processing algorithm to automatically detect the number and the 2-D position (azimuth and range) of seated people. This method, demonstrated experimentally using a millimeterwave (mmWave) SIMO FMCW radar, is capable to isolate the Doppler signals caused by the cardiopulmonary activity (vital signs signatures) of each single subject from the reflections of static reflectors and multipaths. We also propose a phase demodulation technique based on geometric fitting which is performed of each isolated Doppler signal. The localization is then performed by measuring the linear distances from the radar where the vital signs information is sensed (i.e., locating the thoracic regions). Figure 1 shows a graphical illustration of the antenna array of a generic SIMO radar with uniform equispaced linear array (ULA). The radar consists of one transmitter (TX) antenna and N RX receiver (RX) antennas, forming an antenna array of N RX elements equally spaced of d meters. Assuming collocated antennas and a target in the far field at a distance r and angle α from the first element of the array (considered as the reference), the signal reflected from the target should travel an additional distance of (n RX -1)·d sin α to reach the n RX -th antenna. This corresponds to a phase difference of 2π/ ·dsin α among the signals received by adjacent antennas. Therefore, there is a linear progression in the phase of the signals across the array.

Signal model and data cube generation
A SIMO FMCW radar radiates a series of signals, called chirps, whose instantaneous frequency increases linearly over time. Mathematically, a chirp can be expressed as: www.nature.com/scientificreports/ where a T is a complex number indicating the amplitude and the initial phase, ρ = B/T is the sweeping rate with B and T being respectively the total bandwidth and duration, and f 0 is the initial frequency. The chirps are transmitted with a certain pulse repetition interval (PRI) and are received over N RX multipath channels. The corresponding received signals s R (t, n RX , m) can be modelled as the convolution between the transmitted signal s T (t, m) and the channel impulse responses h(t, n RX , m) , as: with where m = 0, ..., M-1 is the slow time index, M is the number of chirps per TX-RX combination that should be received before starting any data processing, β is the complex path gain which indicates the overall attenuation and phase shift, τ is the propagation path, i is the index corresponding to the i-th target/object, l is the path index, δ(·) is the Dirac delta function, c 0 is the speed of light, and y(m) is the chest surface vibration caused by the cardiopulmonary activity. Equation (3) models a multipath channels as proposed by Jakes 43 and includes the essential propagation parameters (i.e., namely magnitude, frequency and phase). Since a room has a limited size, we considered only the first L range bins. As consequence, we assume that the number of possible path delays is also equal to L. This assumption is valid since, due to the small size of a range bin, the differences in delays among the electromagnetic waves falling into a range bin are very small and translate into phase shifts. The digitized time domain beat signals s B (n, n RX , m) , obtained mixing the received signals with the replicas of the transmitted signal of each TX-RX pair and following a low-pass filter and an analog-to-digital (ADC), can be expressed as: where n = 0 , ..., N − 1 is the index in fast time, with N being the number of samples acquired per beat signal and depends both on T and on the sampling time T f of the ADC. In Eq. (5), the gain (or loss) of the mixer was included in β . Moreover, the contribution of −πρτ 2 i,l,n RX (m) is negligible for short-range applications as τ is in the order of few nanoseconds. After performing the Fast Fourier Transform (FFT) in fast time, the range profile is obtained. The resulting frequency domain signal X(k, n RX , m) becomes: with where k = 0, ..., K-1, K corresponds to the maximum unambiguous range, F is the fast Fourier transform operator, w(n) is a rectangular window function in fast time, w(n) and W(k) are a Fourier pair. Since the rectangular window in frequency domain is a sinc function with gradients close to zero around 2πρτ i,l , the frequency domain window function W i,j (k) can be considered as a fixed one in slow time. Assuming P subjects and Q static clutter in a room, equation (7) can be rewritten as: The block diagram of the data cube generation is shown in Fig. 2. Per each TX-RX pair, a range profile matrix is created performing a K-point FFT to M consecutive beat signals. The range profile matrices are then stacked together to form the data cube, which is the starting point for the signal processing algorithm described in "Methods".

Methods
The block diagram of the proposed algorithm for automatic detection of the number of human subjects and 2-D localization is shown in Fig. 3. In order to reduce the computational complexity and memory usage, due to the limited size of the room, we consider only the first L of the possible K range bins.
Rough beamforming. The conventional beamformer is performed in frequency domain over the data cube described in (8) for all the antenna elements and for each range bin. We obtain the angle data cube which can be expressed as:   www.nature.com/scientificreports/ where n a = 0, ..., N A -1 is the angle index corresponding to a certain angle, N A is the number of considered angles, and × indicated the vectorial product operation. The graphical illustration of the beamforming is shown in Fig. 4. From the initial data cube, a sub-matrix, containing all the range bins information of all antennas at a certain slow time index, is extracted. It is multiplied by a weighting vector W(α) of N RX element, whose coefficient have been calculated considering a certain angle and using the following formula: This results in an L-element vector which is inserted in the angle data cube at the (:,n a ,m) indexes. This operation is repeated until all the M slow time indices have been considered. This corresponds to focus the beam to the first angle under study (Fig. 4a). The very same process is then repeated with a next angle, and then with a new weighting vector, until all the angles of the rough beamforming have been covered. This results in the angle data cube depicted in Fig. 4b. Static reflectors removal. The sub-matrices of (13) along the n a dimension can be modelled as: where is an L × P complex mixing matrix whose elements are described in equation (10), is a P × M complex matrix containing the Doppler signals equation (11), and hence the vital signs information caused by the P subjects at each T s , and is an L × M matrix with identical columns containing the direct current (DC) information in slow time resulting from static reflections equation (12), the superscript T indicates the transpose, and 1 is a length M all-ones column vector. In presence of additive noise, the data model becomes as: (15) X angle (:, n a , :) = X A = HS + C,  www.nature.com/scientificreports/ where N is an L × M matrix containing the additive white Gaussian noise assumed to be independent and identically distributed (I.I.D.) for different range-azimuth bins. In order to remove the DC contribution of the static reflectors (e.g., clutter, static parts of the human body, ...), we perform alternate current (AC) coupling to 'Eq. (19). The latter is achieved subtracting the mean value along the slow time for each range and angle. It should be noted that the variable cardiopulmonary signals are preserved. Each sub-matrix can be then expressed as: where H determines the linear combinations of the sources in S , so the magnitudes of the elements in H indicate the energy of the sources in every range bin.
Singular value decomposition. The angle data cube X angle is re-arranged in a 2-D matrix X 2D . This is performed by transposing each of the N A X A sub-matrices and then concatenating them by columns. The result is an M × (L · N A ) modelled as: where Z is the result of the concatenation of the N A H sub-matrices. The observation matrix X 2D contains signals which can be direct paths, multipaths, or combinations of them. Those signals are linear combinations of the P independent sources S which are the information to retrieve. We use the economy-sized singular value decomposition (SVD) as the first step of the proposed methodology to determine the number of persons P. It is applied to Eq. Circle fitting. Before evaluating the most significant components of U, we perform phase demodulation in slow time (i.e., per columns). Considering the operating wavelengths (i.e., mmWave range) and the typical values of the mechanical displacements of the lungs and the heart, the Doppler signal produced by the cardiopulmonary activity describes a circle in a complex plane [16][17][18] . For a proper phase demodulation, and hence to avoid distortions, this circle should first be centered to the origin. The AC coupling applied to Eq. (19) (introduced in "Static reflectors removal") should already centre the circle to the origin. However, due to the presence of noise, this may not be optimal 44 . Therefore, we propose the following technique which is applied to each column of U. The coordinates of the center z c (in the complex plane) and the radius r c of a circle can be estimated by resolving a nonlinear least-squares geometric fitting problem 45 . Assume u = u R , u I being an M × 2 matrix containing the real (R) and imaginary (I) parts of the complex signal corresponding to a column of U, the objective function becomes: with representing the geometric distance between the mth sample and the circle. The best circle is then iteratively computed. A good starting vector is the solution of minimizing the algebraic distance 45 . The algebraic representation of a circle is defined as: where d is a nonzero number and e ∈ IR 2 . Given u , we can compute the circle parameters d , ê , f . Only when all the samples are on one circle we can find a unique solution. Otherwise, it is an overdetermined problem when the sample length is more than 3. Let   www.nature.com/scientificreports/ After knowing the coordinates of the center, the circle can be shifted to the origin of the complex plane. As a result, we can demodulate the phase by directly computing the angular information of the complex signal as: This operation is applied to all the M columns of U , obtaining M φ (m) . This method works well also with radio-frequency (RF) radars, where the vital signs information describes an arc in the complex plane.
Target number estimation. The spectrum of a canonical vital signs signal extracted using radar techniques consists mainly of the respiration fundamental, some respiration harmonics, decreasing in magnitude (normally up to three), and the very small heartbeat fundamental. The signals also contain the heartbeat harmonics, however their magnitudes are so weak that they can be neglected. The energy of the signal is essentially contained in the fundamental and first harmonic of the respiration. We determine P by calculating the signalto-noise (SNR) of the M φ (m) . We estimate the signal power considering the spectrum within the respiration fundamental and its first harmonic, while the remaining spectrum is used to determine the noise power. We also perform an additional check on the spectra's local maxima. First, if the main peak, which in a canonical spectrum indicates the respiration rate, is outside the typical medical ranges of 0.1-0.4 Hz, we conclude that this source is noise. Secondly, we determine the magnitude ratio of the strongest peak (expected to be the respiration fundamental) and its first harmonic. We assume as noise any source with a ratio less than 2. The latter comes with the observation that the respiratory physiology involves signals consisting of a dominant fundamental and smaller (more than half) harmonics 19,33,34 . In the two aforementioned situations, we set the SNR to the minimum of -20 dB. The SNR values are stored in a vector, whose n-th element is related to the n-th uncorrelated sources in U. The final step is to scan this vector starting from the first position. We stop before the first value is below a threshold which, in this work, was determined empirically and set to 10 dB. We assign the corresponding index to P. We denote the first P sources of U as U S .

Independent component analysis. The independent component analysis (ICA) allows separating the
statistically independent sources S from the set of observations U S . The latter are a linear combination of the sources and they can be expressed as: where A is called mixing matrix. Therefore, knowing the U S , it is possible to determine Ŝ by evaluating the unmixing matrix, namely Â −1 , as: Hence, the ICA determines Ŝ estimating Â −1 , while U S is provided by the SVD. The sources Ŝ are the vital signs signatures we use to localize the subjects.

2-D localization.
The 2-D localization is performed on the angle data cube X angle . More precisely, to each of its sub-matrices X A , we estimate Ĥ , which contains the energy of the sources in every range bin (i.e., the channel information), by minimizing the residual error, as: where ζ is the penalty coefficient which represents a trade-off between the residual error and the sparsity, whose value was determined empirically. The results of this operation are N A Ĥ propagation channels, one per each n a -th considered angle. The final results are hence the responses of each subject (i.e., source) in every azimuth-range bin from which it is possible to perform the 2-D localization (example in "Results and discussion").
Once the subjects have been 2-D located, it is possible to improve the accuracy of the azimuth information through a fine beamforming. This means re-running the proposed algorithm and performing the conventional beamformer only around the angles where the subjects were detected during the rough beamforming with a fine angular step. The beamforming algorithms are generally computationally heavier, that's why the fine beamforming is not applied directly as first option.

Results and discussion
We conducted the experimental validation using the commercial Texas Instruments IWR6843ISK mmWave radar sensor, configured to operate as a SIMO FMCW architecture with a ULA antenna array of 4 elements. The distance between the TX antenna and the adjacent RX antenna is of 5 mm. Therefore, we can safely assume colocated radar. The system parameters are: f 0 = 60.645 GHz, B = 3.25 GHz, T = 64 µ s, PRI = 50 ms and T f = 0.25 µ s. We implemented the algorithm in MATLAB. In order to reduce the spectral leakage, before applying the FFT in fast time, each beat signal is multiplied by a Hann window function. We fixed L equal to 111, corresponding to a maximum range of 4.5 m (the range resolution is of 4.05 cm). All procedures in this study protocol adhered to the ethical principles of the Declaration of Helsinki. Written informed consent was provided by all patients before they were enrolled in the study. The IMEC ethical board reviewed and approved the study protocols (IP-19-WATS-TIP2-056). All the collected data were pseudonymized.  www.nature.com/scientificreports/ Figure 5 shows the results of an experiment with two volunteers in an office room whose 2-D positions are 1.5 m / 16.85 • and 2.7 m / -18.71 • , respectively (Fig. 5a). We used a measuring tape to determine the absolute distances between the radar and middle of the chest area of the subjects (expected results). We determined the azimuth information using geometric calculation. The radar position was considered as the origin, namely 0 m/0 • in the polar coordinate system. The orientation was calculated considering the line of sight (LoS) of the radar as 0 • . Clockwise angles were treated as positive while counterclockwise ones as negative. Subject 1 (the closest to radar) is seated right next to a metal wall, while Subject 2 (the farthest to radar) is seated in between an office desk and a large metal cabinet. This clutter causes a strong spreading of the transmitted and reflected signals in the whole room, generating significant multipaths. Due to the closer proximity to the radar, the Subject 1 is not affected by the multipaths of Subject 2 but their Doppler signal is strongly influenced by the metal wall (very strong reflector). Furthermore, the latter might reflect the multipath signals generated by Subject 1 which can have identical delays (time of flight) as the direct path signal of Subject 2, involving non-linear combinations of their phase contents. This generates uncorrelated signals that can be interpreted as independent targets (i.e., radar ghosts). Figure 5b shows the 2-D map (range vs. angle) obtained after the rough beamforming (we considered angular steps of 10 • ). It is possible to see the strong contribution of Subject 1, two significant responses nearby the expected location of Subject 2, and other effects due to the multipath, side lobes, and FFT spreading. In such a situation, it is not trivial to determine the right number of subjects in the room and their positions. It should be noted that, depending on the radar cross section (RCS), the multipath signal generated by a closer target might appear much stronger than the direct path of a distant target. Figure 5c shows the first five components www.nature.com/scientificreports/ produced by the SVD. As expected, the latter cannot provide exclusively the two vital signs signatures. In fact, we can clearly see periodicities in these five components which may resemble vital signs signals. Figure 5d shows the output of the target number estimation operation, from which only the first two components of the SVD are considered valid. This sets P = 2. These signals are then processed by the ICA, whose results are shown in Fig. 5e. At this point, we face an ordering ambiguity issue. We are still not able to indicate which source (i.e., vital signs signature) corresponds to which subject. We solve this with the 2-D localization operation, where we determine the responses of each source (i.e., target), as shown in Fig. 5f,g. In Fig. 5g, it is possible to see the subject's response and the very strong multipath effect. We remove the outliers resulting from multipaths by detecting the shortest direct path, while the outliers originating from additive noise are small and can easily be excluded. We finally perform the fine beamforming with angular steps of 2 • . The final 2-D localization map is shown in Fig. 5h. Subject 1 was localized at 1.54 m/16 • while Subject 2 at 2.75 m/−18 • . These results are in fair agreement with the expected values. We validated the proposed approach conducting experiments on 6 subjects, differing in height (170-195 cm), in weight, and in age (30-50 years). The subjects were seated on chairs with their chest regions facing the radar. Furniture and objects were present near and in between the volunteers, who were grouped in random pairs and they could randomly chose a seat. Two measurements have been collected at the same 2-D location. The reference values for the distances and for the angles where the subjects were expected to be located are reported in Table 1 under "Expected Results". For the first 12 experiments, we considered 3 absolute distances (ranges) with the subjects' chest centers at 0.5 m away from the LoS of the radar. For the remaining experiments, we selected random positions in the room. The experimental results, reported in Table 1 under "Measured Results", demonstrate that the proposed approach was able to accurately determine the 2-D location of the subjects. In Table 2, we reported as maximum mean absolute errors (MAEs) 0.1 m and 2.29 • and as maximum root-meansquare errors (RMSEs) 0.12 m and 3.04 • in measuring respectively the ranges and azimuth angles. Converted to meters, the maximum angular errors corresponds to 0.11 m for the MAE and 0.15 for the RMSE. Compared to the typical size of the human bodies, these small errors can be considered acceptable. Finally, in Table 3, we compared some relevant state-of-the-art works for automatic localization.

Conclusion.
In this work, we proposed a signal processing algorithm, demonstrated using a mmWave SIMO FMCW radar, to automatically determine the number and the 2-D positions of human subjects. This method aims at separating the radar reflections (direct paths from multipaths) to retrieve the vital signs signatures of the subjects present in the monitored environment. We determine the number of people by counting the number www.nature.com/scientificreports/ of sensed cardiopulmonary activities, while the 2-D localization is performed by measuring the distances from the radar to the thoracic regions of the subjects. The experimental validation has proven the ability of the proposed approach in monitoring people in a typical office room, reporting maximum MAES of 0.1 m and 2.29 • and maximum RMSEs of 0.12 m and 3.04 • in measuring respectively the ranges and azimuth angles. This radar system and signal processing technique can be considered a useful technology for the development of future telemedicine.

Data availability
The data that support the plots within this paper and other findings of this study are available from the corresponding author upon reasonable request.