Introduction

To continuously monitor the cardiorespiratory function of preterm infants in the neonatal intensive care unit (NICU), three lead electrocardiogram (ECG) and impedance monitoring are used, necessitating the application of adhesive electrodes.1 As preterm infants have a weak dermal–epidermal junction, the strong adhesive bond between the electrode and epidermis can strip the epidermis on removal causing trauma and pain.2 While alternatives to these electrodes are manufactured without an adhesive layer, insecure skin attachment can influence reliability.3

Cost and accuracy issues have additionally driven the innovation of non-contact modalities for monitoring physiological data. As ‘single-use products’, ECG leads and electrodes create a significant cost and the practice of repeated use has been reported in low-resource countries, with potential risk of infection or breakdown.4 In addition, reduced amplitude of chest wall expansion can affect the respiratory rate (RR) input signal from the impedance lead.5 ECG is also sensitive to artefacts produced by infant movement or from cardiac activity; thus, influencing monitoring accuracy.6

In response to these ECG monitoring limitations, research in the field of biomedical engineering has focused on the development and application of non-contact methods to detect physiological data for preterm infants with mixed results. The accuracy of non-contact monitoring has been influenced by variables such as participant movement, the need for a full view of the participant, and difficulties monitoring in low-light environments.1,7,8,9

In this paper, we describe a preliminary study of a proposed non-contact system based on photoplethysmography (PPGi) and motion magnification. PPGi is a technique whereby the peripheral pulse can be detected optically. The subtle motion signals fall outside human visual limits under most lighting conditions either because of limited resolution of intensity variations or lack of sensitivity to lower spatial and temporal frequencies. Yet, these invisible signals can be highly informative, especially in clinical and biomedical applications. When human tissue is illuminated, haemoglobin in the blood absorbs light more than the surrounding tissue, especially during systole when skin blood flow is greatest.10 Blood circulation causes invisible skin colour variations that can be amplified to measure cardiac activity.11,12,13,14,15 Therefore, when human tissue is videoed, and skin colour magnification is applied, the heart rate (HR) can be measured.16,17

Another example is the motion generated by inhalation and exhalation, which may have low spatial amplitude in video that renders it difficult to recognize. Filming each respiratory phase from capturing the movement of the thorax then applying magnification will produce the RR. The diaphragm typically rises during inspiration from 4  to 12 mm with a different frequency band between 0.2 and 2.0 Hz; thus, measurement of this movement will determine the RR. The heart muscle movement can also be transmitted to the chest leading to a subtle movement with an amplitude of 0.2–0.5 mm,18 which is useful for extracting HR.

Our non-contact system framework involves skin colour and motion magnification, region of interest (ROI) selection, spectral analysis and peak detection, which has been outlined in previous work.19 When magnification is applied to videos, images may be subject to varying degrees of degradation resulting from background noise. We thus apply techniques to enhance image quality and to remove noise, and these algorithms have been previously described.20

The aim of this preliminary study was to evaluate the agreement between HR and RR measurements using a non-contact computer vision system (the index test) based on PPGi and motion magnification with comparison to measurements obtained by a reference standard; the chest lead cardiorespiratory monitor. Specific objectives were to identify factors that may influence method accuracy, provide recommendations for method improvement and insight for how this proposed non-contact system may be implemented into real-time biomedical application. We hypothesized that the proposed non-contact computer vision system will accurately detect HR and RR in the preterm neonatal population.

Methods

Study design

A single-centre cross-sectional observational study was performed over 2 days during May and June of 2018 at Flinders Medical Centre Neonatal Unit, a tertiary newborn service in Adelaide, South Australia. Approval for this study was obtained from the Southern Adelaide Local Health Network Research Committee (HREC/17/SAC/340; SSA/17/SAC/341) and the University of South Australia Human Research Ethics Committee (Protocol number 0000034901). Written consent from the parents of the infants was attained prior to filming with a full explanation of the study procedures provided.

Reference standard

For validation purposes, ECG was utilized in this study as the reference standard for all infants studied to compare the accuracy of the non-contact system. The impedance lead of the ECG measures the variation in electrical impedance associated with chest wall movement to derive an RR.21 Although it has limitations as aforementioned, it was used for validation to avoid disruption to the participants.

Patient selection

Investigation of the validity of the proposed system for monitoring HR and RR was performed on 10 infants who were recruited using convenience sampling. Demographic variability was ensured by recruiting participants with potential confounders such as those receiving phototherapy or continuous positive airway pressure (CPAP) with a face mask. All preterm infants who were monitored using the unit’s regular cardiorespiratory monitor at the time of data collection were eligible. Preterm infants were targeted as they are prone to episodes of bradycardia and apnoea due to idiopathic apnoea of prematurity,6 which ensured the system could be rigorously tested. Infants who were ventilated were excluded as they have supported HR and RRs. Infants with major congenital abnormalities that could make them identifiable from images were excluded. A sample size of 10 infants was used to obtain pilot data for this non-contact system that has not been tested previously with premature infants.

Experimental set-up

A Nikon D610 camera aimed at the infants was positioned at a distance of 1–2 m as shown in Supplementary Fig. 1. The camera was mounted to a tripod used to video the infants for 10 min with a resolution of 1920 × 1080 and a frame rate of 30 fps, while their cardiorespiratory monitor was simultaneously captured by another digital camera (Nikon D5300) for validation purposes. Recording began from each camera at the exact time point to ensure data points from the ECG and non-contact system could be synchronized. The infants did not need to be repositioned or disturbed for the purposes of the study as any sleeping position or coverings were acceptable for data collection. The entire body was captured in the camera’s frame with focus on the face and chest. The illumination levels varied between infants during videoing from normal ambient fluorescent lighting to scheduled periods of dim lighting used to facilitate neurological development. The automatic exposure mode on each camera was used for videoing in response to variations in environmental illumination.

Data analysis

Immediately after capture, the videos were downloaded to a laptop. Phase 1 of data analysis was to extract the cardiorespiratory signals from two different ROIs (see Supplementary Fig. 2) at a distance of ≤2 m, using the physiological variations caused by cardiorespiratory activity. The schematic diagram of the proposed monitoring system is presented in Supplementary Fig. 3.

The first variation relied on changes in skin colour caused by cardiorespiratory activity. HR was extracted using the reflectance properties of human skin from the video recording, which is measured directly from changing brightness values in the image sequences. Since the skin colour caused by cardiorespiratory activity is impossible to see by the naked eye, Eulerian Video Magnification (EVM) technique22 was used to magnify the recorded videos. Some modifications were undertaken to the standard EVM22 to suit the proposed system, including wavelet pyramid decomposition and an elliptic band-pass filter, which provided some improvements in terms of reducing the execution time and motion artefacts. The second processing step was to detect and manually localize ROIs using MATLAB’s command ‘ginput’. The brightness values of the pixels within the selected ROI were then spatially averaged to obtain three source signals from R, G and B components, as follows:

$$i_{\mathrm{R}}\left( t \right),i_{\mathrm{G}}\left( t \right),i_{\mathrm{B}}\left( t \right) = \frac{{\mathop {\sum }\nolimits_{x,y \in {\mathrm{ROI}}} I\left( {x,y,t} \right)}}{{\left| {{\mathrm{ROI}}} \right|}},$$
(1)

where I (x,y,t) is the brightness pixel value at image location (x,y) at time (t), and |ROI| is the size of the selected ROI. The iG(t) signal generated from the G channel was then chosen to estimate the cardiac signal, since this component has the best frequency spectra among other components that correspond to the anticipated range of the cardiac frequency band.12,13,23 A spectral analysis method based on the Fast Fourier Transform (FFT) was then applied to transform the selected signal from the time domain to the frequency domain, followed by a separating ideal band-pass filter with selected frequencies of 0.5–3 Hz, corresponding to 30 to 180 beats per minute (b.p.m.). The inverse FFT was then applied to the filtered signal to obtain the cardiac signal. Finally, peak detection based on the MATLAB built-in function ‘findpeaks’ was performed to identify the periodicity of peaks, their locations and number of peaks of the acquired signal.

The second effect relied on variations in chest movement from inhalation and exhalation, which directly associated with spatial changes of intensity values in the image sequences. The motion magnification system proposed by Al-Naji et al.24 was used to amplify the recorded video before data analysis. The next processing step was to detect and manually localize ROIs using MATLAB’s command ‘ginput’. The magnified video signal from the Y component of the YCbCr colour space was obtained by averaging intensity pixel values over the image sequences as follows:

$$i_{\mathrm{Y}}\left( t \right) = \frac{{\mathop {\sum }\nolimits_{x,y \in {\mathrm{ROI}}} I\left( {x,y,t} \right)}}{{\left| {{\mathrm{ROI}}} \right|}},$$
(2)

where I (x,y,t) is the intensity pixel value at image location (x,y) at time (t), and |ROI| is the size of the selected ROI. A spectral analysis method based on FFT was then applied to transform the selected signal from the time domain to the frequency domain, followed by a separating ideal band-pass filter with selected frequencies of 0.2–2 Hz, corresponding to 12–120 r.p.m.

The inverse FFT was then applied to the filtered signal to obtain the respiratory signal, followed by peak detection to measure RR. The measured value (Mv) of the heart and RRs per minute can be calculated using the following equation:

1. Time (s) = \(\frac{{\mathrm{no.}}\;{\mathrm{of}}\;{\mathrm{frames}}}{{\mathrm{frame}}\;{\mathrm{rate}}}\).

2. The period (p) between two peaks (average) = \(\frac{{\mathrm{no.}}\;{\mathrm{of}}\;{\mathrm{peaks}}}{{\mathrm{time}}}\).

3. RR = p × 60 = breaths/min.

The algorithms were applied in the MATLAB environment—2018a (MathWorks, NSW, Australia) with a Microsoft Windows 10 operating system.

Supplementary Figure 4 shows the MATLAB-based graphic user interface of the proposed monitoring system.15

The Phillips IntelliVue monitor used to validate our results determines the HR by averaging the 12 most recent HR intervals from the ECG and the RR is calculated by averaging the last eight detected breaths.25 Therefore, during phase 2, the HR and RR measurements over a 5-s period were analysed using the calculation previously described. Measurements were compared using the Bland–Altman analysis26 for agreement using the MedCalc Statistical Software (version 18.2.1), which was performed by researcher K.G.

Researchers previously determined the clinically acceptable difference (CAD) to be close to zero (−c,c).27 However, due to the potential for ECG measurement inaccuracy, this may be regarded as unachievable and clinical judgement would therefore be necessary.

Results

With one exception, all infants were ≤37 weeks gestational age. One infant of 40 weeks gestation being cuddled by their parent was also included (participant 6). Gestational ages at birth varied from 23 to 40 weeks, with postnatal ages from 2 to 95 days. The weight of the infants ranged from 800 to 3020 g. Two infants were receiving CPAP during filming and one was under phototherapy to treat neonatal hyperbilirubinaemia. See Supplementary Table 1 for demographic data.

Clustered regression on the bias was conducted, and for the HR readings, the observed mean difference of 4.5 b.p.m. resulted in p < 0.005. For RR, the observed mean bias of 0.8 r.p.m. resulted in a p value of p < 0.586. Bland–Altman analysis of the RR and HR utilizing magnification techniques is shown in Supplementary Figs. 4 and 5. The limits of agreement for HR were −8.3 to +17.4 b.p.m. and for RR, −22 to +23.6 r.p.m. Analysis of the data without magnification techniques applied was also conducted. The limits of agreement for HR were −8.5 to +19.8 b.p.m. and for RR, −21.9 to +23.9.

Although not a primary outcome measure, we analysed one infant receiving CPAP who had an apnoeic event. We could not detect any respiratory effort for 7.5 s (Supplementary Fig. 6). However, the impedance monitor determined the RR to be 14–17 r.p.m. during this 10-s period, which was assumed artefact generated from infant movement and therefore the monitor alarmed bradycardia and desaturation only.

Discussion

The aim of this preliminary study was to assess the accuracy of the proposed non-contact system to measure HR and RR in 10 infants. A clinically significant finding that we did not anticipate was that our system was able to accurately detect apnoea when the ECG monitor did not. While the system has been demonstrated to be accurate in measuring physiological data in controlled scenarios among adult participants,28,29,30 our findings indicate that this novel non-contact technique requires further development to achieve accuracy necessary for use with neonates. Factors such as reduced ROI, low-lit environments and camera movement are proven challenges from this study and are documented compounding variables in others.

We also compared data extracted with and without our magnification algorithms applied. The limits of agreement were slightly wider without magnification, thus indicating the value of adding this technique to the system. The purpose of Video Magnification (VM) is to establish that the ROI is valid and to remove noise that is out of the temporal band covered by the VM. If the infants were moving, the VM will cause more noise and may lead to inaccurate results, particularly during PPGi use.

Accurate apnoea detection was illustrated in one infant in our study where the non-contact system correctly identified apnoea when movement artefact resulted in a falsely reassuring respiratory trace from the skin electrodes. Therefore, in some circumstances non-contact monitoring may be superior to impedance monitoring using skin electrodes where motion artefact can still influence the RR signal.10,31 This noteworthy finding supports the need for further study of infants who have regular episodes of apnoea to evaluate how this proposed non-contact system could be best utilized in the detection of apnoea.

In scenarios where the infants had their forehead exposed making selection of the ROI for PPGi feasible (participants 1–6 and 10), data measurement was more accurate with a mean difference of −4 to +7 b.p.m. In comparison, for those infants who had their head partially covered by a CPAP mask making the ROI unclear (participants 8 and 9), HR data needed to be extracted from motion magnification of the chest. The volumetric changes of the heart muscle can also be transmitted to the chest leading to a subtle movement with an amplitude of 0.2–0.5 mm.18

Perhaps because of the chest moving during respiration, and in conjunction with low light levels, the mean difference for HR were 8 and 9 b.p.m., respectively. Lack of exposed skin area was an identified challenge in the study by Villarroel et al.1 and influenced the accuracy of their results. These challenges may be alleviated by utilizing advanced signal processing techniques to select the ideal ROI.

Low-ambient light levels may have additionally impacted the results as it has in other studies.1,8 Filming in low-light environments may result in a low signal-to-background noise that degrades the images. The integration of unobtrusive near-infrared LEDs into the system may accommodate for this variable.32 We have previously monitored the RR of paediatric participants asleep in a dark room with the use of a Microsoft Kinect sensor in the study by Al-Naji et al.33 This study yielded positive results with a cross-correlation coefficient of 0.9812 and was proven reliable to detect motion, making it perhaps more viable for RR detection. Digital cameras were used for videoing in the current study instead of the Kinect sensor as they provide images of better quality and resolution for PPGi. However, measurement was not adequately accurate in this study due to confounders as aforementioned. These factors will need to be considered when developing a prototype.

For participants 3 and 4, the researcher needed to video without the stability of the tripod due to limitations in space in the neonatal unit. This resulted in a slight camera shake and potentially contributed to the larger mean difference of 6–7 b.p.m. and 7–9 r.p.m. For future work, further consideration in terms of ensuring the camera is not subject to shake is necessary and the aid of mounting cameras such as a GoPro to neonatal equipment may alleviate this issue.

Method revision and future research to improve algorithms to accommodate these influences is vital if this system is to be utilized in the neonatal unit. However, this non-contact system may prove accurate to detect HR in some situations where confounding variables are limited, such as when performing newborn resuscitation on a resuscitaire. This environment is generally well illuminated, the infant is exposed and not moving significantly, and the resucitaire provides a stable surface to mount a small camera, which eliminates all factors that have been identified to influence accuracy. Future research is these environments may well demonstrate superiority to current monitoring systems.

Strengths and limitations

We have studied a real-life population with many of the variables demonstrated to confound measurement commonly occurring in preterm infants. Strength of the study is that data on all patients was reported and was not excluded if affected by poor video quality or movement. This study provides necessary data to improve the proposed non-contact system for neonatal monitoring.

A limitation of the study is that the pre-existing impedance monitoring was utilized to validate RR data with its known inaccuracies. This was important in order to minimize disruption to participants. For future work, the application of additional modes for RR monitoring known to be more accurate, such as a chest impedance belts or airflow detection methods, may be considered.31

The study is limited by the small numbers of infant subjects and in particular small numbers of infants receiving intensive care interventions such as CPAP or mechanical ventilation. Additionally, the researcher firstly determined the reference data from the ECG prior to extracting data derived from the non-contact system. There may have been potential bias from analysing the reference and index text simultaneously.

Conclusion

Improvement in algorithms for processing videos to accurately detect neonatal HR and RR in the NICU environment is necessary. Variables such as movement and limited visibility of the head can produce inaccuracies of this non-contact method. This non-contact system may be more feasible in controlled settings where standardized equipment and lighting are consistent. Further research is required to improve techniques to reduce noise artefacts due to motion and changes in ambient illumination in order to develop a system that can accommodate these variables.