Introduction

Despite advances in non-invasive respiratory support, mechanical ventilation remains an important therapy in Neonatology: about 1.2% of all newborn infants receive mechanical ventilation due to prematurity or a critical illness in the neonatal period.1 Large tertiary neonatal intensive care units (NICUs) frequently have over 1500 ventilator days annually. Modern ventilators are equipped with powerful computers and in addition to calculating and showing several ventilator parameters, their graphical user interfaces also display real-time ventilator waveforms and loops.2,3 These data are invaluable for analysing effectiveness of ventilation, to determine whether ventilator parameters are appropriate and to detect adverse ventilator–patient interactions.4 However, busy clinicians frequently ignore ventilator waveforms or only review them for very short periods.5 As these data are not routinely downloaded or stored, they cannot be reviewed later.

An alternative to inspecting ventilator screens over long periods would be to develop computational methods to study the effectiveness of mechanical ventilation and individual patient–ventilator interactions. This approach requires access to raw ventilator data (airway pressure and flow) at a sampling rate high enough to computationally re-generate and analyse waveforms and loops. Indeed, new ventilator models enable such data to be downloaded.6,7,8,9 In order to interpret these raw data, as a first step, they need to be split into individual ventilator inflations, which can then be further segmented into sub-phases (i.e. lung inflation, inspiratory hold, lung deflation etc.). Identification and separation of these segments enables statistical analysis of their characteristics over longer periods and automatic detection, characterisation and quantitative analysis of patient–ventilator interactions during different phases of the respiratory cycle. Recently, several different approaches have been reported to computationally analyse adult ventilator data and characterise patient–ventilator interactions.10,11,12,13,14,15,16,17 However, neonatal mechanical ventilation uses different ventilator modes and has different characteristics from adult ventilation and the tools developed for adults cannot be directly used with neonatal data.

In this paper, we describe and validate a novel computational approach to segment high-throughput raw data downloaded from the ventilators of critically ill babies into individual ventilator inflations and to further split the inflations into different sub-phases.

Methods

Data collection

Ventilator data were collected from infants ventilated on the Neonatal Intensive Care Unit, Rosie Hospital, Cambridge. The study was approved by the Bromley (London) Research Ethics Committee of the Health Research Authority of the United Kingdom (reference: 18/LO/2182). Informed consent was obtained from parents. All procedures were performed in accordance with the ethical standards of the Research Ethics Committee and the amended Helsinki Declaration (1983).

Ventilator data were downloaded to a laptop computer via a cable attached to one of the serial communication ports of the Dräger Babylog VN500 neonatal ventilator (Dräger Medical, Lübeck, Germany) using the recording software developed by the “Technology and Intellectual Property” Department of Dräger Medical. This is for experimental and scientific purposes only and it is not commercially available. All data downloaded by the software carry a timestamp with millisecond precision. Data are exported into comma-separated value (.csv) text files.

The software retrieves airway pressure, flow and volume data with 100 Hz frequency (100 per second). Airway pressure is measured and calculated by the ventilator’s sensors. Flow is measured by the proximal flow sensor connected at the patient’s wye piece. Volume data are generated by the ventilator using time integration of the flow data. The 100 Hz sampling rate is sufficient for waveforms of individual breaths and inflations to be reconstructed,18 also see Fig. 1a and Supplemental Figs. S5 and S6 (online). As the recording software was designed to be compatible with paediatric and adult ventilators using larger tidal volumes, due to bandwidth limitations the smallest difference in tidal volume that can be retrieved is 1.35 mL, which is too large for neonatal studies. Therefore, our software (Ventiliser) reconstructs the volumes from the obtained flow data.

Fig. 1: Ventilator waveforms reconstructed from pressure and flow data obtained at 100 Hz sampling rate.
figure 1

a Phases and sub-phases of a ventilator cycle. This was a backup ventilator inflation without any patient contribution. Ventilator mode was AC-VG (assist-control with volume guarantee). Sub-phases of the inflation are demarcated by dotted line and numbered as follows: (1A) Lung inflation; (1B) Inspiratory hold; (2A) Lung deflation; (2B) Expiratory hold. See Table 1 for more details. Tidal volume has been calculated by integrating flow. The failure of the tidal volume wave form to return to zero at the end of expiration indicates the presence of some leak around the endotracheal tube. b, c Synchronised and backup ventilator inflations. Time point zero on the horizontal axis corresponds to the start of the ventilator cycle as detected by the algorithm. The vertical dotted lines represent the start of pressure rise as detected by the algorithm. During a synchronised inflation, b the positive (inspiratory) flow is initiated by the patient and therefore it precedes the rise of airway pressure above the PEEP level. There is a slight transient drop in PEEP as the patient starts breathing in from the circuit (arrow). During a backup inflation, c the positive (inspiratory) flow is initiated by the ventilator and therefore it quickly follows the rise in airway pressure.

In addition to 100 Hz waveform data, the recording software also downloads calculated ventilator parameters at 1 Hz frequency, including mandatory, spontaneous, inspiratory, expiratory tidal and minute volumes, peak inflation pressure, mean airway pressure, positive end expiratory pressure, inspiratory and expiratory times, and fraction of inspired oxygen. The parameters obtained are those of the last full inflation that has been applied to the patient or made spontaneously by the patient before the timestamp. Minute volumes are directly calculated from the ventilator flow data over 20-second windows with appropriate filters. The recording tool also retrieves alarm data with a timestamp when an alarm was triggered and again when the issue triggering the alarm has been resolved. Changes in the ventilator and alarm settings are recorded with a timestamp showing the time the changes were made.

Data processing and analysis

Data were processed and analysed using Python (version 3.8, https://www.python.org) and its add-on packages. In addition to modules that are part of the Python standard library, we used NumPy (version 1.18.3, http://www.numpy.org), pandas (version 1.0.3, http://pandas.pydata.org) and SciPy (version 1.4.1, www.scipy.org). We created a Python package named Ventiliser to organise the ventilator data segmentation pipeline code. Data were analysed done using Jupyter Notebooks (version: 1.0.0) and Spyder (version: 3.3.1) installed as part of freely available Anaconda distribution (Continuum Analytics, http://docs.continuum.io/anaconda/pkg-docs). Visualisation was done with matplotlib (version 3.2.1, http://matplotlib.org). The graphical user interface was built using the PyQt5 (version 5.9.2 https://www.riverbankcomputing.com/software/pyqt/) and pyqtgraph (version 0.10.0 http://www.pyqtgraph.org/). All software is open source and freely available. Ventiliser package can be found at https://github.com/barrinalo/Ventiliser and the data analysis notebooks can be downloaded from https://github.com/gbelteki/ventilator_data_segmentation.

Definition of a ventilator cycle its phases and sub-phases

Before implementing the algorithm, we defined the concept of a ventilator cycle and its phases and sub-phases (Table 1 and Fig. 1a). The start of a ventilator cycle was defined as the time from the start of the positive (inward) flow, which is accompanied by a ventilator-assisted and ventilator-cycled positive pressure inflation. The ventilator cycle can be triggered by the baby and synchronised when the increase in flow precedes the start in pressure rise (trigger delay) or it can be a ventilator-initiated backup cycle when the airway pressure increases first rapidly followed by the positive flow (Fig. 1b, c). A ventilator cycle lasts until the start of the next ventilator cycle (commencement of next positive flow). In some modes (e.g. synchronised intermittent mandatory ventilation (SIMV)), there can be unassisted or pressure-supported spontaneous breaths between ventilator inflations. The inspiratory phase of the cycle includes the period when the lungs are being inflated and the inspiratory hold, which, if present, corresponds to inflated lungs with the airway pressure being maintained at the peak inspiratory pressure (PIP) level and with no air flow. The expiratory phase is the period when the lungs are deflating to the level of the functional residual capacity (FRC) and the time with the lungs at FRC and the airway pressure at the positive end expiratory pressure (PEEP).

Table 1 Phases and sub-phases of a ventilator cycle defined by direction and change in airway flow.

The StateMapper algorithm

After defining ventilator sub-phases, we also defined flow and pressure states associated with them (Table 2). To reduce noise and processing time, the raw time series was then discretised by associating flow and pressure states with these sub-phases. Mapping states to raw time series was achieved by piecewise aggregate approximation using mean and standard deviation with a window size of 3 data points (30 ms). The mean (Wmeani) and standard deviation (Wstdi) of each segment (Wi) was then compared to the next one (Wi+1). Stationary states were determined by comparing the difference in mean between windows (ΔWmean = Wmeani+1 − Wmeani) against Wstdi. If ΔWmean < 2 Wstdi or ΔWmean < a threshold (T), then the state is stationary, and if not, it is non-stationary (moving). T is defined as 0.1 L/min for flow and 10% of set PEEP for pressure. The type of the non-stationary states is determined by the sign of ΔWmean with positive flow and pressure states being “Inspiration initiation”, “Expiration termination” and “Pressure rise”, respectively, and negative flow and pressure states being “Inspiration termination”, “Expiration initiation” and “Pressure drop”, respectively. For stationary flow states, the “Peak inspiratory flow” state has positive Wmeani, the “Peak expiratory flow” state has negative Wmeani and the “No flow” state has Wmeani within ±0.1 L/min of 0 L/min. For stationary pressure states, the average between the previous peak inflating pressure (PIP) and the set PEEP was used, such that if Wmeani < the average, it was state PEEP, and if not, it was state PIP. Details can be seen in Supplemental Figs. S1 and S2 (online).

Table 2 Flow and pressure states.

Segmentation algorithm

After discretisation, breaths were demarcated into inspiration–inspiration intervals, determined by encountering an “Inspiration initiation” or “Peak inspiratory flow” state and also fulfilling the following two criteria: (1) having already encountered an expiratory state (“Expiration initiation”, “Peak expiratory flow” or “Expiration termination”); or (2) having encountered an “Inspiratory hold” or “Expiratory hold” state of >50 time units (=500 ms). This allows Ventiliser to recognise the start of a new ventilator inflation even after positive flow was previously encountered without any negative flow, which may be due to a very large (>90%) leak around the endotracheal tube or a ventilator artefact. Inflations with <500 ms expiratory time will still be missed but the user can set this value differently. There is also an optional post-processing step which merges adjacent breaths if they are discovered to have significantly different inspiratory and expiratory volumes, which further reduces artefacts, but also combines breaths with large leaks around the endotracheal tube into one (default value is 66%).

Identifying inflation sub-phases is further complicated by the fact that in many cases the pressure and flow waveforms of ventilated infants do not show the regular shapes presented in Fig. 1 due to spontaneous breathing effort, splinting, coughing or movement of the baby and artefacts due to kinking of the endotracheal tube or condensed water in the ventilator circuit. Therefore, an actual sub-phase is frequently not a contiguous series of consecutive single flow and pressure states as shown in Table 2. Instead, it may be a region dominated by one particular state and interspersed with others. For example, an inflation may start with an “Inspiration initiation” flow state but if the patient is splinting the chest or is trying to breathe out briefly, inspiratory flow will decrease or even stop, resulting in “Peak inspiratory flow” or “Inspiration termination” flow states, respectively, before increasing again to reach the true peak inspiratory flow. To address this, Ventiliser iterates over the set of ordered states (flow or pressure) and finds the split along the time axis that maximises the information gain of the current state with respect to all other states. Each subsequent split is performed on the data after the previous split to ensure order.

Results

Output and running time of the segmentation algorithm

We developed Ventiliser, a computational pipeline segmenting neonatal ventilator data into individual inflations, their phases and sub-phases. Spontaneous breaths between ventilator inflations are also recognised, if present. Ventiliser uses a rule-based algorithm as described in the “Methods” section. Ventiliser generates a report in table format exported as a .csv file (Supplemental Table S1 (online)). The report lists all identified ventilator inflations and spontaneous breaths with start time and duration of the various flow and pressure states and sub-phases. Moreover, additional parameters such as inspiratory and expiratory time, lung inflation and deflation time, peak inspiratory and expiratory flow, inspiratory and expiratory tidal volumes are also reported. The timing of sub-phases allows the user to distinguish between synchronised and backup inflation (see Fig. 1b, c). The correlation coefficient between pressure and flow is also reported; spontaneous breaths between ventilator inflations (if present) are characterised by absent or negative correlation as in their case the positive inspiratory flow is associated with no increase or even some decrease of the pressure from PEEP level, unless they are pressure-supported.

In addition to the report in table format, Ventiliser also has an evaluation module that a user with appropriate programming skills can use for quality control of Ventiliser’s output (Supplemental Fig. S3 (online)). Documentation of the module is available in the same repository as Ventiliser.

Ventiliser runs on a mid-range personal computer with an approximate speed of 2 min per ventilation day (corresponding to 8.64 million data points) and the running time scales linearly with duration of the recording. Speed and performance of the program is shown for some actual recordings in Supplemental Fig. S4 (online).

Algorithm validation

To validate the segmentation algorithm, we have built a graphical user interphase (GUI) capable of importing raw ventilator data. The GUI shows pressure and flow waveforms in a time window set by the user (Fig. 2). The user can manually identify and label transition points between the flow and pressure states described in Table 2. The manual annotation can be exported and stored as a .csv file and compared with computational annotation provided by the pipeline.

Fig. 2: Graphical user interface to display and manually annotate ventilator waveform data.
figure 2

Airway pressure (top panel) and flow (bottom panel) waveforms can be manually annotated with numbers corresponding to the start of pressure and flow states, respectively, as described in Table 2. The annotation can be exported and stored as a comma-separated value (.csv) file and compared with computational annotation provided by the pipeline.

To evaluate the performance of the algorithm, three random 5-min samples were extracted from longer recordings of three infants: two were ventilated using SIMV with volume guarantee, one using assist-control with volume guarantee (AC-VG). In all three, the infant had spontaneous breathing effort and therefore interacted with the ventilator in a complex way. These recordings were not used during the development of Ventiliser (out-of-sample validation). The samples were manually annotated by a medical student after receiving formal training about ventilator waveforms from G.B. The algorithm successfully identified >97% of the manually labelled flow and pressure states with a mean error between 10 and 40 ms (representing 1–4 data points) for all except the expiratory hold start key point, which had a mean error of 49.4 ms (Table 3). Performance on the individual samples is shown in Supplemental Tables S2S4 (online)). The length of sub-phases identified by Ventiliser showed some deviation from the manual annotation (Table 4). Overall, the difference between the mean duration of sub-phases identified by the two methods was <50 ms in 83.33% (25/30) of the tests.

Table 3 Overall performance of algorithm against manual annotations.
Table 4 Comparison of the duration of sub-phases identified by Ventiliser by those identified by manual inspection and annotation.

Reconstruction and in-depth analysis of ventilator waveforms and loops

To demonstrate the utility of Ventiliser, we present analysis of a 39-hour-long ventilator recording obtained from a term infant ventilated using AC-VG mode. The respiratory parameters calculated and displayed by the ventilator (i.e. respiratory rate, tidal volume, peak inspiratory pressure) are shown in data-rich time series plots (Fig. 3a–c). From the 14,044,274 pressure and flow data points, Ventiliser identified 143,260 respiratory cycles: 128,663 ventilator inflations (mean: 55/min) and 14,597 unsupported spontaneous breaths. As the actual ventilator rate was between 60 and 65/min over the whole recording (Fig. 3a), Ventiliser detected ~85–90% of ventilator inflations overall. We further analysed two 1-h periods of the recording in more detail (Table 5). During period 1, the baby had some breathing effort, but the majority of inflations were initiated by the ventilator as the high backup rate allowed only a short time window for the baby to trigger inflations by generating inspiratory (positive) flow. During period 2, stronger breathing effort appeared, there were more synchronised inflations and some unsupported spontaneous breaths also appeared. Moreover, during period 2 ventilator waveforms and loops became more variable and irregular (Fig. 4). Identification of individual inflations by Ventiliser makes it possible to visualise any single inflation (Supplemental Figs. S5 and S6 (online)). Segmentation of inflations into sub-phases allows for quantitative analysis of these sub-phases over the period (Table 5). For example, we have found that the actual pressure rise time (PRT) was on average longer than the set value (80 ms). During period 2, the median time spent with the pressure at the PIP level was significantly shorter and the pressure drop to PEEP level and lung deflation time were also significantly shorter. In accordance with this, during period 1 a larger number of inflations had an inspiratory hold (pressure at the PIP level with no air flow). Also, during period 1 more inflations had no expiratory hold, that is, the next inflation started immediately after deflation of the lung.

Fig. 3: Reconstruction of ventilator trends from ventilator parameter data downloaded with 1 Hz sampling rate.
figure 3

Data were downloaded from a term infant ventilated for respiratory distress. The infant was ventilated with assist-control ventilation using volume guarantee (AC-VG). ac Time series plots of ventilator rate (a), pressures (b) and tidal volume (c). After the first 4 h, the baby triggered more ventilator inflations (RRmand) than the ventilator backup rate (RRset). A few spontaneous breaths (RRspon) also appeared. These were ones that did not reach the trigger threshold (0.2 L/min) or fell to the refractory period of 0.12 s after a previous inflation (a). As this was VG ventilation, the peak inflating pressure (PIP) shows large short-term variability, particularly when spontaneous breathing effort appeared (b). The leak-compensated expired tidal volume (VTmand) also shows significant short-term variability and deviations from its target (VTset) when the infant was breathing, with tidal volumes <2 and >10 mL/kg occurring frequently (c). Dashed lines and numbers show the two time periods studied in more detail.

Table 5 Characterisation of ventilation inflations in a 39-h-long recording and in two 1-h periods.
Fig. 4: Reconstruction of ventilator loops and waveforms from downloaded ventilator flow and pressure data sampled at 100 Hz.
figure 4

Composite pressure–volume loops over two 1-h periods. Each data point represents a single pressure–volume data pair. a, b. During period 1, regular pressure–volume loops are produced with little variability both for synchronised and for backup inflations. c, d With stronger breathing effort of the baby (period 2), the waveforms become more irregular, and there are more synchronised inflations requiring lower inflating pressures.

Discussion

In this paper, we describe a computer program (Ventiliser) that identifies and characterises individual inflations from airway pressure and flow data downloaded from neonatal ventilators. We developed and tested the software with raw data obtained from the Dräger Babylog VN500 ventilator; however, as the input to our programs is platform agnostic, requiring only flow and pressure data in a tabular data format and obtained at a high sampling rate, Ventiliser can be used with any neonatal ventilator from which flow and pressure data can be downloaded at high sampling rate. We have also successfully used Ventiliser on data obtained with a 125-Hz sampling rate from the fabian +ncpap (Vyaire) neonatal ventilator (Belteki et al. unpublished observations). GUI can also be used with flow and pressure data downloaded from any ventilator.

Ventiliser uses Python, a popular computer language and its freely available data science libraries, allowing us to make it freely available to clinicians and researchers. The use of Python also enables the possibility for interfacing with existing open source platforms such as ventMap10 and installation on devices such as the RaspberryPi to enable real-time distributed processing and data collection at the bedside.

We used a rule-based rather than a machine learning algorithm. Machine learning models frequently perform better when tested on new data not seen during algorithm development.19 However, supervised machine learning methods require a training data set that can only be produced via manual annotation by domain experts. As ventilator data are complex and noisy due to ventilator–patient interactions, thousands of inflations would need to be manually annotated by clinicians having significant expertise in neonatal ventilation. Moreover, our program correctly identified >97% of inflations and their sub-phases in three short samples, which were not used during algorithm development. In any case, Ventiliser can be used for benchmarking segmentation algorithms developed in the future and the GUI can be used for producing a manually annotated training data set.

Existing software solutions for ventilation waveform analysis have used adult ventilator data and primarily focused on detection of specific adverse ventilator–patient interactions (such as double triggering or ineffective respiratory efforts) via a rule-based algorithm,10 Hidden Markov model11 or machine learning.12 Furthermore, most of them are not freely available11,12 or may be tied to specific ventilator platforms.10 None of these methods has been validated in neonates. The physiology of adult and neonatal ventilation is significantly different: neonates are usually intubated with un-cuffed endotracheal tubes and have some leak around the tube, can breathe during ventilation, the ventilator modes and settings used are different from those used in adults and ventilator–patient interactions are also different.

Recently, BreathMetrics,13 a program written in Matlab programming language, sought to address the issue of providing a flexible but standard way to perform basic processing and analysis of respiratory waveforms. However, it was developed for the purpose of analysing spontaneous breathing in mice and adult humans from nasal respiration data. Moreover, the program does not use airway pressure, only air flow, and the Matlab software is not freely available.

Our out-of-sample validation on manually annotated samples shows that Ventiliser is able to identify the inflation key points (boundaries between sub-phases) with good accuracy. The length of each sub-phase is determined by the interval between key points, and hence the error from the prediction of key points is summed when identifying sub-phase lengths as seen in Table 4. The expiration termination and expiratory hold sub-phases had considerable deviation from manual annotation (>50 ms in some cases). This probably reflects the more variable nature of the expiratory phase in infants with spontaneous breathing effort, which means trying to fit them to an ideal expiration is more challenging.

The PRT (also known as slope time) calculated by Ventiliser was significantly longer than the one set by the clinician. We cannot tell if our algorithm overestimates the PRT or if the actual PRT is longer than the set value due to patient–ventilator interactions or other factors. In the manually annotated samples, the PRTs returned by Ventiliser were close to the manually identified values, which were also longer than the set slope time (80 ms); however, these were only short samples. The clinical significance of different PRTs in neonatal ventilation is uncertain.20

We have also found that a significant proportion of ventilator inflations during AC-VG ventilation contain periods of inspiratory hold or contains no expiratory hold, the occurrence and significance of which in neonates are not known as there are few studies analysing short periods.21 Ventiliser allows for quantitative analysis of inflations over long periods and will facilitate such studies in future.

In summary, we have developed a program to automatically identify and analyse ventilation inflations from neonatal ventilator data. Our software can be used in future studies to analyse ventilation patterns and ventilator–patient interaction over longer periods (hours or days) of mechanical ventilation.