key: cord-0706722-bk71ijc3
authors: Hagan, Rachael; Gillan, Charles J.; Spence, Ivor; McAuley, Danny; Shyamsundar, Murali
title: Comparing Regression and Neural Network techniques for personalised predictive analytics to promote lung protective We repeated this process for all points in the time series, for each patient and report the RMSE value per method.ventilation in Intensive Care Units
date: 2020-10-08
journal: Comput Biol Med
DOI: 10.1016/j.compbiomed.2020.104030
sha: 928e2ec70a74f21fee1dd787b1668217c559fb5a
doc_id: 706722
cord_uid: bk71ijc3

Mechanical ventilation is a lifesaving tool and provides organ support for patients with respiratory failure. However, injurious ventilation due to inappropriate delivery of high tidal volume can initiate or potentiate lung injury. This could lead to acute respiratory distress syndrome, longer duration of mechanical ventilation, ventilator associated conditions and finally increased mortality. In this study, we explore the viability and compare machine learning methods to generate personalized predictive alerts indicating violation of the safe tidal volume per ideal body weight (IBW) threshold that is accepted as the upper limit for lung protective ventilation (LPV), prior to application to patients. We process streams of patient respiratory data recorded per minute from ventilators in an intensive care unit and apply several state-of-the-art time series prediction methods to forecast the behavior of the tidal volume metric per patient, one hour ahead. Our results show that boosted regression delivers better predictive accuracy than other methods that we investigated and requires relatively short execution times. Long short-term memory neural networks can deliver similar levels of accuracy but only after much longer periods of data acquisition, further extended by several hours computing time to train the algorithm. Utilizing Artificial Intelligence, we have developed a personalized clinical decision support tool that can predict tidal volume behavior within [Formula: see text] accuracy and compare alerts recorded from a real world system to highlight that our models would have predicted violations one hour ahead and can therefore conclude that the algorithms can provide clinical decision support.

Intensive Care Units (ICU) globally. The projected US national estimates for mechanical ventilation suggest 790, 257 hospitalizations of adult patients involving mechanical ventilation in 2005 with an estimated national cost of $27 billion accounting for 12% of all hospital costs [1] . In England, Wales and Northern Ireland during 2012, 44.5% of adult patients admitted to ICUs were mechanically ventilated and this equated to 116, 000 cases [2] .

Despite its undisputed role as a lifesaving organ support tool, injurious mechanical ventilation has been shown to both initiate and potentiate lung injury [3] , [4] , [5] . Injurious ventilation leads to lung injury secondary to excessive lung stress and strain due to both volume and pressure related factors. The harm from high tidal volume has been clearly demonstrated in the pivotal lung protective ventilation (LPV) trial from ARDSNet investigators that has established the role of LPV by using low tidal volume and appropriate use of positive end expiratory pressure in patients with acute respiratory distress syndrome (ARDS) [6] .

There is accumulating supportive evidence for the use of LPV to prevent development of lung injury in all patients. A meta-analysis comparing LPV with conventional ventilation demonstrated a reduced incidence of lung injury as well as lower mortality in non-ARDS patients [7] . Similarly, a 28% reduction is seen in the occurrence of ARDS with an associated increase in ICU free days, hospital free days and mortality benefit [8] . Current data suggest clear harm of tidal volume > 10ml/kg body weight with various systematic reviews suggesting a lower tidal volume to be associated with better clinical outcomes.

Development of significant deterioration secondary to worsening ARDS, or the development of ARDS, is characterised by a reduction in lung compliance [9] . A clinical decision support tool (CDS) which is efficient at determining breaches in thresholds of tidal volume for a set pressure will also be able to detect improvement in lung compliance. Seamless integration of a CDS that promotes compliance with LPV leading to early detection of physiological improvement to facilitate early implementation and support of ventilator weaning is crucial during periods of unprecedented pressure on critical care services such as the COVID-19 pandemic.

LPV is essential to prevent further lung injury in patients with severe COVID-19 related respiratory failure while aggressive weaning is essential to reducing the duration of mechanical ventilation, currently a median of 17 days, to allow better utilisation of ventilatory resources [10] , [11] .

Despite robust evidence, LPV is still poorly implemented with a third of patients receiving injurious ventilation [12] . A recent multicentre observational study has confirmed ongoing poor adherence to LPV, at 50% which further reduces to 15% if ARDS is unrecognised [13] . While previous studies of CDS include displaying safe thresholds for LPV at time of ventilator set up [14] , change of ventilator parameters [15] , or 2 J o u r n a l P r e -p r o o f even default set ups [16] , these solutions do not consider the mode of ventilation, provide decision support to detect a developing condition such as ARDS and ignore the potential changes in physiology between the ventilator interactions. The lack of current methods in improving practice provides further support to the use of automated systems to both diagnose and to use a physician independent alert system to change practice [17] .

Intensive care units routinely collect vast volumes of physiological data on their patients. Prior research has shown that these streams have very valuable information buried in them.This trend started in the 1950s at the University of Southern California when physicians realized that the critically ill may have substantially better chances of survival when minute to minute monitoring of vital signs are available [18] . Research in the intervening years has delivered metrics and protocols which are now routinely used in the ICU [19] . In pioneering work, McGregor demonstrated the viability of monitoring physiological parameters to detect sleep apnea in neo-natal ICU [20] , [21] .

Artificial Intelligence (AI) has shown promise in various fields and has potential in the field of ICU, which is a data-rich environment [22] , [23] . Multiple studies in the areas of ECG analysis, delirium detection, sedation and identification of septic patients have highlighted the potential superiority of AI over routine clinical decision making [24] . In the field of mechanical ventilation, a treatment policy developed using AI techniques was shown to predict extubation readiness [23] . Similar benefit using clinical decision support tool has demonstrated in the management of patients with sepsis and detection of renal impairment [25] , [26] .

Further, artificial intelligence has been proven to aid in the prediction of mortality and outcomes in ICUs [27] , [28] , [29] .

It is the essential that alerts are clinically relevant and studies have suggested artefact related alert rates of 30% [31] . Multiple false alerts could lead to alarm fatigue and clinical inattention [16] , [32] , increased response time [33] and there is evidence that clinicians over ride rate is up to 96% [30] . In a critical care study of a CDS to improve ventilation practices, the positive predictive value was only 59% [34] . This demonstrates the need to improve the quality of the alert generated, along with other measures such as pausing alerts for a specific individual and situation to reduce alert fatigue.

Around 40% of patients in intensive care are supported on invasive mechanical ventilation at any given hour. The ventilators can have many settings that need to be monitored closely and it is important to wean patients off ventilation as soon as possible to avoid dependency or infections. Researchers have utilised numerous machine learning techniques to aid in extubation and ventilator support [23] , detect deteriorating patients [35] , and distinguish patients at risk [36] and with diseases such as ARDS and ALI [37] .

While there has been a significant amount of work carried out analysing medical data and improving patient outcomes, there has, in so far as we know, been no work carried out of the prediction or monitoring of 3 J o u r n a l P r e -p r o o f the tidal volume metric for mechanical ventilation, ensuring lung protection. The main novelty in our work lies in examining the viability of several machine learning methods to construct a personalised predictive alert system for violation of tidal volume thresholds during periods of mechanical ventilation. Our work uses patient data collected in an ICU over several years but is preclinical in the sense that there is no subsequent clinical intervention.

In a previous paper we introduced the VILIAlert system [38] , a quality improvement project, presenting an analysis of the performance of the database systems which underpin the collection of streams of patient respiratory data. The data was collected in the Regional Intensive Care Unit (RICU), Royal Victoria Hospital, Belfast over a three year period. RICU is the regional medical surgical ICU and the regional trauma centre.

Hence the data and trends observed will be generalisable and representative of most ICUs. VILIAlert monitors patients in real-time by continuously computing a set of metrics from the received streams of ventilation data.

Mathematical kernels process the data streams to allow patients to be monitored against the thresholds for lung protective ventilation (LPV). When a threshold is violated consistently (which we defined initially as a period of 60 minutes), an alarm is immediately raised and sent by SMS message to clinical staff. The aim of the VILIAlert system is to give the clinician an opportunity to intervene early and mitigate the potential damage of over ventilation.

In this paper we turn our attention to the challenge of predicting violations of the LPV thresholds based on the time series of patient readings from the ventilator. By adding this to the VILIAlert system we can send an alert to a clinician before potential damage from over ventilation starts to occur. We operated the VILIAlert system for nearly three years, recording in excess of four million per minute tidal volume readings for almost one thousand patients. We define the LPV violation threshold to be tidal volume per ideal body weight (IBW) greater than 8 ml/kg IBW. We employ a pipeline of well-known methods, including ensemble methods built on decision trees and the long short-term memory (LSTM) form of neural networks, the details of which are are comprehensively described already in the literature and for which software in the Python language is available. These newer methods based on supervised learning have proven to be superior to the older ARIMA models [40] .

The tidal volume data set for each patient is a set of N discrete observations V j recorded at per minute intervals t j , j ∈ (1, . . . , N ). N ranges from several hours to many days. We divide each calendar day into 96 periods of 15 minutes duration and average the recorded tidal volumes within each period, in line with previous work in the field [41] . This smooths out random fluctuations in the data due to the phenomena of a patient taking random deep breaths or moving in the bed. We denote these averaged readings byV i .

J o u r n a l P r e -p r o o f 

Following Friedmann [42] , we may state the problem as follows. Output variableV i , dependent on a vector of n input variables x = x 1 , x 2 , x 3 , . . . , x n through some function, F (x), the form of which is unknown. This function represents the behaviour of the respiratory system of the patient. However, we have a set of m observations, each of which associates one input vector x i with one output,

This set of observations forms our training set from which we seek to find an approximation, F * (x) to the true function F (x).

We studied five different regression tree approximations, including bagging and boosting approaches which combine simpler decision trees in various ways to obtain the most accurate predictions. In the bootstrap aggregation approach, also known as bagging, we explore the Bagging, ExtraTrees and RandomForest methods. The alternative boosting approach is covered with the AdaBoost and GradientBoosting methods. For each of the methods we utilized the tsfresh software toolkit [43] , to extract the features used as input into the models.

As an alternative to using regression methods, we investigated the use of long short-term memory neural networks (LSTM), using the Keras and Tensorflow libraries [44] . The LSTM form of the recurrent neural network architecture is quite complex relative to the original Elman form [45] of the RNN and this enables LSTMs to store information over longer periods, an ideal attribute for modelling time series. Instead of working with the feature vectors derived from the time series, in this approach we work directly with the time bins and the observed values y i . Gorr [46] argued that machine learning methods do not require preprocessing of the observed data to achieve stationarity of both the mean and the variance. This is the approach that we therefore follow however we note that there is a contrary view expressed by some authors [47] .

The key parameters which distinguish one neural network model from another in our work are: the number of layers and the shape of the input and the output layers.

We investigated two models which used different numbers of intermediate layers. Each LSTM layer in our model has 50 nodes; this was chosen as 2.5 times the number of input neurons a figure that we believe to be representative of the patients recent breathing history. We trained both models using 70% of the available data and then made predictions for the remaining 30% of the data. We trained the models for 500 epochs and for both models we performed computations using the direct forecasting method [48] where we used 20 input points for forecast 4 steps ahead. Our network used the activation function relu, the rectified linear unit and the optimizer used was adam. The rectified linear activation function is a piecewise linear function that will output the input directly if is positive, otherwise, it will output zero. It is commonly used because it is easy to compute relative to other activation functions [49, 50] . For ModelNeuralA the above choices created a network with 10, 604 trainable parameters and for ModelNeuralB there was a total of 51, 004 trainable parameters.

All analysis for this work was carried out using the Python programming language and for each method calculated the root mean square of the absolute error (RMSE) between the observed values at a time point and the values predicted for the time point an interval ahead. We used RMSE [51] , a commonly used tool in regression analysis, to quantify the accuracy of our predictions for each patient. RMSE is sometimes criticized as being sensitive to outliers, but we see this property as valuable in clinical application work since it flags more significant differences between the predicted and the actual patient readings when they occur [52] . Figure 1 shows the process of our work.

From our selection of 22 patients that represent a coverage of the profiles seen we were able to identify two cohorts representing the two most frequently used modes of ventilation. From visual analysis we identified cohort 1 to mimic controlled followed by support mode of ventilation and cohort 2 to be pure support mode.

While patients can show any combination of controlled and support modes and can even move from one to the other more than once as demanded by disease progression, clinical need etc., we will stick to just two patterns, controlled followed by support or just support alone. Patients 1 to 11 represent cohort 1 and patients 12 to 22 labelled as cohort 2.

As discussed in Section 2 we apply smoothing to our raw data in order to extract the true trends in the patients tidal volume. As described we take averaged 15 minute bins as our patient data going forward. Figure 2 shows how smoothing the data can remove the large anomaly and variation in the data that would potentially throw off our predictive models but still captures the overall trend of the patients data. 

Initially we investigated predicting one fifteen minute time step ahead for each of the patients. to be generated at 10 to prevent overfitting. We compared the depth of the trees for each bagging regressor approximation. The mean depth for each of the RandomForest, ExtraTrees and Bagging trees were 32 ± 5, 37±4 and 33±5 respectively. The boosted regression methods, however, create trees of depth four by default.

We therefore compared the effect of increasing the number of trees created for these models for patient 1, finding a decreasing trend of RMSE for increasing number of trees, as expected. We next investigated predicting up to four time steps ahead. One might at first expect that the further into the future one predicts then the larger the RMSE would be. However the change in RMSE is in the second decimal place, for all five methods in Table 2 Table 1 . Figure 5 shows one of the ten regressor trees created by the AdaBoost method for the prediction 4 time steps ahead for patient 1. The tree splits the data at each node based on the condition given, derived from the features shown and arrives at a prediction by asking a series of questions to the data. The features utilised in Figure 5 have been extracted using tsfresh as the most significant features for the prediction of tidal volume for the given patient. It is interesting here to note what some of the features can mean in Based on the results in Tables 1 and 2 , and further on the basis of easier interpretation of the decision trees made, we selected the AdaBoost regressor to examine the effectiveness of predicting one hour ahead for all patients. 

J o u r n a l P r e -p r o o f Figure 5 : One of the ten regression trees generated by the AdaBoost kernel for patient 1. Refer to Table 6 in appendix for feature explanations. 

We then proceeded to analyse our two LSTM models as described in Section 2. It is important to note that because our LSTM models use 70% training data and uses 20 input points to predict 4, predictions can not be made on patient 4 as the patient only had 40 data points, a drawback of this method. 

The VILIAlert system created an SMS alert to the clinicians whenever the LPV threshold was violated for four time bins consecutively. All of the generated alerts were stored in the database, and used to compare our predictions with. We take the time point for each of the generated alerts and cross reference with our predictions; if the four previous time point predictions are greater than the LPV threshold defined, then our system would have predicted the alert one hour ahead. Therefore we compared the recorded SMS alerts with the predictions shown in the figure 5a.

Due to the drawback of our LSTM model using 70% training data, we can only test our predictions on the last 30% of data remaining for all patients, highlighted in the differences in results shown in table 5a.

With true positives (TP) being alerts that were generated from the VILIAlert system that our model would have predicted one hour ahead and false negatives (FN) being alerts that would not have predicted, we can evaluate the accuracy of our predictive models. We report the accuracy using equation 2 in table 5b.

We can see from table 5a that our AdaBoost model performs accurately for the prediction of alerts generated by the VILIAlert system. For the 84 alerts generated for patient 1, 81 of these would have been predicted an hour ahead of time and therefore could have been prevented and ensured safer ventilation of the patient. Further, our results show that the different modes of ventilation and thus cohorts of patients does not have an effect on the predictive accuracy. In turn, this would indicate that any patient with any tidal volume profile can be predicted within an accuracy of 10% using our AdaBoost model.

Ventilation is a valuable tool for treatment of patients in the ICU but has to be managed so that it does not in itself lead to lung injury. Early recognition of the potential for such damage is vital to assist the clinician.

In this paper we have studied the viability of methods for the prediction of tidal volume, methods based on machine learning techniques to provide early warning of over ventilation. We further utilised smoothing techniques and have demonstrated a smart alert system that has a predictive accuracy within 10% of true values.

It is important to ensure the quality of alerts in clinical decision support tools in order to reduce alarm fatigue. As discussed alarm fatigue can cause increased response time and alerts can even be overridden.

Dependent on the tidal volumes, the results for patients with values oscillating around the 8ml/kg IBW threshold, the accuracy is low, e.g. in patient 6, the accuracy is only 0.27, thus while we are predicting these values with an RMSE 1.43 they may not always be flagged as alerts. Going forward we would deem it appropriate to have a threshold range ±0.5 ml/kg IBW in order to improve true alert detection. Further, we would propose using a traffic light alert system in real time; green suggesting no breaches predicted, amber indicating that within the next 1 hour period the 4 predictions are within 8 ml/kg ±0.5 and further a red alert if all 4 predicted values for the next hour are above the 8 ml/kg threshold.

Our data was collected as part of an observational study and as historical data, has allowed us to investigate a significant number of data points, with over 4 million per minute tidal volume readings recorded.

We compare two different machine learning methods for the prediction of this metric 1 hour ahead, ensuring enough time for clinical intervention to prevent a threshold breach. We have found that decision trees are an adequate solution in as much as they deliver relevant predictions of threshold breaches within a few hours of starting ventilation and require minimal computational resources. Furthermore, we identified that the magnitude of the Rickler wavelet is a critical determinant in the analysis of the tidal volume waveform. This wavelet analysis arises in the study of seismic wave propagation through viscoelastic homogeneous media, under the approximation that Newtonian viscosity is valid. The viscoelastic characteristic of lung parenchyma and additional fluid component such as haematocrit of blood in the pulmonary circulation have been studied [53] , however the changes associated with the development of extraalveolar oedema are yet to be studied.

Development of an automated detection tool based on changes in viscoelastic properties could enable rapid detection of development of cardiogenic and non-cardiogenic oedema in the lungs. Earlier and automated detection will guide fluid balance strategies that is associated with clinical outcomes [54] , as well as earlier institution of investigations lung ultrasound to assess extravascular lung water [55] . We aim to investigate this further in future studies where we select patients with specific lung pathophysiologies. More generally, we are building on the foundational work in this paper in a new project which seeks to optimize mechanical ventilation to deliver lung protective ventilation, predict the development of ventilatory associated conditions and guide weaning. While our work does present limitations as being a single centre, retrospective study, we are confident that our techniques and models work independent of the patient profile or mode of ventilation utilised. Therefore, our models are generalisable despite not being externally validated which we plan on doing in future work.

Human physiology is a complex dynamic system and thus we will incorporate additional physiological data streams to predict deterioration more accurately and gain a more in depth understanding of patient states.

Building such systems involves consideration of many variability points and within that several configuration settings. In our work each of the tree based models are ensembles in their own right. Other systems have also been studied recently proving the benefit of machine learning in healthcare [56] . In the financial domain,

Krauss and co-workers [57] found that applying a higher level of ensemble proved to be a powerful model and we intend to investigate similar ensembles applied to physiology in future work, as highlighted in Figure 6 .

Rachael Hagan acknowledges funding for a PhD studentship from the Department for the Economy Northern Ireland and Dr. Murali Shyamsundar is a NIHR Clinical Scientist fellow.

MS and CJG designed the VILIALert system and CJG implemented the software to gather the data. RH created and implemented all of the analytics software which generated the results reported in this paper. MS and DMcA provided clinical input on the algorithms and analysis of the results while CJG and IS provided input on the software implementation and computational validation of the results. RH led the writing of the manuscript with feedback from all co-authors.

The authors declare no competing interests. One of the ten regression trees generated by the AdaBoost kernel for patient 1. Refer to 

data and code will be made available on request and will be released for replication of result purpose

The epidemiology of mechanical ventilation use in the United States

Number of mechanically ventilated patients during

Ventilatorinduced cell wounding and repair in the intact lung

Mechanisms of ventilator-induced lung injury

Intermittent positive-pressure hyperventilation with high inflation pressures produces pulmonary microvascular injury in rats

Ventilation with lower tidal volumes as compared with traditional tidal volumes for acute lung injury and the acute respiratory distress syndrome

Association between use of lung-protective ventilation with lower tidal volumes and clinical outcomes among patients without acute respiratory distress syndrome: a meta-analysis

Lung-protective ventilation with low tidal volumes and the occurrence of pulmonary complications in patients without acute respiratory distress syndrome: A systematic review and individual patient data analysis

The pulmonary physician in critical care 8: Ventilatory management of ALI/ARDS

Clinical guide for the management of cancer patients during the coronavirus pandemic

Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study

Epidemiology, patterns of care, and mortality for patients with acute respiratory distress syndrome in intensive care units in 50 countries

Clinician Recognition of the Acute Respiratory Distress Syndrome: Risk Factors for Under-Recognition and Trends Over Time

Effect of a clinical decision support system on adherence to a lower tidal volume mechanical ventilation strategy

Better ventilator settings using a computerized clinical tool

Increasing compliance with low tidal volume ventilation in the ICU with two nudge-based interventions: Evaluation through intervention time-series analyses

Underuse of lung protective ventilation: Analysis of potential factors to explain physician behavior

The History of Critical Care Medicine: The Past Present and Future, Intensive and Critical Care Medicine

Monitoring the Critically Ill Patient

Classifying neonatal spells using real-time temporal analysis of physiological data streams: Algorithm development

Real-Time Analysis for Intensive Care c

Artificial intelligence in the intensive care unit

A reinforcement learning approach to weaning of mechanical ventilation in intensive care units

An Interpretable Machine Learning Model for Accurate Prediction of Sepsis in the ICU

The artificial intelligence clinician learns optimal treatment strategies for sepsis in intensive care

A Clinically Applicable Approach to Continuous Prediction of Future Acute Kidney Injury

Advanced analytics for outcome prediction in intensive care units

Interpretable Deep Models for ICU Outcome Prediction. AMIA

Prediction of mortality from respiratory distress among long-term mechanically ventilated patients

Effects of workload, work complexity, and repeated alerts on alert fatigue in a clinical decision support system

The wolf is crying in the operating room: Patient monitor and anesthesia workstation alarming patterns during cardiac surgery

Changes in Default Alarm Settings and Standard In-Service are Insufficient to Improve Alarm Fatigue in an Intensive Care Unit: A Pilot Project

Patient monitoring alarms in the ICU and in the operating room

Limiting ventilator-induced lung injury through individual electronic medical record surveillance

Prediction of imminent, severe deterioration of children with parallel circulations using realtime processing of physiologic data

Using artificial intelligence to predict prolonged mechanical ventilation and tracheostomy placement

Personalized Healthcare through Technology

Expediting assessments of database performance for streams of respiratory parameters

Individual patient data analysis of tidal volumes used in three large randomized control trials involving patients with acute respiratory distress syndrome

Comparison of ARIMA and Random Forest time series models for prediction of avian influenza H5N1 outbreaks

Limiting ventilator-induced lung injury through individual electronic medical record surveillance

Greedy Function Approximation

Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests

Python deep learning : exploring deep learning techniques and neural network architectures with PyTorch, Keras, and TensorFlow

Finding structure in time

Research prospective on neural network forecasting

Neural network forecasting for seasonal and trend time series

Evangelos Spiliotis ,Vassilios Assimakopoulos Statistical and Machine Learning forecasting methods: Concerns and ways forward

Deep learning

Understanding deep neural networks wiht rectified linear units

Another look at measures of forecast accuracy

Components of information for multiple resolution comparison between maps that share a real variable

Airway mechanics and lung tissue viscoelasticity: effects of altered blood hematocrit in the pulmonary circulation

Comparison of two fluid-management strategies in acute lung injury

Clinical review: the role of ultrasound in estimating extra-vascular lung water

Machine learning for early prediction of circulatory failure in the intensive care unit

Deep neural networks, gradient-boosted trees, random forests: Statistical arbitrage on the S and P 500

List of Figures 1

Flow Diagram highlighting the process and methodology used as described in 2 for the prediction of tidal volume

Calculated feature Ricker WaveletCalculates a Continuous wavelet transform or the Ricker wavelet Linear least-squares Calculates a linear least-squares regression for values of the time series Change A Calculates the average, absolute value of consecutive changes of the series x inside a window. Friedrich coefficient Coefficients of polynomial h(x), which has been fitted to the deterministic dynamics of Langevin model Table 6 : Feature explanations for the features calculated using tsfresh for patient 1.