key: cord-0517952-acwgyxlx authors: Mehrabadi, Milad Asgari; Aqajari, Seyed Amir Hossein; Azimi, Iman; Downs, Charles A; Dutt, Nikil; Rahmani, Amir M title: Detection of COVID-19 Using Heart Rate and Blood Pressure: Lessons Learned from Patients with ARDS date: 2020-11-12 journal: nan DOI: nan sha: deeaeb6080f749cd9003c613cc14a208a7e1595e doc_id: 517952 cord_uid: acwgyxlx The world has been affected by COVID-19 coronavirus. At the time of this study, the number of infected people in the United States is the highest globally (7.9 million infections). Within the infected population, patients diagnosed with acute respiratory distress syndrome (ARDS) are in more life-threatening circumstances, resulting in severe respiratory system failure. Various studies have investigated the infections to COVID-19 and ARDS by monitoring laboratory metrics and symptoms. Unfortunately, these methods are merely limited to clinical settings, and symptom-based methods are shown to be ineffective. In contrast, vital signs (e.g., heart rate) have been utilized to early-detect different respiratory diseases in ubiquitous health monitoring. We posit that such biomarkers are informative in identifying ARDS patients infected with COVID-19. In this study, we investigate the behavior of COVID-19 on ARDS patients by utilizing simple vital signs. We analyze the long-term daily logs of blood pressure and heart rate associated with 70 ARDS patients admitted to five University of California academic health centers (containing 42506 samples for each vital sign) to distinguish subjects with COVID-19 positive and negative test results. In addition to the statistical analysis, we develop a deep neural network model to extract features from the longitudinal data. Using only the first eight days of the data, our deep learning model is able to achieve 78.79% accuracy to classify the vital signs of ARDS patients infected with COVID-19 versus other ARDS diagnosed patients. The acute respiratory distress syndrome (ARDS) is a life-threatening consequence of infection with SARS-CoV-2, the novel coronavirus that causes COVID-19 [1] . ARDS is characterized by an overwhelming immune response and non-cardiogenic pulmonary edema that compromise gas exchange, resulting in severe respiratory failure. ARDS mortality ranges from 40%-60%; however, it is unclear if it is substantially higher if associated with COVID-19 infection, as it varies from 28.8%-62% [1, 2] . Currently, more than 38 million people worldwide have been infected with SARS-CoV-2 [3] . In the United States, 7.9 million people have been infected with over 216,000 deaths [3] . The impact of the COVID-19 pandemic is considerable and efforts to mitigate its spread through early detection cannot be over-emphasized. Infections to COVID-19 have been conventionally investigated in clinical settings by monitoring laboratory metrics and symptoms [4, 5] . These studies have focused on a large amount of subjective questionnaires and invasive laboratory test results. For example, Jehi et al. [4] used a large number of features extracted from demographics, comorbidities, immunization history, symptoms, travel history, laboratory vairables, and medications to predict the infection with COVID-19. Li et al. [1] show that the oxygenation index and respiratory system compliance could be leveraged to study ARDS patients infected with COVID-19. Force et al. [6] propose that ARDS caused by factors rather than COVID-19 results in reduced lung compliance. However, reduced lung compliance in ARDS is typical of the disease [1] . Such diagnostics are the gold standard methods to investigate COVID-19 and ARDS patients; however, they are limited to hospitals and clinical settings. Moreover, subjective symptom-based analyses were shown to be an ineffective strategy to qualify an individual's likelihood of contracting COVID-19 [5] . In contrast, various studies showed that vital signs such as heart rate and blood pressure could be exploited for early detection of infections and respiratory diseases [7] . We posit that such biomarkers are informative in identifying ARDS patients infected with COVID-19. These biomarkers can be collected continuously and remotely due to the recent advancements in wearable electronics and Internet-of-Things-based devices. Therefore, the effectiveness of these biomarkers in early COVID-19 detection extends the monitoring services to remote settings. Understanding and leveraging these clinical measurements also play a significant role in preventive care and treatments [8, 9] . Recognition of COVID-19 infections using big sensory data necessitates novel modeling and analysis techniques. The state-of-the-art studies often use traditional statistical models to predict COVID-19 infections. These studies have mostly studied the linear statistical relationship and association between the health parameters or extracted features from the subject's demographics, symptoms, laboratory tests, and medications [4, 5] . For example, a full multi-variable logistic model is constructed in [4] to predict COVID-19 using extracted features. However, such data with complex intensive longitudinal structure and temporal characteristics need to be investigated using nonlinear and advanced methods. Machine learning algorithms, including Artificial Neural Networks, can be tailored in this regard to extract linear/nonlinear correlations in the data throughout the health monitoring. In this paper, we investigate the behavior of COVID-19 on ARDS patients by utilizing three longitudinal features: systolic and diastolic blood pressure and heart rate. We compare individuals who developed ARDS with and without COVID-19 to assess potential markers that could be used in early detection and prevention strategies. We use the University of California COVID Research Data Set (UC-CORDS) [10] that contains comprehensive, structured information from patients admitted to the University of California Health's five academic health centers (i.e., UC Davis Health, UC San Diego Health, UC Irvine Health, UCLA Health, and UCSF Health). Moreover, we utilize statistical features and neural networks to distinguish between ARDS caused by COVID-19 and other factors. The biomarkers investigated in this study (i.e., heart rate and blood pressure) have the potential for scalable COVID-19 prevention and monitoring in everyday settings, thanks to the ubiquitous availability of inexpensive, non-invasive portable and wearable sensors. For instance, the Omron ® HeartGuide wrist-band [11] is an FDA-cleared example of such wearable devices capable of monitoring all these three biomarkers. These observations have potential applications across community settings and for those living in collective housing. In this section, we discuss the results obtained by statistical analysis and neural networks. We measured basic statistical features over BP and HR and compared them with COVID-19 test results. Table 1 shows the Point Biserial correlation between these features and age with the test results, and Table 2 represents 95% confidence interval (CI) of these features for each test group. Table 1 suggests significant negative correlations between the resting HR (min HR), max value of DBP, and test results. Fig. 1 illustrates the difference in the distribution of resting HR between the positive and negative test result groups. The average resting HRs were 55.23 and 48.06 for the negative and positive test groups, respectively. Although resting HR shows a significant correlation with the test results, there is an overlap in the distribution of such a feature between positive and negative results. Due to the longitudinal aspect of the data, we consider a deep neural network architecture to predict the test results by only looking at BP and HR. The accuracy of this model reached as high as 74.32% for the entire test data. Besides, we tested the model with different testing sizes, which is extracted including N days (N ∈ {2, 4, ..., 28}), to see the model's performance by looking only at a limited number of days. Fig. 2a shows the accuracy of the classification model with respect to days. Fig. 2b illustrates the corresponding area under the curve (AUC) with given days. Fig. 2a shows an increase in the model's accuracy at the beginning, starting from 59.85% and reaching as high as 78.79% on 8 th day. To better visualize the extracted features using neural networks, we used the t-SNE method [12] to reduce the feature space dimension to two. We performed this method on the output of the dense layer with 100 neurons. Fig. 3 visualizes the test data with different included days. Fig. 3a shows that using extracted features by the deep neural network, the positive and negative cases are almost separated. As the number of samples increases, the decision boundary calculation would be more challenging; however, the clusters are still distinguishable (Fig. 3b ). A few of our observations warrant additional discussion. First, monitoring of blood pressure and heart rate may provide a useful strategy for individuals living in collective communities, such as nursing homes or rehabilitation facilities, as well as for healthy community-dwelling adults. The potential impact could be to mitigate the spread of COVID-19, as well as allowing early detection of complications associated with infection, such as those at greater risk for ARDS. Second, we assessed for the presence of comorbidities in COVID-19 positive patient with ARDS, and reported that comorbid diagnoses such as type 2 Diabetes Mellitus, hyperglycemia, chronic obstructive pulmonary disease, elevated transaminase, and lactic acid dehydrogenase, bradycardia, acute ST segment elevation myocardial infarction, and metabolic derangements were more prevalent (data not shown). This observation is in-line with other reports [1, 2, 13] demonstrating increased vulnerability among those with chronic health conditions, as well as reported metabolic derangements observed with COVID infection, especially among adults over 60 years of age. Third, there are other potential applications in modeling COVID-19. Specifically, there has been a discussion of how early COVID-19 arrived in the United States; the first cases were reported in California. It would be possible to review data prior to the first reported cases in the U.S. to validate the presence or absence of COVID-19 in our communities prior to January 2020. This is of importance as the viral genome sequence was confirmed in late January 2020, which allowed for the use of polymerase chain reaction to detect viral genetic material [14] . Antibody testing, which has been shown to be inconsistent, was used in the preceding months, raising the question of how early was COVID-19 in the United States. There are related studies in the literature that propose prediction models for the patient's infection with COVID-19 in lab setups. Jehi et al. [4] created a statistical model to accurately predict infection with COVID-19 using the data from 11672 patients, tested before April 2, 2020. A full multi-variable logistic model was initially constructed to predict COVID-19 using features extracted from demographics, comorbidities, immunization history, symptoms, travel history, laboratory variables, and medications before testing. Although their c-index ranged from 0.839 to 0.863, their statistical model requires a broad set of features to predict a patient's infection with COVID-19. One of the drawbacks in their work is that some of these features can only be measured in clinical laboratory settings. In contrast, we considered two easily accessible features as well as utilizing a deep learning method to capture the shortand long-term dependencies in the time series data. There is a correlation between the simple statistical features and the test results. However, simple logistic regression models are insufficient due to the overlap in the feature space. Leveraging the nonlinear features extracted from our proposed neural network, we distinguished negative and positive COVID-19 test results with the AUC as high as 0.83 by using only blood pressure and heart rate values. Moreover, Callahan et al. [5] investigated whether symptom-based screening is feasible in prioritized testing. To access feasibility, they started with predicting participants' test results with diagnoses of common respiratory viruses to co-infect patients positive for SARS-Cov-2 at Stanford Healthcare. They evaluated symptoms mentioned in clinical notes at the time the test was performed. For the respiratory viruses, AUC for the receiver operator curve on the test data ranged from 0.60 to 0.77. Based on their studies, they concluded that two of non-SARS-Cov2 viruses (i.e., influenza type A and RSV) were moderately predictable given presenting symptoms. However, SARS-Cov-2 and remaining common respiratory viruses were not highly predictable (AUROCs below 0.70). According to the model, although they suggested that symptom-based screening is an ineffective strategy to predict person's infection with COVID-19, the usage of vital signs (i.e., heart rate and blood pressure) is not investigated. On the contrary, in this study, we mainly focused on ARDS patients as a population. Although our findings are only based on this population, these achievements could potentially lead future directions of our research to investigate the aforementioned vital signs for COVID-19 prediction tasks with other populations as well. Besides, another category of detection models focuses on identifying the characteristics of the patients with COVID-19 at a specific point in time, which usually is 1-2 months [15] [16] [17] . Some of these studies aim to use the identified characteristics to predict critical cases of COVID-19; specifically, those most likely require hospitalization or Intensive care unit (ICU) admission. In [18] [19] [20] , the authors attempted to find the best predictors of ICU admission among infected patients with COVID-19. In conclusion, we investigated the non-linear patterns in simple vital signs, namely, blood pressure and heart rate, which can be easily and reliably measured without the need for skilled medical professionals, in ARDS patients with positive and negative COVID-19 test results. Our proposed neural network-based model achieved 78.79% accuracy, considering only the first eight days of data. Using such prediction methods, the number of visits to the hospitals or care sites, as well as the chance of virus spread, can be reduced. Using wearable devices, it is possible to monitor vital signs of subjects in everyday settings without visiting a hospital or a care site. Utilizing the proposed model allows early detection of COVID-19 cases in free-living conditions. Data set plays an essential role in any prediction tasks. UC-CORDS data set provides comprehensive, structured information of patients admitted to the hospital at the University of California Health's five academic health centers (i.e., UC Davis Health, UC San Diego Health, UC Irvine Health, UCLA Health, and UCSF Health). This data set provides a wide range of information, including different observations, measurements and COVID-19 test results of patients. Notably, the vital signs are recorded every 15, 30, or 60 minutes based on the time during a day. For this study, we aimed to select participants with ARDS hospitalized after January 1 st , 2020. The earliest available data in UC-CORDS for positive COVID-19 inpatients with ARDS was March 21 st , 2020. Therefore, we only considered hospitalized ARDS (IDs 4195694 and 4191650 from SNOMED vocabulary [21] ) patients tested between March 21 st , 2020 and August 1 st , 2020. Since the number of observations with negative COVID-19 test results was more than positives, we considered patients with negative test results after July 1 st , 2020. This re-sampling resulted in a more balanced data set (i.e., 19449 data points for each feature in the positive group and 23057 samples in the negative group). As of August 1 st 2020, this led to 32 and 38 participants with positive and negative test results, respectively. Table 3 shows the age distribution of patients per each test result. In addition, another valuable aspect of this data set is the longitudinal monitoring of vital signs. Fig. 4 shows the distribution of available data (i.e., blood pressure and heart rate) duration in days per each test result group. Throughout the remainder of this paper, the value of 0 represents negative, and 1 shows positive results. Besides, Fig. 5 shows the cumulative number of samples per day for each test group. The data was jointly reviewed by the Institutional Review Boards of all UC Health campuses and was determined to be non-human subjects research. Moreover, UC-CORDS does not contain any patient identifier such as name and phone number. However, all original service dates (e.g., the date of the COVID-19 test) are preserved, and partial address information is available (i.e., town or city, state, and zip code). As such, UC-CORDS is a HIPAA Limited Data Set. In this study, we were interested in the COVID-19 test result prediction using longitudinal heart rate and blood pressure monitoring. To perform the prediction, we used a deep neural network architecture combining Convolutional and Recurrent Neural Networks (CNN and RNN) . Such a model was utilized to leverage the embedded structure of longitudinal data. We have considered three channels of vital signs, i.e., heart rate (HR), systolic and diastolic blood pressure (SBP and DBP), as the inputs and the test result for the network's output. Fig. 6 illustrates the detailed structure of the proposed network. It consists of two 1-dimensional CNN, following by a max-pooling layer, a long short-term memory (LSTM) [22] layer, and finally two fully connected dense layers. We randomly selected 80% of patients as train and the rest as the test data. We labeled positive COVID-19 test results with 1 (13918 samples in the train, 43.86%, and 3483 samples in the test data, 55.28%) and the negative ones with 0. To perform the learning and testing tasks, TensorFlow package of Python has been utilized. Besides, to see the prediction's effectiveness, we tested our model on different time intervals on test subjects. In other words, we were interested to see the possibility of test result prediction by only looking at a couple of samples (in days). Different test samples with different lengths (starting from 2 days of data until 28 days) have been extracted to answer this question. Finally, for visualization purposes, the t-SNE method [12] has been used over the output of the dense layer of the neural network to reduce the feature space's dimension to two. To show the correlation of features (i.e., blood pressures and heart rates) and the test results, statistical features have been extracted. We measured basic features, including mean, minimum (min), maximum (max), and standard deviation (std) of DBP, SBP, and HR. Besides, we utilized the Point Biserial correlation between the proposed features and the test results. This correlation is similar to Pearson's correlation and is used when one of the variables is binary, and the other variable is a continuous number [23] . This study's data set is not publicly available as they contain protected patient health information and are institutional property. The code developed for this study may be made available upon request for non-commercial use. The project described was supported by the National Center for Research Resources and the National Center for Advancing Translational Sciences, National Institutes of Health, through Grant (UL1 TR001414). Charles A. Downs is also supported by NR016957. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. The authors declare no competing interests. Acute respiratory failure in covid-19: is it "typical" ards? Comparison of hospitalized patients with ards caused by covid-19 and h1n1 COVID-19 dashboard by the Individualizing risk prediction for positive covid-19 testing: results from 11,672 patients Estimating the efficacy of symptom-based screening for covid-19 Acute respiratory distress syndrome Early sepsis detection in critical care patients using multiscale blood pressure and heart rate dynamics Comparison of the clinical course of covid-19 pneumonia and acute respiratory distress syndrome in 2 passengers from the cruise ship diamond princess in february 2020 Covid-19 pneumonia: Ards or not University of California Health creates centralized data set to accelerate COVID-19 research Wearable Blood Pressure Monitor and Watch, HeartGuide by OMRON Visualizing data using t-sne Risk factors associated with acute respiratory distress syndrome and death in patients with coronavirus disease 2019 pneumonia in wuhan, china Whole genome of novel coronavirus, 2019-nCoV, sequenced Covid-19 in critically ill patients in the seattle region-case series Clinical characteristics of covid-19 in new york city and the northwell covid-19 research consortium presenting characteristics, comorbidities, and outcomes among 5700 patients hospitalized with covid-19 in the new york city area Predictors of intensive care unit admission in patients with coronavirus disease 2019 (covid-19) Baseline characteristics and outcomes of 1591 patients infected with sars-cov-2 admitted to icus of the lombardy region, italy Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in wuhan, china SNOMED Clinical Terms Long short-term memory Handbook of parametric and nonparametric statistical procedures