key: cord-0477538-hs202c1r authors: Rashed, Essam A.; Kodera, Sachiko; Shirakami, Hidenobu; Kawaguchi, Ryotetsu; Watanabe, Kazuhiro; Hirata, Akimasa title: Knowledge discovery from emergency ambulance dispatch during COVID-19: A case study of Nagoya City, Japan date: 2021-02-17 journal: nan DOI: nan sha: 9f81422df01ca3b00203f34a6cb2b104794bb397 doc_id: 477538 cord_uid: hs202c1r Accurate forecasting of medical service requirements is an important big data problem that is crucial for resource management in critical times such as natural disasters and pandemics. With the global spread of coronavirus disease 2019 (COVID-19), several concerns have been raised regarding the ability of medical systems to handle sudden changes in the daily routines of healthcare providers. One significant problem is the management of ambulance dispatch and control during a pandemic. To help address this problem, we first analyze ambulance dispatch data records from April 2014 to August 2020 for Nagoya City, Japan. Significant changes were observed in the data during the pandemic, including the state of emergency (SoE) declared across Japan. In this study, we propose a deep learning framework based on recurrent neural networks to estimate the number of emergency ambulance dispatches (EADs) during a SoE. The fusion of data includes environmental factors, the localization data of mobile phone users, and the past history of EADs, thereby providing a general framework for knowledge discovery and better resource management. The results indicate that the proposed blend of training data can be used efficiently in a real-world estimation of EAD requirements during periods of high uncertainties such as pandemics. An ambulance is one of the most important healthcare tools providing an essential life-saving role on a daily basis. The management and location of the ambulance dispatch center are known to reduce the rate of fatalities, particularly during a national emergency caused by a natural disaster or wide-spread rate diagnosis and treatment. To prevent virus infection at hospitals, an early pandemic protocol was considered [6] . Emergency service management should be included in this protocol. Later on, a guidance for health worker infection prevesion was relased by World Health Organization (WHO) [7] . Paramedics have also suffered from the spread of this novel virus; thus, a more careful management than usual is required [8] [9] [10] . In Japan, the first state of emergency (SoE) was declared nationwide on April 16, 2020, and was revoked on May 25, 2020. Note that, during this pandemic, a complete closure policy was not adopted in Japan, but rather a voluntary segregation and seclusion with community cooperation was applied. During the SoE, outdoor activities were reduced, particularly in common crowded regions such as major train stations. From April 18, 2020, NTT Docomo, Inc. (a mobile phone operator in Japan) started to provide publicly available data on activities based on the user locations of mobile phones or smartphones in major traffic stations nationwide. The careful management of healthcare infrastructure is required for fair allocation and usage especially where these resources are limited [11] . Owing to a lack of data on ambulance services in such a pandemic era, it was unclear how many allocations were needed for ambulances, including the number of emergency ambulance dispatches (EADs) and special requirements for additional processes, such as disinfection after the transport of potentially positive cases. The open questions here are i) how accurate EAD forecasts can be when applying environmental factors, ii) how the changes caused by abnormalities such as a SoE during a pandemic should be handled, and iii) what are the main reasons for such changes, including the relationship with human activities? If such information is made available, additional measures for ambulance management can be implemented. For example, it will be possible to strictly limit the use of certain ambulance units for potential COVID-19 patients. This study first analyzed EAD data recorded from April 1, 2014, to August 18, 2020, in Nagoya City, Japan. This analysis has led to a high-quality estimation of the number of EADs based on data gath- ered before the COVID-19 era using machine learning approaches. A comparison of the estimated and actual EADs observed during the pandemic clarifies the differences caused by the COVID-19 outbreak. A data analysis model can provide better understanding of the potential approaches used to estimate the number of EADs during a pandemic and calls during a SoE. The main factor for this was also discussed for potential future pandemics, including third waves of COVID-19. To the best of the authors' knowledge, this is the first study that applies a deep learning approach to forecast the daily number of EADs when considering environmental factors, even during non-pandemic states. The main contributions of this study can be summarized as follows: • A machine learning architecture for accurate EAD forecasting in urban areas from environmental factors • A trained long short-term memory (LSTM) network for EAD estimation in Nagoya City, Japan, with a potential extension to other regions based on data availability • The introduction of a new social factor (i.e., mobile phone usage) that can be used to fine-tune the EAD forecasting during a pandemic 2. Materials and methods Nagoya is a major city in Japan located in the central region of Honshu Island (Fig. 1) with a population varied from 2,272k to 2,316k (2014-2020), which makes Nagoya the fourth largest city nationwide 1 . All daily EAD data were collected by authorities at the Fire Department of the City. Fig. 2 illustrates the daily total number of EADs from April 2014 to August 2020. The total number of EADs can be thought of as a U-shaped plot with two annual peaks during the summer and winter, which is highly associated with upper and/or lower temperature peaks. Weather data, including maximum and/or minimum daily temperature and other related factors such as humidity, were collected for Nagoya City from the online resources of the Japan Meteorological Agency 2 . We also processed data representing the variations (in percentage) of people around the major transport stations in Japan collected by the mobile career NTT Docomo, Inc. and released on the web 3,4 . These data were made available immediately after the emergency declaration on April 18, 2020. Note that the national market share of NTT Docomo is approximately 36.9% (ranked first) 5 . The data are based on the estimated statistical population generated from mobile terminal network operational data [12] . It is well known that the number of ambulance dispatches is related to the daily average ambient temperature [13] [14] [15] [16] [17] [18] [19] . The effects of the daily maximum and minimum ambient temperatures, as well as the relative humidity, on the number of daily EADs are shown in Fig. 3 . A common U-shape curve can be observed at both the maximum and minimum temperatures, whereas the relative humidity triggers a slightly higher number of EADs within the middle range. However, the effect of environmental factors can be split into different patterns based on the cause or illness corresponding to the EADs. In addition, weekends and holidays are other factors that characterize human social activities. Furthermore, the age of the population affects such activities. The typical retirement age in Japan is currently approximately 60-65 years. In addition, an age of over 65 is classified as elderly; thus, 65 years is used as a reference value in this study. The former factors were used as input for applying machine learning to the data on the previous 5 years. The statistics are applied for the populations younger and older than 65 years in age. The learning-based estimation is applied as a reference value for 2020 and then compared with the observed value determined by the city fire department. Unexpected pandemics are known to cause extensive requirements for medical care services and special resource management [20] . From the data presented in Fig. 4 , clearly, the number of EADs in 2020 is lower than that during the past 5 years. Figure 5 shows the number of EADs according to the maximum daily ambient temperature. Data were divided into categories based on the pickup site (i.e., indoor or outdoor). The normal pattern observed since April 2014 significantly changed during the pandemic, as smaller numbers of dispatches were observed in both location categories. A further reduction was also recognized during and after the first SoE (up to August 19) . One potential reason for the reduction in EADs during the pandemic is the change in social activities. However, there are several other factors such as the enhanced hygiene [21] and avoiding the access to medical facilities to prevent potential COVID-19 infection [22] . It is difficult to explicitly highlight the main reason but it is likely a composition of many factors include the above mentioned ones. Even using models that enable artificial intelligence (or machine learning), specific data patterns are challenging to estimate using models trained on data measured under normal situations. Recurrent neural networks (RNNs) are a wide range of network architectures that consider interneuronal connections such that they formulate a memory-like unit [23] . LSTM is a commonly used class of RNN that performs well with long-term dependencies (Fig. 6 ). To address this problem, we consider an LSTM-based network architecture. Consider a time-series data sample i is the measurement of the ith variable at the tth time slot frame. The computation within a single LSTM node can be expressed as follows: where , and s (t) represent the input gate, output gate, forget gate, memory gate, and node state values at time frame t (here, day index), respectively. W and b are the node weights (parameters) and bias matrices. σ and φ are the sigmoid and tanh functions, respectively. LSTM layers are trained using time-series data to optimize the parameters for future forecasting tasks. The proposed architecture consists of two LSTM layers (with 50 and 30 nodes) and a three fully connected (FC) layers (with 300, 100 and K nodes). The input data vector contains the daily based maximum ambient temperature, average relative humidity, and a binary label that identifies working days from nonworking days (weekends or national holidays). The network output is a K nodes representing the number future days to be predicted. The network parameters (W) and bias (b) are initialized with zero matrices. In addition, the Adam algorithm [24] is used for data fitting during training along with the cross-entropy loss function and automatically estimated learning rate. The network architecture was implemented using Wolfram Mathematica ® ver. 12.1 on a workstation with four Intel ® Xeon CPUs running at 3.60 GHz, with 128 GB of memory and three NIVIDIA GeForce 1080 GPUs. We investigated two scenarios that consider the pre-and ongoing pandemic periods. The prepandemic results are used to validate the feasibility of the LSTM architecture in estimating the number of daily EADs when considering the different categories and conditions. We further analyzed the effect of the Covid-19 pandemic and demonstrated a finetuning used to effectively handle bias factors resulting from such abnormalities. The forecast accuracy is validated using the correlation coefficient (CC) mean absolute error (MAE), which is defined as follows: Here, u and v are n-size vectors representing the real and estimated numbers of EADs, respectively. We consider data from April 1, 2014, to December 31, 2018, for training and 2019 data for testing. The output is split into different categories considering the pickup location (indoor/outdoor) and age categories: children (0 ≤ age ≤ 15), adult (15 < age ≤ 65), and elderly (age > 65). Training is considered through 500 epochs with a batch size of 8 samples. The results are presented in Fig. 7 , and the descriptive statistics and quality metrics are listed in Table 1 . It can be observed that elderly patients are estimated to have the highest accuracy compared with the other age categories. This may reflect the high sensitivity of elderly citizens to changes in environmental factors. In addition, the indoor patients can be estimated with a higher accuracy compared with outdoor patients because outdoor EADs include many cases that are unrelated to weather data such as road accidents. Under this scenario, we consider the data from April 1, 2014, to December 31, 2019, as the training set and the data from January 1 to August 19, 2020, for testing. The network is trained similarly to the first scenario setup; however, a significant error is found as the number of EADs is significantly decreased in 2020 (Fig. 8) . To overcome this problem, the data on the mobile users' location are included as an additional input term. The available data provided by NTT around Nagoya's main station (April 18 to August 18, 2020) are shown in Fig. 9 along with profile of COVID-19 positive cases in Japan 6 . We assume that the mobile phone usage is 100% during January 2020 where other missing data were linearly interpolated. Figure 8 shows the forecast results with and without mobile data for different data groups. We observed a significant improvement in the estimation accuracy. Table 2 presents the descriptive statistics and assessment metrics. These data clearly demonstrate that mobile data usage as a surrogate of social activities is useful in improving the forecasting of EADs in different data groups. To clearly demonstrate the forecast accuracy, a plot representing the estimated number of daily EADs in association with the daily maximum temperature and average relative humidity is shown in Fig. 10 . Estimated data are demonstrated with almost the same pattern in both cases; however, including the mobile users' location data significantly reduces the error caused by the abnormalities. To validate how different variables contribute to the estimation of daily EADs, we conducted an ablation study. The experimental data presented in Fig. 8 is repeated with exclusion of single variable each. We consider exclusion of mobile usage data, maximum temperature, average humidity and day label (working day/off day). The network is retrained with these different set of variables and data of 2020 is estimated for each case. Results are shown in Fig. 11 along with box plots demonstrate MAE in each case. These results demonstrate that training using all variables leads to MAE of 4.70% and mobility data that represent a surrogate of social activities during the pandemic is the most dominant variable that increase the MAE to 19.12% when excluded. Excluding temperature, humidity and day label are of comparable importance and lead to increase MAE to values of 8.44%, 7.66% and 8.04%, respectively. In some cases, it is required to have a long-term forecasting that demonstrate data beyond just a single day. The network architecture is designed to express estimation of K successive days. The training session is repeated with different architectures with K = 3, 7, 14 and 28 with all other parameters fixed as those shown in Fig. 8 and the number of EADs for 2020 is estimated once more. Figure 12 demonstrate results obtained from different value of K. The future K days are computed for each day from the beginning of 2020. Therefore, each day is presented with different values (1 to K). We demonstrate the average, minimum and maximum estimated daily values. It is clear from these data, that a good estimation can be achieved within small time period (e.g. 3 days), however, estimation error accumulates when the estimation period extended further. This is clear from the regions labeled with the dashed line ellipse in Fig. 12 . MAE associated with K = 3, 7, 14 and 28 is 7.32%, 7.79%, 8.02% and 8.46%, respectively. In this study, the numbers of people transported by ambulance and based on population activities were evaluated for the planning of the third wave of COIVD-19 and future pandemics. As one notable feature in Japan, the government did not lock down the city, but requested people to apply voluntary constraints. Nagoya is a primal city in the third largest area, following the Kanto (Tokyo) and Kansai (Osaka) areas, in Japan. In addition, ambulance use is free in Japan; thus, the number of EADs corresponds approximately to the real number of patients who need such transport. The number of patients transported by ambulance during the state of the emergency was generally smaller than that during previous years. This difference may be attributable to the significant reduction in the activities of adults; the percentage of teleworking in Japan is normally approximately 1%, whereas it was 30% in April 2020. As a discrepancy in the activities of the population, teleworking became common around the city center but not in the suburbs. Patients younger than 65 years in age should be well correlated with the activities of the population in the city center. Similarly, a reduction was observed even in the elderly. We then demonstrated that environmental factors such as the maximum temperature and relative humidity can be used to estimate the number of people transported by ambulance. During a pandemic, special care (e.g., disinfection) is required even for emergency services. During this particular pandemic, ambulances were disinfected when transporting potential COVID-19 patients, at least in Nagoya. Thus, resource management was critical during the pandemic, and maintaining the number of dispatches below a certain level has been essential. During the pandemic, the number of patients transported was reduced by 20% at maximum, whereas the amount of human activity around the central station was suppressed by 80%. The findings here will be useful to estimate or plan ambulance allocations. The second SoE in Japan was declared active from January 7 up to March 7, 2021 (tentative schedule). The average daily mobility reduction rate in the first and second SoEs 7 and the normal period in between at Nagoya main station was 63%, 29.1% and 18.8%, 7 data computed from January 8 to February 14, 2021 respectively. This indicates the positive response of public (in different scales) during emergency calls to voluntarily reduce the social activities. As a limitation of this study, the EAD data were not classified into specific diseases, some of which may be highly related to environmental factors (such as heat stroke or respiratory system failure), whereas others may not be related at the same level. However, splitting the data and applying a further generalization of the proposed model to handle this problem may remain as a future study. Moreover, a comparison with data obtained from different cities would provide a better understanding of the outlined framework. The source code used in this study including trained network will be provided publicly after publication. This study investigated the correlation between environmental factors such as ambient temperature, absolute humidity, and the daily number of EADs in Nagoya City, Japan. Data collected from April 2014 indicate a good correlation that may be potentially useful in future forecasting of required AED facilities based on weather data. A machine learning framework based on an LSTM network architecture was used for time-sampled forecasting, and interesting results were shown within a normal state. This finding presents for the first time an affordable method for estimating the number of EADs with environmental factors using the LSTM architecture. Moreover, a strong bias was recognized when forecasting the number of EADs required during the COVID-19 pandemic. To handle this problem, additional data indicating a reduction in mobile phone usage in major crowded areas, such as train stations, were used as a surrogate for the reduction of social activity during the pandemic. Including these data can significantly reduce the forecasting error during a time of uncertainty, such as during unexpected pandemics. Characteristics of and important lessons from the coronavirus disease 2019 (COVID-19) outbreak in china: Summary of a report of 72314 cases from the chinese center for disease control and prevention The italian health system and the COVID-19 challenge Critical supply shortages -the need for ventilators and personal protective equipment during the Covid-19 pandemic COVID-19 pandemic and comparative health policy learning in Iran World health organization declares global emergency: A review of the 2019 novel coronavirus (COVID-19) Proposed protocol to keep COVID-19 out of hospitals World Health Organization, Prevention, identification and management of health worker infection in the context of COVID-19 Responding to a cardiac arrest: Keeping paramedics safe during the COVID-19 pandemic COVID-19: What paramedics need to know! Paramedics and pneumonia associated with COVID-19 Fair allocation of scarce medical resources in the time of Covid-19 Population estimation technology for mobile spatial statistics Emergency ambulance dispatches and apparent temperature: A time series analysis The relationship between temperature and ambulance response calls for heat-related illness in Impacts of temperature change on ambulance dispatches and seasonal effect modification Effects of high ambient temperature on ambulance dispatches in different age groups in Fukuoka Impact of extreme temperatures on ambulance dispatches in London Joint effects of heatwaves and air quality on ambulance services for vulnerable populations in Perth, western Australia The impact of extreme heat and heat waves on emergency ambulance dispatches due to external cause in Shenzhen, China Development of prehospital, population-based triage-management protocols for pandemics The potential impact of enhanced hygienic measures during the COVID-19 outbreak on hospital-acquired infections: A pragmatic study in neurological units Potential indirect effects of the COVID-19 pandemic on use of emergency departments for acute life-threatening conditions-United States A novel connectionist system for unconstrained handwriting recognition Adam: A method for stochastic optimization