key: cord-0985139-0ok57r3i authors: Jamshidi, Babak; Jamshidi Zargaran, Shahriar; Bekrizadeh, Hakim; Rezaei, Mansour; Najafi, Farid title: Comparing Length of Hospital Stay during COVID-19 Pandemic in the USA, Italy, and Germany date: 2021-03-18 journal: Int J Qual Health Care DOI: 10.1093/intqhc/mzab050 sha: 5e11be3d2b9254ae38fffc4186edba025f12edd3 doc_id: 985139 cord_uid: 0ok57r3i BACKGROUND: COVID-19 is the most informative pandemic in history. These unprecedented recorded data give rise to some novel concepts, discussions, and models. Macroscopic modeling of the period of hospitalization is one of these new issues. METHODS: Modeling of the lag between diagnosis and death is done by using two classes of macroscopic analytical methods: the correlation-based methods based on Pearson, Spearman, and Kendall correlation coefficients, and the logarithmic methods of two types. Also, we apply eight weighted average methods to smooth the time series before calculating the distance. We consider five lags with the least distance. All the computations are conducted on Matlab R2015b. RESULTS: The length of hospitalization for the fatal cases in the USA, Italy, and Germany are 2–10, 1–6, and 5–19 days, respectively. Overall, this length in the USA is two days more than in Italy and five days less than in Germany. CONCLUSION: We take the distance between the diagnosis and death as the length of hospitalization. There is a negative association between the length of hospitalization and the case fatality rate. Therefore, the estimation of the length of hospitalization by using these macroscopic mathematical methods can be introduced as an indicator to scale the success of the countries fighting the ongoing pandemic. Up to the end of May 2020, over 6 million people have been infected by COVID-19, and around three-eighths of a million people have died of this infectious disease [1] . The epidemic COVID-19 is the most informative pandemic throughout history. These unprecedented recorded data give rise to some unprecedented concepts, discussions, and models. Modeling of the period of hospitalization by a macroscopic approach is one of these novel issues. At first, throughout the paper, by the word "hospital", we mean: "The hospital is an integral part of social and medical organization, the function of which is to provide for the population complete health care, both curative and preventive, and whose out-patient services reach out to the family in its home environment. The hospital is also a center for the training of health workers and bio-social research" [2] . Accordingly, all the hospitals serving whole COVID-19 services are acceptable regardless of the size, quality, and level of facilities. Governments and health care authorities around the world are seeking evidence to evaluate their performances. They regularly explore the best implementation strategy for quality indicators and quantify the effect of quality indicators as a tool to improve the quality of hospital care [3] . Interest in comparative quality measurement and evaluation has grown considerably over the past three decades [4] . Introducing new scales improves both quality indicators and comparative quality measurements. Undoubtedly, the length of hospitalization is a scale representing the function of the health system of countries. Normally, we associate more length of hospitalization of fatal cases to the superiority of care for those admitted patients if the demographic and background variables of such patients are similar. The present paper is the first attempt to model the length of hospitalization for fatal cases of SARS-COV-2 using a macroscopic method. There are two main approaches to study indicators like the length of hospitalization: microscopic and macroscopic. The microscopic approach is based on detailed data, while the macroscopic studies rely on a restricted piece of information -about the population, not about the individuals. Therefore, we classified the studies such as the systematic review of Rees et al. [5] and the 52 studies included in Table 1 [5] as microscopic. Our analysis is a macroscopic study because we base it on two available general variables of the intended countries: the number of confirmed cases and the number of deaths. On the one side, the microscopic methods are more accurate and more reliable. If it is possible to conduct a detailed study like the cohort, we prefer the microscopic approach. On the other side, the macroscopic studies are faster and -both temporal and budgetary-more economical, therefore suitable for the countries poor in data. Particularly, this method is much more effective in case we want to conduct a correlational research project regarding some communities. For instance, to conduct the research entitled by the title of the present paper by a microscopic approach, we must collect data from the patients hospitalized in at least 20-30 hospitals in the three countries over a two or three-month period. It is noticeable that the key novelty of this study is adopting a macroscopic approach to deal with the characteristics of the pandemic. For this study, we investigate the data from three major epicenters [6] with highquality data collection systems: Germany, Italy, and the USA. In addition, they are remarkable from different points of view: the USA with the highest number of tests, confirmed cases, and deaths, Italy with the greatest case-fatality rate, and Germany with the most percentage of recovery worldwide [1] . Modeling the lag between the dates of diagnosis and death is done using two classes of macroscopic methods: -The correlation-based methods based on Pearson [7] , Spearman [8] , and Kendall [9] correlation coefficients [10] and -Three logarithmic methods of two types [11] . Applying the concept of cross-correlation to find the lag in periodic series has a long history. It is frequently applied in pattern recognition, single particle analysis, electron tomography, averaging, cryptanalysis, and neurophysiology. The study of relationships between simultaneous time series, particularly those involving continuous human perceptions and performance, has been ongoing in many fields of medical sciences like psychology for several decades. Many researchers have applied the methods of the first class in medical fields to find the delay between the control and response variables [12] [13] [14] [15] [16] . There are different reports on the length of hospitalization in the countries of interest, and we use these studies to evaluate the accuracy of our correlational methods. Regarding Germany, the average time from the first symptoms to death was 14 days [17] . Also, the report of IHME (2020) showed that the length of hospitalization for fatal cases in Italy was 1-2 days less than in the USA, and the averages for both countries were around 10 days [18] . It has been estimated that the mean length of hospitalization for fatal cases in the USA was 15 days [19]. Besides, hospital stays lasted an average of 10.7 days for survivors and 13.7 days for non-survivors [20] [21] . Finally, the median length of stay in the ICU has been reported to be approximately five days for patients who survive COVID-19, and six days for those who do not survive [22] . Considering three to four days for the distance between confirmation and transformation to the intensive care unit, the estimation of the length of hospitalization is between 9 and 10 days. Finally, the questions that we address are: -What are the findings from macroscopic analysis methods about the length of hospital stay in Italy, the USA, and Germany? -Which macroscopic method is the most consistent with the reports? The data are collected from the website Worldometer [1] , and all calculations are done using Matlab R2015b. Figure 1 illustrates the rationale behind the macroscopic method. The diagram shows that the cases that enter hospitals on a specific day will die − 2 to + 2 days later with the probabilities −2 to +2 , respectively.
Initially, we need to justify the presumptions using some data as evidence. Unfortunately, It is not so simple to show the truth of Presumptions 2 and 3. Because most of the studies report the total and the current number of hospitalizations in which it is impossible to separate them into the proportions of discharges, recoveries, deaths, and new entrances. Moreover, almost all reports, aiming to find fatality per case, address the proportion of deaths in a group of cases admitted in some hospitals. Therefore, it is a worldwide lack of information on the published reports regarding the total count of deaths in the countries. Scatterplots of Figure 2 which is the basis of our mathematical model. Notice that since we can use smoothing methods, and these methods are able to collect and concentrate normally distributed data in the central points, it is possible to ignore the fourth presumption. Consequently, modeling of the lag between diagnosis and death is done by using two classes of methods: -The correlation-based methods based on Pearson [7] , Spearman [8] , and Kendall [9] correlation coefficients [10] and -The logarithmic methods, including three methods [11] . Algorithm A presents the methods of the former class, and Algorithms B and C describe the first and second types of the latter class, respectively. The difference between Algorithms B and C is in the priority of standardization or division. Whether Algorithm B defines the logarithmic method 1 or the logarithmic method 2 depends on the division by the standard deviation or mean of the deviations. Algorithm A (the Pearson / Spearman / Kendall correlation-based method) [11] A.1. i=1. Algorithm C (the logarithmic method 3) [11] To delete the noises and smooth the curves, we can use some weighted averages of the 3, 5, or 7 nearest points based on: -Uniform weights: 1 , 1 , 1 , … , 1 . (On Table 1 , (3), (5) , and (7) denote the average of 3, 5, and 7 nearest points, respectively) -Geometric weights: (1/ , … , 1/ 2 , 1 , 1, 1 , 1/ 2 , … , 1/ ) (On Table 1 , (1/2).3, (1/2).5, (1/3).3, and (1/3).5 denote the average calculated from exponential weights founded on the base ½ of 3 and 5 nearest points, and the base 1/3 of 3 and 5 nearest points, respectively) [11] . It is noticeable that Algorithms B and C are applicable to positive series. Hence, we can apply them for the data of the USA, Italy, and Germany since dates 4 March, 22 February, and March 13, 2020, respectively. Similarly, we calculate the correlation-based scales for the pair of series after the aforementioned dates. After smoothing and applying the algorithms, we record the five fittest lags (Table 1) . It is worth noting that "lag" means the number of days needed to be passed from diagnosis to death, and we study the lags 1 to 25. Table 1 summarizes the results of applying six methods for calculating similarity and nine smoothing methods (including real data). Table 1 . The ranking based on the distance of the two series as a function for the lags The correlation-based algorithm introduces 1-7, 1-5, and 5-14 days as candidates for the lag in the USA, Italy, and Germany, respectively ( Table 1 ). The logarithmic methods 1 and 2 work similarly. They estimate the lags 4-8, 2-6, and 6-19 days as alternatives for the USA, Italy, and Germany, respectively. According to the third logarithmic method, the lags 1-12, 1-12, and 6-19 days have the most probability to be the delay between two variables of the USA, Italy, and Germany, respectively. Overall, for the USA, Italy, and Germany, the most frequent lags in order of frequency are (6, 5, 7, 4, 8) , (4, 5, 3, 6, 2), and (12, 11, 13, 10, 9) , respectively. Generally, the calculated lags using the logarithmic methods are greater than the lags obtained by the correlation-based methods. Figures 3, 4 , and 5 illustrate the comparison of the methods for scaling similarity and the regions under study.
days, and 10 days have the most probability. The lag plot of Germany is different from the other two countries. The mode is around 12, and the curve of the similarity is almost uniform, from a six-day lag to a thirty-day lag.
Figure 5 . The frequency of the calculated lags between the confirmation and death for lags days based on the nine smoothing methods and six methods for calculating similarity for Italy (5.A), the USA (5.B), and Germany (5.C) Figure 5 shows that the USA and Italy have the same domain for the calculated lags. The difference between their bar plots is that the shape of Italy's one tends toward the left. Moreover, the patterns of the USA and Germany are similar, except for the six-day shift. It is observable that the histograms of Germany and the USA are approximately normal, but the graph of Italy is skewed. Overall, for the USA, Italy, and Germany, the most frequent lags in order of frequency are (6, 5, 7, 4, 8) , (4, 5, 3, 6 , 2), and (12, 11, 13, 10, 9) , respectively. The approximately normally distributed lags of the USA and Germany are similar except for the six-day shift, while the graph of Italy is skewed. Enormous factors are affecting the relationship between the number of confirmed cases and the number of deaths from a viral disease including the count of hospitalized cases, the quality of care in each country, the background risk of patients such as age and other health conditions, the preparedness of health systems, the ratio of patients to nursing staff, and the number of available ICU beds. We did not discuss these factors, and roughly assumed that the mentioned factors are overall the same for the countries under study. Our macroscopic analysis was based on two available general variables of the intended countries: the number of confirmed cases and the number of deaths. In comparison with detailed studies, the macroscopic studies are faster, -both temporal and budgetary-more economical, and less accurate. Therefore, this approach is suitable for countries poor in data. Particularly, this method is much more effective in case we want to conduct a correlational research project regarding some communities. Considering the decreasing trend of the severity of the disease, conducting the reported studies before May 2020, the distance between the start of hospitalization and confirmation, and the statistical probabilities, the average of the lag between diagnosis and death for fatal cases in the USA, Italy, and Germany are 11-13, 9-12, and 15-18 days, respectively. In addition, to justify the skewness of the plot of Italy -in contrast to the normal plots of Germany and the USA-, the older age structure and the full-capacity and beyond-capacity attacks on healthcare systems in some regions may be helpful. Based on our model, the skewness in the plot of the number of deaths is also accompanied by skewness in the plot of cases; therefore, it does not affect our analyses. Since we can use smoothing methods, and these methods are able to collect and concentrate normally distributed data in the central points, it is possible to ignore the fourth presumption of the model. The macroscopic studies are faster and -both temporal and budgetary-more economical therefore suitable for the countries poor in data. From a public health perspective, the new macroscopic method is insightful and helps the policy-makers compare their healthcare systems with those of the other involving countries and make some decisions including getting advice from the more successful countries. In this case, the Italian and American authorities may look for the reasons for the superiority -according to the introduced indicators-of the German system. Finally, if there is a possibility to collect data from most patients and hospitals, and there is enough money and time available, it is preferable to adopt microscopic approaches to solve similar issues. If we consider only the lag with the highest similarity, the function of the logarithmic methods 1 and 2 is better than the alternatives. Alternatively, if we take some lags into account or ignore the first day as the solution, the third logarithmic method works much better than the others. Finally, it is noticeable that we take the distance between diagnosis and death as the length of hospitalization. In addition, there is a negative association between the calculated length of hospitalization and case fatality rates. None to report. We have no conflict of interest to declare. Tables Table 1. The The cases that enter hospitals on the ( − ) −th day will die on one of the days − 2, − 1, , + 1, and + 2 with the probabilities −2 , −1 , , +1 , and +2 , respectively. 2-World Health Organization Using quality indicators to improve hospital care: a review of the literature Beyond the initial indicators: lessons from the OECD Health Care Quality Indicators Project and the US National Healthcare Quality Report COVID-19 length of hospital stay: a systematic review and data synthesis Mathematical modeling the epicenters of coronavirus disease-2019 (COVID-19) pandemic Notes on regression and inheritance in the case of two parents The proof and measurement of association between two things On triangular inequalities of correlationbased distances for gene expression profiles. bioRxiv Some new methods to estimate the lag between two related variables The analysis of dependencies between series in psychological experiments Functional connectivity between brain stem midline neurons with respiratory-modulated firing rates The referential dynamics of cognition and action Functional connectivity between cerebellum and primary motor cortex in the awake monkey How high and long will the COVID-19 wave be? A data-driven approach to model and predict the COVID-19 epidemic and the required capacity for the German health system Incidence, clinical outcomes, and transmission dynamics of severe coronavirus disease 2019 in California and Washington: prospective cohort study ICNARC report on COVID-19 in critical care. London: Intensive Care National Audit and Research Centre 24-The King's Fund Website We thank the reviewers for their thorough review and highly appreciate the comments and suggestions, which significantly contributed to improving the quality of the publication. Also, we are grateful to Mohsen Kakavandi, Azad Sheikhi, and Goodarz Alinia (MA in English) for helping us to write better in English.