key: cord-0739713-9pn30z0k authors: Wang, Z.; Zheutlin, A. B.; Kao, Y.-H.; Ayers, K. L.; Gross, S. J.; Kovatch, P.; Nirenberg, S.; Charney, A. W.; Nadkarni, G. N.; O'Reilly, P. F.; Just, A. C.; Horowitz, C. R.; Martin, G.; Branch, A. D.; Glicksberg, B. S.; Charney, D. S.; Reich, D. L.; Oh, W. K.; Schadt, E. E.; Chen, R.; Li, L. title: Analysis of hospitalized COVID-19 patients in the Mount Sinai Health System using electronic medical records (EMR) reveals important prognostic factors for improved clinical outcomes date: 2020-05-04 journal: nan DOI: 10.1101/2020.04.28.20075788 sha: 14ef141c02507d0f373397762cba1e27e42544a0 doc_id: 739713 cord_uid: 9pn30z0k COVID-19 is a novel threat to human health worldwide. There is an urgent need to understand patient characteristics of having COVID-19 disease and evaluate markers of critical illness and mortality. Objective: To assess association of clinical features on patient outcomes. Design, Setting, and Participants: In this observational case series, patient-level data were extracted from electronic medical records for 28,336 patients tested for SARS-CoV-2 at the Mount Sinai Health System from 2/24/ to 4/15/2020, including 6,158 laboratory-confirmed cases. Exposures: Confirmed COVID-19 diagnosis by RT-PCR assay from nasal swabs. Main Outcomes and Measures: Effects of race on positive test rates and mortality were assessed. Among positive cases admitted to the hospital (N = 3,273), effects of patient demographics, hospital site and unit, social behavior, vital signs, lab results, and disease comorbidities on discharge and death were estimated. Results: Hispanics (29%) and African Americans (25%) had disproportionately high positive case rates relative to population base rates (p<2e-16); however, no differences in mortality rates were observed in the hospital. Outcome differed significantly between hospitals (Gray's T=248.9; p<2e-16), reflecting differences in average baseline age and underlying comorbidities. Significant risk factors for mortality included age (HR=1.05 [95% CI, 1.04-1.06]; p=1.15e-32), oxygen saturation (HR=0.985 [95% CI, 0.982-0.988]; p=1.57e-17), care in ICU areas (HR=1.58 [95% CI, 1.29-1.92]; p=7.81e-6), and elevated creatinine (HR=1.75 [95% CI, 1.47-2.10]; p=7.48e-10), alanine aminotransferase (ALT) (HR=1.002, [95% CI 1.001-1.003]; p=8.86e-5) and body-mass index (BMI) (HR=1.02, [95% CI 1.00-1.03]; p=1.09e-2). Asthma (HR=0.78 [95% CI, 0.62-0.98]; p=0.031) was significantly associated with increased length of hospital stay, but not mortality. Deceased patients were more likely to have elevated markers of inflammation. Baseline age, BMI, oxygen saturation, respiratory rate, white blood cell (WBC) count, creatinine, and ALT were significant prognostic indicators of mortality. Conclusions and Relevance: While race was associated with higher risk of infection, we did not find a racial disparity in inpatient mortality suggesting that outcomes in a single tertiary care health system are comparable across races. We identified clinical features associated with reduced mortality and discharge. These findings could help to identify which COVID-19 patients are at greatest risk and evaluate the impact on survival. Coronavirus disease 2019 (COVID-19) is a global pandemic that has already infected over 2 million individuals, including over 630,000 in the US as of Apr 15, 2020 1 . Given recent reports on the high rate of COVID-19 infections among those who remain asymptomatic, however, the true rate of infection is expected to be significantly higher than reported 1,2 . More than 26,000 US residents have died, the majority in epicenters like New York City (NYC), where 118,302 cases and 8,455 deaths have occurred to date 1 . Mortality rates have been disproportionately high among Hispanic and African American individuals. In NYC, death rates for these groups are nearly double those in Caucasians or Asians, but the factors contributing to this racial disparity remain unclear 2 . Furthermore, reducing mortality among all critically ill individuals is the highest priority, although some clinical risk factors have been noted in recent publications 3, 4 , there remains much to learn about which patients are at highest risk, what factors are most indicative of disease progression and prognosis, and which interventions may be effective. A systematic review of studies predicting coronavirus-related outcomes concluded that all of the publications were biased in some way, limiting their utility in practice 3 . They noted that prognostic models often excluded patients for which no outcome was yet determined (e.g., patients that had neither recovered nor died) leading to selection bias, used relatively small samples (e.g., 26-577 patients) increasing risk of overfitting, or in many cases did not use features or timepoints that could be measured prospectively (e.g., last available vital sign) 3 . Furthermore, none of these studies were conducted in the US where population factors, healthrelated behaviors, cultural differences, and hospital standard of care protocols may be different 4 . Indeed, case characteristics among hospitalized patients in China seem meaningfully different than those in the US 5 -for example, mortality and certain mortality rates were much lower. There have been several recent descriptive reports of COVID-19 clinical characteristics among patients admitted to US hospitals 6 , including an ongoing, population-based report from the Center for Disease Control and Prevention 7 and another from NYU Langone Health 5 . While data is mounting regarding racial disparities and COVID-19 5 most of these studies have not offered in depth analysis by race to investigate racial disparities in mortality. Since no proven effective . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 4, 2020. . https://doi.org/10.1101/2020.04.28.20075788 doi: medRxiv preprint therapies exist for COVID-19 6 , up-to-date clinical outcomes and data on factors influencing risk for mortality and recovery over time are urgently needed. Given the high mortality rate in NYC and the uncertainty of COVID-19 progression during inpatient stays, accurately evaluating prognosis of both mortality and discharge among hospitalized individuals is critically needed. Identifying which patients are at highest risk for mortality will enable clinicians to target interventions, allocate resources, and make more informed triage decisions. Highlighting factors most associated with mortality will also help prioritize factors to monitor during hospital admission. Prognostic models have the potential to improve the standard of care for COVID-19, open opportunities for testing investigational drugs in clinical trials, and guide clinical acumen. In this study, the largest and most racially diverse US-based COVID-19 case series to date, we provided descriptive statistics on laboratory-confirmed cases (N = 6,158) and hospitalized patients (N = 3,273) from the Mount Sinai Health System (MSHS). In particular, we investigated the impact of race on mortality given the racial disparities in mortality rates observed in the US 2 . We also evaluated the effects of demographics, social behavior, and clinical variables on mortality and recovery among hospitalized patients admitted to one of MSHS hospitals in Manhattan, Brooklyn, and Queens. As in previous studies [5] [6] [7] , we considered disease comorbidities, vital signs, and lab results for COVID-19 disease and potential to improve patient outcomes. We reported hazard ratios for all risk factors estimated using a competing risk model that simultaneously considered two outcomes resulting in hospitalization termination, death and discharge. We also developed a prognostic model using exclusively data available at baseline, the first such model in a diverse NYC with diverse population. Together, we believe these estimates can help inform stakeholders which COVID-19 patients are at greatest risk for poor outcomes and evaluate the impact on survival. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 4, 2020. . https://doi.org/10.1101/2020.04.28.20075788 doi: medRxiv preprint This study utilized the de-identified EMR data and was exempted from the Mount Sinai Institutional Review Board due to the COVID-19 pandemic. We obtained de-identified data from the electronic medical record (EMR; EPIC Systems) via the Mount Sinai Data Warehouse through April 15, 2020. The EMR dataset included patients with at least one encounter at a Mount Sinai facility who had been diagnosed with COVID-19, who were under investigation for COVID-19 or who had been screened negative for COVID-19. Patients included in the dataset were those who had an encounter (either in person or virtual) at a Mount Sinai facility in which a COVID-19 test was ordered or a COVID-19 related diagnosis, including suspected diagnosis like 'Suspected 2019-Ncov diagnosis', was given. In total, 28,336 patients were included in the de-identified EMR dataset. Demographics including age, sex, race, ethnicity and smoking status, as well as disease comorbidities were extracted from the EMR. Comorbidities were defined as the presence or absence of the following chronic conditions recorded as "Active" in the Epic Problem List: diabetes mellitus, hypertension, asthma, chronic obstructive pulmonary disease (COPD), human immunodeficiency virus (HIV) infection, obesity and cancer. For each encounter, initial measurements of vital signs including BMI, temperature, systolic blood pressure (BP), diastolic BP, O2 saturation, heart rate and respiratory rate were provided. Laboratory test orders and results throughout these encounters were also extracted; common lab test orders included complete blood count (CBC) and differentials, metabolic panels, blood lactate dehydrogenase (LDH), ferritin, fibrin degradation dimer (D-Dimer), serum procalcitonin, hepatic function panel, blood culture, fibrinogen, C reactive protein (CRP). Since vital signs can shift during a single encounter, we also extracted the maximum temperature and minimum O2 saturation for each encounter, as well as the length of stay for inpatients. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 4, 2020. A confirmed case of COVID-19 was defined as a positive test result from real-time reverse- transcriptase-polymerase-chain-reaction (RT-PCR) assay nasopharyngeal swab specimens. A patient under investigation (PUI) of COVID-19 was clinically defined as patients who experienced 1) fever and/or cough, shortness-of-breath (SOB), sore throat, nasal congestion not related to typical seasonal allergies or 2) fever and/or cough and a history of exposure to COVID-19. Collectively, our study included 6,158 COVID-19 patients and 428 PUIs with confirmed positive or presumptive positive COVID-19 RT-PCR test results. We compiled a background population from the MSHS EMR as a demographic reference for the residents in the areas of NYC served by Mount Sinai. We selected all patients with encounters of any types (outpatient, inpatient, emergency) recorded in the MSHS EMR since 2016 and retrieved their self-reported races, ethnicities and age in years. Patients with unknown race/ethnicity information were excluded from this analysis. In total, the background population was composed of 1.6 million people, with 48.1% white, 21.6% African American/Black, 15.3% other, 8.5% Asian/Pacific Islander and 6.4% Hispanic/Latinos (Table 1) . Next, we tallied the numbers of SARS-CoV-2 infected patients defined previously and those who were deceased as of April 15, 2020 (Table 1) . To test for racial disparities in positive test frequencies and COVID-19 mortality, we compared observed rates of COVID-19 diagnosis and mortality relative to the MSHS race frequencies. We performed a Chi-square test of independence of variables in contingency tables under the null hypothesis that the COVID-19 diagnosis rates or mortality rates were not different among the five racial groups. We also fitted a multivariate logistic regression (implemented in 'glm' function in R version 3.6.1) to adjust for confounding variables such as age on diagnosis rates. Finally, competing risks survival analysis was employed to analyze the mortality rate and discharge rate over time. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 4, 2020. We included COVID-19 patients (N=3,061) and PUIs (N=212) with confirmed positive or presumptive positive COVID-19 RT-PCR test results who were admitted as inpatients and stayed at least one day in the hospital. We recorded the durations of hospital stay and the number of days since SARS-CoV-2 positive. The diagnosis day was defined as the date when the COVID-19 test with positive results was ordered. Three possible outcomes were defined for our hospitalized COVID-19 cohort: in-hospital death (deceased), discharged to home or other locations not concerning intensive medical care (recovered), and continued hospitalization (rightcensored). For individual patients, we extracted all hospital encounters up to April 15, 2020 that were related to COVID-19 diagnostic tests or hospital admissions. We were first interested in observing any demographic or hospital site-or specialty-specific effects on mortality and discharge individually over time. To do this, we estimated the cumulative incidence functions (CIFs) for in-hospital death and discharge using a univariate competing risks survival analysis for each covariate individually: race, sex, hospital, and care area within the hospital. Next, we assessed which factors were associated with death among hospitalized patients. For this analysis, we only included subjects with known outcomes (death: N = 742; discharge: N = 1,706). We fitted a multivariate logistic regression on clinical outcome (deceased=1, recovered =0) using duration of stay, demographic factors, vital signs, comorbidities, care in ICU unit, and common laboratory tests ordered at time of hospital admission. To assess the impact of clinical variables on survival, we modeled the outcomes of hospitalized COVID-19 patients using competing risks survival analyses, which treats the two distinctive outcomes, in-hospital death and cured as two competing causes for the same event, i.e. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 4, 2020. . https://doi.org/10.1101/2020.04.28.20075788 doi: medRxiv preprint termination of hospitalization. Competing risks models are recommended over Kaplan-Meier survival analysis for studying events with multiple underlying causes [7] [8] [9] . More formally, we denote the duration of hospital stay since diagnosis as ! and the total duration from diagnosis to termination of hospitalization as ! ! . The goal of competing risks survival analysis is to estimate the CIF for each individual cause k. CIF is a function of time defined as the probability of a patient who stayed at hospital for a duration of ! ! : , where ((!) is the survival function and ℎ " (!) is the hazard function for the cause -. To estimate the CIFs for in-hospital death and recovery for COVID-19 patients, we used the "cuminc" function in the R package "cmprsk". Gray's test 10 was conducted to determine if there were statistical differences among CIFs corresponding to subgroups of patients, under the null hypothesis that the CIFs under consideration were not different from each other. To identify covariates associated with the two clinical outcomes -represented in the covariate vector . -we performed multivariate statistical analyses to estimate the contribution of each potential covariate to the cause-specific hazard function ℎ " (!) as follows: is the baseline hazard of case -, and β ",' is the coefficient for covariate 8. When estimating the β ",' for competing risks analyses, we applied two families of models: causespecific hazard models and subdistribution hazard models 11 . Cause-specific hazard models were estimated using the Cox proportional hazard approach 12 using the "coxph" function in the R package "survival" by treating one cause as the event and the other as right-censored, whereas subdistribution hazard models were fitted using the "cmr" function in the R package "cmprsk" 13 . The R software used throughout this research is in version 3.6.1. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 4, 2020. . https://doi.org/10.1101/2020.04.28.20075788 doi: medRxiv preprint To help with prediction, we performed the same analysis used for estimating effects on survival described already ("Multivariate analyses on inpatient mortality: Competing risk survival analysis"), but limited features to only those available at baseline when patients were admitted to the hospital. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 4, 2020. . https://doi.org/10.1101/2020.04.28.20075788 doi: medRxiv preprint Our cohort included 28,336 individuals tested (Table 1) We investigated the impact of race and age on COVID-19 diagnosis rates and mortality. We then assessed the influence of a variety of factors on inpatient mortality. First, we estimated the individual effect of race, sex, hospital, and specialty unit on inpatient mortality over time. Next, we identified the strongest predictors of inpatient mortality considering all possible indicators regardless of time, which can help prioritize clinical features to monitor. Then, to assess the impact of each feature on survival, we estimated hazard ratios from a competing risks survival analysis. Finally, to offer a prognostic tool for prioritizing highest risk patients, we used baseline information to forecast mortality. We found Hispanics and African Americans were over-represented in the SARS-CoV-2 infected cohort, accounting for 28.5% and 24.6%, respectively, of all the infected patients with known self-reported races, which were both significantly higher rates than expected based on the population base rate (Chi2=5340.5; p < 2e-16; Table 1 ). We also noted the age distributions of infected patients within each race were different relative to our reference population (Fig. S1A) . Furthermore, the Caucasian deceased cohort shows a different age distribution from the other race groups. Its density increases continuously as age increases, forming a triangle shape, while African Americans shows the highest density at the age of 74 years ( Fig S1B) . However, even after adjusting for age in logistic regression, we found that Hispanics, African Americans and people identified as other races have significantly higher odds of being infected by SARS-CoV-2 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 4, 2020. . https://doi.org/10.1101/2020.04.28.20075788 doi: medRxiv preprint compared to Caucasian individuals (Table S1 ). Age-adjusted COVID-19 diagnosis rates in Caucasian and Asian American groups were not statistically different. Together, this has clearly demonstrated that some minority groups, including Hispanics and African Americans, were at higher risk of being infected by SARS-CoV-2 in the New York metropolitan area served by the MSHS. Interestingly, we found a bimodal age distribution in infected cohorts of each race, with the first peak in the age group of 20-40 years and second peak in the age group of 50-70 ( Fig S1A) . Within the deceased population, Caucasians showed a triangular shape of age distribution with higher density when age increases, while African Americans showed the highest density around the age of 75 years ( Fig S1B) . Importantly we found no differences in mortality rates across all racial groups (p =0.056; Table 1 ). We also found that older age increased risk for mortality ( Fig. 1) , as has been previously reported elsewhere 3,4,6 . Furthermore, we did not find an effect of race on clinical outcome after adjusting for underlying covariates (Fig. 3) , which further suggests racial parity in terms of inhospital mortality in our cohort. To identify factors involved in the progression and prognosis of the disease, we focused on COVID-19 patients admitted to the hospital. Our hospitalized COVID-19 cohort contains 3,273 patients with at least one day in the hospital, among whom 742 died, 1,706 were discharged (presumed recovered), and 825 were still hospitalized as of April 15, 2020 (Table 2) . We summarize the demographic features, comorbidities, vital signs and laboratory tests at admission, as well as the distribution of patients by hospital site, care in the intensive care unit (ICU) area, and number of days from diagnosis to discharge or last follow-up (Table 2) . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 4, 2020. . https://doi.org/10.1101/2020.04.28.20075788 doi: medRxiv preprint Effects of demographics, hospital site, and care area on time to death or discharge. To assess the effects of individual covariates on clinical outcomes over time, we estimated the cumulative incidence functions (CIFs) for in-hospital death and discharge using a univariate competing risks survival analysis. In our full cohort, we estimated that by 10 days postdiagnosis, patients have a 51.1% chance of being recovered and being discharged from the hospital and a 21.7% chance of death (Fig S2) . By grouping our cohort using different demographic and hospitalization factors, we found the differences in time-adjusted mortality among different racial groups (p=0.003, Gray's test) is more enriched in Caucasian due to its older age group (Fig. S1B ), but no differences in the CIFs of the recovered patients (p=0.408) ( Fig. 2A) . We also found females have better outcomes compared to males; they have trend with lower mortality rates (p=0.064) and significantly higher discharge rates (p=2.36e-4) over time ( Fig. 2B) , which translates to shorter in-hospital stays. Hospitalized patient outcomes also differed by hospital and care area types within hospitals, including ICUs, medical and surgical units, and other specialties (Fig. 2C-D) . We found that patients admitted to hospitals in Brooklyn and Queens experienced significantly worse outcomes (Gray's T=248.9; p<2e-16) compared to three hospitals based in Manhattan (Fig. 2C ). This can be attributed at least partly due to the relatively older COVID-19 cohorts in Brooklyn and Queens compared with the Manhattan hospitals (Table S2) . As expected, we also observed drastically worse outcomes for all patients admitted to ICU care areas compared to other care areas (Fig. 2D ). Multivariate associations between mortality and discharge. Next, we identified etiologic factors associated with death among hospitalized patients with known outcomes (death: N = 742; discharge: N = 1,706). We found many significant associations (Fig. 3 CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 4, 2020. Fig. 3 ). Since 'care in ICU area' showed the highest OR, we then divided our samples into patients treated in the ICU or not, as these groups have very different rates of mortality (Fig. 2D) (Fig. S5A) , whereas all the previously discovered significant covariates persist in patients not admitted to the ICU (Fig. S5B) . Competing risks survival analyses for inpatient mortality or discharge. We employed a competing risks survival analysis to estimate the effects of covariates on both mortality and discharge using two Cox proportional hazard models, yielding cause-specific hazard ratios (HR) for each of the two events across all etiologic factors under consideration. This allowed us to dissect the effect of those factors on 1) reducing in-hospital mortality rate, and 2) shortening hospitalization (early discharge). Survival analyses also allowed us to leverage the additional information for patients who were still hospitalized by treating them as right-censored (i.e. the final outcome of the patients cannot be determined by the end of the study period). The cause-specific hazards models showed advanced age significantly increased the risk of inhospital death (HR=1.05 [95% CI, 1.04-1.06]; p=1.15e-32) and decreased the probability of discharge (HR=0.98 [95% CI, 0.978-0.986]; p=4.07e-21) (Fig. 4) all increased the risk of in-hospital death while prolonging the hospital stay (Fig. 4) . Interestingly, we found that some etiologic factors only significantly influenced one outcome. For instance, BMI (death: HR=1.02, [95% CI 1.00-1.03]; p=0.021) showed a significant risk and . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 4, 2020. (Fig. 4) . Prognostic model to forecast mortality or discharge for inpatients. Given the large sample size, richness in clinical baseline measurements, and well-defined outcomes, we developed a prognostic model for COVID-19. Several significant predictors increased the risk for in-hospital death and decreased chance of recovery including age, BMI, oxygen saturation, elevated respiratory rate, WBC, creatinine and ALT (Fig. 5) . We additionally found that Hispanic and African American patients have slightly reduced mortality rates compared to people of other races when controlling for the initial baseline physiological measurements and comorbidities (Fig. 5A) . Moreover, we found history of cancer increased the length of hospitalization and delayed recovery, although it did not significantly increase risk of mortality (Fig. 5B ). This prognostic model may be of significant practical value for clinicians and hospital management. We also assessed many laboratory tests extracted for the hospitalized COVID-19 patients in addition to WBC, creatinine and AST on their association with mortality. These laboratory tests were only available for 30% to 75% of the patients in our hospitalized cohort, therefore, are not incorporated as covariates for the previous analyses to avoid potential availability biases and decreased sample size. To estimate their associations with mortality, we fitted individual multivariate regression models for each of these lab tests controlling for age, sex and race in the corresponding subsets of cohort where these lab tests are available at baseline. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 4, 2020. (Table S3 ). Using the largest and most racially diverse US case cohort to date, we have evaluated the impact of demographics and clinical characteristics on inpatient mortality in the Mount Sinai Health System. Among 6,158 positive or presumed positive diagnosed cases, 3,273 (50%) were admitted to one of five hospitals; of those admitted, 742 died (22%) and 1,706 recovered and discharged (52%) by the end of the study period on April 15, 2020. While we did observe higher rates of COVID-19 diagnosis among African American and Hispanic individuals, we did not observe any impact of race on mortality among inpatients. Consistent with previous reports 3,4 , we found that older individuals and men were at higher risk for mortality, as were critically ill patients cared for in the ICU. We also found that mortality varied by hospital. We identified many clinical features significantly associated with morality that may be important factors to monitor during hospital admission including respiration, temperature, heart rate, white blood cell count, creatinine, and ALT. We also estimated hazard ratios for survival, identifying oxygen saturation, ICU care, elevated creatinine and ALT as strong predictors of mortality. Finally, we developed a prognostic model to forecast risk for mortality using only baseline features, which we hope will help clinicians and hospitals identify individuals at highest risk earlier on in disease progression. Case prevalence in African Americans and Hispanics is disproportionately high in New York 14 , which is reflected in our cohort. However, in this cohort, the disparity in positive COVID-19 diagnosis rates did not translate in our cohort to any differences in mortality, suggesting that inpatient care in the hospital does not further this disparity. While this study cannot prove it, this data suggests that there are no intrinsic biological differences which explain the racial disparities in mortality. Differences in rates of positive cases may be due to true differences in infection rates, which is consistent with higher case density in areas like Brooklyn and Queens 15 , where . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 4, 2020. . https://doi.org/10.1101/2020.04.28.20075788 doi: medRxiv preprint relatively more African American and Hispanic individuals live 16 . These areas are also more densely populated (e.g., persons per household is higher relative to Manhattan), making viral isolation more challenging 16 . It is also consistent with racial differences in occupation -African American and Hispanic residents are less likely than residents of any other race to be able to work from home, according to the Bureau of Labor Statistics 17 . This puts these groups at increased risk of infection through elevated exposure. Positive case rate differences may also in part be due to limitations on number of available tests in high case density places. According to testing rates made available by NY State (https://github.com/nychealth/coronavirus-data) combined with population estimates 15 , 3.5 residents in Brooklyn and 2.7 residents in Queens per 100 have been tested, while 4.2 per 100 have been tested in Staten Island, for example. Test availability may have resulted in biasing testing among those residents to individuals more likely to be positive, thus artificially raising the positive rate by testing fewer individuals seen as less likely to be positive (e.g., asymptomatic individuals). Fewer tests also reduce the ability to track and contain the virus, so this may contribute to higher case rates, as well. While we are encouraged to report no in-hospital differences in mortality by race, the burden of mortality will remain disproportionately held by African American and Hispanic individuals until rates of infection can be targeted and reduced. We also found that where patients were treated affected their risk for mortality. Patients admitted to hospitals in Brooklyn or Queens were at higher risk than those at one of the Manhattan hospitals. The outer boroughs have consistently higher rates of comorbidities relevant to COVID-19 infection compared to Manhattan 18 . We found that average age in patients admitted in Brooklyn or Mount Sinai Queens was higher and oxygen saturation is lower than those in Manhattan (Table S2 ). Rates of comorbidities were also higher in Queens, as well (56% with any comorbidity versus 36-47% at other hospitals). Another contributing factor may be case prevalence differences by borough. There are nearly twice as many cases in Brooklyn (>31,000) and Queens (>36,000) -the counties where the Mount Sinai Brooklyn and Queens hospitals are located, respectively -than in Manhattan (>15,000) 15 . Hospitals in New York are already overwhelmed and struggling to provide sufficient staff and hospital resources to meet the need of the pandemic. It may be that the density of cases outside of Manhattan have put an additional strain on those hospitals or that only more severe patients are able to be admitted. This is an . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 4, 2020. . https://doi.org/10.1101/2020.04.28.20075788 doi: medRxiv preprint important health disparity for policymakers to address, as it may contribute to the disproportionate death of individuals living in certain neighborhoods. The strongest predictors of hospital termination -either death or discharge -in our data were older age, higher BMI, lower oxygen saturation, care in ICU, elevated creatine and ALT levels, and history of COPD (Fig. 4) . These findings were highly similar to descriptive differences reported between critically and non-critically ill hospitalized patients at NYU 3 , with the exception of COPD (not significant) and ALT (not reported). They also reported many significant differences in lab values, including C-reactive protein, d-dimer, ferritin, and procalcitonin. These lab tests were not routinely collected on all of our patients, however, among patients with available measures, we were able to corroborate their evidence that higher baseline measures of all aforementioned labs were associated with mortality (Table S3 ). Given the consistency of these findings, we suggest oxygen saturation, creatine, C-reactive protein, ddimer, ferritin, and procalcitonin are good targets to monitor throughout hospital admission as they are sensitive to clinical outcome across reported data. We additionally found that high temperature, abnormally high respiratory rate, high WBC, and history of asthma significantly lengthened hospital stays (Fig. 4) , suggesting that these vital signs and labs may also be of use for monitoring disease progression and severity. One major reason for high mortality rates from COVID-19 is that no proven effective therapies exist as yet 6 . Many new treatments have been quickly developed or adapted, and despite incomplete evidence of efficacy, have been incorporated into current clinical practice. For instance, heparin was administered as part of standard of care at Mount Sinai as of April 10, 2020 due to the negative impact of SARS-CoV-2 on coagulation and consequent professional recommendations 19 . Therefore, the efficacy of the use of heparin is limited based upon the data availability and subsequent follow up. While results from ongoing clinical trials are still unknown, there is an immediate need for any useful information on patient response to these pharmacologic treatments. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 4, 2020. . https://doi.org/10.1101/2020.04.28.20075788 doi: medRxiv preprint Finally, we discovered several significant baseline predictors increasing risk for in-hospital death: older age, higher BMI, lower oxygen saturation, elevated respiratory rate, elevated WBC, elevated creatinine and elevated ALT (Fig. 5 ). Assessing these patient characteristics immediately upon hospital admission may help identify individuals at the highest risk and help determine clinical action. We hope this prognostic model may be of significant practical value for clinicians and hospital management. Our study should be considered in light of many limitations. First, many patients were missing key inflammatory markers like C-reactive protein and d-dimer, so we could not include them in our multivariate models without losing significant power. We have provided results for these labs analyzed individually (Table S4) that are in line with other reports 3 , however future work incorporating these labs with other measures would yield a more comprehensive ranking of significant indicators of mortality. Additionally, clinical course trajectories of lab values and vital signs across hospitalization are likely to give valuable insight into disease severity and progression, but analysis at this level requires substantially more repeated measurements per patient than were available at the time of this study. We also expect outcomes to change as local outbreaks become contained (or not) and additional information on novel treatments is made available. However, given the urgent need for prognostic indictors amidst the ongoing pandemic, we believe this report, the largest and more diverse population to date, provides an important initial summary of clinical features associated with mortality and can facilitate risk assessment and care. While further studies are imperative, our ability to use modeling techniques to assimilate and analyze large data sets quickly and efficiently provide a complimentary approach as we await vital data from clinical trials. We anticipate that emerging reports across the country will be combined and compared, as health-related behaviors (both personal and mandated by state and local governments), as well as hospital protocols and resources vary considerably. In this study, we estimated the effect of key clinical characteristics on mortality among patients hospitalized in one of five Mount Sinai hospitals in New York, Brooklyn, and Queens. Based on these findings, first, we recommend considering for hospital admission patients with the following characteristics: older age, higher BMI, lower oxygen saturation, elevated respiratory rate, and elevated lab parameters (WBC, creatinine and ALT) as prognostic indicators for . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 4, 2020. . https://doi.org/10.1101/2020.04.28.20075788 doi: medRxiv preprint increased risk for mortality. Second, we identified changes in respiration, temperature, heart rate, white blood cell count, creatinine, and ALT as particularly important features to monitor during hospital admission to track risk for increased mortality. Together, we hope these estimates can help inform clinicians and hospitals early on which patients are at greatest risk, what ongoing clinical features track with disease progression. Administrative, technical, or material support: Li, Wang, Kao, Zheutlin, Kovatch, Nirenberg, Supervision: Li, Chen, Schadt . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 4, 2020. . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 4, 2020. . https://doi.org/10.1101/2020.04.28.20075788 doi: medRxiv preprint Tables Table 1 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 4, 2020. . Figure 1 . In-hospital mortality rates of COVID-19 patients break-down by self-reported races and age groups. In-hospital mortality rate is defined by the number of deceased patients during hospitalization divided by the total number of Covid-19 patients and PUIs in our cohort. The mortality rates of different age groups are plotted across different racial groups indicated in the legend. The 95% confidence intervals (CI) for the mortality rates across race groups are estimated using bootstrap by sampling the patients in that a rolling 10-year age window 500 times. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 4, 2020. . https://doi.org/10.1101/2020.04.28.20075788 doi: medRxiv preprint for individual covariates in logarithmic scale. The estimated HR and p-values are indicated in the tick labels for those covariates. Covariates with significant elevated HR (HR > 1 and p-value<0.05) or decreased HR (HR < 1 and p-value<0.05) are highlighted in red and blue, respectively. Error bars indicate 95% confidence intervals (CI). (A, B) . The estimated coefficients from the logistic regression model, also known as log odds, are plotted for the covariates. An intercept term was included in the model but excluded from the plot. Error bars indicate 95% CI. patients without ICU stays. The HR from the cause-specific hazard model with competing risks (death (A) and cured (B)) are plotted for individual covariates in logarithmic scale. The estimated HR and p-values are indicated in the tick labels for those covariates. Covariates with significant elevated HR (HR > 1 and p-value<0.05) or decreased HR (HR < 1 and p-value<0.05) are highlighted in red and blue, respectively. Error bars indicate 95% CI. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 4, 2020. . https://doi.org/10.1101/2020.04.28.20075788 doi: medRxiv preprint Fig. 1 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review) A B C D Fig. 2 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review) Fig. 3 A B Fig. 4 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review) A B . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review) Spread of SARS-CoV-2 in the Icelandic Population Hydroxychloroquine in patients with COVID-19: an openlabel, randomized, controlled trial. medRxiv Factors associated with hospitalization and critical illness among 4,103 patients with COVID-19 disease Hospitalization Rates and Characteristics of Patients Hospitalized with Laboratory-Confirmed Coronavirus Disease 2019-COVID-NET, 14 States COVID-19 and African Americans Pharmacologic Treatments for Coronavirus Disease 2019 (COVID-19): A Review Introduction to the analysis of survival data in the presence of competing risks Competing risks analyses: objectives and approaches How to handle mortality when investigating length of hospital stay and time to clinical stability A class of K-sample tests for comparing the cumulative incidence of a competing risk. The Annals of statistics A proportional hazards model for the subdistribution of a competing risk The cmprsk package. The Comprehensive R Archive network NYC Health COVID-19: Data An interactive web-based dashboard to track COVID-19 in real time kingscountybrooklynboroughnewyork,newyorkcountymanhattanboroughne wyork,queenscountyqueensboroughnewyork,richmondcountystatenislandboroughnewyor k/PST045218 ISTH interim guidance on recognition and management of coagulopathy in COVID-19 We would like to express our sincerest condolences to the patients and their families who underwent from the COVID-19 outbreak. We greatly appreciate the all medical staff who worked together to overcome the COVID-19 outbreak. We are grateful of research opportunity from Dr. Eric Nestler. We also thank the helpful comments from Dr. Emma Benn.