key: cord-0864002-hjih9vc3 authors: Gupta, Aashish; Kachur, Sergey M.; Tafur, Jose D.; Patel, Harsh K.; Timme, Divina O.; Shariati, Farnoosh; Rogers, Kristen D.; Morin, Daniel P.; Lavie, Carl J. title: Development and validation of a multivariable risk prediction model for COVID-19 mortality in the Southern United States date: 2021-09-17 journal: Mayo Clin Proc DOI: 10.1016/j.mayocp.2021.09.002 sha: 3efba3eda1a4592b11dbf9713c89d11a3edee985 doc_id: 864002 cord_uid: hjih9vc3 Objective To evaluate clinical characteristics of COVID-19 admitted patients in Southern United States and development as well as validation of a mortality risk prediction model. Patients and methods Southern Louisiana was an early hot-spot in the pandemic, which provided a large collection of clinical data on inpatients with COVID-19. We designed a risk stratification model to assess admitted COVID patients’ mortality risk. Data from 1673 consecutive patients diagnosed with COVID-19 infection and hospitalized between 03/01/2020 to 04/30/2020 was used to create an 11-factor mortality risk model based on baseline comorbidity, organ injury, and laboratory results. The risk model was validated using a subsequent cohort of 2067 consecutive hospitalized patients admitted between 06/01/2020 to 12/31/2020. Results The resultant model has an area under the curve of 0.783 (95% confidence interval 0.76-0.81), with an optimal sensitivity of 0.74 and specificity of 0.69 for predicting mortality. Validation of this model in a subsequent cohort of 2067 consecutively hospitalized patients yielded comparable prognostic performance. Conclusion We have developed an easy-to-use, robust model for systematically evaluating patients presenting to acute care settings with COVID infection. Severe Acute Respiratory Syndrome Coronavirus 2 (SARS CoV-2) which causes Coronavirus disease 2019 , is the third coronavirus this century to cause severe illness in humans (after the more limited outbreaks of SARS-CoV and Middle East respiratory syndromecoronavirus in the last two decades). Case fatality rates have ranged from 0.1% in Singapore to 16% in Belgium of the diagnosed infections. True Infection fatality rates are believed to be much lower(1). Since February 2020, the COVID-19 pandemic has infected more than 183 million people around the world and resulted in more than 3.9 million deaths (2) . Of these, approximately 33 million cases and 600,000 deaths have been in the United States, with a 1.8% case fatality ratio (2) . Increasing evidence indicates that much of the mortality results from hyper-inflammation related to a cytokine release syndrome (or "cytokine storm"), (5) (6) (7) . Clinical data suggests that persons with diabetes, cardiovascular disease, or chronic lung disease are at higher risk (6) (7) (8) , and there may be an associated pro-inflammatory genotype (9) . In addition to demographic factors and cardiovascular risk factors, clear patterns have developed in the serologic presentation of patients with COVID-19. These derangements span multiple organ systems and include markers of inflammation, immune regulation, the clotting cascade, and indicators of end-organ function (3, 9) . Healthcare systems have been overwhelmed with a surge of hospital admissions due to the COVID-19 pandemic (11). Identifying patients at high risk of adverse outcomes at the time of presentation plays an J o u r n a l P r e -p r o o f important role in allocating limited resources (12) . Various prediction models have been developed to risk stratify patients with COVID-19. However, the majority of these risk prediction models have been found to be at a high risk of bias (12, 13) . A combination of epidemiological factors had made urban Louisiana a nexus of early COVID-19 morbidity and mortality. There were 428,000 cases reported through February, 2021, with a case-fatality rate of 2.2% (15). We examined more than 3700 patients hospitalized for COVID-19 within the Ochsner Health network of hospitals between February 2020 and December 2020, to better understand the clinical impact of demographics, laboratory data, and medical therapies. We sought to develop a mortality risk assessment tool using patient-level data that can be applied at the time of presentation to acute care settings, to improve triage and to identify patients at high risk of adverse outcomes. All patients older than 18 years admitted to the Ochsner Health system hospitals with COVID-19 infection throughout Louisiana from 03/01/2020 through 04/30/2020 were enrolled into an observational cohort after approval of all protocols from an independent institutional review board. These patients represented the model derivation cohort. Following creation of the risk model, appropriate limited data were collected on all hospitalized patients with COVID-19 infection from 06/01/2020 through 12/31/2020, resulting in a validation cohort. We collected patients' demographics, medical history, presenting symptoms, medications, select inpatient therapies, labs, and clinical outcomes. Patients with positive COVID-19 infection who were treated on an outpatient basis were not included in this study. Clinical outcomes tracked included: Intensive care unit care, number of ventilator days, maximal number of pressors/inotropes, sequential organ failure assessment score, and significant clinical organ-specific events encompassing cardiac events (myocardial injury, reductions in contractility, and arrhythmias), renal injury, hepatic injury, thrombotic events, and death. Major organ dysfunction was defined as the presence of kidney injury, myocardial injury, hepatic injury or respiratory injury requiring mechanical ventilation. Acute myocardial injury was considered present if troponin I levels were elevated above the upper limit of normal (ULN), with a 50% change in subsequent level (either rise or fall) at an interval of 3-6 hours. For the model to work as a triage tool, troponin I should have been collected at first contact with acute care settings. Non-acute myocardial injury was defined as troponin elevation above ULN, but with <50% change in subsequent values. Cardiogenic shock was defined as heart failure requiring inotropic or mechanical support. Renal injury was defined based on The Kidney Disease: Improving Global Outcomes criteria for acute kidney injury (16) . Hepatic injury was considered to be present in patients with elevation in the aminotransferase levels >2 times ULN or INR >1.5 in the absence of underlying liver disease (Table 1) . Laboratory information was collected on admission, and included inflammatory markers, ddimer, lactate dehydrogenase, ferritin, lactic acid, renal function and a complete blood count. Arterial and venous blood gas results were recorded when present. Maximum values of these markers during the admission were recorded. Hospital days were calculated from the date of urgent/emergency presentation culminating in an admission to an inpatient facility until the patient was discharged. Transfers for escalation of care were considered as part of the index admission. Mortality was evaluated during the index hospitalization only. Mortality shortly after discharge from inpatient settings was not included in analysis. Statistical analysis was performed using SPSS v27.0 (IBM, Armonk, NY, USA), Stata v17.0 Continuous variables were analyzed using two-tailed Student's t-test or ANOVA, using an α=0.05. Proportions were compared using χ² tests. Multiple logistic regression models and Cox proportional hazards models were used for univariate and multivariate analyses of outcome predictors. Continuous predictor variables were converted to categories for an easier application to the risk prediction model. Cutoffs were derived using receiver operating characteristic (ROC) curve analysis using the Youden's index. Missing values for candidate variables were handled using multiple imputation with chained equations under missing at random assumption. Patients admitted between 1 March 2020 and 30 April 2020 were included in the derivation cohort. The risk model was built using stepwise elimination of variables from a comprehensive multivariable model. Potential predictor variables were identified from review of literature and J o u r n a l P r e -p r o o f based on availability at first contact with acute care settings. Variables with greater than 50% missing values were removed as the first step(Supplemental Table) . In the second step univariate analysis was performed and variables with significance value >0.1 were removed. Next, we checked for collinearity and any factor with Variance Inflation factor>1.5 was removed. In the final step, a Least Absolute Shrinkage and Selection Operator (LASSO) regression was applied to minimize overfitting and further minimize potential collinearity of variables using 10-fold cross validation. LASSO regression was performed to fit models for all lambdas as well as using the one-standard-error rule to select lambda. Final regularization model with λ (0.017) using the one standard error rule was selected ( Figure 1 ). Adjusted odds ratio was used to calculate weights of variables by rounding off the nearest integer. All included variables were adjusted based on age and gender except for age and gender which were adjusted for gender and age respectively. A ROC curve was constructed to evaluate the discrimination power of the score. The optimal cutoff score was chosen using Youden's index. Validation of the model was performed using a separate cohort of patients admitted between 1 June 2020 to 31 December 2020. Even though validation cohort was checked for duplicates, we chose to omit patients admitted during May 2020 to minimize inadvertent overlap. The mean age for our patient cohort was 63±16 years. The population was evenly split between males (49.6%) and females (50.4%) and was predominantly African American (71%). In this patient population 93% of individuals had a comorbidity; the most common diagnosis was hypertension (78%), but other conditions such as obesity (58%), diabetes (45%) and J o u r n a l P r e -p r o o f hyperlipidemia (44%) also affected a large proportion of hospitalized patients. Tobacco use was present in 38%, and coronary artery disease, chronic kidney disease, heart failure, and chronic obstructive pulmonary disease were each present in 10-20% of the population. The mean Sequential Organ Failure Assessment Score on admission was 2.33±2.78 (17) ( Table 2 ). The average length of hospitalization was 11±10 days, and in-hospital mortality was 24%. Of all admitted patients, 36% required intensive care. Major organ dysfunction was present in 42% of patients. Renal injury, myocardial injury, and lung injury requiring mechanical ventilation were the most common kind of organ dysfunction present, each present in 25-30% of the patient sample. Venous thrombosis and hepatic injury were each present in approximately 4% ( Figure 2 ). Mortality among patients who required critical care was 48%, and their mean length of stay was 17±12 days. Mechanical ventilation was required by 26% of the patients. In patients requiring mechanical ventilation, the mortality rate and mean length of stay were 59.2% and 20±12 days, respectively. In hospitalized patients, 6.6% required new initiation of dialysis; these patients had a mortality rate of 50% and a mean hospital stay of 18±12 days. Of these patients, 73% required mechanical ventilation, mortality was 67.5%, and mean hospital length of stay was 20±12days. Those with acute myocardial injury (i.e., troponin greater than the ULN and >50% variation) had a mortality of 40.0% and mean length of stay 16±13 days. Those with acute hepatic injury (aminotransferases >2x the ULN) had a mortality of 72.4% and mean length of stay was 13±9 days. Variables associated with shorter time from admission to death were age≥60, acute kidney injury at admission, admission lab values of lactate>2 and procalcitonin >0.25. Significant mortality-based differences in baseline characteristics were present in most of the categories recorded, with the notable exception of race (Table 3a) . Deceased patients were significantly more likely to be taking most of the medications surveyed, except for ibuprofen (Table 3b) . Additionally, those who died were more likely to have been started on therapy with one of the three agents hypothesized to modify mortality during the early pandemic period (i.e., hydroxychloroquine, azithromycin, remdesivir) (17) . Multivariable logistic regression modelling examining associations between patient characteristics and mortality found confirmed predictive value of several factors that have been found in previously modeled data (18) .Significant factors associated with death included age >60 years, smoking, hypoxemia, thrombocytopenia, acute liver injury, acute myocardial injury, and elevated lactate or procalcitonin (Table 4 ). Final model that we termed the ALPACA (AAALLPPPACA) score, is shown in Table 5 . The maximal possible score in the model is 33, the minimum score is 0. At a score of 0, sensitivity is >99% and specificity <10%. At a score of 21, specificity reaches 100% and sensitivity is 5%. As shown in Figure 3a , a cut off of 8.5 maximized Youden's Index: 74% sensitivity and 69% specificity, with area under the curve of 0.783 (95% Confidence Interval 0.76-0.81) (Figure 3a) . The validation cohort comprised 2067 patients (61.5 ± 17.5 years, 50% women) who were admitted with a positive COVID test between 06/01/2020 and 12/31/2020. The average length of stay was 7.4 ± 8.0 days and in-hospital mortality was 9.6%. Among the 95% of these patients for whom renal function data were available, 25.2% had renal injury, and in these individuals, mortality was 17.2%. Evidence of myocardial injury was present in 11.9%, but data were J o u r n a l P r e -p r o o f incomplete for over 50% of patients (i.e., troponins were only measured in 912 patients). Mortality in patients with cardiac injury was 29.3% and the mean length of hospitalization was 12.5 days. Major organ dysfunction was present in 47.1%. In comparison with the derivation cohort, this population was significantly younger, had higher rates of liver injury, lower rates of cardiac injury, had a significantly higher albumin and significantly lower mortality and length of stay ( Table 6 ). The difference in mortality between derivation and validation cohort reflected the national and international trends in decreasing mortality with COVID-19 illness (20, 21) .This was likely due to a combination of factors including younger patient population, lower community prevalence and use of corticosteroids (22, 23) . However, the populations' mean ALPACA score and rates of any organ injury were not significantly different, and pre-existing comorbidities were between the two cohorts. Understanding prognosis can be useful to clinicians for bed management, care delivery, and palliative discussions in a disease that has affected an enormous number of patients throughout the world. In examining patterns of illness between the early pandemic and subsequent waves of illness, several interesting patterns emerge. The first is that the mean age of the population is lower. In numerous studies, the concern was that this indicated that younger patients were increasingly developing more severe infections with COVID-19 (26) . However, the fact that the prevalence of comorbidities was not significantly different also suggests that those who were ill enough for hospitalization were already suffering from chronic illnesses, and that younger healthy individuals may not have been disproportionately more affected in later stages of the pandemic. Another interesting observation from our data is that the prevalence of end-organ injury did not significantly decrease in the validation cohort, despite decreases in mortality and length of stay. One way to interpret this finding would be that severe infection of organ systems is not an age-dependent phenomenon. In addition, the lower mortality rate may indicate improvements in care that allowed patients to survive COVID-19 infection despite significant end-organ damage during infection. Our relatively simple prediction model, derived from a manually curated patient-level database, makes ALPACA unique in comparison to others developed over the course of the pandemic. Our independently derived risk factors for mortality, represented in ALPACA, are similar to those used in other COVID-19 risk calculators (18, 20) . Additionally, the excellent performance of this model for estimating mortality in the validation cohort, despite changes in age, length of stay, mortality, and other variables, suggests that ALPACA is a robust predictive method. The validation cohort included patients admitted both during "surge" and "non-surge" conditions in Louisiana, and our model performed well. A systematic review of COVID-19 prediction models found that all available prediction models were at high risk of bias (12) . A low event rate in the studies' validation cohorts was one of the major concerns in the majority of these prediction models (12) . Another study evaluated 22 models (including 17 models developed specifically for COVID -19) and found that no prognostic model offered higher net benefit than univariable predictors (specifically, age and admission oxygen saturation) (13) . Moreover, 10 of these 17 models were developed in the Chinese population, and therefore may not be as robust in a different population. The ALPACA COVID-19 mortality risk stratification score appears effective and robust. However, our data are limited to a single geographic region in the Southeastern United States. Furthermore, though the number of patients examined was significant, this risk model still falls far short of several other studies using much higher numbers of patients, with data sourced via automated algorithms. Another limitation of the study was we chose not to include mortality J o u r n a l P r e -p r o o f shortly after discharge from acute care settings to simplify data collection process. This limits the ability of this model to predict mortality beyond hospital discharge. Lastly, the variation in data collection between the derivation and validation cohorts (specifically, a significantly reduced proportion of troponin testing in the validation cohort) may have introduced some bias. However, the fact that the model performed almost identically in both cohorts does suggest that it is robust. With the emergence of new SARS CoV-2 variants and significant time left prior to widespread vaccination, there still is ample time for developing tools to better triage and manage hospitalized COVID-19 patients. With additional validation, the ALPACA score could be a potent tool to help manage hospital bed shortages, identify proper patient placement, and hopefully to help evaluate novel treatment strategies for seriously ill COVID-19 patients. The ALPACA score, derived early in the COVID-19 pandemic and validated in late 2020, is a valuable tool for risk-stratifying COVID-19 infected patients for the endpoint of in-hospital mortality. J o u r n a l P r e -p r o o f Troponin I above ULN with 50% change in subsequent level (rise or fall) checked at 3-6 hrs. interval. Chronic myocardial injury Troponin I elevation above ULN with <50% change in subsequent levels. Cardiogenic shock Heart failure requiring inotropic or mechanical support. Acute renal injury Creatinine elevation 1.5 times baseline. Acute hepatic injury Aminotransferase levels >2 X ULN or INR >1.5 in absence of underlying liver disease. Thrombotic events New diagnosis of deep vein thrombosis or pulmonary embolism on imaging. a INR, international normalized ratio; ULN, upper limit of normal. A systematic review and meta-analysis of published research data on COVID-19 infection fatality rates An interactive web-based dashboard to track COVID-19 in real time Clinical features of patients infected with 2019 novel coronavirus in Wuhan COVID-19 cytokine storm: the interplay between inflammation and coagulation Pathological inflammation in patients with COVID-19: a key role for monocytes and macrophages Clinical Characteristics of Covid-19 in New York City Presenting Characteristics, Comorbidities, and Outcomes Among 5700 Patients Hospitalized With COVID-19 in the New York City Area Baseline Characteristics and Outcomes of 1591 Patients Infected With SARS-CoV-2 Admitted to ICUs of the Lombardy Region Association of TNFalpha G-308 a Promoter Polymorphism with the Course and Outcome of COVID-19 Patients Cytokine release syndrome in severe COVID-19: interleukin-6 receptor antagonist tocilizumab may be the key to reduce mortality Impact of Pre-Infection Left Ventricular Ejection Fraction on Outcomes in COVID-19 Infection : Ejection Fraction and COVID Outcomes. Current problems in cardiology Prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal Systematic evaluation and external validation of 22 prognostic models among hospitalised adults with COVID-19: an observational cohort study KDIGO clinical practice guidelines for acute kidney injury The SOFA (Sepsisrelated Organ Failure Assessment) score to describe organ dysfunction/failure. On behalf of the Working Group on Sepsis-Related Problems of the European Society of Intensive Care Medicine Chloroquine and hydroxychloroquine in covid-19 Developing a COVID-19 mortality risk prediction model when individual-level data are not available Variation in US Hospital Mortality Rates for Patients Admitted With COVID-19 During the First 6 Months of the Pandemic Improving Survival of Critical Care Patients With Coronavirus Disease 2019 in England: A National Cohort Study Association Between Administration of Systemic Corticosteroids and Mortality Among Critically Ill Patients With COVID-19: A Meta-analysis Decreased COVID-19 Mortality-A Cause for Optimism Assessing the Calibration of Dichotomous Outcome Models with the Calibration Belt A General-purpose Nomogram Generator for Predictive Logistic Regression Models Younger Adults Caught in COVID-19 Crosshairs as Demographics Shift COVID-19 mortality risk assessment: An international multi-center study