key: cord-0997124-cofekuly authors: Wu, Shangrong; Du, Zhiguo; Shen, Sanying; Zhang, Bo; Yang, Hong; Li, Xia; Cui, Wei; Chen, Fangxiong; Huang, Jin title: Identification and validation of a novel clinical signature to predict the prognosis in confirmed COVID-19 patients date: 2020-06-18 journal: Clin Infect Dis DOI: 10.1093/cid/ciaa793 sha: 34c8d39c5c09b0cb79aaf0afeb599821cd6844c0 doc_id: 997124 cord_uid: cofekuly BACKGROUND: This study aims to identify a prognostic biomarker to predict the disease prognosis and reduce the mortality rate of COVID-19, which has caused a worldwide pandemic. METHODS: COVID-19 patients were randomly divided into training and test groups. Univariate and multivariate Cox regression analyses were performed to identify the disease prognosis signature, which was selected to establish a risk model in the training group. Furthermore, the disease prognosis signature of COVID-19 was validated in the test group. RESULTS: The signature of COVID-19 was combined with five indicators, namely neutrophil count, lymphocyte count, procalcitonin, older age, and C-reactive protein. The signature stratified patients into high- and low-risk groups with significantly relevant disease prognosis (log-rank test, P<0.001) in the training group. The survival analysis indicated that the high-risk group displayed substantially lower survival probability than the low-risk group (log-rank test P<0.001). The area under ROC curve (AUC) showed that the signature of COVID-19 displayed the highest predictive accuracy regarding disease prognosis, which was 0.955 in the training group and 0.945 in the test group. The ROC analysis of both groups demonstrated that the predictive ability of the signature surpassed the use of each of the five indicators alone. CONCLUSION: The signature of COVID-19 presents a novel predictor and prognostic biomarker for closely monitoring patients and providing timely treatment for those who are severely or critically ill. A type of pneumonia with an unknown etiology, termed coronavirus disease 2019 , caused a rapidly spreading outbreak induced by severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) [1] [2] , and was declared a public health emergency of international concern on January 30, 2020, by the World Health Organization (WHO). China has enforced the most drastic of all classic public health measures to bring the epidemic under control, but the situation in other countries is not optimistic. Since late February 2020, new cases have been reported daily in other parts of the world. By March 25, 2020 , the cumulative number of confirmed cases abroad passed the 340000 mark, with more than 15000 deaths, which far exceeded the cases in China. In addition to taking strict preventative measures against the epidemic to curb rapid large-scale outbreaks, the timely treatment of severely or critically ill patients play a significant role in reducing COVID -19 fatalities. Previous studies on the emergence of this novel coronavirus and its clinical features suggested that older age, male gender, underlying comorbidities, elevated d-dimer at admission, and progressive radiographic deterioration on follow-up CT might be risk factors for patients infected with SARS-CoV-2 [3] [4] . Until now, no antiviral treatment or vaccine has proven effective against the coronavirus infection. Infected patients classified as being severely or critically ill may develop multiple organ malfunctions, including acute respiratory distress syndrome and acute cardiac injury [5] [6] [7] , emphasizing an urgent need to establish a A c c e p t e d M a n u s c r i p t 6 predictive model for monitoring a patient's risk of developing critical disease symptoms and reducing the mortality rate. This study identifies a significant, independent COVID-19 signature in patients via multivariate Cox regression, which may serve as a foundation for accurate individual diagnosis and treatment of severely or critically ill patients. This research represented a single-center retrospective study performed from January 27, 2020, to February 26, 2020, by Wuhan Fourth Hospital. Here, 270 patients infected with laboratory-identified SARS-CoV-2 were classified into two groups, namely moderately ill and severely or critically ill, according to the Guidance for Corona Virus Disease 2019 (6th edition) [8] , as announced by the National Health Commission of China. The definitions of moderately ill, severely ill, and critically ill are shown in Table S1 . The research team consisted of experienced respiratory physicians, radiologists, and laboratory physicians. Missing data or data requiring clarification via the available records were obtained through direct communication with the attending physician and other healthcare providers. Finally, the patients were randomly divided into training and test groups. The patient characteristics were obtained from electronic medical records and included clinical features, signs and symptoms, comorbidities, imaging features of the chest, A c c e p t e d M a n u s c r i p t 7 laboratory findings, treatments, and antiviral or anti-inflammatory drugs. This information was documented on a standardized record form. Throat swab samples were collected from the patients suspected of being infected with COVID-19 for the extraction of SARS-CoV-2 RNA. Then, the respiratory samples were transferred into a collection tube containing 2 ml cell lysates, and vortexed for 30 s. The RNA was extracted from the samples using the appropriate kit (Liferiver, Shanghai, China). After The clinical laboratory investigation included a complete blood count and biochemical serum tests (including liver and kidney function), as well as the determination of the coagulation mechanism and myocardial enzyme spectrum. A previously reported method was adopted for the construction of the signature module [9] [10] [11] . First, a univariate Cox regression analysis was used to determine which indicators were associated with disease prognosis, after which 25 significantly correlated indicators were identified (P-value<0.05). Furthermore, a multivariate Cox regression analysis was employed to construct a model consisting of five indicators (neutrophil count, lymphocyte count, procalcitonin, age, and C-reactive protein) to assess the risk of prognosis (P-value <0.05, Concordance Index:0.93 and AIC lowest ) and screen for the most powerful determiners. This process allowed for the construction of a model capable of assessing the risk factors of prognosis according to the following equation: The signature of COVID-19 selected above was used to construct a risk model, employing the median risk score as the cutoff value to divide the training and test patients into either high-risk or low-risk groups. Then, the predictive value of the signature in the test dataset was validated using survival analysis, as well as ROC analysis. All assessments were performed using the R project (https://cloud.r-project.org/)(version 3.5.1) with the pROC and disease prognosis packages downloaded from Bioconductor (https://bioconductor.org). A total of 270 patients with confirmed COVID-19 participated in this study, of which 203 (75.2%) were moderately ill, and 67 (34.8%) were severely ill. As shown in Table 1 The selected technical route of the prognostic signature of COVID-19 is displayed in Figure 1 . The training group (n = 210) was used to explore the association between disease prognosis and the occurrence of the indicators. Univariate Cox regression analysis of the indicator data was initially performed, with the survival time and overall status as the dependent variables. Here, 25 indicators were identified that significantly correlated with the disease prognosis in the patients (P-value <0.05, Figure 2 , Table S2 ). Furthermore, a multivariate Cox regression analysis ( Figure 3 ) was employed to construct a model consisting of five indicators (neutrophil count, lymphocyte count, procalcitonin, age, and C-reactive protein) to assess the risk of the prognosis and screen for the most powerful prognostic determiners. The risk scores (Table S3 ) of the combination of these five indicators were determined as follows: where RS is the risk score, and ID is the indicator value. The analysis represented the risk score of the selected signature of COVID-19 for each patient. A median risk score was used to divide the training group into a low-risk group (n = 104) and a high-risk group (n =106). The results of the survival analysis revealed that the high-risk group demonstrated significantly lower survival rates than the low-risk group (logrank test P<0.001; Figure 4A ). As the duration after disease diagnosis increased, the survival probability of the high-risk group was 0.59, while the low-risk group was not faced with lifethreatening risk. The same disease prognosis risk score model was used to calculate the signature-based risk scores of the test group patients, validating the prediction power of the signature. Similarly, the test data set was divided into two groups, namely a high-risk group (n=12) and a low-risk group (n=48). The two risk groups in the test dataset were displayed using survival analysis ( Figure 4B ). The median survival rate of the high-risk group in the test was significantly lower than in the low-risk group (log-rank test P<0.001). The results indicated that when the disease progressed for 14 days, the survival probability of the highrisk group was only 0.3, which was significantly lower than in the low-risk group (survival probability =0.85). ROC analysis was performed to test the prediction power of the signature of COVID-19, which considered the larger AUC as a better model for predicting the disease prognosis in COVID-19 patients. In the training group, the predictive ability of the five-indicator signature was high (AUC Signature =0.955, Figure. 4C), further demonstrating that the signature in this A c c e p t e d M a n u s c r i p t 12 study was a novel and highly accurate survival biomarker. A similar, highly accurate result was evident in the test group as well (AUC Signature =0.945, Figure 4D ). COVID-19 is a highly infectious disease characterized by a long incubation period, and rapid onset, with no specific treatment method currently available. It is crucial to find a signature factors for the contraction of COVID-19 [12] . Furthermore, a recent study found that cardiac troponin I(≥ 0.05 ng/mL) is also an important predictor for the prognosis of COVID-19 [13] . Here, multivariable Cox regression analysis was used to assess the independence of the M a n u s c r i p t 13 signature with an AUC of 0.955 in the training group and 0.945 in the test group, indicating its potential as a powerful survival biomarker. The signature combined with the five indicators (neutrophil count, lymphocyte count, procalcitonin, age, and C-reactive protein) was strongly associated with the physiological status of COVID-19 patients, such as inflammation and immune function. Therefore, the signature could be a more effective biomarker in a multi-dimensional model. cases to severe acute respiratory distress syndrome (ARDS) and respiratory failure [14] , as did severe pneumonia and acute heart injury [5] . Complement-mediated systemic inflammation may be an underlying mechanism for the pathogenic response to the SARS infection. Previous research found that complementdeficient mice displayed reduced pulmonary neutrophils and an attenuated pro-inflammatory response [15] . Neutrophils infiltrate the tissues infected with coronavirus, promoting the expression of pro-inflammatory cytokines and chemokines, which might induce extensive lung damage in SARS, MERS-CoV, and SARS-CoV-2 infection [5, [16] [17] . Furthermore, studies have revealed that a high neutrophil count in patients with SARS when admitted to the hospital, is more likely to present a poor prognosis [18] [19] . A recent study found that patients with refractory COVID-19 exhibited higher neutrophil levels on admission [20] , corroborating the findings of this study. This result may be closely related to the inflammation caused by neutrophils, which leads to tissue damage. A c c e p t e d M a n u s c r i p t 14 Both C-reactive protein and procalcitonin can reflect the inflammatory state of the body. Procalcitonin shows a certain correlation with microbial invasion and is one of the most promising biomarkers for the diagnosis of sepsis [21] . C-reactive protein can be induced by inflammation, playing a crucial role in activating the complement system and neutrophils, while promoting the secretion of IL-6, IL-1b, and TNF-α, which contribute to further inflammation [22] . Severely or critically ill patients with COVID-19 may develop sepsis, which is a significant contributor to the mortality rate. C-reactive protein and procalcitonin are reportedly helpful in the diagnosis of sepsis [23] , while serum procalcitonin levels appear to correlate with the severity of the microbial attack. Elevated C-reactive protein and procalcitonin levels are more common in COVID-19 patients with heart injury, placing them at a higher risk of hospital death [24] . Moreover, high levels of C-reactive protein and procalcitonin exhibit a significant correlation with pulmonary inflammation [25] and are reportedly associated with patients who are severely ill with COVID-19 [26] . In a recent retrospective study involving COVID-19, the C-reactive protein and procalcitonin levels were higher in deceased patients [3, 27] , which corresponds with the results of this study. The indicators of the prognostic factors help to identify the severity of the COVID-19 disease while showing that secondary bacterial infections cannot be ignored. A high lymphocyte count is considered a protective factor for COVID-19 since severe lymphopenia was predictive of poor outcomes [28] . T-cells play a critical role in inhibiting the overactive innate immune response while maintaining immune homeostasis during SARS- prevent re-infection [29] [30] . A previous study indicated that the lymphocyte count is crucial during the early screening, diagnosis, and treatment of critically ill COVID-19 patients [31] . This study found that serious lymphopenia was more common in severely ill patients, A c c e p t e d M a n u s c r i p t 15 indicating that SARS-CoV-2 might affect lymphocytes, while cell-mediated immunity might be associated with disease severity. Research indicated that the depletion of T-cells resulted in immune dysregulation, accompanied by increased inflammation, cytokine storms, and the aggravation of damaged tissue [31] , which was consistent with the supposition of this study. Older age has always been a risk factor for a series of diseases, and this research was no exception. Previous reports indicated that the median age of patients at the time of death was older for SARS-CoV-2 infection[32-33]. As suggested in recent studies on a similar topic [3, 34] , this study showed that the median age of severely or critically ill patients was older than that of moderately ill patients. Therefore, it seems that the elderly may have a high likelihood of developing chronic underlying comorbidities (diabetes, hypertension, and heart disease) and are more susceptible to COVID-19 with a poor outcome [6] , which could be attributed to the elderly often being physically fragile with weak immune systems. Therefore, people belonging to this age group would experience immune senescence[35], accompanied by decreased immune defense functionality, a weakened ability for the proliferation and differentiation of T-and B-cells in the lymph nodes, reduced effector functionality, as well as poor coordination between innate immunity and acquired immune response, contributing to increased morbidity and mortality[36]. These results indicated that disease prognosis in the elderly requires careful attention and timeous treatment. The risk score of the selected signature of COVID-19 was calculated to further verify its ability as a prognostic biomarker in patients infected with the disease. The training group was divided into low-risk and high-risk groups according to the median risk score. The results showed that the high-risk group displayed a significantly lower survival rate than the low-risk group. Therefore, the risk model based on the signature combined with the five indicators A c c e p t e d M a n u s c r i p t 16 could be used to predict the disease prognosis and the survival rate, fully utilizing medical resources to provide critically ill patients with better treatment and reducing the mortality rate of COVID-19. This study exhibited several limitations. First, since this was a retrospective, single-center sample study, potential biomarkers such as underlying diseases, which could predict prognosis of COVID-19, were not included in the model. Therefore, a multi-center large sample study would be preferable for assessing the prognostic markers of COVID-19. Second, although a retrospective study of moderately and severely or critically ill patients was performed to establish the signature combined with the clinical indicators, the laboratory data regarding cardiac troponin I, oxygen partial pressure, and the characteristics regarding BMI were not available. Therefore, these elements were not included in the risk factor analysis due to the severity of the epidemic at that specific time and the shortage of medical resources. Third, notwithstanding these limitations, the consistent correlation of the signature with the overall survival rate in this study indicates that it is a dominant independent signature of COVID-19. The signature of COVID-19 is an effective prognostic biomarker that can be used during the risk assessment of patients infected with the disease. It allows for close monitoring to provide timely treatment for severely or critically ill patients. A c c e p t e d M a n u s c r i p t 17 We thank the all medical staffs and patients involved in the study. The We declare no competing interests. Multivariate cox regression analysis of the signature associated with disease prognosis. M a n u s c r i p t A c c e p t e d M a n u s c r i p t 32 Figure 4 Coronavirus Infections-More Than Just the Common Cold The continuing 2019-nCoV epidemic threat of novel coronaviruses to global health -The latest 2019 novel coronavirus outbreak in Wuhan, China Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study Radiological findings from 81 patients with COVID-19 pneumonia in Wuhan, China: a descriptive study Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study Clinical Characteristics of 138 Hospitalized Patients With 2019 Novel Coronavirus-Infected Pneumonia in Wuhan National Health Committee of the People's Republic of China. Diagnosis and Treatment of New Coronavir us Pneumonia (Trial Version 6) Identification of DNA methylation signature to predict prognosis in gastric adenocarcinoma The lncRNAs RP1-261G23.7, RP11-69E11.4 and SATB2-AS1 are a novel clinical signature for predicting recurrent osteosarcoma Protein-coding genes combined with long noncoding RNA as a novel transcriptome molecular staging model to predict the survival of patients with esophageal squamous cell carcinoma Prediction for Progression Risk in Patients with COVID-19 Pneumonia: the CALL Score Predictors of Mortality for Patients with COVID-19 Pneumonia Caused by SARS-CoV-2: A Prospective Cohort Study. The European respiratory journal Severe acute respiratory syndrome vs. the Middle East respiratory syndrome. Current opinion in pulmonary medicine Complement Activation Contributes to Severe Acute Respiratory Syndrome Coronavirus Pathogenesis Plasma inflammatory cytokines and chemokines in severe acute respiratory syndrome MERS-CoV infection in humans is associated with a pro-inflammatory Th1 and Th17 cytokine profile A major outbreak of severe acute respiratory syndrome in Hong Kong. The New England journal of medicine Severe acute respiratory distress syndrome (SARS): a critical care perspective. Critical care medicine Clinical infectious diseases : an official publication of the Infectious Diseases Society of America Procalcitonin as a diagnostic marker for sepsis: a systematic review and metaanalysis. The Lancet. Infectious diseases C-reactive protein: an activator of innate immunity and a modulator of adaptive immunity Effect of procalcitonin-guided antibiotic treatment on clinical outcomes in intensive care unit patients with infection and sepsis patients: a patient-level meta-analysis of randomized trials Association of Cardiac Injury With Mortality in Hospitalized Patients With COVID-19 in Wuhan, China Chest CT Findings in Patients With Coronavirus Disease 2019 and Its Relationship With Clinical Features. Investigative radiology Clinical characteristics of 140 patients infected with SARS-CoV-2 in Wuhan Clinical characteristics of 113 deceased patients with coronavirus disease 2019: retrospective study Coronavirus Disease 2019 in elderly patients: characteristics and prognostic factors based on 4-week follow-up. The Journal of infection Understanding the T cell immune response in SARS coronavirus infection Cellular immune responses to severe acute respiratory syndrome coronavirus (SARS-CoV) infection in senescent BALB/c mice: CD4+ T cells are important in control of SARS-CoV infection Clinical infectious diseases : an official publication of the Infectious Diseases Society of America Data are shown as median (IQR), n (n/N%) A c c e p t e d M a n u s c r i p t A c c e p t e d M a n u s c r i p t