key: cord-314986-uhpe69k0 authors: Cai, Quan; Du, Si-Yao; Gao, Si; Huang, Guo-Liang; Zhang, Zheng; Li, Shu; Wang, Xin; Li, Pei-Ling; Lv, Peng; Hou, Gang; Zhang, Li-Na title: A model based on CT radiomic features for predicting RT-PCR becoming negative in coronavirus disease 2019 (COVID-19) patients date: 2020-10-20 journal: BMC Med Imaging DOI: 10.1186/s12880-020-00521-z sha: doc_id: 314986 cord_uid: uhpe69k0 BACKGROUND: Coronavirus disease 2019 (COVID-19) has emerged as a global pandemic. According to the diagnosis and treatment guidelines of China, negative reverse transcription-polymerase chain reaction (RT-PCR) is the key criterion for discharging COVID-19 patients. However, repeated RT-PCR tests lead to medical waste and prolonged hospital stays for COVID-19 patients during the recovery period. Our purpose is to assess a model based on chest computed tomography (CT) radiomic features and clinical characteristics to predict RT-PCR negativity during clinical treatment. METHODS: From February 10 to March 10, 2020, 203 mild COVID-19 patients in Fangcang Shelter Hospital were retrospectively included (training: n = 141; testing: n = 62), and clinical characteristics were collected. Lung abnormalities on chest CT images were segmented with a deep learning algorithm. CT quantitative features and radiomic features were automatically extracted. Clinical characteristics and CT quantitative features were compared between RT-PCR-negative and RT-PCR-positive groups. Univariate logistic regression and Spearman correlation analyses identified the strongest features associated with RT-PCR negativity, and a multivariate logistic regression model was established. The diagnostic performance was evaluated for both cohorts. RESULTS: The RT-PCR-negative group had a longer time interval from symptom onset to CT exams than the RT-PCR-positive group (median 23 vs. 16 days, p < 0.001). There was no significant difference in the other clinical characteristics or CT quantitative features. In addition to the time interval from symptom onset to CT exams, nine CT radiomic features were selected for the model. ROC curve analysis revealed AUCs of 0.811 and 0.812 for differentiating the RT-PCR-negative group, with sensitivity/specificity of 0.765/0.625 and 0.784/0.600 in the training and testing datasets, respectively. CONCLUSION: The model combining CT radiomic features and clinical data helped predict RT-PCR negativity during clinical treatment, indicating the proper time for RT-PCR retesting. discharging COVID-19 patients. The clinical prediction of RT-PCR becoming negative is critical for the proper retesting time, preventing medical waste from repeated RT-PCR tests and unnecessary prolonged hospital stays. Doctors need an objective and accurate method for prediction of RT-PCR negativity during clinical treatment. Chest computed tomography (CT) can intuitively demonstrate the lung lesions and its manifestations of COVID-19 pneumonia have been reported in many studies [2] [3] [4] . Chest CT exams are useful in supplementary diagnosis of RT-PCR tests [5] [6] [7] , evaluating disease stages [2, 3, 8, 9] and severity [10] [11] [12] . Recently, deep learning techniques have been widely used in the detection and segmentation of COVID-19 lesions in chest CT images [13] [14] [15] [16] . Based on a reliable segmentation method, the high-throughput and high-dimensional radiomic features on chest CT showed strong potential for predicting the true status of RT-PCR. We hypothesized that a model incorporating CT radiomic features and clinical characteristics can predict RT-PCR becoming negative. We collected the clinical data and chest CT features of mild COVID-19 patients in Fangcang Shelter Hospital in Wuhan, Hubei, aiming to establish a predictive model for RT-PCR becoming negative during the recovery period. The study was approved by the institutional review board of the First Affiliated Hospital of China Medical University. Informed consent was waived due to the nature of the retrospective study. Between February 10, 2020, and March 10, 2020, the clinical data and CT images of COVID-19 patients at Fangcang Shelter Hospital in Hongshan Gymnasium, Wuhan, Hubei, were reviewed retrospectively. All cases were mild from the onset and during the course of hospitalization, as defined by no hypoxemia or respiratory distress (respiratory rate ≥ 30 breaths/min, requirement for oxygen treatment or mechanical ventilation, or SpO2 ≤ 93% on room air) [1] . Patients were included if they met the following criteria: (1) No abnormal clinical symptoms (fever and severe respiratory symptoms) for more than 3 days. (2) Underwent RT-PCR tests at least 3 times after abnormal clinical symptoms disappeared. (3) The first RT-PCR tests were performed between 3 and 5 days after abnormal clinical symptoms disappeared. (4) Underwent chest CT exams within 2 days after the first RT-PCR test. Patients with inconsistent results in the first two consecutive RT-PCR tests were excluded (Fig. 1a, b) . Novel coronavirus 2019-nCoV nucleic acid detection kit (fluorescence PCR method) (Sansure Biological Technology Co., Ltd., Changsha, China, Serial Number: 20150036) was used for RT-PCR tests. The enrolled patients were divided into two groups: RT-PCR-negative and RT-PCR-positive groups (Fig. 1a , b). Inclusion criteria for the RT-PCR-negative group were: (1) All RT-PCR tests were negative; (2) No worsening clinical symptoms during hospitalization and the 2-week isolation after discharge. Inclusion criteria for the RT-PCR-positive group: the first two RT-PCR tests were positive. We collected 20 available clinical characteristics, including general characteristics (age, gender, time interval from symptoms onset to CT exams), comorbidities, vital signs on the CT scan day and laboratory tests on admission. Comorbidities included diabetes, hypertension, cardiovascular disease, chronic obstructive pulmonary disease, chronic liver disease and cancer. Vital signs on the CT scan day included heart rate, systolic blood pressure, diastolic blood pressure, respiratory rate, and blood oxygen saturation. Laboratory tests include white blood cell count, neutrophil count, lymphocyte count, platelet count, hemoglobin and neutrophil/lymphocyte ratio (NLR) (NLR = neutrophil counts/lymphocyte counts). The first RT-PCR tests for all enrolled patients were performed between 3 and 5 days after abnormal clinical symptoms disappeared. Then, all patients underwent CT exams within 2 days after the first RT-PCR test. Chest CT scanning used a mobile cabin CT (CT-NeuVz Prime, Neusoft) with a single breath-hold in the supine position. The scan parameters are as follows: tube voltage of 120 kVp, tube current of 100-200 mA, detector collimation of 64 or 128 × 0.625 mm, field of view of 350 mm × 350 mm, and matrix size of 512 × 512. Imaging data were reconstructed using a medium sharp reconstruction algorithm with a slice thickness of 5 mm and an interval of 1 mm. CT image analysis was performed on a dedicated workstation-Lung intelligence Kit (LK) Version V2.1.1. R (GE Healthcare, China). The main processes included data import and preprocessing, lung lobe segmentation, lesion segmentation and feature extraction (Fig. 2) . Lung lobes were segmented with the purpose of improving the accuracy of lesion segmentation and calculating the proportion of lesions in each lung lobe. Before lung lobe and lesion segmentation, the images were resampled to voxel size 1 × 1 × 1 mm 3 , and a Gaussian filter was applied for denoising. Then, a fully automatic segmentation of three-dimensional lung lobes and lesions based on deep learning algorithms was performed. In cases of unsatisfactory lung lobe and lesion segmentation, two thoracic radiologists (with 5 and 15 years of experience, respectively) blinded to the clinical information and RT-PCR results manually adjusted the contour and resolved discrepancies by consensus. After segmentation, 86 CT quantitative parameters were automatically calculated: the statistical results of lung lobe and lesion (volume, volume percentage, pneumonia score, average density, standard deviation of density) and the component analysis of the lesion (partial solidity, solidity and total lesions) (Additional file 1: Supplementary Data 1). After segmentation, 120 radiomic features of 7 categories were automatically calculated: (1) first-order features (n = 19); (2) 2D and 3D shape features (n = 26); (3) gray level cooccurrence matrix features (n = 24); (4) gray level run length matrix features (n = 16); (5) gray level size zone matrix features (n = 16); (6) neighboring gray tone difference matrix features (n = 5); and (7) gray level dependence matrix features (n = 14). Detailed names and definitions of all 120 features can be found in Additional file 1: Supplementary Data 2. Missing values were replaced by the median, and the data were standardized by the following formula: standardized value = (original value-average value)/standard deviation. The patients were randomly assigned at a 7:3 ratio to either the training cohort or the testing cohort. All patients in the training cohort were used to build the predictive model, while patients in the testing cohort were used to independently evaluate the model's performance. To obtain the strongest features that were significantly associated with negative RT-PCR results in the training cohort, we performed univariate logistic regression analysis, and features with a p value < 0.10 were used for subsequent analysis. Then, Spearman correlation analysis was used to remove the features highly correlated with others; here, the |r| value was 0.9. We constructed a multivariate logistic regression model to identify a strategy to best classify RT-PCR-negative patients in the training dataset. Radiomics scores (Radscores) were calculated in each patient through a linear combination of the extracted features with their respective coefficients. The predictive performance was evaluated in terms of discrimination-receiver operating characteristic (ROC) curve, calibration-calibration curve and clinical application-decision curve. Categorical variables are presented as the number and percentage of the total. The normality of continuous variables was evaluated by using the Shapiro-Wilk test. Normally distributed variables are shown as the mean ± standard deviation or the median (25% percentile, 75% percentile). The differences in variables between different subgroups were assessed by the t test or Mann-Whitney U test as appropriate. The chi-squared test was used to compare the significance of the differences between categorical variables. All statistical analyses for the present study were performed with R 3.5.1 and Python 3.5.6. A two-tailed p value < 0.05 indicated statistical significance. The flow diagram summarizing the selection of the enrolled patients is shown in Fig. 1b . For 203 patients included in our study, the average number of RT-PCR tests was 6 ± 3, ranging from 3 to 12 during hospitalization. 122/203 (60.1%) were categorized in the RT-PCRnegative group, and 81 (39.9%) were categorized in the RT-PCR-positive group. Figure 3 shows CT images for cases in the RT-PCR-negative and RT-PCR-positive groups. Clinical information of the training and the testing cohort is shown in Table 1 . The RT-PCR-negative group had a longer time interval from symptom onset to CT exams than the RT-PCR-positive group (median 23 vs. 16 days for the total patients, p < 0.001). There was no significant difference in the other clinical characteristics. The CT quantitative features are summarized in Additional file 1: Supplementary Data 1, and none of them differed between the two groups. A total of 226 characteristics from each patient were collected: 20 clinical characteristics, 86 quantitative features and 120 radiomic features. After the univariate logistic regression analysis was performed, 20/226 parameters were reserved. Then, 10 features that were highly correlated (|r|> 0.9) with other features were removed due to their redundancy based on the Spearman correlation analysis. Ultimately, 10/20 parameters (Table 2) were retained to build the model. The statistical summary of the multivariate logistic regression model is shown in Table 2 . The time interval from symptom onset to CT exams and original_first-order_Minimun had the highest odds ratio (OR) values (OR = 2.84 and 2.10, respectively) among all parameters. Figure 4 shows Rad-score for each patient in the training and testing datasets. ROC curves of the model (Fig. 5) showed an area under the curve (AUC) of 0.811 with a sensitivity of 76.5%, specificity of 62.5% and accuracy of 70.9% in the training dataset and 0.812 with a sensitivity of 78.4%, specificity of 60.0% and accuracy of 71.0% in the testing dataset. The calibration curve of Rad-scores for the differentiation of the RT-PCR-negative group demonstrated the good consistency between prediction and observation in the training and testing cohorts (Fig. 6) . The decision curve analysis showed that the model had a significantly improved performance within a certain threshold range in the training and testing datasets (Fig. 7 ). We demonstrated the usefulness of CT radiomic features for predicting RT-PCR negativity and established a predictive model based on CT radiomic features combined with clinical data in COVID-19 patients during the recovery period. With AUCs of 0.811 and 0.812 for the training and testing datasets, respectively, we expect the model to help doctors effectively predict RT-PCR negativity during clinical treatment. The unsatisfactory sensitivity of RT-PCR detection is a major concern [5, 6, 17, 18] . To avoid the possibility of false negative RT-PCR in our study, we included patients with repeated RT-PCR tests (average times: 6; range 3-12) during hospitalization. Only the patients with consistent results of the consecutive RT-PCR tests were included to ensure true negative or positive RT-PCR status for the corresponding CT. A 2-week isolation after discharge was further performed to avoid any possibility of false negative RT-PCR. Accurate lesion segmentation is the key to feature extraction and model construction. Colombi et al. 's study [19] divided lung parenchyma into upper, middle and lower zones in severe COVID-19 patients. They found quantification of well aerated lung parenchyma were predictors of adverse outcome. In the present study, we used the automatic pneumonia segmentation software based on a deep learning algorithm. It detected the respiratory tract and lung lesions based on the actual segmentation of the lung lobes, so more comprehensive and complicated quantitative parameters and radiomic features were evaluated for model construction. Recently, the deep learning algorithm has been widely used in the detection of COVID-19 lesions in chest CT images [13] [14] [15] [16] . Most studies [13] [14] [15] applied it to chest CT images in the early stage of the disease course for diagnosis and differential diagnosis, while there are few studies regarding chest CT images of COVID-19 patients during the recovery period. We analyzed chest CT images after the abnormal clinical symptoms disappeared, and proposed a combination model of radiomic features and clinical data to predict RT-PCR negativity. Radiomic features played important roles in the model. Among the 10 parameters in the model, 9 of them were CT radiomic features. The top five radiomic features are original_firstorder_Minimum, origi-nal_gldm_Small Dependence Low Gray Level Emphasis, original_glszm_Large Area High Gray Level Emphasis, original_firstorder_10Percentile, and original_shape_ Sphericity ( Table 2 ). These indicators represent lesion internal heterogeneity of morphology, density, texture and distribution, thus indicating disease severity. The time interval from symptom onset was the only clinical parameter selected in the model, with the strongest correlation with the RT-PCR-negative group (OR = 2.84). As expected, the longer the disease course, the more patients received negative RT-PCR. We also analyzed the chest CT quantitative parameters, but none of them were included in the model. Increased numbers, extents, and densities of ground-glass opacities (GGOs) [20] and consolidations [21] represent progression in COVID-19 patients, as well as the transformation of consolidation from GGOs [8] . Decreased sizes, extents, and degrees of such lesions could indicate improvement [21] [22] [23] [24] [25] . In our study, the recovering patients who had a negative RT-PCR result were expected to show smaller lesion volumes and lower CT values, but the quantitative parameters were not precise enough for the changes. The high-throughput and high-dimensional radiomic features could reflect more detailed changes inside the lesions than the CT quantitative parameters. No laboratory tests were included in the model. Neutrophils and lymphocytes are the main hematological indicators reflecting systematic inflammation. Lymphocytopenia occurred in more than 80% of critically ill patients [26] , while in an almost mild study population, *Normal range: heart rate, 60-100 beats/minute; Systolic blood pressure, 90-140 mmHg; Diastolic blood pressure, 60-90 mmHg; Respiratory rate, 12-20 times/ minute; Blood oxygen saturation, 95-100%; White blood cell count, 3.50-9.50 × 10 9 /L; Neutrophil count, 1.80-6.30 × 10 9 /L; Lymphocyte count, 1.10-3.20 × 10 9 /L; Hemoglobin count, 130-175 g/L; Platelet count, 125-350 × 10 9 /L [27] . Elevated baseline neutrophils in mild cases were not common, and only 6.3% of non-severe patients showed increases in Zhang et al. 's study [28] . Neutrophils also did not increase over the disease course for patients with mild disease and survivors [22, 29] . The patients included in our study were mild COVID-19 patients from Fangcang Shelter Hospital. Most laboratory tests were normal or slightly exceeded normal limits, and we did not find a significant difference in lymphocytes and neutrophils between the RT-PCR-negative and RT-PCR-positive groups. This study has several limitations. First, as a retrospective study, the study only involved mild COVID-19 cases, so the model cannot be employed for severe and critical cases. For all mild COVID-19 patients in Fangcang Shelter Hospital, some laboratory tests such as erythrocyte sedimentation rate and C-reactive protein were not performed. Second, this is a single-center study, and multi-center data should be used for further verification. Moreover, we only built one model type and lacked comparative analysis with other model types, including decision trees, random forests and support vector machines. Finally, we did not explain the biological interpretation of the radiomic features. We are fully aware of the need for further exploration of these conclusions in subsequent studies. In conclusion, the established model based on CT radiomic features and clinical data could help doctors predict RT-PCR negativity during the clinical treatment, indicating the proper time for RT-PCR retesting. Supplementary information accompanies this paper at https ://doi. org/10.1186/s1288 0-020-00521 -z. Decision curve analysis (DCA) in the training and testing datasets. The y-axis represents the net benefit (the net benefit was calculated by subtracting the proportion of all false-positive patients from the true-positive patient, and the weight is the relative hazard of abandoning treatment versus negative patients). The red solid line indicates the model. The black solid line indicates the hypothesis that all patients were treated by one scheme (for example, assuming that all patients were in the RT-PCR-negative group). The black dotted line represents the hypothesis that all patients were treated by another scheme (for example, assuming that all patients were in the RT-PCR positive group). The model shows the added net benefit if the probability thresholds in the training and testing datasets are more than 0.20 and between 0.15 and 0.82, respectively National Health Commission of the People's Republic of China. COV-ID-19's diagnosis and treatment plan Chest CT manifestations of new coronavirus disease 2019 (COVID-19): a pictorial review Chest computed tomography findings of COVID-19 pneumonia: pictorial essay with literature review Imaging features of the initial chest thin-section CT scans from 110 patients after admission with suspected or confirmed diagnosis of COVID-19 Correlation of chest CT and RT-PCR testing in Coronavirus Disease 2019 (COVID-19) in China: a report of 1014 cases Sensitivity of chest CT for COVID-19: comparison to RT-PCR Chest CT for typical 2019-nCoV pneumonia: relationship to negative RT-PCR testing Time course of lung changes on chest CT during recovery from 2019 novel Coronavirus (COVID-19) pneumonia Chest CT findings in coronavirus disease-19 (COVID-19): relationship to duration of infection The clinical and chest CT features associated with severe and critical COVID-19 pneumonia Chest CT severity score: an imaging tool for assessing severe COVID-19 CT image visual quantitative evaluation and clinical classification of coronavirus disease Using artificial intelligence to detect COVID-19 and community-acquired pneumonia based on pulmonary CT: evaluation of the diagnostic accuracy Rapid AI development cycle for the coronavirus (COVID-19) pandemic: initial results for automated detection and patient monitoring using deep learning CT image analysis Classification of COVID-19 patients from chest CT images using multi-objective differential evolution-based convolutional neural networks Lung infection quantification of COVID-19 in CT images with deep learning Essentials for radiologists on COVID-19: an update-radiology scientific expert panel Evaluating the accuracy of different respiratory specimens in the laboratory diagnosis and monitoring the viral shedding of 2019-nCoV infections Well-aerated lung on admitting chest CT to predict adverse outcome in COVID-19 pneumonia CT imaging features of 2019 novel Coronavirus (2019-nCoV) Emerging 2019 novel Coronavirus (2019-nCoV) pneumonia Clinical characteristics of 138 hospitalized patients with 2019 novel Coronavirus-infected pneumonia in Wuhan Evolution of CT manifestations in a patient recovered from 2019 novel Coronavirus (2019-nCoV) pneumonia in Wuhan Pre-and posttreatment chest CT findings: 2019 novel Coronavirus (2019-nCoV) pneumonia Longitudinal CT findings in COVID-19 pneumonia: case presenting organizing pneumonia pattern Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study Analysis of clinical characteristics and laboratory findings of 95 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a retrospective analysis Longitudinal characteristics of lymphocyte responses and cytokine profiles in the peripheral blood of SARS-CoV-2 infected patients We thank Yan Guo of GE Healthcare for her help regarding statistics. We also thank American Journal Experts for providing language editing services. LZ guaranteed the integrity of the whole study and edited manuscript. LZ and SL designed the study. SD, SG and GH1 researched literature. Experimental studies and data analysis were performed by QC, PL1, ZZ, PL2 and GH2. SD, SG, GH1 and XW participated in statistical analysis. SD and QC participated in manuscript preparation. All authors read and approved the final manuscript. This study was supported by National Financial Appropriation Research Project (Grant Number 2017YFC1309100) and National Scientific Foundation of China (Grant Number 81971695). The funding body had no role in the design of the study, collection, analysis, and interpretation of data, or in writing the manuscript. The datasets generated and analyzed during the current study are available from the corresponding author on reasonable request. The study was approved by the institutional review board of the First Affiliated Hospital of China Medical University. Informed consent was waived due to the nature of the retrospective study. Not applicable. The authors declare that they have no competing interests. Ready to submit your research Ready to submit your research ? Choose BMC and benefit from:? Choose BMC and benefit from:Received: 15 June 2020 Accepted: 14 October 2020