key: cord-213974-rtltf11w authors: Lensink, Keegan; Laradji, Issam; Law, Marco; Barbano, Paolo Emilio; Nicolaou, Savvas; Parker, William; Haber, Eldad title: Segmentation of Pulmonary Opacification in Chest CT Scans of COVID-19 Patients date: 2020-07-07 journal: nan DOI: nan sha: doc_id: 213974 cord_uid: rtltf11w The Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has rapidly spread into a global pandemic. A form of pneumonia, presenting as opacities with in a patient's lungs, is the most common presentation associated with this virus, and great attention has gone into how these changes relate to patient morbidity and mortality. In this work we provide open source models for the segmentation of patterns of pulmonary opacification on chest Computed Tomography (CT) scans which have been correlated with various stages and severities of infection. We have collected 663 chest CT scans of COVID-19 patients from healthcare centers around the world, and created pixel wise segmentation labels for nearly 25,000 slices that segment 6 different patterns of pulmonary opacification. We provide open source implementations and pre-trained weights for multiple segmentation models trained on our dataset. Our best model achieves an opacity Intersection-Over-Union score of 0.76 on our test set, demonstrates successful domain adaptation, and predicts the volume of opacification within 1.7% of expert radiologists. Additionally, we present an analysis of the inter-observer variability inherent to this task, and propose methods for appropriate probabilistic approaches. world are overwhelmed and facing shortages of the essential equipment necessary to manage the symptoms of this disease. Rapid screening is necessary to diagnose the disease and slow the spread, and effective tools are essential for prognostication in order to efficiently allocate resources to those who need them most. While RT-PCR has emerged as the standard screening protocol for COVID-19 in many countries, the test has been shown to have high false negative rates due to its relatively low sensitivity yet high specificity [28] . Recent work has shown that the analysis of chest CT scans by trained radiologists increases the diagnostic sensitivity [1] . This is because the virus attacks and inhibits the alveoli of the lung, which fill with fluid in response, causing various forms of opacification within the lung when seen on Computed Tomography (CT) scans. Due to an increase in density, these areas present on CT scans as increased attenuation with preserved bronchial and vascular markings, known as a ground glass opacity (GGO). In addition to this, when the accumulation of fluid progresses to obscure bronchial and vascular regions on CT scans, it is known as consolidation. In addition to providing complimentary diagnostic properties, the analysis of CT scans has great potential value for the prognostication of patients with COVID-19. The percentage of well-aerated-lung (WAL) has emerged as a predictive metric for determining prognosis of patients confirmed with COVID-19, including admission to the ICU and death [6] . The quantification of percentage of WAL is often done by visually estimating volume of opacification relative to healthy lung, which is a time consuming process, or can be roughly estimated automatically through attenuation values within the lung. In addition to the percent of WAL, which does not account for the various forms of opacification, expert interpretation of CT scans can provide insight on the severity of the infection by identifying various patterns of opacification (see Table 1 ). The prevalence of these patterns, and the severity of the infection, can also correlate to different stages of the disease [15, 24] . Therefore, automatic quantification of both the percentage WAL and the opacification composition could enable efficient estimation of the stage of the disease, and even a glimpse at the risk for poor outcomes. Standard diagnostics require experienced radiologists, and is highly time consuming. Thus, there is a need to develop machine learning techniques to deal with the problem in a quantitative way to support the radiological team. The task is difficult for a number of reasons, and there is a limited amount of available work focused on this particular problem. First, privacy restrictions and labelling cost lead to a lack of available public data. As a result, there is only one small public dataset [20] known to us at the time of writing this manuscript. For this reason, we are unable to publicly release the collected dataset, at this time. Second, as we show in this work, the segmentation process has a subjective component. Indeed, the collected data is heavily influenced by the instructions provided to the annotators and therefore, it is likely problematic to combine data sets collected under different labelling regimes. Third, the cost of acquiring pixel level segmentation labels is prohibitive. Compared to common computer vision tasks such as the annotation of street scenes, this task is non-trivial and requires careful attention by highly skilled expert annotators. In this study, we found that pixel level annotation required an average of 60 minutes per scan, leading to a total of roughly 660 highly trained person-hours to acquire the dataset. From a machine learning point of view, the second point deserves special attention. As we show next, the segmentation task is especially difficult due to the subjectivity of the labels and the low inter-class variance. The regions of opacification are, by definition, hazy and without clearly defined borders. As such, there is an expected increased level of inter-observer variability. In addition, the patterns that we wish to differentiate have low inter-class variability. While the patterns described in 1 are technically independent, they are not mutually exclusive in a given area, and distinguishing these complex patterns is very time consuming and difficult for even expert radiologists. Given the challenges mentioned above, the goal of this work is to provide open source models for the segmentation of patterns of pulmonary opacification, which have been correlated with various stages and severity of COVID-19 pneumonia. While the development of open source models have been hindered by the lack of publicly available data, we hope that by releasing our open source model and pretrained weights, we can enable healthcare centers and researchers around the world to develop tools for the effective diagnosis and prognostication of COVID-19 on CT scans. In addition to our models, we hope to enable researchers by discussing the insights gained from our work into this difficult task, particularly related to the incorporation of uncertainty and the high inter-observer variability between annotators. We provide an open source software implementation 1 , with training procedure using a small public dataset, and an online visualization tool 2 to easily view predictions on our private dataset. Our contributions are as follows: 1. We have collected 663 chest CT scans of patients with COVID-19 pneumonia from healthcare centers around the world, and created pixel wise segmentation labels for nearly 25,000 slices that segment 6 different forms of pulmonary opacification that have been correlated with stages and severity of COVID-19. We provide open source implementations and pretrained weights for multiple segmentation models trained on our dataset, and show that these models adapt to domains that are withheld from the training set. We hope that by making this publicly available we will ease the burden of development on healthcare centers around the world that have limited access to data. In this section we discuss the work that is most relevant for this paper. We start with semantic segmentation for CT Scans on general medical problems, followed by semantic segmentation for COVID-19. Deep learning-based methods have been widely applied in medical image analysis to combat COVID-19 [10, 13, 23] . They have been proposed to detect patients infected with COVID-19 via radiological imaging. For example, a COVID-Net [22] was proposed to detect COVID-19 cases from chest radiography or X-Ray images. An anomaly detection model [19] was designed to assist radiologists in analyzing the vast amounts of chest X-ray images. For CT imaging, a location-attention oriented model was employed to calculate the infection probability of COVID-19. A weakly-supervised deep learning-based software system was developed in [26] using 3D CT volumes to detect COVID-19. Although plenty of AI systems have been proposed to provide assistance in COVID-19 diagnostics for clinical practice, there are only a few related works [8] , and no significant impact has been shown using AI to improve clinical outcomes, as of yet. Semantic segmentation for CT Scans has been widely used for diagnosing lung diseases. Diagnosis is often based on segmenting different organs and lesions from chest CT slices, which can provide essential information for doctors to identify underlying disease processes. Many methods exist that perform lung nodule segmentation. Early algorithms are based on SVM to extract features and detect nodule segmentations [13] . Later, algorithms based on deep learning emerged [10] . Some examples are central focus CNNs [23] and GAN-synthesized data to improve the training of a discriminative model for pathological lung segmentation. Latest methods are based on two deep networks to segment lung tumors from CT slices by adding multiple residual streams of varying resolutions, and multi-task learning of joint classification and segmentation. Semantic segmentation for COVID-19 While COVID-19 is a recent phenomenon, several methods have been proposed to analyze infected regions of COVID-19 in the lungs. [8] proposed a semi-supervised learning algorithm for automatic COVID-19 lung infection segmentation from CT scans. Their algorithm leverages attention to enhance representations. Similarly, [27] proposed to use spatial and channel attention to enhance representations, and [5] augment UNet [17] with ResNeXt [25] blocks and attention to improve its efficacy. Instead of focusing on the architecture. Although previous methods are accurate, their computational cost can be prohibitive. [3] similarly propose the segmentation of tomographic patterns from chest CT using a large annotated dataset, and compute measures of the severity of infection. We have worked with health centers around the world to retrospectively collect 663 CT scans of patients suspected of having COVID-19. The dataset is composed of scans from health centers in Canada, Italy, South Korea, Iran, and Saudi Arabia, where each respective health center's research ethics board approved use of the dataset. The dataset represents a relatively equal distribution of sex, with 321 female scans and 324 male scans. While all scans had a slice thickness of 1mm, the slice spacing varied depending on the healthcare centers protocols for storing scans. We collected 112 scans with 1mm spacing, 84 scans with 5mm spacing, and 449 scans with 10mm spacing. CT Scans are 3D volumes, where the x, y, and z axes are commonly used to refer to the anterior-posterior, lateral, and distal-proximal axes respectively. All studies were re-sampled prior to our collection to have an x and y resolution of 512 × 512, with varied number of slices depending on the slice spacing and the length of the distal-proximal axis. Studies with 10mm spacing generally have 30-60 slices, where as studies with 1mm slice spacing have 150-300 slices. General and Sub-specialist Radiologists (Medical Doctors trained in medical imaging) determined the most clinically relevant classes for annotation of the data. In order to aid prognosis and diagnosis, we segmented 6 different patterns of pulmonary opacification seen in COVID-19 pneu- (84) 3865 (116) 16208 (463) 24975 (663) monia, as outlined in Table 1 . These patterns are composed of GGO, varying degrees of lobular septal thickening (an anatomic unit of the lung) and/or consolidation, and are commonly used by radiologists to differentiate between different stages and severity of infection. Each tomographic pattern is defined and identified by its distinguishing spatial characteristics as outlined in [9] . The aim of this project is to automate the segmentation of these 6 patterns, as the information can aid radiologists in diagnostics and evaluating patient prognostication in terms of admission to the ICU, the need for mechanical ventilation, and death. In addition to the patterns of pulmonary opacification we also annotated pleural effusion and lymphadenopathy, both of which are non-pulmonary findings that relate to the infection. The annotation team was composed of a number of practicing staff and resident radiologists at Vancouver General Hospital, as well as numerous medical students at the University of British Columbia. The medical students were used to collect lung segmentation labels, while the radiologists and residents were used to collect segmentation labels for patterns of opacification. In order to compute the percent WAL we first segmented the total lung volumes and then the opacities within the lung. A team of 12 expert radiologists and residents used an online annotation tool to segment the patterns of pulmonary opacification. The radiologists were trained on how to use the software and instructed to annotate every slice in 10mm spaced studies, and a portion of roughly 50 representative slices with in the 5mm and 1mm spaced studies. As a result, the thin spaced studies are partially labelled, as some slices were left unlabelled and in some regions there were no labels. The dataset was split on the scan level in order to ensure that all slices from a scan were contained in the same set. Due to the relatively few number of scans in the test set (84/663), it is composed of scans that were selected by an expert radiologist in order to ensure a clinically representative sample. These scans were selected such that the test set included a variety of presentations, including healthy lungs, as well as lungs that contained opacities at varying stages of the disease. In order to test the effectiveness of domain adaption, we withheld all studies from 2 locations, Italy and Vancouver, for inclusion in the test set. Following selec-tion of the test set, the validation and training sets were randomly selected as 20% and 80% of the remaining studies, respectively. The final number of scans in each split from each region is presented in Table 2 . The labelling process for the test set differed slightly from the training set in order to increase the quality of the annotations. The instructions for the annotators stayed the same to limit bias, however we ensured that every slice of each scan in the test set was labelled, in order compute a more accurate estimate of the ground truth percent WAL. In contrast to this, for the training set we prioritized getting slices labelled from more scans in order to increase the diversity of the training set, so we instructed the annotation team to label fewer slices out of each study. This process allowed us to efficiently collect a high quality test set while balancing diversity in the training set. Standard computer vision applications and data sets, such as CamVID [2] , contain very little variance when annotated by different annotators. This is because it is rather trivial to identify simple objects such as trees, houses, etc. However, this is not generally the case when working with medical data sets. For the problem at hand, there is significant inter-observer variability, similar to other studies presented in medical imaging [12] due to the fact that the segmentation task is non-trivial even to an expert. In the attempt to quantify this difference we have performed a study, collecting data from 12 different experts that annotate the same 43 slices, thus allowing us to compare and quantify the inter-observer variability present in our dataset. A visual comparison of the different annotations is presented in Figure 2 and 4. Qualitatively, we see that while there are differences between the radiologists they are all generally focused on the same regions of the lung. We observe that there is large variability in both the borders of the opacity and more so in the type of opacity. As we discuss next, these differences require attention when training a network. To see why traditional metrics may be insufficient to deal with this problem we compute the intersection over union (IOU), which is a standard metrics in semantic segmentation [7] . The results are presented in a form of a matrix in Figure 2 . Comparison of the labels generated by each radiologist (columns) for 5 consecutive slices (rows) from the same study. Figure 3 . Inter-observer opacity IOU comparisons. Figure 3a shows a pair-wise distance matrix between all 12 radiologists, and Figure 3b shows the distance between each radiologist and the average prediction. We see that, by a large margin, each radiologist is more similar to the average prediction than to any of their peers. Figure 3 . As can be seen in Figure 3 , the IOU of each radiologist relative to their peers is rather poor. An AI system with similar IOU is typically considered as inoperable. The main reason that IOU is an inappropriate measure is that it assumes hard borders, where the problem at hand does not present such borders. Furthermore, since the decision on the opacification type is highly subjective, results tend to yield even lower IOU's. Thus, in the next section we discuss techniques to deal with the uncertainty in the data. By definition GGO and consolidation are hazy cloudlike opacities that do not have clear boundaries due to their physical structure, therefore they are better represented by a continuous probability distribution rather than a discrete one. Unfortunately, commonly available annotation tools do not make it easy to incorporate uncertainty into the labelling, nor is it time efficient to ask annotators to label in such a manner. In cases such as this, where segmentation is non-trivial and the object does not have a clear border, disagreement between hard labels should not simply be attributed to annotation error. Instead, we view each label as a sample from the ground truth, which is an unknown underlying continuous probability distribution. This can be modeled by S obs =S + n(x, c). Here S obs is the observed segmentation obtained by the radiologist,S is the average segmentation and n(x, c) is a noise model that depends on the location, e.g. how close is the region to the edge of the object, and the class of the object, c. The noise model is clearly correlated, as different classes tend to be more correlated than others as they present in a more similar way. It is also correlated in space, as pixels that are close to the boundary can depend on each other. While a comprehensive treatment of the problem is beyond the scope of this paper we have been experimenting with non-parametric noise models as well as simple parametric noise models. In this paper we present a simple parametric approach that uses only the first and second moments of the data estimated from the uncertainty study. To this end, we assume that each data point (pixel) represents a measurement from a Gaussian statistic. We then use the KLdivergence measure to compare the probability obtained by the network to the probability parameterized by Gaussian distribution. This approach takes into consideration first order statistics that is presented in the data. We are currently collecting more data that compares different annotations of the same slice in order to develop a more comprehensive model that will allow us to train with noise models that are closer to realistic ones. In this section we describe the various deep learning approaches we have taken to solve the problem. Although the problem is 3D by nature, in this early phase of development we have focused on developing 2D methods that segment each axial slice independently. While this is clearly suboptimal, as the inclusion of the z axis contains relevant information, we decided to first focus on 2D methods to set a baseline for all future approaches. Training 3D models is undergoing as new data arrives and results will be presented in the future. A lung window of −1000Hu to 350Hu is applied to each slice, followed by normalization of the pixel values in each slice using the mean, −653.2Hu, and standard deviation, 628.5Hu, computed across the entire training set. In this initial work, we have grouped the 6 patterns into 3 clinically relevant grouping as outlined in Table 1 . These groupings are created in order to increase the inter-class variation while maintaining clinical relevancy. In this initial phase we focus on training three popular 2D segmentation networks, which are a natural first step for this problem and do not require any special care given the wide variety of slice spacing in the dataset. We train a UNet [18] , a DeepLabv3Plus with a ResNet50 backbone [4] , and a PSPNet with an InceptionResNetv2 [21] backbone. In order to compare models we select a variety of common metrics. Firstly, we use the Intersection-over-Union to evaluate the accuracy of the segmentation for each class as Although as we have previously demonstrated, this metric does not represent a desired property it is commonly used and thus we compute it to compare to known literature. In addition, we combine the probabilities for each opacity group together to compute the Opacity IOU, which we use to evaluate the ability of the model to distinguish between healthy lung and opacification. In order to more sensitively determine the accuracy of the model at computing the percent WAL, we compute the ratio of the predicted volume of opacification with the ground truth, a metric we are calling the Relative Volume (RV). This metric is computed over the entire test set as whereT P ,F P ,T P andF N are integrated quantities. This metric is much less sensitive compared with the IOU as it integrates the opacities, however more sensitive than comparing the percent WAL directly because it does not depend on the lung volume, which is often much larger than volume of opacification. Therefore, unlike the IOU, even if individual boundaries do not match the overall relative volume should be correct. Furthermore, while the IOU of an individual structure has no real clinical value, the RV has significant clinical value and therefore estimating it correctly is a much more desired goal. In our training we use the ADAM optimizer [14] with a batch size of 64 to minimize the weighted KL divergence loss for 30 epochs using a learning rate of 10 −1 , which is decayed by a factor of 10 every 10 epochs. The loss is weighted using the complement of the probability of each class, that were computed from the training set. The model selected for evaluation on the test set is selected using the binary opacity IOU on the validation set. We visualize select output of the model in Figure 5 , where we see that the qualitative results are visually correct. In the first figure we present a successful segmentation of Pure GGO. While there are some variations in the border compared to the "ground truth" as estimated by a radiologist, all regions of opacification have been segmented by the model. It is important to note that unlike other problems in computer vision where the ground truth is correct, it is impossible to know if the radiologist is better than our trained model given one label. Further studies are needed in order to obtain a full quantitative assessment of the results. In the second row we see a prediction on a slice, where the "ground truth" contains a combination of group 2 and group 3. This presentation is typical where regions of GGO have progressed in severity and inter/intra-lobular lines have formed, which is a defining characteristic for Group 3. Again, we see some disagreement between the prediction and the "ground truth" in terms of the exact border, however the model has correctly identified each region of opacification. In qualitative terms, the prediction is equally viable to the ground truth and the difference between our segmentation to the "ground truth" is similar to the one obtained by the segmentation by different radiologists. In the third row we show the effects of partial volume artifacts, which we consider to be the most common cause of false positive. In this case we see the effect of the diaphragm causing a opaque region in the anterior portion of the right lung, however it is also common to see partial volume artifacts at the apex of the lungs as well. Seeing as a radiologist would rule this out by viewing adjacent slices and recognizing the start of the diaphragm, a potential solution for this would be to include information from additional slices, such seen in 3D approaches. In the bottom row we see the successful segmentation of consolidation in the posterior basal segment of the right lower lung lobe. We would like to highlight the region of the opacification that has been predicted to be group 3 (purple), however was labelled as group 4 (brown). In consultation with a second radiologist we have confirmed that the prediction can be seen as more correct than the label in the predicted data set. The likely cause of this was previously discussed, where we show that more specific options are needed by the annotators. In this case since the region of opacification was mainly group 4, it is likely that the annotator simply labelled the entire region as such. In order to compare the different deep learning models, quantitative results are presented in Table 3 . The UNet proved to be the best segmentation model in terms of opacity IOU, while we see that the other models performed similarly, yet slightly worse. No single model stands out as having the best mIOU over the three opacity groups, and all scores are relatively low. Given the relatively high opacity IOU, this could be attributed to a significant confusion between opacity groups. Despite varied IOU scores, the models all predicted clinically relevant relative volumes of opacity, with the UNet and PSPNet over predicting by 1.7% and 4.4% respectively, and the DeepLabv3Plus under predicting by 3.3%. (d) In this example we see that the model is more specific than our annotator, and was able to differentiate a region of crazy paving from a larger region of consolidation.. Figure 5 . Predictions from the UNet model on four slices from the test set. The first column shows the CT scan, the second column shows the ground truth annotation, and the last column shows the model predictions. Despite the challenging task of segmenting pulmonary opacification on CT scans, these initial promising results already indicate that deep learning approaches can provide value in clinical situations. All models were able to achieve relative volumes ratios within ±5% of the ground truth, enabling clinically relevant automatic estimation of the percent WAL. We show qualitative results that our model is able to determine the pattern of opacification, which could provide timely and valuable clinical information to healthcare centers, as these patterns are associated with the severity of the infection. We do not see large differences between different model architectures observed in [8] , suggesting that the quantity and diversity of data are more important that the network architecture, as proposed in [11] . Success on the test set, which contains scans from regions that were held out of the training set, shows that the models are able to successfully transfer knowledge across domains. This is an important finding, as the sharing of pre-trained models may allow the community to circumvent the lack of publicly available data due to privacy restrictions. The best model achieves an opacity IOU of 0.758 on the test set, which we note is slightly higher than the human level performance we found comparing each radiologist to the average prediction in Section 2. Given the analysis of the inter-observer variability, we believe these models have approached the upper bound of the accuracy for our labels, without including higher order statistics or better data. Thus, given a test set of similarly collected labels we are not able to reliably evaluate quantitative performance using simple statistics beyond this point. we see that comparison of qualitatively ideal annotations from expert annotators yield a wide range of IOU scores. While it is noteworthy that the model's opacity IOU is higher than the values we see between radiologists and the average label, due to the fact that the human level analysis comes from only one scan we do not believe it is possible to determine if the model has surpassed human ability yet. The relatively poor quantitative performance in differentiating between pattern of opacification is likely due to a number of factors. Firstly, in Section 2 we see large disagreements between experts on the specific pattern of opacification in a given region, as shown in Figure 4 . While it is possible that annotator error is a factor, we believe that this could largely be rectified through improved annotation procedure. Internal patterns within the segmented regions can mix together like a mosaic of many patterns in a single segmented region, and therefore should not be characterized by a single number. A more appropriate description should be given by a partial volume or probability, however creating such labels is difficult with software tools commonly used for annotation. Thus, currently, our labels do not always account for the variations in the type of pattern across regions making the labels inherently noisy, which effects training and evaluation of the model. In future work we plan to adjust the labelling procedure to provide annotators with more specific tools and instructions, such that we are able to obtain more accurate labels. Lastly, even with the groupings described in Table 1 , there is low inter-class variability between the patterns of opacification, meaning that even given excellent annotations the segmentation task is likely chal-lenging. This application demonstrates the need to further develop techniques that allow the use of noisy labels with a non-trivial noise model. A simple example is the use of covariance information that is non-trivial to incorporate when using Stochastic Gradient Descent. In this paper we have described our collection and annotation of CT scans of patients with COVID-19 pneumonia for the segmentation of patterns of pulmonary opacification. We provide results using three popular 2D segmentation networks that show that we are able to accurately compute the relative volume of opacity, and estimate the composition of three clinically relevant pattern groups that have been linked to patient outcome. We show that our models are able to adapt to new domains by withholding two regions from training, which provides valuable insight on the possibility of a community driven approach that is not hindered by the lack of publicly available data or restrictive privacy policies. We provide an analysis of the inter-observer variability that is present in this non-trivial task, and conclude that due to the physical structure of opacification and the low inter-class variability, improved success in this task requires the adoption of soft labelling techniques and probabilistic models. To this end, we propose improved annotation procedures and noise modelling techniques that allow for future work using continuous probability distribution as opposed to the more common discrete distributions used in computer vision. Correlation of chest ct and rt-pcr testing in coronavirus disease 2019 (covid-19) in china: a report of 1014 cases Semantic object classes in video: A high-definition ground truth database. Pattern Recognition Letters, xx(x):xx-xx Quantification of tomographic patterns associated with covid-19 from chest ct Rethinking atrous convolution for semantic image segmentation Residual attention u-net for automated multi-class segmentation of covid-19 chest ct images Well-aerated lung on admitting chest ct to predict adverse outcome in covid-19 pneumonia The pascal visual object classes (voc) challenge. IJCV Inf-net: Automatic covid-19 lung infection segmentation from ct images Fleischner society: Glossary of terms for thoracic imaging Deep learning techniques for medical image segmentation: Achievements and challenges Automatic lung segmentation in routine imaging is a data diversity problem, not a methodology problem Deep learning with noisy labels: exploring techniques and remedies in medical image analysis Lung nodule segmentation and recognition using svm classifier and active contour modeling: A complete intelligent system Adam: A method for stochastic optimization Coronavirus disease (COVID-19): Spectrum of CT findings and temporal progression of the disease Who coronavirus disease dashboard U-net: Convolutional networks for biomedical image segmentation U-net: Convolutional networks for biomedical image segmentation Unsupervised anomaly detection with generative adversarial networks to guide marker discovery Covid-19 ct segmentation dataset Inception-v4, inception-resnet and the impact of residual connections on learning Covid-net: A tailored deep convolutional neural network design for detection of covid-19 cases from chest x-ray images Central focused convolutional neural networks: Developing a data-driven model for lung nodule segmentation Temporal changes of CT findings in 90 patients with COVID-19 pneumonia: A longitudinal study Aggregated residual transformations for deep neural networks Deep learning-based detection for covid-19 from chest ct using weak label. medRxiv An automatic covid-19 ct segmentation based on u-net with attention mechanism Coronavirus disease 2019 (covid-19): a perspective from china We thank Brian Lee and Duncan Ferguson for their support in collecting and creating the dataset, Vancouver Coastal Health Research Institute and the doctors, medical students, and healthcare centers who contributed to the creation of the dataset. For a comprehensive list please see our acknowledgments page. K.L and E.H. are supported of the Natural Sciences and Engineering Research Council of Canada (NSERC).