key: cord-0335706-0mrsa3m6 authors: Boussel, Loic; Bartoli, Jean-Michel; Adnane, Samy; Meder, Jean-François; Malléa, Patrick; Clech, Jeremy; Zins, Marc; Bérégi, Jean-Paul title: French Imaging Database Against Coronavirus (FIDAC): A large COVID-19 multi-center chest CT database date: 2022-05-26 journal: nan DOI: 10.1016/j.diii.2022.05.006 sha: 0044d0fea160ce66e9053c700088b119bce59eb8 doc_id: 335706 cord_uid: 0mrsa3m6 Purpose During the first wave of the COVID-19 pandemic, the French Society of Radiology and the French College of Radiology, in partnership with NEHS Digital, have set up a system to collect chest computed tomography (CT) examinations with clinical, virological and radiological metadata, from patients clinically suspected of COVID-19 pneumonia. This allowed the constitution of an anonymized multicenter database, named FIDAC (French Imaging Database Against Coronavirus). The aim of this report was to describe the content of this public database. Materials and methods Twenty-two French radiology centers participated to the data collection. The data collected were chest CT examinations in DICOM format associated with the following metadata: patient age and sex, originating facility identifier, originating facility region, time from symptom onset to CT examination, indication for CT examination, reverse transcription-polymerase chain reaction (RT-PCR) results and normalized CT report performed by a senior radiologist. All the data were anonymized and sent through a NEHS Digital system to a centralized data center. Results A total of 5944 patients were included from the 22 centers aggregated into 8 regions with a mean number of patients of 743 ± 603.3 [SD] per region (range: 102-1577 patients). Reason for the CT examination and normalized CT report was provided for all patients. RT-PCR results were provided in 5574 patients (93.77%) with a positive test in 44.6% of patients. Conclusion The FIDAC project allowed the creation of a large database of chest CT images and metadata available, under conditions, in open access through the CERF-SFR website. Since the onset of the COVID-19 epidemic, chest CT has proven to be an effective tool for disease diagnosis [1, 2, 3, 4, 5, 6] , patient referral [7, 8] , and disease prognosis [9, 10] . Multiple solutions based on artificial intelligence have been developed to establish the diagnosis and quantify lung involvement in order to estimate the risk for the patient to develop a severe form or to die [5, 11, 12] . However, not all of these solutions have been tested on independent datasets, which hinders the interpretation of their results. Moreover, most of these solutions are not integrated into the clinical radiology workflow, either because they are published by companies that are not yet very present in the radiology landscape, or because they are open source solutions whose code is difficult and costly to deploy in current practice. It is therefore necessary to have validation databases collected from a large panel of imaging centers in order to be able to objectively compare the solutions and guide the user in choosing one solution or another. In this context, the French Society of Radiology (SFR) and the French College of Radiology (CERF), in partnership with NEHS (Nouvelle Entreprise Humaine en Santé) Digital, have set up a system to collect chest CT scans, with clinical, virological and radiological metadata, from patients clinically suspected of presenting with COVID-19 pneumonia during the first wave of the epidemic. This allowed the constitution of an anonymized multicenter database, named FIDAC (French Imaging Database Against Coronavirus) available to the scientific community. The purpose of this report was to describe the implementation and the content of this public database. The announcement of the start of the data collection was made by the CERF and the SFR to all French radiology structures on April 15, 2020. The inclusion criteria were the realization of a chest CT scan in a patient with a suspicion of COVID-19 lung involvement at initial admittance or during the follow-up and/or a reverse transcription polymerase chain reaction (RT-PCR) proven COVID-19 infection and/or radiological signs of COVI-19 on the chest CT. The clinical suspicion was based on the presence of one or a combination of the following signs: fever or chills, cough, shortness of breath or difficulty breathing, fatigue, muscle or body aches, headache, anosmia or ear, nose, throat symptoms. Exclusion criteria were patients under 18 years of age or a chest CT ordered outside a context of COVID-19. Data collection was performed from April 1 to August 5, 2020. The data collected were chest CT images in DICOM format associated with the following metadata: patient age and sex, originating facility identifier, originating facility region (if more than 100 patients, otherwise "other"), time from symptom onset to CT examination (<3 days, 3-7 days, 8-21 days, and >21 days), indication for CT examination, final RT-PCR test result (positive, negative or not performed) and the normalized CT report. Indication for CT examination was labelled as "low symptoms COVID-19" when the patient was presenting with one or a combination of the clinical signs described above but without the need for oxygen therapy, "suspicion of COVID-19 in a oxygen dependent patient" when the patient was presenting with one or a combination of the clinical signs described above but with the need for oxygen therapy, "follow-up" and "other". Normalized CT report was termed as "classic/probable COVID-19" when presenting with peripheral, bilateral ground-glass opacities (GGO) or multifocal GGO of rounded morphology +/-consolidation or crazy paving, reversed halo sign or sub-pleural bands of consolidations. It was classified as "intermediate for COVID-19" when presenting with multifocal, diffuse, peripheral, or unilateral GGO ± consolidation lacking a specific distribution and non-rounded or nonperipheral or with only few very small GGO with a non-rounded and non-peripheral distribution or with atypical findings: large pleural effusion, major lymph nodes size increase or bronchiolitis pattern. It was classified as "non-COVID" when demonstrating another pathology and "normal" when no pulmonary disease was detected by the radiologist. This classification was performed by a senior radiologist from the local center. The transfer of CT images and metadata was performed either using the Nexus platform (NEHS Digital) with a dedicated workflow adapted to the study for the centers owning a Nexus system or through a dedicated web portal developed by NEHS Digital for the other centers. This study is compliant with the General Data Protection Regulation (GDPR) and the French regulation authority (Commission Nationale de l'Informatique et des Libertés, CNIL) by applying anonymization before sending the data out of the radiology structures. Indeed, direct identifiers (e.g., name and birthdate) have been removed, unique identifiers (e.g., Patient Id or Image Id) have been randomly regenerated, and some metadata have been generalized (e.g. radiology structures have been aggregated in regions). During the data collection, several privacy impact assessments have been performed to evaluate the risk of re-identification by individualization, by correlation or by inference. Those evaluations led to the removal of some additional metadata such as manufacturer and CT model's name. Therefore, all the images and the metadata were anonymized in compliance with the CNIL (MR004). All patients were informed about the study. In order to ease the use of the database, the metadata were include in the DICOM header of the CT images as private tags (Table 1) . Furthermore, patient's age tag (0x10,0x1010) was replaced by the age range of the patient. The FIDAC project allowed creating a French multicenter anonymized database of chest CT scans performed in patients with suspected or confirmed COVID-19. Access to this public database can be performed by request to the SFR on the CERF-SFR website [13, 14, 15, 16, 17] . Furthermore, the rate of completion of the metadata of major interest, including RT-PCR results, patients' age categories and normalized CT report is high, allowing to apply these labels on a large proportion of the CT database. Another important feature of the FIDAC database is the use of the 16-bits DICOM format for the CT images. This make the review of the images easier for radiologists as DICOM is a standard for storage of medical images and many viewers are available to display these images [18] . Furthermore, this 16-bits format allows preserving the dynamic of the images as the values of the pixels range from minus one thousand to several thousand Hounsfield Units. Finally, the metadata are included in the DICOM header in order to make the database more compact and to avoid consistency errors. This large collection of data was made possible, in a very short time, by the pre-existing interconnection network of radiology departments based on the Nexus solution (NEHS Digital). For centers that did not have Nexus at the time of the study, NEHS Digital set up a web interface to upload files. Although this latter system was efficient, the transfer was greatly simplified by the use of Nexus, which underlines the interest of having such interconnection systems for multicenter imaging studies on a national scale. The FIDAC database has several limitation. First some of the metadata are unbalanced. Indeed, there is an over representation of the regions of the north and east part of France that faced a larger number of COVID-19 patients during the first month of the pandemic [19] . Furthermore, older people are overrepresented in the dataset. Indeed, younger people were less likely to develop a severe form of COVID-19 and thus to necessitate a CT examination [20, 21, 22] . Another limit is the relatively low level of completion of some of the metadata including the delay between the onset of symptoms. This is related to the lack of information when the CT was performed in aged patient in an emergency setting. Finally, data collection was not carried out consecutively in each center and no biological data was recorded. This limits the ability to perform advanced statistics on the population included and to draw conclusions on the effectiveness of the scanner for the diagnosis of covid-19. In conclusion, the FIDAC project, developed by the French Society and the French College of radiology allowed the creation of a large database of chest CT images and metadata throughout France during the first wave of the COVID-19 pandemic. This database is available, under conditions, in open access through the CERF-SFR website.. Radiological Society of North America Expert Consensus Document on reporting chest CT findings related to COVID-19: endorsed by the Society of Thoracic Radiology, the American College of Radiology, and RSNA COVID-19 pneumonia: a review of typical CT findings and differential diagnosis Radiology indispensable for tracking COVID-19 Radiology, COVID-19, and the next pandemic Imaging of COVID-19: an update of current evidences COVID-19 after 18 months: where do we stand? Efficacy of chest CT for COVID-19 pneumonia diagnosis in France Chest CT for rapid triage of patients in multiple emergency departments during COVID-19 epidemic: experience report from a large French university hospital Early prediction of disease progression in COVID-19 pneumonia patients with chest CT and clinical characteristics COVID-19: a qualitative chest CT model to identify severe form of the disease Integrating deep learning CT-scan model, biological and clinical variables to predict severity of COVID-19 patients Chest CT in COVID-19 pneumonia: a review of current knowledge Artificial intelligence for COVID-19: rapid review Modified SEIR and AI prediction of the epidemics trend of COVID-19 in China under public health interventions A classifier prediction model to predict the status of Coronavirus COVID-19 patients in South Korea The RSNA International COVID-19 Open Radiology Database (RICORD) COVID-19 pneumonia: the fight must go on A survey of DICOM viewer software to integrate clinical research and medical imaging Clinical and virological data of the first cases of COVID-19 in Europe: a case series Clinical characteristics of patients with severe pneumonia caused by the SARS-CoV-2 in Wuhan, China Clinical characteristics of 140 patients infected with SARS-CoV-2 in Wuhan COVID 19) Region's name (0C19.1010:NEHS-COVID 19) Indication for CT examination (0C19.1011:NEHS-COVID 19) Time between onset of symptoms and CT examination The authors declare that the work described has been carried out in accordance with the Declaration of Helsinki of the World Medical Association revised in 2013 for experiments involving humans.The authors declare that this report does not contain any personal information that could lead to the identification. We acknowledge Alexandre Fernandez and Brieuc Tredan from NEHS Digital for their help in data collection.We acknowledge, for their participation in the creation of the database, the radiologists from the following centers: CHU de Strasbourg, CHU de Nîmes, CHU de Clermont-Ferrand,