key: cord-0684112-52nw9gxq authors: Martin, Alistair; Nateqi, Jama; Gruarin, Stefanie; Munsch, Nicolas; Abdarahmane, Isselmou; Knapp, Bernhard title: An artificial intelligence-based first-line defence against COVID-19: digitally screening citizens for risks via a chatbot date: 2020-03-27 journal: bioRxiv DOI: 10.1101/2020.03.25.008805 sha: 71b51f6332abed381f138bd9d7269fbd4c394beb doc_id: 684112 cord_uid: 52nw9gxq To combat the pandemic of the coronavirus disease (COVID-19), numerous governments have established phone hotlines to prescreen potential cases. These hotlines have struggled with the volume of callers, leading to wait times of hours or, even, an inability to contact health authorities. Symptoma is a symptom-to-disease digital health assistant that can differentiate more than 20,000 diseases with an accuracy of more than 90%. We tested the accuracy of Symptoma to identify COVID-19 using a set of diverse clinical cases combined with case reports of COVID-19. We showed that Symptoma can accurately distinguish COVID-19 in 96.32% of clinical cases. When considering only COVID-19 symptoms and risk factors, Symptoma identified 100% of those infected when presented with only three signs. Lastly, we showed that Symptoma’s accuracy far exceeds that of simple “yes-no” questionnaires widely available online. In summary, Symptoma provides unparalleled accuracy in systematically identifying cases of COVID-19 while also considering over 20,000 other diseases. Furthermore, Symptoma allows free text input, furthered with disease-specific follow up questions, in 36 languages. Combined, these results and accessibility give Symptoma the potential to be a key tool in the global fight against COVID-19. The Symptoma predictor is freely available online at https://www.symptoma.com. Currently, the world is facing an unprecedented health crisis caused by the novel coronavirus disease . To track and hopefully curb this pandemic, large-scale laboratory testing for COVID-19 is becoming widespread. However, capacities are far from being able to test whole populations. Therefore many countries have established phone hotlines to pre-screen persons who are unsure about their COVID-19 infection status. Only after talking to an operator and being identified as a potential case, often due to being symptomatic, will laboratory testing occur. However, these hotlines are severely overrun worldwide, leading to hour-long waiting periods and, even, disconnected lines. This ultimately leads to many COVID-19 cases going undiagnosed. One solution to the overwhelming number of calls inundating hotlines is to pre-screen them using computer-based approaches. These methods can be grouped into two categories. Firstly, a large number of simple yes/no online questionnaires are available. These questionnaires lead straight to the point but are limited in their informative value. They do not provide a deep understanding of a patient's health situation, they do not allow for the consideration of additional symptoms, they do not allow the generation of additional data for analysis, and, critically, they are often language-and/or country-specific. The alternative to these questionnaires is general-purpose symptom checkers which allow a user to list or select their symptoms before being provided with potential diagnoses. Several have already been developed over recent years (benchmarked in Nateqi et al.) 1 . However, most of these symptom checkers are highly restricted in the number of diseases taken into account as building up the underlying databases is fundamentally cost-intensive and slow. Furthermore, language ambiguities are hard to overcome. This inevitably leads to small disease databases where users can only choose from a limited list of pre-defined symptoms, lowering their viability when a new disease emerges. Recently, we showed that Symptoma, a symptom-to-disease digital health assistant, significantly outperforms other symptom checkers in general diagnosis 1 . This was also confirmed by an independent study 2 . In the following, we present the accuracy of Symptoma with regards to systematically identifying cases of COVID-19 while also considering over 20,000 other diseases. Symptoma classifies nearly all 30 COVID-19 case descriptions correctly as COVID-19 cases (96.6% sensitivity), failing only when presented with a case containing no defining symptoms of COVID-19 (Fever, Fatigue, Dizziness, Constipation, Rhonchi, Tachypnea, and Bilateral pneumonia). Achieving 100% sensitivity is however easy e.g. by constructing a test that simply classifies every case as COVID-19. To address this issue we also tested how well Symptoma performs on cases of non-COVID-19 patients. For this purpose, we use 1,112 British Medical Journal (BMJ) cases that stretch over 84 fields of medicine (see Methods). Of these 1,112 cases, only 41 are classified as potential COVID-19 cases by Symptoma, with only seven of these ranking COVID-19 higher than the correct diagnoses. These seven cases relate to diseases that present similarly to COVID-19, however, have far lower incidence rates and, therefore, are deemed less likely, e.g. Severe Acute Respiratory Syndrome (SARS-CoV) or the Avian influenza A (H5N1) virus infection (bird flu). The results are summarized in Table 1 . Identifying patients presenting with COVID-19 both quickly and efficiently is of utmost importance to digital diagnoses. However, achieving both speed and accuracy simultaneously is remarkably difficult. Short, and therefore quick, questionnaires will typically have low specificity, while conversely, long questionnaires lack efficiency and speed, often containing many questions not pertinent to any given patient. Symptoma's free text search allows quick, efficient, and complex queries of symptom's without constraint to a predefined list of symptoms. Only reported COVID-19 symptoms are considered. Points are jittered vertically for clarity only. To highlight this with regards to COVID-19, we show in Figure 1 , the search rank of queries containing various numbers of symptoms known to be present in those infected with COVID-19 (see Methods). Key symptoms, such as suffering a fever or living in an area with a high incidence of COVID-19, leads to COVID-19 suggested within the top 30 search results immediately. This threshold is passed by 75%, 98.5%, and 100% of one, two, and three symptom queries respectively. At three symptoms, 99.1% of the possible combinations are returned within the top 10 results, and with four symptoms, all queries return COVID-19 within the top 10. These results highlight the speed with which a correct diagnosis can be observed, even when minimal symptoms are entered into the query. Below we compare Symptoma to simple COVID-19 symptom checkers. These applications aim to determine, using a limited set of symptoms as input, the likelihood that a user is suffering from COVID-19 in comparison to influenza, common cold and/or hay fever. Based on literature-derived symptom frequencies for each disease (see Table S1 ), we have implemented four different methods: vector calculus-based distance in space between case presentation and symptom frequency (SF-DIST), distance normalised by the standard deviation (SF-SD), distance based on principal component analysis (SF-PCA) and the cosine similarity (SF-COS). These methods are described in detail within the Methods. To evaluate the performance of these approaches in comparison to Symptoma, we classified the combined COVID-19 and BMJ cases of Table 1 that have at least one COVID-19 symptom (n=394) with all four simple approaches. A case is classified as COVID-19 positive if the probability of COVID-19 is at least 5% higher than the probability for influenza, common cold or hay fever. As Symptoma weights COVID-19 against more than 20,000 diseases we use the definition as stated within the Methods. The results, summarised in Figure 2 , show that Symptoma performs considerably better than the simplistic approaches, achieving an accuracy of 91.1%, 19% higher than any other method (SF-COS achieved an accuracy of 72.1%). Similarly, a margin of 0.34 sensitivity and 0.27 specificity was found between Symptoma and the other methods. The former, the sensitivity, is especially notable, as the other approaches consider only COVID-19, common cold, flu, and hay fever as a potential diagnosis. Above, we compared the symptom-to-disease search engine Symptoma to alternative methods on both COVID-19 and BMJderived decoy cases. We showed that Symptoma is superior to these alternative approaches in multiple aspects. First, Symptoma can correctly identify cases of COVID-19, not only against a few diseases but against more than 20,000 other potential causes. This far exceeds the 6,000 differential diagnoses provided by the second-largest symptom checker "Isabel Healthcare" 3 . Secondly, we showed that Symptoma needs only a few symptoms to highlight cases of COVID-19. Crucially, these can be inputted as free text that is semantically understood. For example, if one enters "Tiramisu", "Food Poisoning" is returned as one of the top results. In contrast, other COVID-19 questionnaires and symptom checkers require that patients only select from a predefined list of symptoms or are constrained to fixed questions. Furthermore, Symptoma is localised into 36 languages, allowing a standardised approach on disease predictions globally. Lastly, we found that Symptoma performs better than simple 3/5 questionnaire-based approached in identifying COVID-19 cases, even though these alternative approaches only considered three alternative diagnoses (COVID-19, influenza, common cold, and hay fever). Moreover, as Symptoma is constantly mining the newest literature, it keeps up-to-date with the latest knowledge and alters its predictions accordingly, e.g. the recent reports on anosmia in COVID-19 patients 4, 5 . Similarly, by tracking cases of Covid-19 entered into Symptoma, due to the free-text function, new symptoms related to Covid-19 may be highlighted. On these grounds, we believe that Symptoma is a highly valuable tool in the global COVID-19 crisis. To show the performance of Symptoma for COVID-19 we analysed a total of 1142 medical test cases. The different sets and sources of these cases are described below. A total of 1112 cases were sourced from the British Medical Journal (BMJ) and transcribed by a medical clinician into sets of symptoms, both negative and positive, alongside other risk factors, the patient's age and sex when available 6, 7 . The cases cover a diverse range of causes, including patients suffering rib fractures, rabies, or metastatic cancer. The number of symptoms and keywords per case ranges from one to 33 (median eight) including complex terms such as "right true vocal cord is immobile". A set of 30 case reports were derived from the current literature (e.g., [8] [9] [10] [11] ). For each case, a list of symptoms and risk factors the patient presented with is given, alongside their age and sex where available. We make use of the World Health Organisation (WHO) COVID-19 symptom list to construct example queries from those infected with COVID-19 12 . The ten most frequent symptoms are combined with both "contact with someone infected with COVID-19" and "visiting/living in a COVID-19 risk area", to give 12 possible symptoms and/or risk factors. All possible combinations of these are then taken as potential COVID-19 cases yielding a total number of 4096 artificial cases. For any given set of symptoms, many possible causes could give rise to that specific presentation. We count a prediction as a true positive if the true cause is listed within the top 30 results returned by Symptoma. Given the possible 20,000 causes contained within Symptoma, this is the top 0.15%. Focussing on COVID-19, we can generate the following classification: We developed four alternative methods to give the likelihood that a given patient has either COVID-19, influenza, common cold or hay fever. Underlying each were the frequencies with which various symptoms presented for these diseases (Table S1 ). To determine the probability of each of the four diseases, we represented each patient case in a 10-dimensional space, where each axis represents a different symptom. A value of one corresponds to exhibiting the symptom, zero means the case does not have the symptom, and 0.5 means that they are unsure. This is fundamentally equivalent to many of the simple questionnaire-based approached to COVID-19 identification. In the most simplistic approach (SF-DIST), we calculated the distance in space between the patient and each of the four diseases, each of which can also be seen as a point in the 10-dimensional symptom space. Normalisation yields the respective probabilities. In the second approach, the same procedure is used, but the distance in space is normalised by the respective standard deviation of each symptom across all diseases (SF-SD). In the third approach, each symptom frequency is weighted by the first principal component of a matrix consisting of all symptoms across all diseases (SF-PCA). Lastly, we interpreted the points as vectors and calculated the cosine similarity between the cases and diseases (SF-COS). From symptom to diagnosis-symptom checkers re-evaluated : Are symptom checkers finally sufficient and accurate to use? an update from the ent perspective Comparison of symptom checkers Anosmia, hyposmia, and dysgeusia symptoms of coronavirus disease Sixty seconds on Clinical characteristics and therapeutic procedure for four cases with 2019 novel coronavirus pneumonia receiving combined chinese and western medicine treatment First cases of coronavirus disease 2019 (covid-19) in france: surveillance, investigations and control measures Pathological findings of covid-19 associated with acute respiratory distress syndrome. The Lancet respiratory medicine Viral load kinetics of SARS-CoV-2 infection in first two patients in korea Report of the WHO-china joint mission on coronavirus disease 2019 Study design: BK, AM, JN. Data compilation: SG, NM, IA. Data analysis: AM, BK, NM, IA. Writing the manuscript: BK, AM. Revising the manuscript critically: AM, BK, JN, SG. All authors are employees of Symptoma GmbH. JN holds shares of Symptoma. Results can be freely reproduced using the web interface https://www.symptoma.com/ This study has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 830017.