key: cord-1009717-ye6k6cf8 authors: Coma, Ermengol; Mora, Nuria; Prats-Uribe, Albert; Fina, Francesc; Prieto-Alhambra, Daniel; Medina-Peralta, Manuel title: Excess cases of influenza suggest an earlier start to the coronavirus epidemic in Spain than official figures tell us: an analysis of primary care electronic medical records from over 6 million people from Catalonia date: 2020-04-14 journal: nan DOI: 10.1101/2020.04.09.20056259 sha: a2d1730bcfc816f1a5f524e7974cbaaa83e58d0b doc_id: 1009717 cord_uid: ye6k6cf8 Objectives: There is uncertainty about when the first cases of COVID-19 appeared in Spain, as asymptomatic patients can transmit the virus. We aimed to determine whether influenza diagnoses masked early COVID-19 cases and, if so, estimate numbers of undetected COVID-19 cases in a large database of primary-care records covering >6 million people in Catalonia. Design: Time-series study of influenza and COVID-19 cases, using all influenza seasons from autumn-winter 2010-2011 to autumn-winter 2019-2020. Setting: Primary care, Catalonia, Spain. Participants: People registered in one of the contributing primary-care practices, covering >6 million people and >85% of the population. Main outcome measures: Weekly new cases of influenza and COVID-19 diagnosed in primary care. Analyses: Daily counts of both cases were computed using the total cases recorded over the previous 7 days to avoid weekly effects on recording practice. Epidemic curves were characterised for the 2010-2011 to 2019-2020 influenza seasons. Influenza seasons with a similar epidemic curve and peak case number as the 2019-2020 season were used to model predictions for 2019-2020. ARIMA models were fitted to the included influenza seasons, overall and stratified by age, to estimate expected case numbers. Daily excess influenza cases were defined as the number of observed minus expected cases. Results: Four influenza season curves (2011-2012, 2012-2013, 2013-2014, and 2016-2017) were used to estimate the number of expected cases of influenza in 2019-2020. Between 4 February 2020 and 20 March 20202, 8,017 (95% CI: 1,841 to 14,718) excess influenza cases were identified. This excess was highest in the 15-64 age group. Conclusions: COVID-19 cases may have been present in the Catalan population when the first imported case was reported on 25 February 2020. COVID-19 carriers may have been misclassified as influenza diagnoses in primary care, boosting community transmission before public health measures were taken. In future, the surveillance of excess influenza cases using widely available primary-care electronic medical records could help detect new outbreaks of COVID-19 or other influenza-like illness-causing pathogens. Earlier detection would allow public health responses to be initiated earlier than during the current crisis. A new infectious disease, now named COVID-19, was identified by Chinese authorities on 7 January 2020 as the cause of an outbreak of pneumonia in Wuhan. [1] Caused by SARS-CoV-2, COVID-19 is asymptomatic in around 1-3% of patients, according to the WHO mission report. [2] Most patients present mild influenza-like symptoms, including fever, dry cough, fatigue, sore throat, dyspnoea, headache, and myalgia. [2, 3] Around 20% of symptomatic cases present severe forms of disease that require hospital admission. [1] Older people, men, and those with multiple comorbidities appear more likely to suffer more serious types of COVID-19. [2] [3] [4] [5] [6] Conversely, children seem to have a similar probability of infection, but milder and often asymptomatic forms of the disease. [7] Cases of COVID-19 have grown exponentially and have been reported all over the world. The first three cases in Europe were reported in France on 24 January 2020. [8] The first imported COVID-19 case in Spain was dated 31 January 2020 in La Gomera, and the first in Catalonia reported a month after, on 25 February 2020. The total number of confirmed cases in Catalonia then increased exponentially, with 715 cumulative cases reported by 14 March 2020 and a striking 4,203 on 20 March 2020. Despite these official figures, it is uncertain whether SARS-CoV-2 was circulating in the community before the first official cases. It is difficult to believe, for example, that this airborne infection did not cross the uncontrolled borders between Catalonia and its northern neighbouring country for a whole month. Some have thus speculated that undetected COVID-19 cases may have been categorised as influenza before the first official case was reported in Spain. [9] Catalonia is fortunate to have a reliable system for influenza surveillance in place. A network of 60 sentinel general practitioners covering 1% of the total population report daily cases of influenza-like illness (ILI) and take samples for differential diagnosis and confirmation of influenza infections in the region. [10] A specialised hospital-based system takes samples from severe hospitalised flu cases. [10] A community-based surveillance system called Diagnosticat also extracts counts of ILI diagnoses from a network of GP health records in real-time, covering 85% of the population. [11] This last approach allows us to examine trends with granularity and to stratify analyses by age and other factors. As the first cases of SARS-CoV-2 appeared in Catalonia during the influenza epidemic season and the disease shares some symptomatology with influenza, we hypothesised that SARS-CoV-2 could have been circulating in the community before the first confirmed case, resulting in an excess of influenza diagnoses. We aimed to estimate the number of excess influenza cases in Catalonia, globally and by age, and to examine its relationship with the number of clinically diagnosed COVID-19 cases. . CC-BY-NC 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.04.09.20056259 doi: medRxiv preprint Methods We used a time-series study of influenza and COVID-19 cases. We extracted data from primary-care electronic medical records covering about 85% of the population of Catalonia, around 6 million people. The study period included all influenza seasons from autumn-winter 2010-2011 to autumn-winter 2019-2020. The key study outcomes were diagnoses of influenza and COVID-19. Daily frequency of influenza cases recorded in primary-care records were obtained from electronic medical records, as is routinely done for the Diagnosticat database. [12] Diagnosticat is a website that reports in real-time all influenza diagnoses recorded by all general practitioners working at any of the primary-care centres run by the Institut Català de la Salut (ICS). ICS is the main primary-care health service provider in Catalonia and covers about 85% of practices in the region, who all use the same electronic medical record software, ECAP. [13] Diagnosticat includes all clinical influenza diagnosis codes (ICD-10 codes in Supplementary Table 1 ) and is updated from ECAP daily (since 2010). It presents the frequency of daily influenza cases and the weekly incidence rates per 10 5 population, a unit that allows diagnoses to be compared between territories independently of the number of inhabitants. Influenza data on Diagnosticat has been shown to accurately represent that in a gold-standard source, the sentinel network of influenza infection reports dataset. [11] The number of COVID-19 clinical diagnoses were extracted and aggregated using the same data source and methods as for influenza diagnoses. Clinical diagnoses of COVID-19 have been recorded in ECAP since 27 February 2020, when bespoke codes were introduced (Supplementary Table 1 ). Since March 15 2020, Catalan policies have advocated for cases to be defined based on symptoms alone, with serological or PCR confirmation only required when patients are admitted to hospital or are healthcare staff. [14] Statistical analysis Daily counts of influenza and COVID-19 cases were computed based on the frequency of cases recorded in the previous 7-day period to avoid weekly effects on recording practice. All influenza seasons in the study period (2010-2011 to 2019-2020) were analysed separately to characterise annual epidemic curves for seasonal influenza. Influenza seasons with a visually similar epidemic curve and similar peak case number to that of the 2019-2020 season were selected to model predictions for 2019-2020. Auto Regressive Integrated Moving Average (ARIMA) models [15] were fitted to the seasons included in the analysis for the whole population and for three age groups, paedriatic patients (under 15), adults , and elderly (over 64 years old). . CC-BY-NC 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . From the fitted time series, the expected speed of decrease in the number of weekly influenza cases for the 2019-2020 influenza season was calculated for each day after the peak. The expected speed of decrease was defined as the difference between the number of influenza diagnoses predicted between the current day t and the previous day t-1, divided by the number of diagnoses predicted for the previous day t-1 ((cases t (cases t-1 )-1). Expected influenza cases were calculated using the sequence G t = G 0 was the expected influenza cases in the period t, G 0 was the number of cases at the peak and V k the speed of decrease at day k. The expected influenza cases for each day on the 2019-2020 season were calculated from the day of the season peak to 20 March 2020, the day of the data extraction. Excess influenza cases were defined as the number of observed minus expected cases, estimated daily as above. All analyses were performed in R, version 3.5.1. [16] Results Previous influenza epidemic curves . CC-BY-NC 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . ARIMA models were fitted using the included seasons. Supplementary Table 2 shows the full modelling process and the fitted parameters. In Catalonia, the 2019-2020 influenza epidemic reached its peak on 4 February 2020, with 12,066 cases in the previous 7 days. Figure 2 shows the evolution of the season compared with past seasons, centred on the day of the peak. By eye, the downwards trend after the peak initially looks very similar to the previous seasons. However, 20 days after the peak, the curve starts to flatten, and the slope slows down. This abnormal pattern in the descending part of the curve differs from the pattern in the previous seasons. . CC-BY-NC 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.04.09.20056259 doi: medRxiv preprint Figure 3 shows the observed and estimated numbers of weekly new influenza cases (with 95% CI) after the peak of the 2019-2020 influenza season. The estimated expected number of cases were predicted using the selected previous influenza seasons. In the whole population, observed cases were always greater than expected after the seasonal influenza peak, to some extent. The difference was statistically significant for 23 days between 4 February 2020 and 20 March 2020. Most of these days fell after 8 March 2020, when the difference between observed and expected increased significantly and observed cases remained above the 95% CI band for expected cases for 2 weeks. There was a greater difference between observed and expected cases among people aged 15-64 years than in both the total population and other age groups, with 25 total days of significant difference. The observed and expected cases diverged earlier than for the total population, separating around 26 February 2020 and remaining significantly different for the rest of the study period. . CC-BY-NC 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.04.09.20056259 doi: medRxiv preprint Observed and expected cases were generally similar in those older than 64 years, until 6 March 2020. Observed cases then quickly rose above expected cases, with the difference becoming significant on 11 March 2020 and remaining so for 9 days, until 19 March 2020. The shape of the observed cases curve for people younger than 15 years was similar to that for people aged 15-64 years. However, the difference between observed and expected cases was only significantly different for 11 days, between 6 March 2020 and 16 March 2020. We estimated 8,017 excess influenza cases (95% CI 1,841 -14,718) between 4 February 2020 and 20 March 2020. This excess is presented stratified by age in Table 2. . CC-BY-NC 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . March 2020, after the peak of the seasonal influenza epidemic, and the percentage of all influenza cases in that period that they make up, overall and by age group. . CC-BY-NC 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . Catalonia, Spain, as number of cases in the previous 7-day period, from the peak of the 2019-2020 seasonal influenza epidemic (4 February 2020). In mid-February 2020, we observed an unusually high, larger than expected number of influenza cases in the daily published data. In Catalonia, the 2019-2020 seasonal influenza epidemic reached its peak on 4 February 2020. Based on previous years' data, influenza diagnoses were expected to decrease rapidly over the following weeks. However, the number of influenza diagnoses instead remained stable, which was counterintuitive and inconsistent with data from past influenza seasons. This increase in observed influenza diagnoses over those expected, here named "excess influenza," correlates over time with the observed number of COVID-19 cases. Excess influenza cases could be used in future for the early detection of competing outbreaks. Using four of the previous nine influenza seasons as a benchmark, we detected 8,017 excess influenza cases between 4 February 2020 and 20 March 2020. This excess was higher in people aged 15-64 years, with over 20% more cases than expected. The excess started to decrease after 15 March 2020. Worryingly, these results suggest that SARS-CoV-2 could have already been circulating in the Catalan population when the first imported case was reported on 25 February 2020. People infected with COVID-19 may have been masked . CC-BY-NC 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . under ILI diagnoses in primary care, allowing continuing community transmission of COVID-19 before public health measures were taken. To our knowledge, this is the first study attempting to quantify the start of the COVID-19 epidemic in Spain by comparing the number of reported ILI cases with the expected figures based on previous influenza seasons. The excess influenza cases metric could be useful for monitoring future outbreaks of COVID-19 and other competing viral epidemics. Our study has several limitations. We used ecological data and modelled it using data from previous seasons, therefore assuming a direct causal link between excess influenza cases and the COVID-19 pandemic. Although we lack confirmatory tests or antigenic data for the estimated excess influenza cases, our results agree with a recent study that tested all influenza samples in Los Angeles for SARS-CoV-2, finding 2.2-10.7% of the tested samples positive for the pathogen. [17] The observed excess influenza cases could have been due to a panic effect, in which the current coronavirus infodemic, a rapid spread of misinformation, has encouraged people to consult healthcare professionals more frequently and for milder symptoms than usual. However, our data showed that the number of influenza diagnoses dropped drastically and COVID-19 diagnoses increased after 15 March 2020. New COVID-19 guidelines were released on 15 March 2020 in Spain that recommended only testing hospital-admitted patients and healthcare staff and encouraging GPs to diagnose COVID-19 clinically without PCR confirmation. [14] At least some of the excess ILI cases were thus likely to have actually been COVID-19 cases. Our study also has strengths. The data used were good quality, as demonstrated in many previous publications, [18] [19] [20] [21] [22] [23] [24] were obtained directly from primary-care records, and have been validated against gold-standard sentinel systems. This existing database covers over 85% of the population of Catalonia, which allowed us to rapidly detect excess influenza cases across the whole population and in different age groups. In conclusion, the full extent of the SARS-CoV-2 pandemic is still unknown. The confirmed number of cases may be just the tip of the iceberg, due to the lack of testing of patients presenting mild COVID-19 symptoms. We need comprehensive, well-designed, seroprevalence studies to know how many people have been infected. The surveillance of excess influenza cases using widely available primary-care electronic medical records could help detect new outbreaks of COVID-19 and other ILI-causing pathogens, supporting early testing and public health responses. . CC-BY-NC 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . Contributors: All authors contributed to the design of the study, the interpretation of the results, and reviewed the manuscript. EC and NM had access to the data, performed the statistical analysis, and acted as guarantors. EC, NM, AP-U, and DP-A wrote the first draft of the manuscript. MM-P and DP-A are joint senior authors. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted PPI statement: This research was done without patient involvement. Patients were not invited to comment on the study design and were not consulted to develop patient relevant outcomes or interpret the results. Patients were not invited to contribute to the writing or editing of this document for readability or accuracy. . CC-BY-NC 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.04.09.20056259 doi: medRxiv preprint Closas P, Coma E, Méndez L. Sequential detection of influenza epidemics by the . CC-BY-NC 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY-NC 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.04.09.20056259 doi: medRxiv preprint Novel coronavirus disease 2019 (COVID-19) pandemic: increased transmission in the EU/EEA and the UK -sixth update Report of the WHO-China Joint Mission on Coronavirus Disease 2019 (COVID-19) Epidemiologic Features and Clinical Course of Patients Infected with SARS-CoV-2 in Singapore Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet Published Online First Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study Epidemiology and Transmission of COVID-19 in Shenzhen China: Analysis of 391 cases and 1,286 of their close contacts First cases of coronavirus disease 2019 (COVID-19) in France: surveillance, investigations and control measures Sentinelles syndromic and viral surveillance group D. Excess cases of Influenza like illnesses in France synchronous with COVID19 invasion Assessment of two complementary influenza surveillance systems: Sentinel primary care influenza-like illness versus severe hospitalized laboratory-confirmed influenza using the moving epidemic method Seguiment de la grip a Catalunya Base de datos SIDIAP: La historia clínica informatizada de Atención Primaria como fuente de información para la investigación epidemiológica Guia d'actuació enfront de casos d'infecció pel nou coronavirus SARS-CoV-2 Time series analysis: Forecasting and control: Fourth edition R software: Version 3.5.1. R Found Stat Comput Published Online First Community Prevalence of SARS-CoV-2 Among Patients With Influenzalike Illnesses Presenting to a Los Angeles Medical Center Validation of cancer diagnoses in electronic health records: Results from the information system for research in primary care (SIDIAP) in northeast spain Epidemiology of dementia: prevalence and incidence estimates using validated electronic health records from primary care How well can electronic health records from primary care identify Alzheimer's disease cases? Validity for use in research on vascular diseases of the SIDIAP (Information System for the Development of Research in Primary Care): the EMMA study The descriptive epidemiology of rheumatoid arthritis in Catalonia: a retrospective study using routinely collected data Ankylosing spondylitis is associated with an increased risk of vertebral and nonvertebral clinical fractures: A populationbased cohort study The authors acknowledge English language editing by Dr Jennifer A de Beyer of the Centre for Statistics in Medicine, University of Oxford.Data on influenza case counts for each season and age group are publicly available in Diagnosticat, in the SeGrip section: https://www.ics.gencat.cat/sisap/grip/principal. COVID-19 case counts are available on reasonable request. All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf and declare: Dr Prieto-Alhambra reports grants and other from AMGEN; grants, non-financial support and other from UCB Biopharma; grants from Les Laboratoires Servier, outside the submitted work; and Janssen, on behalf of IMI-funded EHDEN and EMIF consortiums, and Synapse Management Partners have supported training programmes organised by DPA's department and open for external participants. APU reports grants from Fundacion Alfonso Martin Escudero and the Medical Research Council. No other relationships or activities that could appear to have influenced the submitted work. Lead authors affirm that the manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned have been explained. No ethical approval was required. Analyses were only conducted on de-identified and aggregated data available on the public domain.