key: cord-0758613-33xzcodv authors: Cazelles, B.; Champagne, C.; Nguyen Van Yen, B.; Comiskey, C.; Vergu, E.; Roche, B. title: A mechanistic and data-driven reconstruction of the time-varying reproduction number: Application to the COVID-19 epidemic date: 2021-02-08 journal: nan DOI: 10.1101/2021.02.04.21251167 sha: e3e2a7728d22be2970e575764bc34d5c61caef6d doc_id: 758613 cord_uid: 33xzcodv The effective reproduction number Reff is a critical epidemiological parameter that characterizes the transmissibility of a pathogen. However, this parameter is difficult to estimate in the presence of silent transmission and/or significant temporal variation in case reporting. This variation can occur due to the lack of timely or appropriate testing, public health interventions and/or changes in human behavior during an epidemic. This is exactly the situation we are confronted with during this COVID-19 pandemic. In this work, we propose to estimate Reff for the SARS-CoV-2 (the etiological agent of the COVID-19), based on a model of its propagation considering a time-varying transmission rate. This rate is modeled by a Brownian diffusion process embedded in a stochastic model. The model is then fitted with Bayesian inference (PMCMC) using multiple well-documented hospital datasets from several regions in France and in Ireland. This mechanistic modeling framework enables us to reconstruct the temporal evolution of the transmission rate of the COVID-19 based only on the available data, without any specific hypothesis on its evolution. This approach allows us to follow both the course of the COVID-19 epidemic and the temporal evolution of its Reff(t). In this way, we can appropriately assess the effect of the mitigation strategies implemented to control the epidemic waves in France and in Ireland. In the last months of 2019, clustered pneumonia cases were described in China [1] . The etiological agent of this new disease, a betacoronavirus, was identified in January and named SARS-CoV2. This new coronavirus disease (COVID-19) spread rapidly worldwide, causing millions of cases, killing hundreds of thousands of people and causing socio-economic damage. Until vaccination campaigns are widely implemented, the expansion of COVID-19 continues to threaten to overwhelm the healthcare systems of many countries, despite a wide range of public health strategies using different non-pharmaceutical interventions (NPI). In the early stages of each new epidemic, one of the first steps to design a control strategy is to estimate pathogen transmissibility in order to provide information on its potential to spread in the population. This is crucial to understand the likely trajectory of the new epidemic and the level of intervention that is needed to control it. Among the various indicators that quantify this transmissibility, the most commonly used is the reproduction number, which measures how many amore people on average can be infected by one infected individual. In the initial phase of the epidemic, when the entire population is susceptible, this quantity is referred to as R 0 , the basic reproduction number, and is defined as the average number of secondary cases caused by one infected individual in an entirely susceptible population [2] [3] . As the epidemic develops and the number of infected individuals increases (and the number of susceptible individuals decreases), the effective reproduction number R eff , characterizing the transmission potential according to the immunological state of the host population, is used. It can be estimated as a function of time, the instantaneous effective reproduction number R eff (t) quantifying the number of secondary infections caused by an infected person at a specific time-point of an epidemic. The epidemic is able to spread when R eff (t)>1 and is under control when R eff (t)<1. R eff (t) can be used to monitor changes in transmission in near real time and, for instance, the effect of control or mitigation measures. Classically, R eff (t) is estimated by the ratio of the number of new infections generated at time t to the total infectiousness of individuals in the infected state at time t. This latter quantity is defined based on the generation time distribution (the time between infection and transmission) or on the serial interval (the time between the onset of symptoms of a primary case and the onset of symptoms of secondary cases) [4] [5] . For the COVID-19 epidemic, considering the data needed for the estimation of the reproductive numbers, caution is urged when interpreting the values obtained and the short-term fluctuations in these estimates due to both data quality and data uncertainty, which must be taken into account [6] [7] [8] [9] . In a recent study by O'Driscoll et al [6] , it was concluded, comparing different methods, that there are many important biases in the R eff (t) estimates and that this can easily lead to erroneous conclusions about changes in transmissibility during an epidemic. These biases are mainly due to the uncertainty in incidence data that can arise due to both the transmission characteristics of this virus (asymptomatic and pre-symptomatic transmission) and the quality and preparedness of the public health system. For COVID- 19 , it has been shown that the number of observed confirmed cases significantly underestimates the actual number of infections [10] [11] . For instance, during the initial rapid growth phase of the COVID-19 epidemic, the number of confirmed case underestimated the actual number of infections by 50 to 100 times [10] . In France, it has been estimated that the detection rate increased from 7% in mid-May to 40% by the end of June, compared to well below 5% at the beginning of the epidemic [12] . In addition, these biases can be amplified by the combination of the high proportion of asymptomatic cases [13] and low health-seeking behaviors. Although the estimation of the reproductive number is robust to underreporting [14] , this is only true if the reporting rate (among other characteristics) is constant over the observation period. This was not verified for the COVID-19 epidemic, mainly due to fluctuations in the capacity of testing and the availability of information. The uncertainties in incidence data are not the only one, there is also uncertainty in the generation time and in the serial interval [15] [16] . Other studies have highlighted these issues. Gostic et al [7] quantified the effects of data characteristics on R eff (t) estimates: reporting process; imperfect observation of cases; missing observations of recent infections; estimation of the generation interval. Moreover, Pitzer et al [8] showed that biases in R eff (t) are amplified when reporting delays have fluctuated due to the availability and changing practices of testing. These two studies concluded that changes in diagnostic testing and reporting processes must be monitored and taken into account when interpreting estimates of the reproductive number of COVID-19. Nevertheless these changes are extremely difficult to quantify. In these contexts, as testing capacity and reporting delays evolve, the use of hospital admission or death data may be preferable for inferring reproduction numbers [9, 17] . However, the delays in the time from infection to hospitalization and/or death are also uncertain and, overall, it is difficult to incorporate the uncertainty related to all of these delays [9, 17, 18] . In addition to the classical methods, a complementary approach consists in inferring changes in transmission using mathematical models, and computing R eff based on its proportionality with the transmission rates [19] [20] [21] [22] [23] [24] . The time-variation of R eff is computed indirectly by simply fitting the model to different time periods (before or after the lockdown in the simplest cases) or by using exponential decay models [21] , but see Lemaitre et al [20] . To overcome all these numerous weaknesses in the data available for computing R eff , we propose using a framework that has been already implemented to tackle non-stationarity in epidemiology [25] [26] [27] . This framework uses diffusion models driven by Brownian motion to model time-varying key epidemiological parameters embedded in a stochastic state-space framework coupled with Bayesian inference methods. The advantages of this approach are the possibilities of (i) considering the mechanisms of the pathogen transmission and then its particularities (asymptomatic), (ii) using multiple datasets (incidence and hospital data), (iii) accounting for all the uncertainty associated with the data used and especially (iv) following the evolution over time of some of the model parameters. The main advantage of this approach is that it is data-driven. Indeed, in this framework, the reconstruction of the time-varying parameters is done only under the weak assumption that they follow a basic stochastic process and it is estimated solely based on the available observations [27]. Applied to COVID-19, this framework makes it possible to monitor the evolution of disease transmission over time under non-stationary conditions such as those that prevailed during this epidemic. We have chosen to illustrate our approach with data from certain regions of France and Ireland. Our framework is based on three main components: a stochastic epidemiological model embedded in a state-space framework, a diffusion process for each time-varying parameter and a Bayesian inference technique based on adaptive PMCMC (see Supporting Information). The main advantage of the state-space framework is to explicitly consider the observation process. This makes it possible to take into account the unknowns and uncertainty in the partial observation of the disease. The proposed epidemiological model accounts for the transmission characteristic of the COVID-19 and the feature of the data. Figure 1 illustrates the model used, the compartments and flows between them. The temporal variation in the transmission rate β(t) was modeled by making the assumption that it is not driven by mechanistic terms but evolves randomly. We consider that β(t) follows a continuous diffusion process: where ν is the volatility of the Brownian process (dB) to be estimated. Intuitively, the higher the values of ν the larger the changes in β(t). The logarithmic transformation avoids negative values, which have no biological meaning. Based on the SEIR model structure that accounts for the asymptomatic states (see eqs S2), the R eff can be computed as: where 1/γ is the infection duration, τ A is the fraction of asymptomatic individuals in the population, (1-τ A ) the proportion of symptomatic infectious individuals and q i the reduction in the transmissibility of some infected (I 2 ) and asymptomatics (A i ) (see Fig. 1 Our estimations of R eff were compared, based on data from Ireland. with those obtained with two other methods, The first method, proposed by Cori et al [14] and frequently used in recent studies analyzing COVID-19 data, relies on the number of new infections generated during a given time period and the serial interval distribution and is implemented in the EpiEstim package. In the second method new infections and hence R eff are generated using a simple discrete SIR model fitted with Kalman-filtering tools [28] . In the first wave of the COVID-19 epidemic the number of cases reported was very low and associated with large uncertainties [10, 12] . This was due, on the one hand, to the testing capacity (RT-PCR laboratory capacity) that was limited and varied greatly during this epidemic and, on the other hand, to the features of this new virus, such as transmission before symptom onset and substantial asymptomatic transmission, which resulted in a low fraction of infected people attending health facilities for testing. This suggests that hospitalization data were likely to be the most accurate COVID-19 related data [9, 17] . Thus, we focused on multiple hospital datasets in France and in Ireland. We used incidence data to avoid all the shortcomings associated with the use of cumulative data (see [ Since these hospital multiple datasets were only available after the implementation of the first mitigation measures in France and in Ireland and that our aim was to estimate R eff before, during and after the NPI measures, we also used reported incidence data for the first period. The temporal evolution of both the transmission rate β(t) and R eff (t) ( Fig. 2A ) and the fit of the model to the observed data are displayed in Fig. 2 . Fig. 3 shows the time evolution of the R eff (t) (Fig. 3A ) and the dynamic of the model's unobserved compartments. Another important characteristic of this epidemic is the fact that the peak of daily hospital admission and daily ICU admission are concomitant (Figs 2G-2H and Figs S3,S6,S9 ,S12,S15,S18). This feature has been incorporated in the model by allowing hospitalization or admission to ICU for each stage of the symptomatic infection ( Fig. 1 and eqs S2 ). The capacity of our framework to describe the different types of data, among which some of them are characterized by large noise (within the Paris region for instance) is also illustrated in Fig. 2 . The main advantage of this framework consists in its ability to reconstruct the time variation of the transmission rate β(t) ( Fig. 2A) . Based on the estimation of β(t), we can compute the time-variation of R eff (t) ( Fig. 2A) and thus reconstruct the observed dynamics of the COVID-19 epidemic. Figure 3 provides the results for each component of the epidemic dynamic model for the Paris region. Our main result is described in Fig. 4 displaying the time evolution of the R eff in the five French regions considered and in Ireland. Our estimates of the initial value of R eff lie in the range [3.,3.5] in agreement with other estimates [11, 30] . The peak of R eff (t) just before the start date of the first mitigation measures is presumably an effect of the model to accommodate diverging trends between reported case data and hospital data. Then one can note a decrease of about 80% in R eff between the 1 st of March and the 1 st of May in all the regions considered ( Fig. 4 and Table 1 ). The reduction in the transmission following the second lockdown was smaller, between 40% and 55% in France but less than 25% in Ireland. In both cases, given the temporality of the decline compared to the timing of the NPIs, these sharp decreases seem to be the result of the implementation of the mitigation measures. We can also notice that there is a larger uncertainty at the end of the observation period. As the dynamics of the model are mainly driven by the hospitalization data, these latter determine the transmission rate (and then R eff ) during the interval [t-δt,t], with δt of the order of the average delay between infection and hospitalization, of around 2-3 weeks (Figs 4A-4B). Fig. 4 also illustrates the robustness and but also the sensitivity of our method. Firstly, Fig. 4A shows that whether hospital discharge data are used or not during the inference process, similar medians and CI are found. Secondly, Figs 4A-4B highlight the sensitivity to asymptomatic transmission: when asymptomatic transmission increases (i.e. large q A value) lower values are inferred for β(t) hence leading to lower R eff (the opposite happens when q A decreases). * using hospital discharge data; ** not using hospital discharge data; *** 01-07-2020 for Ireland. Figs 3C-3D show that the number of asymptomatic infectious is of the same order of magnitude as the number of symptomatic infectious but with a larger uncertainty due to lack of information in the data. Indeed, the data used contain very little information on asymptomatics and we observe identical prior and posterior distributions for the rate of asymptomatic, τ A (Figs S1,S5,S8). It can also be noted that the number of removed individuals is small and that the seroprevalence calculated as of 15 May for France or the 1 July for Ireland was very low but in full agreement with the published results of the seroprevalence surveys (Tables 1,S3 ). These estimated seroprevalence show that herd immunity is far from being reached even after the second waves. The comparison of the performances of our estimations of R eff with those of two others methods is summarized in Fig. 5 . It is difficult to compare the absolute values of the estimates obtained with the 3 methods because the true values are unknown and speculative, but we can limit ourselves to comparing the trends given by these methods. The trajectory of R eff over time computed by other . CC-BY-NC 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 8, 2021. ; https://doi.org/10.1101/2021.02.04.21251167 doi: medRxiv preprint methods fall within the range of our 95% CI, which is large because of uncertainties in the transmission rate, asymptomatic transmission and the different delays needed to describe hospital data. We can also note that the 95% CI of R eff obtained with the EpiEstim method is very narrow and its width is smaller than the variability of fluctuations in its median. The main differences between the three estimates relate to the time-lags of the effects of the lockdown, the peaks of R eff and the date of crossing the 1 threshold. These lags range from one week to more than one month and do not correspond to the early or late lags of the date of crossing 1, when comparing to the estimations provided by our method. A final important point concerns the observed incidence. We have used the incidence data until the 22 March (Fig. 2B , black points; Figs S6-S9-S12-S15-S18) in the inference process leading to a median of the posterior distribution of the reporting rate of 2.7% (95% CI: 1.9%-4.1%) for the Paris region (Table S2 ). The plotted dynamics of the estimated observed incidence uses this value for the whole trajectory. The values of the reporting rate for the other regions are in Table S2 . The comparison of the observed incidence when available and the simulated incidence (Figs 2,S3,S6,S9,S12,S15,S18) clearly illustrates that the reporting rate has greatly evolved during the course of the epidemic. It is relatively easy to see that during the first wave the observed peak of incidence comes after the peak of hospitalization, whereas for the second wave the opposite happens, in agreement with what can be expected. Moreover, if we compare the intensity of the observed incidence waves, we can see that the second observed wave is 5 to 10 times greater than the first one, whereas model-based simulations suggest that they have similar magnitudes. This difference cannot only be explained by the fact that, as the frequency of testing increases, it is also increasingly likely that some of people tested positive are asymptomatic, whereas in the model the people tested are considered symptomatic. In any case, it appears crucial to be able to take into account a reporting delay that seems relatively large for the COVID-19 epidemic using models for now-casting [31-32] or a detailed observation model [18] . For each epidemic, it is always essential to estimate pathogen transmissibility in order to provide information on its potential diffusion in the population. This is commonly done using the effective reproduction number R eff (t). Its time evolution allows us to follow, almost in real time, the evolution of the course of the epidemic, and get information regarding the potential effects of mitigation measures taken. Here we propose a mechanistic approach based on time-varying parameters embedded in a stochastic model coupled with Bayesian inference. This enables us to reconstruct the temporal evolution of the transmission rate of COVID-19, on the basis of the only data available without specific hypothesis on this evolution. With this approach we describe both the temporal evolution of the COVID-19 epidemic and its R eff (t). Thus, we can quantify the effect of mitigation measures on the epidemics waves in France and in Ireland. It is important to note that this methodology overcomes many of the biases associated with estimates of R eff (t) with more traditional methods in the case of COVD-19 [6-7]. Indeed, there are many biases related to under-reporting of infectious cases, uncertainties in the . CC-BY-NC 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 8, 2021. ; https://doi.org/10.1101/2021.02.04.21251167 doi: medRxiv preprint generation time and in the serial interval, and also, to the importance of silent transmission. Moreover these biases are amplified by the fact that reporting delays have fluctuated widely over the course of the epidemic [8] . Within our framework, the estimates of R eff (t) represent the values required to reproduce the welldocumented hospital data. Such hospital data are clearly of a higher quality than the reported number of cases [17] [18] . Moreover, this estimation accounts for transmission mechanisms and the different delays describe the process of transmission (even asymptomatic transmission) and the different processes evolving in the hospital (Fig. 1 ). Using our approach, by visualizing the evolution of R eff (t) we can follow the course of the COVID-19 epidemic in five French regions and in Ireland. Therefore, we can quantify the effect of the mitigation measures during and between epidemics waves. For the first lockdown we estimated a decrease of around 80% in transmission while between 45 to 55% in French regions and less than 25% in Ireland (Table 1) during the second lockdown, which was less restrictive. This may reflect the nature of the mitigation measures introduced in both countries for the second wave. In Ireland, the mitigation measures introduced in the second wave were less restrictive than the first, and this may explain the smaller reduction in the effective reproductive number. We also found other interesting results such as a significant high correlation between the trend of mobility and transmission between the epidemic waves (see Fig. S21 and [32]), highlighting the importance of following the evolution of mobility when relaxing mitigation measures to anticipate the future evolution of the spread of the SARS-CoV-2. We have also compared the trend of our estimations of R eff with those of two other methods from the literature [14, 28] on data from Ireland (Fig. 5 ). The main difference between our estimations and the two others consists in asynchronous peaks of R eff . For instance, the peak of R eff occurs at the beginning of August for the two other methods, while our estimates suggest a peak in mid-September (Fig. 5 ). These differences may be explained by the fact that the two other methods computed their estimations based on the new cases only. The peak of R eff in early August could be explained by the increase in testing at during the vacation period while our estimates peaked later due to the increase in hospitalization generated by higher values of new infections in early September when the economy restarted. We therefore believe that our estimates based on admissions in hospital and ICU and deaths were more consistent with the peak of the second waves of hospitalization that arrived in late October in Ireland (Figs S6-S7). This convinces us of the value of our framework in presenting a more coherent picture of the course of this epidemic. Clearly the weakness of our approach is the dependence of the R eff (t) estimations on the relevance of the model used and the accuracy and completeness of the available data. In our case, inference was based on hospital data that are clearly of a higher quality and accuracy compared to the observed number of infected cases. Moreover, despite its relative simplicity, the model incorporating time varying transmission rate is able to accurately describe the multiple hospital datasets that included daily hospitalized admission for COVID-19, daily ICU admission, daily deaths at hospital and daily hospital discharged and also the number of beds used each day both in hospital and ICU. Furthermore our model can be partially validated by the fact that our predicted seroprevalence in the French regions and in Ireland are in complete agreement with the results published from seroprevalence . CC-BY-NC 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 8, 2021. ; https://doi.org/10.1101/2021.02.04.21251167 doi: medRxiv preprint surveys in these settings (see Tables 1,S3) . To highlight our model results, we can see that the asymptomatic infectious are as numerous as symptomatic ones, but are characterized by a larger uncertainty, due to the lack of information in the data (Figs 3C-3D and figures in the Supporting Information). This is in agreement with recent papers [11, 24, 34, 35] , which emphasize that the growth of the COVID-19 epidemic is driven by silent infections. This has also been highlighted in Ireland where it has been estimated that in the second epidemic wave the ratio of silent infections to known reported cases was approximately 1:1 [36] . It is indeed interesting to note that our model estimates for asymptomatic cases in Ireland lead to similar ratios but with a large 95% CI (see Fig. S7 ). Our study is not without limitations. The model used here is, like all complex SEIR models developed for COVID-19, non-identifiable which means that it is likely that several solutions exist and we only present one of the most likely ones. This point is overlooked very open but see [37] . One limitation is the use of the classical homogeneous mixing assumption in which all individuals are assumed to interact uniformly and ignores heterogeneity between groups by sex, age, geographical region. However this kind of data are not easily available and where mixing patterns among age groups are available at the individual level in contact tracing databases they are only accessible following extensive ethical reviews. Another weakness is to not consider an age-structure in the model in order to simulate age-based predictions. In all cases taking an age structure and mixing matrix appears insufficient and heterogeneity of contact is important (see Britton et al, 2020). Nevertheless, in our opinion, these limitations are more than balanced by the fact that we take into account the non-stationarity of the epidemic data and that our results are mainly driven by hospital data, which is more accurate and timely than the number of infected cases. As our main objective was to infer global R eff , and not to explore age-specific mitigation strategies, simplification of the age structure appears justified. The corroboration of our Irish findings on proportions asymptomatic with those of others provides further evidence of this [36] . As demonstrated previously, modeling with time varying parameter is an interesting framework for modeling the time evolution of an epidemic even if the knowledge about disease transmission is either incomplete or uncertain [27] . Indeed a large part of all the unknowns is put in the time-varying parameters described by a diffusion process but driven by the observed data. This is exactly what we have been confronted during the COVID-19 pandemic, as the data are uncertain, as are the transmission mechanisms of SARS-CoV-2. We therefore proposed to model the spread of this disease using a stochastic model with a time-varying transmission rate inferred using welldocumented hospital multiple datasets. The knowledge of the transmission rate makes it possible to easily calculate the R eff (t), which is a key parameter of the epidemic, in order to monitor the effects of public health policies on the course the COVID-19 epidemic. We believe, therefore, that this framework could be particularly useful for analyzing this new phase of worrying expansion and for helping to refine possible future mitigation measures after the second wave, pending the imminent launch of the vaccines. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 8, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 8, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 8, 2021. ; (1), σ the incubation rate, γ the recovery rate, 1/κ average hospitalized period, 1/δ average time spent in ICU, τ A the fraction of asymptomatics, τ H the fraction of infectious hospitalized, τ I the fraction of ICU admission, τ D death rate, q 1 and q 2 the reduction in the transmissibility of I 2 and A i , q I the reduction in the fraction of people admitted in ICU and q D the reduction in the death rate. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint We have used a state-space framework defined by two sets of equations: the first set describes the propagation of the disease in the population and the second corresponds for the observation process. This allows to consideration of unknowns and uncertainty both in the epidemiological process and in the observation of epidemic dynamics: The first equation of (S1) corresponds to the epidemiological model, with x(t) representing the state variables (for instance, S(t) the susceptibles, E(t) the infected non-infectious, I(t) the infectious and R(t) the removed for the classical SEIR model) and θ(t) the epidemiological parameters, among which some are time-dependent. The second equation is associated to the observational process defined by the probabilistic distribution f with parameters depending on a functional of some of the epidemiological model coordinates, h(x(t)) and some observational and/or measurement error, φ(t). y(t) corresponds to observed epidemic dynamics (partial and noisy observations of x(t)) and u(t) is the process noise describing intrinsic or environmental stochasticity other than the observational noise is included in f. In our applications, h(x(t)) will represent the daily new cases. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint Our objective was to propose a simple but relevant model that incorporates the available information to examine the dynamics of Covid-19 epidemics before, during and after the lockdown. Our model is an extended stochastic SEIR model also accounting for asymptomatic transmission and the hospital course of the hospitalized infectious. It is similar to other models that have already been proposed to model and forecast the Covid-19 epidemic ( . It includes the following variables the susceptibles S, the infected non-infectious E, the symptomatic infectious I, the asymptomatic infectious A, the removed people R, and the hospital-related variables: hospitalized people H, people in intensive care unit ICI, cured people G, and deaths at hospital D. We have also introduced Erlang-distributed stage durations (with a shape parameter equal to 2) in E, I, A and H compartments, to relax the mathematically convenient but unrealistic assumption of exponential stage durations. For sake of simplicity, we present here below the differential equations describing the deterministic version of our model, whereas its stochastic version was used in this study: is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The main originality of this model is the time-varying transmission rate β(t) that follows a Brownian diffusion process, equation (1) in the main text. The other parameters are: σ is the incubation rate, γ the recovery rate, 1/κ the average hospitalizion period, 1/δ the average time spent in ICU, τ A the fraction of asymptomatics, τ H the fraction of infectious hospitalized, τ I the fraction of ICU admission, τ D the death rate, q 1 and q 2 the reduction in transmissibility of I 2 and asymptomatics (A 1 +A 2 ), respectively. As the peaks of hospitalized and admitted to ICU individuals are concomitant, we consider that a small fraction, q I .τ I of infectious with severe symptoms enter directly the ICU. Even if the majority of deaths occur in the ICU, a fraction, q D .τ D , can occur in the hospital, outside the intensive care. Then, q I and q D are the reduction in admission in ICU and in death rate, respectively. Most of the parameters are estimated in our inference process and the remaining ones set to plausible values, in agreement with the literature (see Table S1 ). Our inference approach is mainly based on data on new entries in several model compartments. These observed variables used are described in the following equations: where C I is the incidence of symptomatic cases, C H the daily admission to hospital, C ICU the daily new admission in ICU, C D the new daily number of of hospital deaths and C G the daily new discharge at hospital. Our model trajectories are related to the data using a negative binomial observation model, which is commonly used (Bretó et al, 2009 ). The observed incidences, C k,obs (t), are assumed to be drawn from a negative binomial distribution with mean ρ k .C k (t) and variance ρ k .C k (t) + ϕ k .(ρ k .C k (t)) 2 . For each k, C k (t) represents the number of events occurring during day simulated by the model, ! ∈ [0,1] is an observation rate (or reporting rate) and ! is an over-dispersion parameter. Current hospital data, H obs (t) (corresponding to H 1 +H 2 +ICU) and ICU obs (t) (corresponding to ICU), are also available and therefore used to fit the model. We make the assumption that these variables X k,obs follow a discretized normal distribution with fixed standard deviation proportional to its estimated higher value and with mean equal to the predicted values by the . CC-BY-NC 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 8, 2021. ; https://doi.org/10.1101/2021.02.04.21251167 doi: medRxiv preprint model X k (equations S2) times an observation rate ρ k (see Table S1 ) and with fixed standard deviation proportional to its estimated highest value. All variables are assumed to be independent conditional on the underlying epidemiological model, and therefore the model likelihood is the product of the likelihoods for all observed variables. Equations (S1-S3) are considered in a stochastic framework solved with the Euler-Maruyama algorithm (Kloeden and Platen, 1999) implemented in the SSM platform (Dureau et al, 2013b ). The likelihood of our stochastic model being intractable, it was estimated with particle filtering method (Sequential Monte Carlo, SMC). With a given set of parameters, the SMC algorithm reconstructs sequentially the trajectory of the state variables and the time-varying parameters, and allows to compute the associated likelihood. Firstly, the distribution of the initial conditions of the system is approximated by a sample of particles. Then, at each iteration, the particles are projected according to the epidemiological and observation models up to the next observation point. To each of the particles a weight is associated, reflecting the quality of its prediction compared to the observation, and the total likelihood is updated. A resampling step using the weights is performed before the next iteration, in order to discard the trajectories associated with low weight particles. In order to estimate the parameters of the system, the particle filter is embedded in a Markov Chain Monte Carlo framework, leading to the PMCMC algorithm (Andrieu et al, 2010). More precisely, the likelihood estimated by the SMC method is used within a Metropolis Hasting scheme (particle marginal Metropolis Hastings) (Andrieu et al, 2010) . The proposal distribution is a Gaussian whose co-variance matrix is adapted following the framework described in Dureau et al (2013b). The starting point of the MCMC chain is initialized using optimal values obtained from a simplex algorithm adapted for stochastic system (KSimplex), on a large number of parameter sets. Then, a pre-adaptation of the proposal co-variance matrix is performed with Kalman MCMC (KMCMC). The main idea behind these algorithmic improvements relies on less computationally costly algorithms in order to facilitate the exploration of parameter space. As our model is not Gaussian, we approximate the likelihood using the extended Kalman filter both in the simplex algorithm (KSimplex) (Dureau et al, 2013b) and in the MCMC (KMCMC) step (Dureau et al, 2013a) . Then, the adaptive PMCMC is run on the output of the KMCMC step with is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 8, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 8, 2021. ; https://doi.org/10.1101/2021.02.04.21251167 doi: medRxiv preprint Table S1 . Definition of the different parameters and their priors for Ile-de-France region, Ireland and four other French regions: Provence Alpes Côte d'Azur (PACA), Occitanie (OC), Nouvelle-Aquitaine (NA), Auvergne Rhône Alpes (ARA). U stands for uniform distribution and tN for truncated normal distribution (tN[mean,std,limit inf,limit sup]). . CC-BY-NC 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 8, 2021. ; https://doi.org/10.1101/2021.02.04.21251167 doi: medRxiv preprint Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia The concept of Ro in epidemic theory The construction of next-generation matrices for compartmental epidemic models How generation intervals shape the relationship between growth rates and reproductive numbers On the relationship between serial interval, infectiousness profile and generation time The impact of changes in diagnostic testing practices on estimates of COVID-19 transmission in the United States Evaluating the use of the reproduction number as an epidemiological tool, using spatio-temporal trends of the Covid-19 outbreak in England Severe underestimation of COVID-19 case numbers: effect of epidemic growth rate and test restrictions Quantifying Asymptomatic Infection and Transmission of COVID-19 in New York City using Observed Cases, Serology and Testing Capacity Underdetection of COVID-19 cases in France in the exit phase following lockdown Prevalence of Asymptomatic SARS-CoV-2 Infection: A Narrative Review A new framework and software to estimate timevarying reproduction numbers during epidemics Serial interval of SARS-CoV-2 was shortened over time by nonpharmaceutical interventions Estimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data Estimating the time-varying reproduction number of SARS-CoV-2 using national and subnational case counts Tracking disease outbreaks from sparse data with Bayesian inference Reconstruction of the full transmission dynamics of COVID-19 in Wuhan Assessing the impact of non-pharmaceutical interventions on SARS-CoV-2 transmission in Switzerland Inferring change points in the spread of COVID-19 reveals the effectiveness of interventions Time Series Analysis via Mechanistic Models & Lydie, N. Seroprevalence of SARS-CoV-2 among adults in three regions of France following the lockdown and associated risk factors: a multicohort study Impact of lockdown on COVID-19 epidemic in Île-de-France and possible exit strategies SSM: Inference for time series analysis with State Space Models We thank Una Ni Mhaoldhomhnaigh who extracted and managed the data during the lockdown periods and edited some of the previous version of the manuscript.Funding B. C. and B. R. are partially supported by a grant ANR Flash Covid-19 from the "Agence Nationale de la Recherche" (DigEpi).