key: cord-0779901-mrwn5lvo authors: Raimúndez, Elba; Dudkin, Erika; Vanhoefer, Jakob; Alamoudi, Emad; Merkt, Simon; Fuhrmann, Lara; Bai, Fan; Hasenauer, Jan title: COVID-19 outbreak in Wuhan demonstrates the limitations of publicly available case numbers for epidemiological modelling. date: 2020-04-22 journal: nan DOI: 10.1101/2020.04.19.20071597 sha: 7812b2516941dcc3cdb591f771db1196f2527add doc_id: 779901 cord_uid: mrwn5lvo Epidemiological models are widely used to analyse the spread of diseases such as the global COVID-19 pandemic caused by SARS-CoV-2. However, all models are based on simplifying assumptions and on sparse data. This limits the reliability of parameter estimates and predictions. In this manuscript, we demonstrate the relevance of these limitations by performing a study of the COVID-19 outbreak in Wuhan, China. We perform parameter estimation, uncertainty analysis and model selection for a range of established epidemiological models. Amongst others, we employ Markov chain Monte Carlo sampling, parameter and prediction profile calculation algorithms. Our results show that parameter estimates and predictions obtained for several established models on the basis of reported case numbers can be subject to substantial uncertainty. More importantly, estimates were often unrealistic and the confidence / credibility intervals did not cover plausible values of critical parameters obtained using different approaches. These findings suggest, amongst others, that several models are over-simplistic and that the reported case numbers provide often insufficient information. Epidemiological models are essential tools in public health as they facilitate assessments and forecasts of the spread of infectious diseases. This has been for instance demonstrated for influenza [1] , dengue [2] , Ebola [3] , Zika [4] , and -most recently -COVID-19 [5, 6] . These assessments and forecasts are the basis for political decision making [7] and therefore of vital importance. The spectrum of mathematical modelling approaches in epidemiology ranges from relatively simple ordinary differential equation (ODE) models [8] [9] [10] , partial differential equation (PDE) models [11, 12] , stochastic differential equation (SDE) models [13] [14] [15] , continuous-time discrete-state Markov chain (CTMC) models [15] [16] [17] , to complex agent-based models [18, 19] . While ODE, PDE and SDE models provide descriptions at the population level, agent-based models are centered around formulations of the properties and dynamics of individuals. Some models explicitly account for space (usually in terms of countries, regions and/or cities) to capture spreading. Furthermore, models for the infection spread are usually combined with models of testing and reporting strategy to link them to the observed case number [20] . The choice of the modelling approach depends on the purpose of the study, the availability of information about the underlying disease and population, and the amount and quality of experimental data. Yet, all these models rely on estimates of parameter values. The parameters of epidemiological models are mostly inferred using frequentist and Bayesian parameter estimation methods. Frequentist methods often rely on parameter optimisation for obtaining point estimates and profile likelihoods for uncertainty analysis [21] . Bayesian methods exploit sampling strategies such as Markov chain Monte Carlo (MCMC) methods [20, 22] or variational inference [23] . Also flexible emulator based methods based, e.g. on Gaussian process, have been applied [24] . For applications in which competing hypotheses are available, the parameter estimation is often complemented by model selection. Established model selection measures include the Akaike Information Criterion (AIC) [25] , the Bayesian Information Criterion (BIC) [26] , or Bayes factors [27] . Our study complements other modelling efforts [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] by considering multiple established model types, observables, parameter estimation and model selection approaches. To recapitulate the situation in the beginning of the pandemic, we limit the use of prior knowledge to a minimum. This highlights challenges, e.g. the limited information content of case numbers and the dependence on proper model topology, but also opportunities for quantitative modelling in epidemiology. For this study, we considered the case numbers reported by the Hubei Province Health Commission and Wuhan Municipal Health Commission [44, 45] . These case numbers were particularly relevant for the analysis of the early transmission dynamics and the planning of interventions. Accordingly, these data were the basis of several modelling studies on the dynamics of COVID-19 epidemic (see e.g. [34, [38] [39] [40] [46] [47] [48] [49] ). Here, we used the time interval from January 9 to February 9, as afterwards the definition of a positive test changed [50] , which limits the comparability. The Chinese Center for Disease Control and Prevention provides time-resolved information on: • Reported number of infected individuals: y I (t) • Reported number of recovered individuals: y R (t) • Reported number of deceased individuals: y D (t) • Reported cumulative number of infected individuals: y T (t) = y I (t) + y R (t) + y D (t) The reported number of deceased individuals is probably most accurate, yet the overall reliability of the measurement and the distribution of the errors is unknown. As in the literature different combinations of these observables are used for model parameterisation, we consider here the following fitting scenarios: As different studies considered different aspects of the data, we first asked which scenario is most suited to determine the parameters of the infection process. To address this question, we employed a deterministic Susceptible-Exposed-Infectious-Recovered-Deceased (SEIRD) model [51] . This compartmental model describes the size of the susceptible (S), exposed (E), infectious (I), recovered (R) and deceased (D) subgroups ( Figure 1A) . The time-dependence of the subgroup sizes is governed by the ODEs: in which β is the average number of contacts per person per time which result in an infection, κ is the rate at which exposed individuals become infectious, γ is the rate at which infectious individuals recover, δ is the rate at which infectious individuals decease, and N = S(t) + E(t) + I(t) + R(t) + D(t) is the overall population size. Note that the inverse of the rate κ is the average incubation time T E = κ −1 . The initial conditions for the different state variables are given by N S , N E , N I , N R and N D . The initial conditions are usually non-zero and have to be inferred along with the unknown model parameters. Following previous work [52] , we applied the simplifying assumption that infectious individuals are observed. For all observable scenarios we performed a maximum likelihood estimation assuming normally as well as log-normally distributed measurement noise with unknown standard deviations (see Materials and Methods). The multi-start local optimisations converged (Supplementary Figure S1A ) and the simulations with the maximum likelihood estimates achieved a good agreement with the observed data ( Figure This confirms the findings of other research groups showing that the SEIRD model is sufficient to describe the dynamics of the COVID-19 outbreak in Wuhan. The comparably low noise levels inferred for the number of deceased individuals confirms our expectation that these observations are most reliable. Model selection based on AIC and BIC indicated a strong support for log-normally distributed measurement noise (Supplementary Figure S1B ). For an in-depth analysis of the impact of the choice of observables, we performed uncertainty analysis using frequentist and Bayesian methods ( Figure 2 ). Profile likelihood calculation and MCMC sampling revealed that O3 provides improved parameter identifiability and decreased parameter uncertainties compared to O1 and O2 (Figure 2A-C) . This was to be expected as O3 uses three observables (I, R and D) while O1 and O2 use two observables (and a subset 4 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 22, 2020 is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 22, 2020. . consistent with the estimates reported by the WHO which indicate a median incubation time of 5-6 days [53] . Similar inconsistencies are observed for the basic reproduction number. Not only the Bayesian parameter estimates are off, but also the maximum likelihood estimates. However, in contrast to narrow 99%-credibility intervals computed from MCMC samples, the 99%-confidence intervals derived from profile likelihoods are broad and cover realistic values. This indicates that estimates derived from case numbers can be unrealistic and their reliability should be assessed using different approaches. As the information encoded in the number of reported cases during the initial infection appears insufficient, we complemented it in the following with a log-normal prior for the incubation period as specified in the Materials and Methods section. The parameters of the prior are derived from the work of [54] , which is based on the infections among travellers from Wuhan. As O3 with this prior achieves plausible estimates, we considered for the remainder this setup ( Figure 3A ). The SEIRD model with the prior on the incubation period provides a reasonable description of the case numbers reported for Wuhan ( Figure 3A ). Yet, this widely used model disregards the observation that patients are asymptomatic [55] . These patients can be infectious but are more difficult to detect. To study the impact of asymptomatic patients, we consider besides the basic SEIRD model (M1) also two alternative epidemiological models: • M2: A SAIRD model considering asymptomatic individuals (A) with transmission rate ζ and symptomatic individuals (I) with transmission rate β. The asymptomatic individuals are assumed to become symptomatic. • M3: A SAIRD model similar to M2, which allows for the direct recovery of asymptomatic patients. The recovered asymptomatic patients (R ) are assumed to remain unreported. We assume that the reported number of infected individuals corresponds to the number of symptomatic patients (I). Accordingly, the reported number of recovered individuals is assumed to count only previously symptomatic individuals. As a further model extension, we consider the possibility of waning immunity. A recent study suggested that the infection with SARS-CoV-2 does not necessarily induce a long-lived antibody response [56] . This has, to the best of our knowledge, not yet been evaluated using model-based approaches. To address this, we consider: • M4: A SAIRD model similar to M2, which allows recovered individuals (R) to become susceptible (S) with rate ρ. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 22, 2020. . The parameters of models M1-M4 were estimated using multi-start local optimisation. The simulations of models M1 to M4 for the respective maximum a posterior estimate show a reasonable agreement with the data (Figure 3 ). Interestingly, while the simulations for M1, M2 and M4 are similar, the simulation for M3 shows an early saturation ( Figure 3C ). The reason is that the initial number of unobserved recovered patients (which can also be interpreted as immune patients) is estimated to be very high, which does not appear to be plausible. As the fitting results for all models were highly reproducible ( Figure 4A ), we evaluated the AIC and BIC for model selection ( Figure 4B ,C). As all models achieved a relatively similar fit, the complexity penalisation has a critical impact and the ranking differs for AIC and BIC, suggesting that the differences are minor. Only M4 was consistently not supported by the available data. The model selection revealed again the limited information content of the case numbers as there is clear evidence for the relevance of asymptomatic cases. To assess the uncertainty of the parameter estimates and predictions, we computed the profile likelihoods and performed MCMC sampling ( Figure 5A -D). The results indicate that the parameter uncertainties for M2-M4 are larger than for M1, but that most parameters are well determined. The profile likelihoods yield overall more conservative estimates than the sampling. The predictions of the state variables based on the sampling suggest low uncertainties of all model states ( Figure 5E -H). In particular for M3 this appears unrealistic as there are so far no reports about a large number of immune individuals ( Figure 5G ). A key question in many recent studies is how much interventions such as compulsory masks and social contact restrictions, as well as the rising public awareness impact the transmission rate of SARS-CoV-2. To study how well this question can be assessed based on early case report data, we considered three simple scenarios: • No change of the transmission rate. • Discrete change in the transmission rate due to the compulsory masks introduced by the government in Wuhan on January 22 and substantially increasing contract restrictions. • Continuous change in the transmission rate due to rising public awareness and a broad spectrum of interventions. These scenarios are illustrated in Figure 6 and a detailed mathematical description is provided in the Materials and Methods section. 8 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 22, 2020. 9 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. As the result of the analysis of the infection dynamics carried out in the previous section was inconclusive, we considered model M1 to M4. For all 12 combinations of model structures and intervention effects, we performed parameter estimation and uncertainty analysis. The assessment of the model selection criteria provided support for a discrete change in the transmission rate on January 22 ( Figure 7A ). The resulting model provides an accurate description of the data and suggests that the transmission rate dropped by around 71% ( Figure 7B ). The uncertainty estimates for the decrease (∆) depend heavily on the analysis approach. While MCMC sampling yields a 99% credibility interval from 37.9 to 76.4%, the profiles suggests a much broader regime ( Figure 7C ). Accordingly, the reported case number for the early outbreak were not sufficient for an accurate assessment of all model parameters. Despite the parameter uncertainties, the states seem to be relatively well determined ( Figure 7D ). A S R D I S A R D I A S R D R' I (t) (t) E S R D I (t) A B C D (t) (t) (t) (t) M1 M2 M3 M4 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 22, 2020. Time Time Time Pandemics pose a global challenge and show the importance of model-based forecasting. Forecasts influence the political decision-making process and have a significant impact on our society. Minimising model uncertainties and properly evaluating them is therefore crucial. Yet, many publications are still only using reported case number and/or omit an identifiability and uncertainty analysis [33-35, 39, 57-60] . Here, we demonstrated that both aspects are problematic. Our analysis of the COVID-19 outbreak in Wuhan demonstrates that the parameterisation of epidemiological models can result in incorrect parameter estimates and predictions. Surprisingly, even Bayesian uncertainty analysis using MCMC sampling resulted in an underestimation of the indeterminacy and provided inaccurate predictions. In our opinion there are two reasons for this: 1. The models considered here and used in various other publications are too simple. They neglect for instance the stochastic nature of the process [61, 62] , the heterogeneity of the population (e.g. the age structure [63] ), time-dependent testing and reporting protocols [50] , and particularities of the process (e.g. a large number of asymptomatic cases [64] ). As the parameter estimates depend on the model characteristics, oversimplifications can result in biased estimates and predictions. 2. The case report data provide only limited information about the process, in particular the distribution of inter-event times are difficult of reconstruct. 12 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 22, 2020. 13 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 22, 2020. . Besides parameter estimation, the aforementioned limitations of case numbers were observed in the model selection process. The data did, for instance, not allow to unravel that a large number of the asymptomatic cases is not detected. We observed that detailed prior information is required if merely case numbers are employed for parameter estimation. While literature-based priors are used in many manuscripts [65] , we hypothesise that it would be better to use information about individual cases for parameter estimation and model selection. In particular the date of the onset of symptoms, the date of the positive test, and the date of recovery/death for individuals is highly relevant. These data are being collected and analysed [54, [66] [67] [68] , but should in the future be shared much earlier. Furthermore, randomised testing would be required, ideally using antibody tests to determine the fraction of completely asymptomatic patients. Such studies are usually not possible during an initial phase of a pandemic but are now on the way [69] . This study does not offer new insights into the COVID-19 pandemic. However, it pinpoints important pitfalls and showcases the relevance of the underlying assumptions and the available data. Furthermore, it demonstrates that even a proper uncertainty analysis using state-of-the-art frequentist or Bayesian approaches does not ensure that the true parameters and dynamics are captured within the uncertainty bounds. While we demonstrate this aspect here for deterministic compartmental models -which are the basis of many modelling studies for COVID-19 and beyond -it certainly holds also for other modelling approaches. Similarly, model-free studies are based on assumptions, data and statistical models, rendering them subject to at least the same limitations. The study is based on official reports on the total number of cases and the numbers of infected individuals, recovered individuals and deceased individuals. From January 11 to 20, the reports were made available by the Wuhan Municipal Health Commission [45] . Afterwards, the reports were organised by the Health Commission of Hubei Province [44] . The complete data sets are listed in Table 1 . The exact population size in the city of Wuhan in the period under consideration is not precisely known due to Chinese New Year. In this study we assume a population size of 9 million which was mentioned by the mayor of Wuhan, Zhou Xianwang [70] . 14 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 22, 2020. . is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 22, 2020. S susceptible E exposed but not infectious A asymptomatic and infectious I symptomatic and infectious R recovered with previously symptomatic progression R recovered with previously asymptomatic progression D deceased We considered four different deterministic compartmental models for the description of the transmission process. The state variables of the models are the number of individuals with particular characteristics and the notations can be found in Table 2 . The models allow for various processes which result in the transitions of individuals between compartments (see Figure 3) . A description of the rates is provided in Table 3 . For the time-dependence of the transmission rates β and ζ, β(t) = β 0 · g(t) and ζ(t) = β 0 · ξ 0 · g(t), we considered three scenarios: The function g(t) describes the reduction of the transmission rates compared to baseline at t = 0, with g(0) = 1 for all scenarios. The parameters for the discrete change are the time point T C and the relative reduction ∆, and for the continuous change we have the decay rate k. The parameter ξ 0 = ζ 0 β 0 denotes the relative difference between the transmission rates of the symptomatic individuals (β(t)) and asymptomatic individuals (ζ(t)). The ODEs governing the dynamics of the different compartmental models are provided in Table 5 . We decided to initialise the model on January 9 since which the virus was detected [71] , defining t = 0. As on January 22 the wearing of masks became mandatory, we set for the scenario of a discrete reduction of β for T C = 13 days. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 22, 2020 The state variables of the model were linked to the observed case numbers using observation functions. The observation functions for the total number of cases (y T ) as well as for the number of infected (y I ), recovered (y R ) and deceased (y D ) people are provided in Table 5 . As the reported numbers are subject to unknown measurement noise, we considered two error models: • Additive normally distributed measurement noise: • Multiplicative log-normally distributed measurement noise: The observation time points t k , k = 1, . . . , 32, are the days listed in Table 1 , and the measurements (as indicated with the superscript m) are the respective case numbers. The distribution parameters σ T , σ I , σ R and σ D were considered as unknown. In the following the parameters of the transition rates, the total and initial number of people in different compartments, and the parameters of the noise distribution are inferred. A comprehensive list of all model parameters and implemented constraints is provided in Table 4 . The boundaries of the search space were chosen very conservatively to indicate the initial knowledge gap about SARS-CoV-2 and COVID-19. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 22, 2020. . is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 22, 2020. . Table 5 : Mathematical models. ODEs for the transmission dynamics, initial conditions and observations functions are defined for all considered model structures. As some models consider only a subset of the state variables, some rows are empty. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint To infer the unknown model parameters we used frequentist and Bayesian approaches. These approaches considered the conditional probability p(D|θ) of the data D given the parameters θ, also known as likelihood. For additive normally distributed measurement noise the likelihood function is and for multiplicative log-normally distributed measurement it is The set of considered observables is encoded by I and differs for the scenarios O1 to O3: I O1 = {T, D}, I O2 = {I, D}, and I O3 = {I, R, D}. In addition to the case reports, we also incorporated knowledge available before the parameter estimation. For Bayesian approaches this was done by defining a prior distributions. These prior distributions are mostly loguniform with conservative upper and lower bounds, meaning that the distribution over the log-transformed parameter values is flat. For κ we include in parts of the study information about the incubation period [54] , given by a log-normal prior: , withκ = (6.4[day]) −1 and σ κ = 0.13. For the frequentist approaches the available estimate of κ is treated as a data point. Remark: For the parameter estimation we consider the log-transformed parameter values. For the log-transformed parameters, the log-uniform prior become effectively a uniform prior. This renders the frequentist and the Bayesian approaches comparable, namely, the maximum likelihood and the maximum a posteriori estimates coincide. To determine the maximum likelihood and the maximum a posteriori estimates, we minimised the negative log-likelihood function and negative log-posterior function, respectively. As these optimisation problems are non-linear and non-convex, we used multi-start local optimisation. The starting points for the local optimisations were generated using latin hypercube sampling. Local optimisation was performed using the interior point algorithm implemented in the MATLAB function lsqnonlin.m, which exploits the least-squares like structure of the optimisation problems. To facilitate convergence, we computed the gradients 20 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 22, 2020. . of the residuals via forward sensitivity equations. The convergence of the global optimisation was assessed using waterfall plots. For each of the 18 considered combinations of compartment model, noise model, observable scenario and intervention scenario, we performed 200 local optimisations. For all combinations at least 15 runs converged to the observed global optimum. This suggested that the results are highly reliable. To evaluate the (frequentist) parameter and prediction confidence intervals we used profile likelihoods [72, 73] . The profile likelihoods were computed using the MATLAB toolbox Data2Dynamics [74] . This toolbox implements optimisation-based methods with adaptive step-size selection as well as fast integration-based methods [75] . To ensure the robustness of the results, the consistency of the outcomes was checked and confirmed. The profile likelihoods were used to derive the (finite sample) confidence intervals. For a significance level α, the bounds of the confidence interval were determined as the smallest and largest parameter values for which the profile likelihood stays above the threshold defined by the αth-percentile of the χ 2 -distribution [76] . We used the χ 2 -distribution with one degree of freedom which yields the so called point-wise confidence intervals. To evaluate the (Bayesian) parameter and prediction credibility intervals we used Markov chain Monte Carlo sampling [77] . The parameter posterior distribution was sampled using the Adaptive Metropolis algorithm implemented in the MATLAB toolbox PESTO [78] . The methods are self-tuning and provided good convergence properties. Convergence after burnin was assessed using the Geweke test [79] . The parameter samples were used to generate samples for the model states and observables. The samples of parameters and predictions were used to derive the credibility intervals. For a credibility level α, the bounds of the credibility interval were determined as the 100α/2and the 100(1 − α/2)-percentile of the respective sample. This procedure yields the so called equal-tailed interval. We considered competing hypotheses on the dynamics of the infection process, the effect of the intervention and the noise distribution. Each of the resulting models was assessed using 21 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 22, 2020. . https://doi.org/10.1101/2020.04. 19.20071597 doi: medRxiv preprint the Akaike information criterion (AIC), AIC j = −2 log p(D|θ (j) ) + 2n (j) θ , and the Bayesian information criterion (BIC), in which j is the model index,θ (j) is the maximum likelihood estimate for the j-th model, and n θ (j) is the number of parameter of the j-th model. The number of independent data points is denoted by n D . AIC and BIC account for the likelihood of the data and penalise model complexity. Low AIC and BIC values are favorable. We consider a difference 10 between AIC/BIC values of different models as substantial [80] . For analysis and visualisation we also evaluated the AIC and BIC weights [80] . The AIC weight for the j-th model is defined as and provides the weight of evidence in favor of the j-th model being the actual best model in terms of the Kullback-Leibler Information (assuming that the true model is in the considered set). The BIC weight for the j-th model is defined as , with BIC min := min j BIC j , and provides an approximation to the Bayesian posterior probability of the j-th model. AIC and BIC weights are between 0 and 1, and a high value indicates a strong support. The model formulation, parameter estimation and profile likelihoods were performed in the MATLAB toolbox Data2Dynamics (https://github.com/Data2Dynamics/d2d) [74] . Outliers in the computed prediction profiles arising from calculation error were corrected subsequently. The calculation of parameter confidence intervals and MCMC sampling was carried out using the MATLAB toolboxes PESTO (https://github.com/ICB-DCM/PESTO) [78] and AMICI (https://github.com/ICB-DCM/AMICI) [81, 82] . For numerical integration Data2Dynamics and AMICI rely on the SUNDIALS solver CVODES [83] . The complete implementation (including the respective version of the used toolboxes) and data are available on ZENODO (https://doi.org/10.5281/zenodo.3757227). This includes the MATLAB code as well as the the specification of the parameter estimation problems as PEtab files [84] (with the model in SBML format [85] ). . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 22, 2020. . No competing interests. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 22, 2020 24 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 22, 2020. . Figure S2 : Parameter correlation for model O1. The scatter plot matrix for the MCMC samples are depicted. The maximum a posterior estimate (MAP) and the profile likelihoods with respect to the parameter in the x-axis and y-axis are indicated. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 22, 2020. . https://doi.org/10.1101/2020.04. 19.20071597 doi: medRxiv preprint Figure S3 : Parameter correlation for model O2. The scatter plot matrix for the MCMC samples are depicted. The maximum a posterior estimate (MAP) and the profile likelihoods with respect to the parameter in the x-axis and y-axis are indicated. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 22, 2020. . https://doi.org/10.1101/2020.04. 19.20071597 doi: medRxiv preprint Figure S4 : Parameter correlation for model O3. The scatter plot matrix for the MCMC samples are depicted. The maximum a posterior estimate (MAP) and the profile likelihoods with respect to the parameter in the x-axis and y-axis are indicated. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 22, 2020. . https://doi.org/10.1101/2020.04. 19.20071597 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 22, 2020. . https://doi.org/10.1101/2020.04. 19.20071597 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 22, 2020. . https://doi.org/10.1101/2020.04. 19.20071597 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 22, 2020. . https://doi.org/10.1101/2020.04. 19.20071597 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 22, 2020. . https://doi.org/10.1101/2020.04. 19.20071597 doi: medRxiv preprint Forecasting Influenza Epidemics in Hong Kong Challenges in Real-Time Prediction of Infectious Disease: A Case Study of Dengue in Thailand Inference and Forecast of the Current West African Ebola Outbreak in Guinea, Sierra Leone and Liberia Using Phenomenological Models to Characterize Transmissibility and Forecast Patterns and Final Burden of Zika Epidemics Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand Risk Assessment of Novel Coronavirus COVID-19 Outbreaks Outside China Assessing the Use of Influenza Forecasts and Epidemiological Modeling in Public Health Decision Making in the United States A Contribution to the Mathematical Theory of Epidemics The Mathematics of Infectious Diseases Epidemic Models The SIR epidemic model from a PDE point of view Partial Differential Equations of an Epidemic Model with Spatial Diffusion Stochastic Modelling of the Spatial Spread of Influenza in Germany Stochastic Epidemic Modeling An Introduction to Stochastic Processes with Applications to Biology Stochastic epidemic models: A survey Stochastic Models for Epidemics Growing Artificial Societies: Social Science from the Bottom Up Agent-Based Models in Empirical Social Research Bayesian modeling to unmask and predict influenza A/H1N1pdm dynamics in London Statistical Estimation of Parameters in a Disease Transmission Model: Analysis of a Cryptosporidium Outbreak Bayesian parameter inference for dynamic infectious disease modelling: Rotavirus in Germany Contemporary Statistical Inference for Infectious Disease Models Using Stan Bayesian Emulation and Calibration of a Dynamic Epidemic Model for A/H1N1 Influenza Information theory and an extension of the maximum likelihood principle Estimating the dimension of a model Bayes factors Return of the coronavirus: 2019-nCoV World Health Organization. Novel Coronavirus (2019-nCoV) situation report WHO Director-General's opening remarks at the media briefing on COVID-19 Beware of asymptomatic transmission: Study on 2019-nCoV prevention and control measures based on extended SEIR model A mathematical model for simulating the phase-based transmissibility of a novel coronavirus Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV2) Breaking down of the healthcare system: Mathematical modelling for controlling the novel coronavirus (2019-nCoV) outbreak in Wuhan Novel coronavirus 2019-nCoV: Early estimation of epidemiological parameters and epidemic predictions Interventions to mitigate early spread of SARS-CoV-2 in Singapore: A modelling study Potential impact of seasonal forcing on a SARS-CoV-2 pandemic Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia Preliminary Estimation of the Basic Reproduction Number of Novel Coronavirus (2019-nCoV) in China, from 2019 to 2020: A Data-Driven Analysis in the Early Phase of the Outbreak An investigation of transmission control measures during the first 50 days of the COVID-19 epidemic in China Predicting the cumulative number of cases for the COVID-19 epidemic in China from early data. medRxiv A pragmatic model to forecast the COVID-19 epidemic in different countries and allowing for daily updates Dynamic Modeling to Identify Mitigation Strategies for Covid-19 Pandemic. medRxiv Wuhan Municipal Health Commission Estimating the unreported number of novel coronavirus (2019-nCoV) cases in China in the first half of January 2020: a data-driven modelling analysis of the early outbreak Phase-adjusted estimation of the number of coronavirus disease 2019 cases in Wuhan Real-time forecasts of the COVID-19 epidemic in China from Epidemic analysis of COVID-19 in China by dynamical modeling Mathematical Structures of Epidemic Systems Fitting dynamic models to epidemic outbreaks with quantified uncertainty: a primer for parameter uncertainty, identifiability, and forecasts Novel Coronavirus (2019-nCoV) situation report -30 Incubation period of 2019 novel coronavirus (2019-nCoV) infections among travellers from Wuhan, China European Centre for Disease Prevention and Control. Outbreak of novel coronavirus disease 2019 (COVID-19): increased transmission globally -fifth update Effective containment explains subexponential growth in recent confirmed COVID-19 cases in China A first study on the impact of current and future control measures on the spread of COVID-19 in Germany Covid-19 epidemic in Malaysia: Impact of lock-down on infection dynamics Purely data-driven exploration of COVID-19 pandemic after three months of the outbreak. medRxiv A discrete stochastic model of the COVID-19 outbreak: Forecast and control Modeling targeted layered containment of an influenza pandemic in the United States Estimating clinical severity of COVID-19 from the transmission dynamics in Wuhan, China Estimation of the asymptomatic ratio of novel coronavirus infections (COVID-19) Estimate of the development of the epidemic reproduction number Rt from Coronavirus SARS-CoV-2 case data and implications for political measures based on prognostics The incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases: estimation and application Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study Virological assessment of hospitalized patients with COVID-2019 072: Gemeinsam gegen COVID-19: Münchner Tropeninstitut beginnt Stichprobenanalyse zur Verbreitung der Corona-Pandemie und zur Wirksamkeit von Gegenmaßnahmen Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood Profile likelihood in systems biology Data2Dynamics: a modeling environment tailored to parameter estimation in dynamical systems Fast integration-based prediction bands for ordinary differential equation models Teaching about approximate confidence regions based on maximum likelihood estimation Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems PESTO: Parameter EStimation TOolbox Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments Model selection and multimodel inference: A practical information-theoretic approach Scalable parameter estimation for genome-scale biochemical reaction networks Parameter estimation for dynamical systems with discrete events and logical operations CVODES: The sensitivity-enabled ODE solver in SUN-DIALS PEtab -interoperable specification of parameter estimation problems in systems biology The systems biology markup language (SBML): A medium for representation and exchange of biochemical network models