key: cord-0611895-jomeywqr authors: Massonis, Gemma; Banga, Julio R.; Villaverde, Alejandro F. title: Structural Identifiability and Observability of Compartmental Models of the COVID-19 Pandemic date: 2020-06-25 journal: nan DOI: nan sha: 486bbfe04ae8cbd95d31eee1bdb1c4b2149ddc64 doc_id: 611895 cord_uid: jomeywqr The recent coronavirus disease (COVID-19) outbreak has dramatically increased the public awareness and appreciation of the utility of dynamic models. At the same time, the dissemination of contradictory model predictions has highlighted their limitations. If some parameters and/or state variables of a model cannot be determined from output measurements, its ability to yield correct insights -- as well as the possibility of controlling the system -- may be compromised. Epidemic dynamics are commonly analysed using compartmental models, and many variations of such models have been used for analysing and predicting the evolution of the COVID-19 pandemic. In this paper we survey the different models proposed in the literature, assembling a list of 36 model structures and assessing their ability to provide reliable information. We address the problem using the control theoretic concepts of structural identifiability and observability. Since some parameters can vary during the course of an epidemic, we consider both the constant and time-varying parameter assumptions. We analyse the structural identifiability and observability of all of the models, considering all plausible choices of outputs and time-varying parameters, which leads us to analyse 255 different model versions. We classify the models according to their structural identifiability and observability under the different assumptions and discuss the implications of the results. We also illustrate with an example several alternative ways of remedying the lack of observability of a model. Our analyses provide guidelines for choosing the most informative model for each purpose, taking into account the available knowledge and measurements. The current coronavirus disease pandemic, caused by the SARS-CoV-2 virus, continues to wreak unparalleled havoc across the world. Public health authorities can use mathematical models to answer critical questions related with the dynamics of an epidemic (severity and time course of infected people), its impact on the healthcare system, and the design and effectiveness of different interventions [1] [2] [3] [4] . Mathematical modeling of infectious diseases has a long history [5, 6] . Modeling efforts are particularly important in the context of COVID-19 because its dynamics can be particularly complex and counter-intuitive due to the uncertainty in the transmission mechanisms, possible seasonal variation in both susceptibility and transmission, and their variation within subpopulations [7] . The media has given extensive coverage to analyses and forecasts using COVID-19 models, with increased attention to cases of conflicting conclusions, giving the impression that epidemiological models are unreliable or flawed. However, a closer looks reveals that these modeling studies were following different approaches, handling uncertainty differently, and ultimately addressing different questions on different time-scales [8] . Broadly speaking, data-driven models (using statistical regression or machine learning) can be used for shortterm forecasts (one or a few weeks). Mechanistic models based on assumptions about transmission and immunity try to mimic how the virus spreads, and can be used to formalize current knowledge and explore long-term outcomes of the pandemic and the effectiveness of different interventions. However, the accuracy of mechanistic models is constrained by the uncertainties in our knowledge, which creates uncertainties in model parameters and even in the model structure [8] . Further, the uncertainty in the COVID-19 data and the exponential spread of the virus amplify the uncertainty in the predictions. Predictability studies [9] seek the characterization of the fundamental limits to outbreak prediction and their impact on decision-making. Despite the vast literature on mathematical epidemiology in general, and modeling of COVID-19 in particular, comparatively few authors have considered the predictability of infectious disease outbreaks [9, 10] . Uncertainty quantification [11] is an interconnected concept that is also key for the reliability of a model, and that has received similarly scant attention [12, 13] . In addition to predictability and uncertainty quantification approaches, identifiability is a related property whose absence can severely limit the usefulness of a mechanistic model [14] . A model is identifiable if we can determine the values of its parameters from knowledge of its inputs and outputs. Likewise, the related control theoretic property of observability describes if we can infer the model states from knowledge of its inputs and outputs. If a model is non-identifiable (or non-observable) different sets of parameters (or states) can produce the same predictions or fit to data. The implications can be enormous: in the context of the COVID-19 outbreak in Wuhan, non-identifiability in model calibrations was identified as the main reason for wide variations in model predictions [15] . Reliable models can be used in combination with optimization and optimal control methods to find the best intervention strategies, such as lock-downs with minimum economic impact [16, 17] . Further, they can be used to explore the feasibility of model-based real-time control of the pandemic [18, 19] . However, using calibrated models with non-identifiability or non-observability issues can result in bad or even dangerous intervention and control strategies. It is common to distinguish between structural and practical identifiability. Structural non-identifiability may be due to the model and measurement (input-output) structure. Practical non-identifiability is due to lack of information in the considered data-sets. Non-identifiability results in incorrect parameter estimates and bad uncertainty quantification [14, 20] , i.e. a misleading calibrated model which should not be used to analyze epidemiological data, test hypothesis, or design interventions. The structural identifiability of several epidemic mechanistic models has been studied e.g. in [21] [22] [23] [24] [25] [26] . Other recent studies have mostly focused on practical identifiability, such as [14, 20, [27] [28] [29] . In this paper we assess the structural identifiability and observability of a large set of COVID-19 mechanistic models described by deterministic ordinary differential equations, derived by different authors using the compartmental modeling framework [30] . Compartmental models are widely used in epidemiology because they are tractable and powerful despite their simplicity. We collect 36 different compartmental models, of which we consider several variations, making up a total of 255 different model versions. Our aim is to characterize their ability to provide insights about their unknown parameters -i.e. their structural identifiability -and unmeasured states -i.e. their observability. To this end we adopt a differential geometry approach that considers structural identifiability as a particular case of nonlinear observability, allowing to analyse both properties jointly. We define the relevant concepts and describe the methods used in Section 2. Then we provide an overview of the different types of compartmental models found in the literature in Section 3. We analyse their structural identifiability and observability and discuss the results in Section 4, where we also show different ways of remedying lack of observability using an illustrative model. Finally, we conclude our study with some key remarks in Section 5. We consider models defined by systems of ordinary differential equations with the following notation: where f and h are analytical (generally nonlinear) functions of the states x(t) ∈ R n x , known inputs u(t) ∈ R n u , unknown constant parameters θ ∈ R n θ , and unknown inputs or time-varying parameters w(t) ∈ R n w . The output y(t) ∈ R n y represents the measurable functions of model variables. The expressions (1-2) are sufficiently general to represent a wide range of model structures, of which compartmental models are a particular case. Definition 1 (Structurally locally identifiable [31] ). A parameter θ i of model M is structurally locally identifiable (s.l.i.) if for almost any parameter vector θ * ∈ R n θ there is a neighbourhood N(θ * ) in which the following relationship holds:θ ∈ N(θ * ) and y(t,θ) = y(t, θ * ) Otherwise, θ i is structurally unidentifiable (s.u.). If all model parameters are s.l.i. the model is s.l.i. If there is at least one s.u. parameter, the model is s.u.. Likewise, a state x i (τ) is observable if it can be distinguished from any other states in a neighbourhood from observations of the model output y(t) and input u(t) in the interval t 0 ≤ τ ≤ t ≤ t f , for a finite t f . Otherwise, x i (τ) is unobservable. A model is called observable if all its states are observable. We also say that M is invertible if it is possible to infer its unknown inputs w(t), and we say that w(t) is reconstructible in this case. Structural identifiability can be seen as a particular case of observability [32] [33] [34] , by augmenting the state vector with the unknown parameters θ, which are now considered as state variables with zero dynamics, x = (x T , θ T ) T . The reconstructibility of unknown inputs w(t), which is also known as input observability, can also be cast in a similar way, although in this case their derivatives may be nonzero. To this end, let us augment the state vector further with w as additional states, as well as their derivatives up to some non-negative integer l: The l−augmented dynamics is: leading to the l−augmented system: Remark 1 (Unknown inputs, disturbances, or time-varying parameters). In Section 4, when reporting the results of the structural identifiability and observability analyses, we will explicitly consider some parameters as time-varying. In the model structure defined in equations (1-2) the unknown parameter vector θ is assumed to be constant. To consider an unknown parameter as time-varying we include it in the "unknown input" vector w(t). Thus, changing the consideration of a parameter from constant to time-varying entails removing it from θ and including it in w(t). The elements of w(t) can be interpreted as unmeasured disturbances or inputs of unknown magnitude or, equivalently, as time-varying parameters. Regardless of the interpretation, they are assumed to change smoothly, i.e. they are infinitely differentiable functions of time. For the analysis of some models it is necessary, or at least convenient, to introduce the mild assumption that the derivatives of w(t) vanish for a certain non-negative integer s (possibly s = +∞), i.e. w s) (t) 0 and w i) (t) = 0 for all i > s. This assumption is equivalent to assuming that the disturbances are polynomial functions of time, with maximum degree equal to s [35] . Definition 2 (Full Input-State-Parameter Observability, FISPO [35] ). Let us consider a model M given by (1) (2) . We augment its state vector as z(t) = x(t) T θ T w(t) T T (4), which leads to its augmented form (5) . We say that M has the FISPO property if, for every t 0 ∈ I, every model unknown z i (t 0 ) can be inferred from y(t) and u(t) in a finite time interval t 0 , t f ⊂ I. Thus, M is FISPO if, for every z(t 0 ) and for almost any vector z * (t 0 ), there is a neighbourhood N (z * (t 0 )) such that, for allẑ(t 0 ) ∈ N (z * (t 0 )) , the following property is fulfilled: In this paper we analyse input, state, and parameter observability -that is, the FISPO property defined aboveusing a differential geometry framework. Such analyses are structural and local. By structural we refer to properties that are entirely determined by the model equations; thus we do not consider possible deficiencies due to insufficient or noise-corrupted data. By local we refer to the ability to distinguish between neighbouring states (similarly, parameters or unmeasured inputs), even though they may not be distinguishable from other distant states. This is usually sufficient, since in most (although not all, see e.g. [36] ) applications local observability entails global observability. This specific type of observability has sometimes been called local weak observability [37] . This approach assesses structural identifiability and observability by calculating the rank of a matrix that is constructed with Lie derivatives. The corresponding definitions are as follows (in the remainder of this section we omit the dependency on time to simplify the notation): Definition 3 (Extended Lie derivative [38] ). Consider the system M (1-2) with augmented state vector (4) and augmented dynamics (5) . Assuming that the inputs u are analytical functions, the extended Lie derivative of the output alongf =f (·, u) is: The zero-order derivative is L 0f h = h, and the i−order extended Lie derivatives can be recursively calculated as: Definition 4 (Observability-identifiability matrix [35] ). The observability-identifiability matrix of the system M (1-2) with augmented state vector (4), augmented dynamics (5), and analytical inputs u is the following mnx × nx matrix, The FISPO property of M can be analysed by calculating the rank of the observability-identifiability matrix: Theorem 1 (Observability-identifiability condition, OIC [38] ). If the identifiability-observability matrix of a model M satisfies rank (O I (x 0 , u)) = nx = n x + n θ + n w , withx 0 being a (possibly generic) point in the augmented state space, then the system is structurally locally observable and structurally locally identifiable. In this paper we generally check the OIC criterion of (1) using STRIKE-GOLDD, an open source MATLAB toolbox [39] . Alternatively, for some models we use the Maple code ObservabilityTest, which implements a procedure that avoids the symbolic calculation of the Lie derivatives and is hence computationally efficient [33] . A number of other software tools are available, including GenSSI2 [40] in MATLAB, IdentifiabilityAnalysis in Mathematica [38] , DAISY in REDUCE [41] , SIAN in Maple [42] , and the web app COMBOS [43] . It should be taken into account that in the present work we are interested in assessing structural identifiability and observability both with constant and continuous time-varying model parameters (or equivalently, with unknown inputs), as explained in Remark 1. Ideally, the method of choice should provide a convenient way of analysing models with this type of parameters (inputs). It is always possible to perform this type of analysis by assuming that the time dependency of the parameters is of a particular form, e.g. a polynomial function of a certain maximum degree. In this article we review compartmental models, which are one of the most widely used families of models in epidemiology. They divide the population into homogeneous compartments, each of which corresponds to a state variable that quantifies the number of individuals that are at a certain disease stage. The dynamics of these compartments are governed by ordinary differential equations, usually with unknown parameters that describe the rates at which individuals move among different stages of disease. The basic compartmental model used for describing a transmission disease is the SIR model, in which the population is divided into three classes: • Susceptible: individuals who have no immunity and may become infected if exposed. • Infected and infectious: an exposed individual becomes infected after contracting the disease. Since an infected individual has the ability to transmit the disease, he/she is also infectious. • Recovered: individuals who are immune to the disease and do not affect its transmission. Another class of models, called SEIR, include an additional compartment to account for the existence of a latent period after the transmission: • Exposed: individuals vulnerable to contracting the disease when they come into contact with it. These idealized models differ from the reality. Contact tracing, screening, or changes in habits are some differences that are not considered in basic SIR or SEIR models, but are important for evaluating the effects of an intervention. Furthermore, it is not only important to enrich the information about the behaviour of the population; the characteristics of the disease must also be taken into account. These additional details can be incorporated to the model as new parameters, functions, or extra compartments. Compartments such as asymptomatic, quarantined, isolated, and hospitalized have been widely used in COVID-19 models. From 29 articles, most of which are very recent [10, 15, , we have collected 36 models. Depending on whether they have an exposed compartment or not, they can be broadly classified as belonging to the SIR or SEIR families. However, most of these models include additional compartments. Susceptible individuals become infected with an incidence of: where β = pc is the transmission rate, c is the contact rate and p the probability that a contact with a susceptible individual results in a transmission [6] . Individuals who recover leave the infectious class at rate γ, where 1/γ is the average infectious period. The set of differential equations describing the basic SIR model is given by: As mentioned above, compartmental models can be extended to consider further details. We have found models that incorporate the following features: asymptomatic individuals, births and deaths, delay-time, lock-down, quarantine, isolation, social distancing, and screening. Figure 1 shows a classification of the SIR models reviewed in this article, and Table 1 lists them along with their equations. Multiple output choices have been considered in the study of the structural identifiability and observability of some models. In such cases the observations are listed in the Output column. Figure 1 : Classification of SIR models. Each block represents a model structure. The basic, three-compartment SIR model structure is on top of the tree. Every additional block is labeled with the additional feature that it contains with respect to its parent block. The darkness of the shade indicates the number of additional features with respect to the basic SIR model. Parameters Output ICS Input Equations Individuals in the SEIR model are divided in four compartments: Susceptible (S), Exposed (E), Infected (I) and Recovered (R). Compared to the SIR models, the additional compartment E allows for a more accurate description of diseases in which the incubation period and the latent period do not coincide, i.e. the period between which an infected becomes infectious. This is why SEIR models are in principle best suited to epidemics with a long incubation period such as COVID-19 [50] . Susceptible individuals move to the exposed class at a rate βI(t), where β is the transmission rate parameter. Exposed individuals become infected at rate κ, where 1/κ is the average latent period. Infected individuals recover at rate γ, where 1/γ is the average infectious period. Thus, the set of differential equations describing the basic SEIR model is: Existing extensions of SEIR models may incorporate some of the following features: asymptomatic individuals, births and deaths, hospitalization, quarantine, isolation, social distancing, screening and lock-down. Figure 2 shows a classification of the models found in the literature; Table 2 lists them along with their equations. [51] S, L, E, I, Q, R γ, β 1 , η, δ, ξ, θ 2 , , θ 1 , α 1 , α 2 , We analysed the structural identifiability and observability of the 17 SIR model structures (a total of 98 model versions considering the different output configurations and time-varying parameter assumptions) and 19 SEIR models (with a total of 157 model versions) listed in Tables 1 and 2. The detailed results for each model are given in Appendix A, which reports the structural identifiability of each parameter and the observability of each state, for every model version. In the remainder of this section we provide an overview of the main results. The general patterns regarding state observability are as follows. The recovered state (R) is almost never observable unless it is directly measured (D.M.) as output; the only exceptions are two SEIR models, 31 and 38, for which R is observable under the assumption of time-varying parameters. The susceptible state (S), in contrast, is observable in roughly two thirds of the models (SIR: 65/98, SEIR: 103/157); this is also true for the exposed state (E) in the SEIR models. The infected state (I) is included in most studies among the outputs, either directly (D.M.) or indirectly measured (as part of a parameterized measurement function). When it is not considered in this way, its observability is generally similar to that of S (in 18/157 model versions I is not an output and it is observable in 13/18). The transmission and recovery rates (β, γ) are the two parameters common to all SIR models. The transmission rate is identifiable in 59/98 model versions, and γ in 51/98 and its derivatives in 12/98. SEIR models have a third parameter in common, the latent period (κ). It is identifiable in most of the models (145/157), as well as the recovery rate (111/157). The transmission rate is identifiable in 101/157 model versions, but it is not identifiable in any SEIR model version that accounts for social distancing (numbers 34 and 61); we found no clear pattern in the other models. The transmission rate β, the recovery rate γ, and in SEIR models the latent period κ, can vary during an epidemic as a result of changes in the population's behaviour [57, 70] , the introduction of new drugs or new medical equipment [57] , or the reduction of the period duration as a result of high temperatures [71] . To account for such variations, the present study has considered both the constant and the time-varying cases, by including the corresponding variables either in the constant parameter vector θ or in the unknown input vector w(t), respectively, as described in Remark 1. Changing a parameter from constant to time-varying naturally influences structural identifiability and observability. This effect is graphically summarized in Figures 3-7 , which represent classes of models in tree form and classify them according to their observability. Each model is shaded with a color, according to the observability of the parameter studied. Some models include different rates for different population groups: for example, they may consider two different transmission rates for symptomatic and asymptomatic individuals. For those models, each rate may have different observability properties when considered time-varying parameters; in such cases the model is depicted between two color blocks (see for example the SIR 20 model in Figure 3 ). Changing β from a constant to a time-varying parameter (or equivalently an unknown input) does not change its observability nor that of the other variables in SIR models. In contrast, this is not the case with the recovery rate γ, for which a somewhat counter-intuitive result may be obtained: by changing γ from a constant to a continuous function of time with at least a non-zero derivative, its model can become more observable and identifiable -despite the fact that it is an unknown function. An example of this is the SIR model 15: if γ is constant the model has only one identifiable parameter, τ, and no observable states; if γ is time-varying with at least one non-zero derivative, two parameters become identifiable (β, µ), two states become observable (I, S), and γ itself is observable. In the other models, when γ is not identifiable as a constant nor observable as an unknown input, its successive derivatives are observable. For the SEIR models, the consideration of the β parameter as an unknown input function follows a similar trend to that of the SIR models with the exception of model 38, which gains both observability and identifiability and becomes FISPO. Considering the recovery rate γ (Fig. 7) or the latent period κ (Fig. 6) individually as time-varying parameters generally leads to greater observability, except for model 31 (1) . As an example, in model 39(2) one of the unknown inputs becomes observable, three states become observable (S, E, I), and three parameters become identifiable (γ, µ i , β); or the 16(2) model, in which both its input and three states (S, E, I) become observable and two parameters (µ, β) become identifiable. Besides the transmission rate, latent period, and recovery rate, other rates (screening, disease-related deaths, and isolation) have also been considered as time-varying parameters in some studies. The observability of most models is not modified if these parameters are allowed to change in time; the exception being 8 models which gain observability. An example is the SEIR model 41(1), which has seven parameters, seven states, and one output. Assuming constant parameters, five of them are structurally identifiable (κ, α, β, γ 1 , γ 2 ) and two are unidentifiable (q, ρ), while there are three observable states (I, J, C) and four unobservable states (S, E, A, R) [28] . However, when the parameter ρ (which describes the proportion of exposed/latent individuals who become clinically infectious) is considered timevarying, all parameters become identifiable (including ρ) and six states become observable (all except R, which is never observable unless it can be directly measured, as we have already mentioned). The fact that allowing an unknown quantity to change in time can improve its observability -and also the observability of other variables in a model -may seem paradoxical. An intuitive explanation can be obtained from the study of the symmetry in the model structure. The existence of Lie symmetries amounts to the possibility of transforming parameters and state variables while leaving the output unchanged, i.e. their existence amounts to lack of structural identifiability and/or observability [72] . The STRIKE-GOLDD toolbox used in this paper includes procedures for finding Lie symmetries [73] . Let us use the SIR 15 model as an example. This model has five parameters (τ, β, ρ, µ, d), of which only τ is identifiable if assumed constant. The model contains the following symmetry: where is the parameter of the Lie group of transformations. Thus, there is a symmetry between ρ and µ that makes them unidentifiable: changes in one parameter can be compensated by changes in the other one. However, if ρ is time-varying and µ is constant, the latter cannot compensate the changes of the former, and the symmetry is broken. Indeed, if ρ is considered time-varying the model gains identifiability (not only µ, but also τ and β become identifiable) and observability (S, I and ρ become observable). Let us now illustrate how the results of this study may be applied in a realistic scenario. We use as an example the model SIR 26, which has 6 states (S, I, R, A, Q, J) and 16 parameters (d 1 , d 2 , d 3 , d 4 , d 5 , d 6 , k 1 , k 2 , λ, γ 1 , γ 2 , a , q , j , µ 1 , µ 2 ); its equations are shown in Table 1 . This model includes the following additional features with respect to the basic SIR model: birth/death, asymptomatic individuals (A), quarantine (Q), and isolation (J). In its original publication two states were measured (Q, J). With these two states as outputs the model has five identifiable parameters (d 1 , d 5 , q , k 2 , µ 1 ) and two observable states (A, I); thus, there are two unobservable states (S, R) and ten unidentifiable parameters. If we are interested in estimating e.g. the number of susceptible individuals (S), this model would not be appropriate. How should we proceed in that scenario? One way of improving observability could be by including more outputs (option 1). For example, since there is a separate class for asymptomatic individuals (A), the infected compartment (I) considers only individuals with symptoms, and we could assume that they can be detected. By including 'I' in the output set, the structural identifiability and observability of the model improves: six more parameters are identifiable (λ, a , j , d 4 , k 1 , µ 2 ) and the state in which we are interested (S) becomes observable. However, including more outputs is not always realistic. Another possibility would then be to reduce the complexity of the model by decreasing the number of additional features (option 2). For example, leaving out the asymptomatic compartment leads to the following model: The output of the model is the same, Q, J. In this case, the model has eight identifiable parameters (λ, q , j , d 1 , d 5 , µ 1 , µ 2 , k 2 ) and two observable states (S, I). A third possibility is to simplify the parametrization of the model (option 3). This model considers a different death rate for every compartment (d i , i = 1, . . . , 6.). With some loss of generality, we could consider a specific death rate for infected individuals, d I = d 2 , and a general death rate d for all non-infected and asymptomatic individuals, This reduction of the number of parameters leads to a better observability to the model: the only unidentifiable parameters are d 2 , γ 1 , and k 1 , and the only non-observable state is R. Thus, this option also allows to identify S. Our analyses have shown that a fraction of the models found in the literature have unidentifiable parameters. Key parameters such as the transmission rate (β), the recovery rate (γ), and the latent period (κ) are structurally identifiable in most, but not all, models. The transmission and recovery rates are identifiable in roughly two thirds of the models, and the latent period in almost all (> 90%) of them. Likewise, the states corresponding to the number of susceptible (S) and exposed (E) individuals are non-observable in roughly one third of the model versions analysed in this paper. The number of infected individuals (I) can usually be directly measured, but it is non-observable in one third of the model versions in which it is not measured. The situation is worse for the number of recovered individuals (R), which is almost never observable unless it is directly measured. Many models include other states in addition to S, E, I, and R, which are not always observable either. The transmission rate and other parameters may vary during the course of an epidemic, as a result of a number of factors such as changes in public policy, population behaviour, or environmental conditions. To account for these variations, in the present study we have considered both the constant and the time-varying parameter case. Somewhat unexpectedly, we found that allowing for variability in an unknown parameter often improves the observability and/or identifiability of the model. This phenomenon might be explained by the contribution of this variability to the removal of symmetries in the model structure. Structural identifiability and observability depend on which states or functions are measured. The lack of these properties may in principle be surmounted by choosing the right set of outputs [74] , but the required measurements are not always possible to perform in practice. Epidemiological models are a clear example of this; limitations such as lack of testing or the existence of asymptomatic individuals usually make it impossible to have measurements of all states. An alternative to measuring more states is to use a model with fewer compartments and/or a simpler parameterization, thus decreasing the number of states and/or parameters. Reducing the model dimension in this way may achieve observability and identifiability. Even when it is not possible (or practical) to avoid non-observability or non-identifiability by any means, the model may still be useful, as long as it is only used to infer its observable states or identifiable parameters. For example, we may be interested in determining the transmission rate β but not the number of recovered individuals R; in such case it is fine to use a model in which β is identifiable even if R is not observable. Of course, this means that, to ensure that a model is properly used, it is necessary to characterize its identifiability and observability in detail, to know if the quantity of interest is observable/identifiable. The contribution of this work has been to provide such a detailed analysis of the structural identifiability and observability of a large set of compartmental models of COVID-19 presented in the recent literature. The results of our analyses can be used to avoid the pitfalls caused by non-identifiability and non-observability. By classifying the existing models according to these properties, and arranging them in a structured way as a function of the compartments that they include, our study has answered the following question: given the sets of existing models and available measurements, which model is appropriate for inferring the value of some particular parameters, and/or to predict the time course of the states of interest? The tables included in the following pages report the results of the observability and structural identifiability analyses of all the model variants considered in this paper. Each block of rows represents one of the following assumptions: • All parameters considered constant (i.e. as is usually the case in the original publications). • Transmission rate β considered time-varying. • Latent period κ considered time-varying (only in SEIR models; SIR models do not have this parameter). • Recovery rate γ considered time-varying. • All parameters considered time-varying. Within each block, each row provides detailed information about identifiable and non-identifiable parameters, observable and non-observable states, directly measured (D.M.) states, observable and unobservable unknown inputs (and time-varying parameters), known inputs, and number of derivatives of the unknown inputs (and time-varying parameters) assumed to be non-zero (nnDerW). The suffix d number represents the n th derivative of an unknown function (e.g. β d1 is the first derivative of the time-varying parameter β). The blank blocks in the tables of the SEIR models numbers 38 and 8 indicate that the corresponding time-varying case is already considered in the original formulation of the model. The SIR models 29 and 30 have only been studied in their original form, i.e. without considering time-varying parameters, because these models do not contain the common parameters of the SIR models; instead they use the R 0 constant. 13 20 h=I h=KI h=I, R, Q h=Q h=X h=D, R, T Identifiable β, γ γ k0 β, δ, η, ξ, , ρ Non Identifiable K, β β, γ, δ k, β, α α, γ, ε, C, λ, σ, κ, θ, μ, τ Opinion: Mathematical models: A key tool for outbreak response An introduction to mathematical modeling of infectious diseases How simulation modelling can help reduce the impact of covid-19 Special report: The simulations driving the world's response to covid-19 Mathematical Epidemiology An introduction to mathematical epidemiology Modeling infectious disease dynamics Wrong but usefulwhat covid-19 epidemiologic models can and cannot tell us On the predictability of infectious disease outbreaks Predictability: Can the turning point and end of an expanding epidemic be precisely forecast? Sensitivity analysis for uncertainty quantification in mathematical models Asymptotic estimates of sars-cov-2 infection counts and their sensitivity to stochastic perturbation Covid-19 outbreak in wuhan demonstrates the limitations of publicly available case numbers for epidemiological modelling Fitting dynamic models to epidemic outbreaks with quantified uncertainty: a primer for parameter uncertainty, identifiability, and forecasts Why is it difficult to accurately predict the covid-19 epidemic? A simple planning problem for COVID-19 lockdown A multi-risk SIR model with optimally targeted lockdown Can the covid-19 epidemic be controlled on the basis of daily test reports? Practical unidentifiability of a simple vector-borne disease model: Implications for parameter estimation and intervention assessment The structural identifiability of a general epidemic (SIR) model with seasonal forcing The structural identifiability of the susceptible infected recovered model with seasonal forcing The structural identifiability of susceptible-infective-recovered type epidemic models with incomplete immunity and birth targeted vaccination Identifiability and estimation of multiple transmission pathways in cholera and waterborne disease Integrating measures of viral prevalence and seroprevalence: a mechanistic modelling approach to explaining cohort patterns of human papillomavirus in women in the usa Population modeling of early covid-19 epidemic dynamics in french regions and estimation of the lockdown impact on infection rate Structural and practical identifiability analysis of outbreak models Assessing parameter identifiability in compartmental dynamic models using a computational approach: application to infectious disease transmission models Influencing public health policy with data-informed mathematical models of infectious diseases: Recent developments and new challenges Compartmental Models in Epidemiology Dynamic systems biology modeling and simulation New results for identifiability of nonlinear systems A probabilistic algorithm to test local algebraic observability in polynomial time Observability and structural identifiability of nonlinear biological systems Full observability and estimation of unknown inputs, states, and parameters of nonlinear biological models Local identifiability analysis of nonlinear ode models: how to determine all candidate solutions Nonlinear controllability and observability An efficient method for structural identiability analysis of large dynamic systems Structural identifiability of dynamic systems biology models Genssi 2.0: multi-experiment structural identifiability analysis of sbml models A new version of DAISY to test structural identifiability of biological models SIAN: software for structural identifiability analysis of ode models On finding and using identifiable parameter combinations in nonlinear dynamic systems biology models and combos: A novel web implementation Total variation regularization for compartmental epidemic models with time-varying dynamics Effective containment explains subexponential growth in recent confirmed covid-19 cases in china Modelling the covid-19 epidemic and implementation of population-wide interventions in italy A simple SIR model with a large set of asymptomatic infectives Fundamental principles of epidemic spread highlight the immediate need for large-scale serological surveys to assess the stage of the sars-cov-2 epidemic Covid-19 outbreak in wuhan demonstrates the limitations of publicly available case numbers for epidemiological modelling A feedback SIR (fSIR) model highlights advantages and limitations of infection-based social distancing Construction of compartmental models for covid-19 with quarantine, lockdown and vaccine interventions Models of SEIRS epidemic dynamics with extensions, including network-structured populations, testing, contact tracing, and social distancing A modified SEIR model to predict the covid-19 outbreak in spain and italy: simulating control scenarios and multi-scale epidemics Social distancing to slow the coronavirus SEIAR model with asymptomatic cohort and consequences to efficiency of quarantine government measures in COVID-19 epidemic Research about the optimal strategies for prevention and control of varicella outbreak in a school in a central city of china: based on an SEIR dynamic model Epidemic analysis of covid-19 in china by dynamical modeling Mathematical modeling of epidemic diseases To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the covid-19 pandemic Seir transmission dynamics model of 2019 ncov coronavirus with considering the weak infectious ability and changes in latency duration Healthcare impact of covid-19 epidemic in india: A stochastic mathematical model Modeling the control of covid-19: impact of policy interventions and meteorological factors Modelling the transmission dynamics of covid-19 in six high burden countries Mathematical model of transmission dynamics with mitigation and health measures for sars-cov-2 infection in european countries A novel covid-19 epidemiological model with explicit susceptible and asymptomatic isolation compartments reveals unexpected consequences of timing social distancing A mathematical model of epidemics with screening and variable infectivity Dynamic models for the analysis of epidemic spreads Effects of quarantine in six endemic models for infectious diseases Introduction to SEIR models A time-dependent sir model for covid-19 with undetectable infected persons A periodic seirs epidemic model with a time-dependent latent period Structural identifiability analysis via symmetries of differential equations Finding and breaking lie symmetries: Implications for structural identifiability and observability in biological modelling Minimal output sets for identifiability