key: cord-0514884-9wd5x7r1
authors: Sameni, Reza
title: Mathematical Modeling of Epidemic Diseases; A Case Study of the COVID-19 Coronavirus
date: 2020-03-25
journal: nan
DOI: nan
sha: 88c1931de9f586dcd9fbe6f34feb39047ead67fc
doc_id: 514884
cord_uid: 9wd5x7r1

The outbreak of the Coronavirus COVID-19 has taken the lives of several thousands worldwide and locked-out many countries and regions, with yet unpredictable global consequences. In this research we study the epidemic patterns of this virus, from a mathematical modeling perspective. The study is based on endemic extensions of the well-known susceptible-infected-recovered (SIR) family of compartmental models. It is shown how social measures such as distancing, regional lock-downs, quarantine and global public health vigilance, influence the model parameters, which can eventually change the mortality rates and active contaminated cases over time, in the real world. As with all mathematical models, the predictive ability of the model is limited by the accuracy of the available data and to the so-called level of abstraction used for modeling the problem. In order to provide the broader audience of researchers a better understanding of spreading patterns of epidemic diseases, a short introduction on biological systems modeling is also presented and the Matlab source codes for the simulations are provided online.

Since the outbreak of the Coronavirus COVID-19 in January 2020, the virus has affected most countries and taken the lives of several thousands of people worldwide. By March 2020, the World Health Organization (WHO) declared the situation a pandemic, the first of its kind in our generation. To date, many countries and regions have been locked-down and applied strict social distancing measures to stop the virus propagation. From a strategic and healthcare management perspective, the propagation pattern of the disease and the prediction of its spread over time is of great importance, to save lives and to minimize the social and economic consequences of the disease. Within the scientific community, the problem of interest has been studied in various communities including mathematical epidemiology [1] , biological systems modeling [2] , [3] , signal processing [4] and control engineering [5] .

In this study, epidemic outbreaks are studied from an interdisciplinary perspective, by using an extension of the susceptible-exposed-infected-recovered (SEIR) model [1] , which is a mathematical compartmental model based on the average behavior of a population under study. The objective is to provide researchers a better understanding of the significance of mathematical modeling for epidemic diseases. It is * R. Sameni shown by simulation, how social measures such as distancing, regional lock-downs and public health vigilance, can influence the model parameters, which in turns change the mortality rates and active contaminated cases over time.

It should be highlighted that mathematical models applied to real-world systems (social, biological, economical, etc.) are only valid under their assumptions and hypothesis. Therefore, this research-and similar ones-that address epidemic patterns, do not convey direct clinical information and dangers for the public, but should rather be used by healthcare strategists for better planning and decision making. Hence, the study of this work is only recommended for researchers familiar with the strength points and limitations of mathematical modeling of biological systems. The Matlab codes required for reproducing the results of this research are also online available in the Git repository of the open-source electrophysiological toolbox (OSET) [6] .

In Section II, a brief introduction to mathematical modeling of biological systems is presented, to highlight the scope of the present study and to open perspectives for the interested researchers, who may be less familiar with the context. The proposed model for the outspread of the Coronavirus is presented in Section III. The article is concluded with some general remarks and future perspectives.

A model is an entity that resembles a system or object in certain aspects, but is easier to work with as compared to the original system. Models are used for the 1) identification and better understanding of systems, 2) simulation of a system's behavior, 3) prediction of its future behavior, and ultimately 4) system control. Apparently, from item 1 to 4, the problem becomes more difficult and although the ultimate objective is to harness or control a system, this objective is not necessarily achievable. While modeling is the first and most important step in this path, it is highly challenging and nontrivial. The various issues that one faces in this regard, include:

• Models are not unique and different models can co-exist for a single system.

• A model is only a slice of reality and all models have a scope, outside of which, they are invalid.

• Modeling can be done in different levels of abstraction, which corresponds to the level of simplification and the specific aspects of the system that are considered by the model.

Example 1. The response of global stock markets with numerous economic, political, industrial, social and psychological factors, to a high impact news can in cases be modeled with a second-order differential equation, with a step-like overdamped behavior that reaches its steady state after a while. Or in medicine, the response of the human body-with more that thirty-seven trillion cells-to medication can in many cases be "resonably" modeled with a first order differential equation.

While various types of models are used for biological systems, we are commonly interested in mathematical models [7] , as they permit the prediction and possible control of biological systems. In choosing among different available models, the widely accepted principle is the model parsimony, which simply means that "a model should be as simple as possible and as complex as necessary!". The model parsimony, is also an important factor for estimating the unknown model parameters using real data. A more accurate model with fewer number of parameters is evidently preferred over a less accurate and more complex model. But how should one select between a more accurate complex model and a less accurate simpler one? Measures such as the Akaike information criterion (AIC), the Bayesian information criterion (BIC) and the minimum description length (MDL), address the balance between the number of observations and the model unknown parameters to select between competing models with variable number of parameters and different levels of accuracy [8] , [9] . Finally, the physical interpretability of the model parameters and the ability to estimate the parameters such that the model matches real-world data, is what makes the whole modeling framework meaningful.

Differential (difference) equations arise in many modeling problems. The major application of these equations is when the rate of change of a variable is related to other variables, as it is so in most physical and biological systems. Many powerful mathematical tools exist for the analysis and (numerical) solution of models based on differential equations. Despite their vast applications, differential equations are difficult to conceive and interpret without visualization. In this context, compartmental models are used as a visual means of representing differential equations of dynamic systems. A compartment is an abstract entity representing the quantity of interest (volume, number, density, etc.). Depending on the level of abstraction, each of the variables of interest (equivalent to system states in dynamic systems) are represented by a single compartment, conceptually represented by a box. Each compartment is assumed to be internally homogeneous, which implies that all entities assumed inside the compartment are indistinguishable. For example, depending on the model complexity selected for modeling a certain epidemic disease, men and women at risk can be assumed to conform a single compartment, or may alternatively be considered as different compartments. A similar partitioning may be considered for different age groups, ethnicities, countries, etc., at a cost of a more complex (less parsimonious) model with additional states and parameters to be identified. Apparently, the available realworld data may be insufficient for the parameter identification of a more detailed (complex) model.

The compartments interact with one another through a set of rate equations, visually represented by arrows between the compartments. Therefore, compartmental models can be converted to a set of first order linear or nonlinear equations (and vice versa), by writing the net flow into a compartment. Compartmental modeling is also known as mass transport [10] , or mass action [11] , in other contexts. More technically, a compartmental model is a weighted directed graph representation of a dynamic system. Each compartment corresponds to a node of the graph and the linking arrows are the graph edges. From this perspective, for an n compartment system, the compartment variables can be considered as state variables denoted in vector form as x(t) = [x 1 (t), . . . , x n (t)] T . The compartmental model provides a graphical representation of the state-space model:

where f (·) is the state dynamics function corresponding to the compartmental model graph (which can be possibly timevariant and nonlinear), u(t) = [u 1 (t), . . . , u p (t)] T represents deterministic or stochastic external system inputs, y(t) = [y 1 (t), . . . , y m (t)] T is the vector of observable model variables considered as outputs (the measurements), g(·) if the function that maps the state variables to the observations (measurements), and v(t) = [v 1 (t), . . . , v m (t)] T is the vector of measurement inaccuracies, considered as observation noise.

Researchers familiar with estimation theory, have already guessed that the state-space form of (1), implies that one may eventually be able to estimate and predict the compartment variables from noisy measurements, using state-space estimation techniques, such as the Kalman or extended Kalman filter [12] . With this background, the basic steps of compartmental modeling are: 1) Identifying the quantities of interest as distinct compartments and selecting a variable for each quantity as a function of time. These variables are the state variables of the resulting state-space equations. 2) Linking the compartments with arrows indicating the rate of quantity flow from each compartment to another (visually denoted over the arrows connecting the compartments). 3) Writing the corresponding first-order (linear or nonlinear) differential equations for each of the state variables of the model. In writing the equations from the graph representation, the edge weights multiplied by the state variable of their start node are added to (subtracted from) the rate change equation of the end node (start node). External inputs can be considered to be originated from an external node with value 1. Example 2. A three compartment model corresponding to the following set of equations is shown in Fig. 1 .

which can be put in the matrix form of (1). Due to the statedependency of the rate flow between x and z, the model is nonlinear. It is also an open system, since the sum of rate changes is non-zero, i.e., there is net flow in and out of the whole system (due to λ and ρ).

In order to model the propagation of epidemic diseases in a population, certain disease-and population-specific assumptions are required. The most common assumptions in this context include:

• The diseases are contagious and transfer via contact. • A disease may or may not be mortal. • There may be births during the period of study, and the birth may (or may not) be congenitally transferred from the mother to the baby.

• The disease can have an exposure period, during which the contaminants carry and spread the disease, but do not have visible symptoms.

• Catching the disease may or may not result in short-term or long-term immunity. Depending on the case, the recovered patients can again become susceptible to the disease.

• Interventions such as medication, vaccination, lock-down, quarantine and social distancing can change the pattern of propagation. 1) The susceptible-infected-recovered model: A basic model used for modeling epidemic diseases without lifetime immunity is known as the susceptible-infected-recovered (SIR) model [1] . In this model, the total population of N individuals exposed to an epidemic disease at each time instant t is divided into three groups (each represented by a compartment): the susceptible group fraction denoted by s(t), the infected group fraction denoted by i(t), and the recovered group fraction denoted by r(t) (the compartment variables are in fact the fraction of each group's population divided by N ). Accordingly, the system is closed and we have

A compartmental model for the propagation of the disease is shown in Fig. 2 . The compartmental representation of Fig. 2 is equivalent to the following set of differential equations:

Accordingly, moving from the susceptible group to the infected group takes place at a rate that is proportional to the population of the infected and susceptible groups, with parameter α. At the same time, infected individuals are assumed to recover at a constant rate of β. Finally, considering that the disease is not assumed to result in lifetime immunity of the subjects, the recovered individuals again return to the susceptible group at a fixed rate of γ. From (4), it is evident that

which is in accordance with (3) and the fact that the system is assumed to be closed (no births or deaths have been considered). Assuming initial conditions for each group, the set of nonlinear equations (4) can be (numerically) solved to find the evolution of the population of each compartment over time. The numerical solution of a basic (non-endemic) SIR model is shown in Fig. 3 , with and without lifetime immunity. The time-step for numerical discretizing of the differential equations of this simulation has been chosen to be ∆=0.1 of a day. Notice how the outbreak of a disease that does not cause lifetime immunity (such as a typical flu), can result in a constant rate of illness throughout time, after its transient period. For widespread epidemic diseases, the healthcare strategists are interested in the slopes of s(t), i(t) and r(t), rather than the total number of infected individuals (as it is currently the case for the COVID-19 Coronavirus). The prolongation of the disease spread provides the better management of healthcare resources such as hospitalization, medication, healthcare personnel, etc. For later reference, it is interesting to study the fixed-point of the SIR model (whereṡ(t) =i(t) =ṙ(t) = 0). Equating the left sides of (4) with zero, it can be algebraically shown that if α, γ = 0 (the non-immunizing case), the SIR model has only two fixed-points:

. The first fixed-point corresponds to the lack of any infected cases, and the second corresponds to a persistent disease in the population, as illustrated in Fig. 3(b) . This situation is only reachable if β < α, i.e., when the infection rate is greater than the recovery rate. We can also verify whether or not the fixed-points are stable. Various methods can be used for this purpose. Perhaps, the most tangible approach is based on perturbation theory. Simply stated, one can add small perturbations to the fixed-points of the system and check whether or not the perturbations are compensated by the system's dynamics by pushing the state vector back to its fixed-point. Accordingly, the first fixedpoint in (6) can be perturbed to:

where 0 < ǫ ≪ 1 is a small perturbation (e.g., equivalent to a single case of disease outbreak in a large population). Now replacing the perturbed point in (4) and neglecting second and higher order terms containing ǫ, we obtain:

As a result, the first fixed-point is unstable, since due to the sign of the derivatives of the perturbed system, the system's dynamics drives the state vector away from the fixed-point (since the population of the susceptible group has a negative derivative). However, depending on whether α > β or not, the outbreak may or may not result in an increase in the infected population. Simply put, if the infection rate is greater than the recovery rate (α > β) the disease would lead into an outspread; but if the recovery rate is faster than the infection rate (α < β) the percentage of the infected population will remain close to zero. In either case, for a non-endemic non-immunizing disease, all individuals that become infected recover after a while and move to the recovered group and again go back to the susceptible group at a rate of γ. Note that a SIR model with a non-zero infected population fraction in steady-state, indicates that there is a constant flow between the compartments, i.e., people are constantly contaminated, recovered and again become susceptible to the disease. Perturbing the second fixed-point results in

In this case, depending on whether (αI 0 − β) > 0 or not, the fixed-point may be stable or unstable.

2) The endemic SIR model: An endemic version of the SIR model with rates of birth µ * and with different death rates from the susceptible (µ s ), infected (µ i ) and recovered (µ r ) groups is shown in Fig. 4 . This system is no longer closed and its state equations can be written as follows:

With this background, in the sequel, we propose a compartmental model for the epidemiology of the COVID-19 Coronavirus.

Many infectious diseases are characterized by an incubation period between exposure and the outbreak of clinical symptoms. Subjects exposed to the infection are much more dangerous for the public as compared to the subjects showing clinical symptoms. The condition becomes more and more dangerous, with the increase of the incubation rate. A wellknown case is the HIV virus in its clinical latency stage. The experience of the COVID-19 shows that a two-week incubation period can spread a virus worldwide and almost at any level of the society. Remember that any two of us are only six handshakes apart! 1 For this reason, an additional compartment is added between the susceptibility and infection stages of the SIR model, which accounts for the symptomless exposed subjects. Moreover, since we are also interested in minimizing the mortality rate of the disease, a termination compartment is dedicated to the passed-away population. The variables of the model are therefore: 1) s(t): The susceptible population fraction (the number of individuals in danger of being infected, divided by the total population). 2) e(t): The exposed population fraction (the number of individuals exposed to the virus but without having symptoms, divided by the total population).

The infected population fraction (the number of infected individuals with symptoms, divided by the total population). 4) r(t): The recovered population fraction (the number of recovered individuals, divided by the total population). 5) p(t): The number of individuals that pass away due to the disease, divided by the total population).

Keeping in mind that 

In (12), similar to the classical SIR model, the interpretation of the nonlinear terms including s(t)e(t) and s(t)i(t) is that the rate of exposure to the virus is proportional the population of both the susceptible and exposed/infected subjects. Note that the constraint (11) gives us an excess degree of freedom that will be later used for simplifying the analysis by eliminating one of the state variables.

The simplifying assumptions behind the proposed model are:

1) Natural birth and natural deaths have been neglected.

Therefore, other parameters leading to changes in the population are not considered. Neglecting the birth rate is also supported by the current findings that babies are not susceptible to this virus and to the best of our knowledge, no congenital transmissions of the virus from mothers to fetuses have been reported. 2) In the current study, we do not distinguish between male and female subjects; although the current global toll of the virus suggests that men have been more vulnerable to the virus than women. 3) Age ranges have not been considered; although we known that higher aged subjects are more vulnerable to the virus. 4) Moreover, in this primary version, we have not yet considered the possibility of vaccination. 5) Geopolitical factors such as distance, country borders and continental differences have also been ignored. But considering that different countries have adopted customized counter measures against the virus spread, the model parameters are fitted over country-level data.

Having formed the model, we now explain its parameters and their relationship with real-world factors and clinical protocols. Note that all the model parameters have the dimension of inverse time, to balance the left and right side dimensions of (12) . Note also that the studied model is essentially based on an exponential law assumption. Therefore, we can find a rule of thumb for selecting the model parameters based on clinical facts and protocols. For example, let us consider the exposure to infection period, from the second equation in (12) , temporarily discarding the interactions with the other compartments in the system. We can write this part of the model as follows:

Due to the exponential law, this equation implies that the population of the exposed cases drops 63%, 86%, 95%, 98%, 99%, and 99.75% from its initial value, after κ −1 , 2κ −1 , 3κ −1 , 4κ −1 , 5κ −1 , and 6κ −1 time units, respectively. The latter indicates that after 6κ −1 time units only 25 cases out of 10,000 would still be exposed but without symptoms. Having this in mind, we can reverse engineer the proper order of the model parameters from the quarantine protocols advised by clinicians, for a certain infectious disease. Assume that we are dealing with an extremely contagious disease for which the healthcare decision makers have agreed on the above noted 99.75% percentage as the target infection drop-out threshold, and advised 14 days of quarantine for the whole population.

In that case, we can select κ = 6/14 = 0.43 (inverse days) in our model. Apparently, there is a lot of simplifications in this discussion; the age range, the subject-specific body immune system features, the severeness of the virus and many more factors have been neglected. But it gives an idea about how the parameters can be tuned in practice, up to a reasonable order of magnitude. With this background, we now explain the interpretation of each parameter of the model. 1) κ: The rate at which symptoms appear in exposed cases, resulting in transition from the exposed to the infected population. The selection of this parameter is according to the exponential law detailed above. We should add that wide screening policies adopted by certain countries (such as Germany, in the case of the Coronavirus), are external factors that can significantly accelerate the identification of the infected cases. In this case, screening is a factor that increases κ. 2) α i : The contagion factor between the infected and susceptible populations, which is related to the contagiousness of the virus and social factors such as personal hygiene, population density and level of human interactions. In order to find the range (or order of magnitude) of this parameter, we can start with the contagion factors of more known viruses, such as flu and influenza, which are more or less influenced by the same spreading factors. 3) α e : The contagion factor between the exposed and susceptible populations. This parameter is logically far greater than α i , since in ordinary conditions (before lock-downs and quarantine), people rarely avoid contact with a symptomless individual; nor does the individual itself avoid interaction with others. 4) γ: The reinfection rate, or the rate of returning from the recovered group to the susceptible group. This happens for the cases that the body does not develop lifetime immunity after recovery, or the virus itself starts to mutate over time. This parameter is the inverse of the immunity rate of the virus. It is currently too early to comment about the immunity characteristics of the Coronavirus 2 . Although at least one case of reinfection soon after recovery has been reported, preliminary research have suggested short-term immunity of up to four months. 5) β: The recovery rate of the infected cases. By considering the fourth equation in (12), we can denote the change in the number of hospitalized recoveries (or under control in any form, e.g., under home-care) by r h , resulting in

where ∆ is the time unit of approximation (for example 1 day). Therefore, the parameter β can be approximated by dividing the daily recovery count of the population under study, by the total infected cases in the same day. In the real world, apart from the body strength of the infected subject in resisting against the virus, this parameter depends on the healthcare infrastructure of a country (hospitalization facilities, availability of medication, number of intensive care units, etc.). 6) ρ: The recovery rate of the exposed cases (the cases that are exposed, but recover without any symptoms). This parameter is not directly measurable from pure observations and requires lab-based experiments. However, we logically expect this parameter to be of the same order or greater than the parameter β (the recovery rate of the infected population with symptoms). 7) µ: The mortality rate of the infected cases. By approximating the last equation in (12) by

where ∆ is the time unit of approximation (for example 1 day), the parameter µ can be approximated by dividing the daily death toll by the total infected cases in the same day. As with β, the mortality of the virus itself, the immune system of the subjects, and the medical infrastructure are important factors that influence the parameter. 8) e 0 : The initial exposed population (seed).

By studying the above factors, we can see that the only parameters of the model that can be changed in the shortterm (before the development of long-term solutions such as vaccination, medication, improvement of hospitalization facilities, etc.), is to reduce the infection rates by minimizing human contacts (social distancing), or to apply public screening. These are the two policies, which have been adopted worldwide.

As with the basic SIR model presented in Example II-C1, the fixed-point(s) of the model can be sought by letting the left hand sides of all equations in (12) equal to zero. Accordingly, assuming that all the model parameters are nonzero, the only fixed-point is the no-disease case (i(t) = e(t) = r(t) = 0): (s * (t), e * (t), i * (t), r * (t), p * (t)) = (1 − p 0 , 0, 0, 0, p 0 ) (13) where 0 ≤ p 0 ≤ 1 is the steady-state total death fraction. The stability of this fixed-point can be addressed by perturbing the fixed-point with a minor exposure ǫ (which can correspond to a single new exposed case in the real world):

(s(t), e(t), i(t), r(t), p(t)) = (1 − p 0 − ǫ, ǫ, 0, 0, p 0 ) (14) Putting this point in the state dynamics (12), we find:

which is unstable, i.e., the system's dynamics drives it away from the fixed-point in the direction of reducing the healthy cases, resulting in further infection. A more rigorous study of the system stability conditions is presented in the following sections using eigenanalysis.

Let us study the model during its primary rising phase, when the infection toll is much smaller than the total population. For instance, suppose that a country has 100,000 of exposed or infected cases, which is indeed significant for any country, as it is far beyond the available number of intensive care unit beds of even the most developed countries 3 . But for a country with 100 million population, such an exposure/infection toll is only 0.1% of the total population. Therefore, during the primary phases of the disease spread, the model can be simplified by assuming that the susceptible population is almost constant (s(t) ≈ 1) and ds(t)/dt ≈ 0, regardless of the other parameters of the model. Under this assumption, (12) is simplified to the linear set of equations:

3 See for example: https://link.springer.com/article/10.1007/s00134-012-2627-8/tables/2 Defining x(t) = (e(t), i(t), r(t), p(t)) T , (16) can be written in matrix form:

d dt

where A is the 4×4 state matrix on the right hand side of (16) . The characteristic function of this linear system is:

. Therefore the system's eigenvalues are:

which are all real-valued. Moreover, it is straightforward to check that λ 3 > δ > λ 4 . The eigenvectors corresponding to each eigenvalue are:

The general form of the solution of the compartmental variables is a summation of exponential terms with the above exponential rates and eivenvectors:

Specifically, after some algebraic simplifications, we can calculate the infected and exposed populations as follows:

(22) From the last equation in (12) , it is clear that the death toll will not stop before i(t) = 0. Also from (22), we can see that since λ 3 is the dominant eigenvalue, the steady-state behavior and whether or not i(t) and e(t) diverge from or converge to zero, depends on the sign of λ 3 . The necessary and sufficient condition for the linearized system's stability (stopping the death doll) is λ 3 < 0, which simplifies to κα i + δ(β + µ) < 0, or:

A sufficient condition that guarantees this property is when α i = α e = 0. The condition α i = 0 implies that the susceptible group avoids contact with the infected ones. However, the second condition (α e = 0) is difficult to fulfill in the realworld, since the exposed group do not have any symptoms. This is why social distancing is required to enforce α e ≈ 0 and to permit all the exposed subjects to move to the infected group without infecting new individuals, after which the symptomless group can be all considered clear of the disease. Another practical case is when α i ≈ 0 (healthy people avoid contact with the infected) and κ + ρ > α e (the rate of recovery of the exposed or the appearance of their symptoms is faster than the rate of new exposures). This condition is fulfilled by social distancing and lock-down (isolation of even the symptom-less cases for a certain period). However, if none of the above conditions are fulfilled and λ 3 > 0, the number of exposed and infected cases increases exponentially at a rate of λ 3 . In this case, with fixed system parameters, the infection rate rises exponentially up to a point at which the linear approximation does no longer hold. This practically translates into:

When an exponential outbreak of infection occurs (λ 3 > 0), the system is unstable and without enforcing temporary lock-downs, social distancing and quarantine of the infected cases (resulting in the model parameter changes), the exponential increase in the number of infected subjects continues to a point where a significant percentage of the population is infected.

Considering that the death toll p(t) is composed of the same exponential terms as the infected cases in (22), the above result is indeed disturbing.

It is also interesting to observe from (22 ) that the population of the different compartments of the model is only linearly proportional to the initial exposed population size e 0 4 . Therefore, for a large population (at the level of a populated city or country), the initial infected seed size is not as important as the other model parameters that influence the exponential behavior of the model (such as the social contact rates). Therefore:

Result 2. The initial seed size is not the most critical parameter for epidemic management. Regions with smaller initial seeds of infected/exposed cases may end up with a higher infected and death toll depending on their infection rates, defined by factors such as humancontact rate and personal hygiene.

The ability to estimate the future trend of the epidemic pattern is extremely important from the strategic perspective. This requires the accurate estimation of the compartmental model variables from reported statistics of the virus spread. The trend estimation requirements are addressed in the sequel for both the linearized and general form of the model. As noted before, these results can be used to design an estimation scheme, based on for example a Kalman filter, for estimation and prediction of the current and future trends, in presence of inaccurate infection tolls.

1) The linearized case (low fraction of infection): Among the different state variables of the linearized compartmental model (in the low fraction of infection case), all except e(t) 4 Note that the COVID-19 is believed to have started from a single case. are directly measurable. The measurements can be formulated in matrix form as follows:

where I(t), R(t), and P (t) are the reported infection, recovery and death tolls divided by the total population. The variables can be considered as noisy measurements for the system's "true" infection, recovery and death tolls. The evident sources of noises include: unavailable information regrading the exact population, intentional and unintentional misreported values, mis-classified reasons of death (especially for the elderly or subjects suffering from multiple health issues), and the marginal cases that may be unknown or misclassified for the healthcare system. Equation (24) can be written in more compact form as follows:

where x(t) = [e(t), i(t), r(t), p(t)] T is the reduced statevector. Although the variable e(t) is not directly measurable from the available public data, one may seek whether of not this variable can be indirectly estimated from the other measurements (assuming that the other model parameters are known). Here, we can use the notion of observability from system theory [13] , [14] . Without going into the theoretical details, for the model matrix pair (A, C) the observability matrix is defined as follows:

If O k has rank m (4 in our model), all the four state variables of the linearized model (16) are observable at its output, which means that they can be estimated from the observations in finite time. It is straightforward to numerically calculate (26) for arbitrary choices of the model parameters and to check that none of its four columns are linearly dependent 5 . Therefore, all the state variables, including e(t), are observable and may be estimated using state estimation techniques such as the Kalman filter (as the optimal linear estimator).

2) The general case: For the original nonlinear compartmental model (12) , the variations in s(t) is no longer negligible; but the constraint (11) on the system being closed still holds. Therefore, we can reduce the model order by replacing s(t) = 1 − e(t) − i(t) − r(t) − p(t), which simplifies the compartmental model as follows: 

with the following non-zero entries:

which has some time-dependent elements. Due to the nonevident form of the observability matrix of this case, the proof of observability of the matrix pair (A(t), C) in its general case is cumbersome. However, it is simple to check this property numerically for arbitrary values of the model parameters. We have tested this property for our later shown simulated results. Therefore, we can state the following result in both the general and linear approximated case:

Result 3. Although the number of exposed cases of the population is not directly measurable, if the model parameters are known (or accurately estimated), the number of exposed cases can also be estimated from the other observations.

The peaks of the infected group population, and its potential repetition in time, is important from the strategic viewpoint [15] . These points correspond to local or global extremums of i(t), which mathematically correspond to where di(t)/dt = 0 in (12), i.e., where i(t) = κe(t)/(β + µ). It can be shown that this leads to a reduced order set of nonlinear dynamic equations, which can be solved for the remaining variables [s(t), e(t), r(t), p(t)] T . The simulations demonstrated in the sequel, show that the infected population can have multiple local peaks over time, with recurrent behaviors, proving that: This behavior has been observed in previous pandemics, such as the 1918 pandemic influenza, known as the Spanish flu, where three pandemic waves of infection have been observed within an interval of a few months (cf. https://en.wikipedia. org/wiki/Spanish flu). A very insightful mathematical study of sustained oscillations of similar compartmental models was previously presented in [16] .

We have carried-out several simulations with different sets of parameters, resulting in the different scenarios explained in previous sections.

Example 3. The first scenario is illustrated in Fig. 6 , where we have considered a single virus outbreak in a 84 million population, with constant parameters α e = 0.65, α i = 0.005, κ = 0.05, ρ = 0.08, β = 0.1, µ = 0.02 and γ = 0, simulated over one year. The assumption γ = 0 implies that the disease has been assumed to develop lifetime immunity, therefore there is no return from the recovered to the susceptible compartment.

Example 4. The second scenario is illustrated in Fig. 7 . All the model parameters are identical to Example 3, except for the re-susceptibility rate (loss of immunity) that is now γ = 0.001. It is interesting to see that in this scenario, the model has recurrent decaying peaks over time, similar to the aforementioned 1918 Spanish flu.

Example 5. The next scenario is illustrated in Fig. 8 . This scenario corresponds to the case where a one month lockdown is applied by the government to identify the exposed and infected cases. The parameters of the model before quarantine are α e = 0.6, α i = 0.005, κ = 0.05, ρ = 0.08, β = 0.1, µ = 0.02 and γ = 0.001. In the 30th day after the initial outbreak, the one month lock-down is applied, during which α e and α i are reduced to 0.1 and 0.001, respectively, while keeping the other parameters unchanged. After one month, α i remains 0.001 (people keep distance with the infected ones), but α e is increased to 0.4 (much more than the quarantine period, but two-third of the original social contact factor). We can see that with this scenario, which corresponds to an insufficient quarantine period, the population of the infected and exposed peaks have decreased; but the quarantine does not significantly change the mortality toll after five months. But why? Because, after the quarantine period, there is still a minor fraction of the population that is exposed, and this very small seed can re-initiate the virus spread. We therefore come to the following result:

Result 5. Imposing quarantines is effective in delaying and reducing the infection population peaks; but is insufficient in the long term. Social distancing and other measures should remain for a long period after the initial quarantine, to make the number of contaminated subjects equal to "zero." Based on the simulations of this section, we can propose a protocol that can be effective in practice:

Algorithm 1 Proposed protocol for effective epidemic disease control 1: Apply maximum lock-down of the entire population and quarantine of the infected cases, as soon as possible. 2: Estimate the number of exposed subjects and the level of confidence of the estimate. 3: Keep the maximal lock-down until as many of all the number of estimated exposed subjects are identified at a very high level of confidence. 4: Switch from lock-down to strict social distancing and personal hygiene protocols. 5: Switch to normal life only when no new cases have been reported for a long period.

In this study we considered some of the properties of epidemic diseases from a mathematical and signal processing perspective, by using a compartmental model for the propagation of the epidemic disease. It was shown that the model is not stable around its fixed-point (the no-infection case). We should indeed be proactively worried about the instability of our societies to such epidemic outbreaks; since a mortal virus with an initial seed as small as a single subject worldwide can trigger the avalanche of pandemic waves, killing many people worldwide, as the COVID-19 Coronavirus has shown. Note however that from the modeling and dynamics perspective, there is nothing specific to the Coronavirus, apart from its specific parameters. In fact, humans have always been living in such a metastable condition and researchers from all domains need to synergize to think a way through this condition.

The research will be continued and extended from various aspects in the future versions. Specifically, by fitting the model over real data, the prediction of infection and mortality rates under quarantine and vaccination, and the study of the recurrent pattern of the epidemic disease over time. Non-exponential law infection distributions have also been considered in the literature [17] . This can be an interesting track of research for the study of epidemic diseases, including the family of Coronaviruses.

This article has been drafted in March 2020, in the midst of the COVID-19 Coronavirus outbreak. The author would like to sincerely thank Professor Christian Jutten, Emeritus Professor of Université Grenoble Alpes, Grenoble, France, for his insightful and motivating comments throughout this study.

Mathematical models in population biology and epidemiology

Modeling Biological Systems: Principles and Applications

A Course in Mathematical Biology: Quantitative Modeling with Mathematical and Computational Methods, ser. Monographs on Mathematical Modeling and Computation

Comparison of filtering methods for the modeling and retrospective forecasting of influenza epidemics

Control Applications for Biomedical Engineering Systems

The Open-Source Electrophysiological Toolbox (OSET), version 3.14

Multimodel inference: understanding aic and bic in model selection

Modified aic and mdl model selection criteria for short data records

Mathematical and Computer Modeling of Physiological Systems

An Introduction to Mathematical Modelling in Molecular Systems Biology

Theory and Practice Using Matlab

Linear Systems

Stability and stabilization: an introduction

Final and peak epidemic sizes for seir models with quarantine and isolation

Oscillations in sirs model with distributed delays

Epidemiological models with nonexponentially distributed disease stages and applications to disease control