key: cord-0720570-lzws23pv
authors: Bokharaie, V.
title: A Study on the Effects of Containment Policies and Vaccination on the Spread of SARS-CoV-2
date: 2020-10-20
journal: nan
DOI: 10.1101/2020.10.16.20213835
sha: e7ebb27cb8162f0fa0a172f0835d96a711b02031
doc_id: 720570
cord_uid: lzws23pv

This paper presents a method to predict the spread of the SARS-CoV-2 in a population with a known age-structure, and then, to quantify the effects of various containment policies, including those policies that affect each age-group differently. The model itself is a compartmental model in which each compartment is divided into a number of age-groups. The parameter of the model are estimated using an optimisation scheme and some known results from the theory of monotone systems such that the model output agrees with some collected data on the spread of SARS-CoV-2. To highlight the strengths of this framework, a few case studies are presented in which different populations are subjected to different containment strategies. They include cases in which the containment policies switch between scenarios with different levels of severity. Then a case study on herd immunity due to vaccination is presented. And then it is shown how we can use this framework to optimality distribute a limited number of vaccine units in a given population to maximise their impact and lower the total number of infectious individuals.

This manuscript presents a framework to model the spread of SARS-CoV-2 in a population with a known age-structure. The model itself is a compartmental model in which each compartment is divided into a number of age-groups. And a data-driven approach is presented to adapt the parameters of this model to the available data on the spread of SARS-CoV-2. The mathematical framework is adopted from [11] and [4] , although they only consider the SIS case, while here I have used the SEIR model. The SEIR models are of course not new, but the main contribution of this work is to use an optimisation scheme to tune its parameters to the available data on the spread of SARS-CoV-2. The framework is general, as will be explained, although in this manuscript the data collected from early stages of the spread of the virus in Wuhan is used. With more detailed and comprehensive data, the outputs of the model can be more accurate. One advantage of the adopted framework is that it allows us to estimate the parameters based on data collected from one population in any other population with a known age-structure. Hence, using the data collected from a population in China will not limit us to model the spread of virus only in China.

There is also a publicly available Python library called MiTepid sim [8] which was specifically developed as a part of this study. All the figures in this manuscript are generated using MiTepid sim. It can be used to simulate the spread of the SARS-CoV-2 in each age group starting from any given initial condition and under any defined containment policy. Also, another library called MiTepid opt [7] accompanies this work which is used to implement the optimisation scheme presented in this manuscript and is also publicly available.

The structure of the paper is as follows. In Section 2, I will explain the outline of the model and the optimisation schemes, without getting into the details. The mathematical details are explained in Appendices A.1 and A.2. Then various case studies are presented in Section 3. They include a study on how changing variables of the model affect the trajectory of the Infectious population. And then studying the effects of various suppression strategies, including those which might affect each age group differently. Then, long-term containment plans. These are the types of containment policies which we 11th, as shown in Table 5 in Appendix A.2. Up until two weeks before this date, Chinese authorities did not impose any meaningful containment strategies and the virus was spreading in an uncontained population. Hence we can assume these numbers are the results of the uncontained spread of the virus. Using these two sets of information, and using the optimisation scheme as detailed in Appendix A.2, we can estimate for the contact rates. One detail of the optimisation scheme which deserves to be highlighted is that it relies on the ratio of the number of reported cases in each age-group, not the absolute values. By doing so, we can avoid two potential issues that might arise in using such data. One is the fact that the number of confirmed cases in each age-group is an unknown fraction of actual Infectious numbers at each time. Another issue is the deliberate or non-deliberate errors that creep in while reporting these numbers. Relying on the ratios of the reported Infected numbers in the optimisation scheme makes it less sensitive to both of these issues, although it does not completely eliminate them. Using more data points as inputs to the optimisation scheme is one possible way to overcome these issues even more.

All the mathematical details of both the model and the optimisation schemes are explained in detail in Appendices A.1 and A.2. The important thing to keep in mind is that although the optimisation scheme uses data collected from China in a certain time-period and in an uncontained population, but the values we obtain can be adapted to any population with a known age-structure and under any containment policy, if we know how that policy affects contact between various age-groups, or how it will change the Basic Reproduction Number (defined in Section A. 1.2) . The model has some other advantages and also some disadvantages which are discussed in Section 4. Even Considering the disadvantages, it can serve as a good first step to quantify the effects of various containment policies in different countries. The actual values of the contact rates are accompanied with MiTepid sim [8] , a python library which can be used to simulate the model used in this paper or any other stratified compartmental model of interest.

Using the mathematical framework explained in Section 2 and detailed in Appendices A.1 and A.2, we can answer various questions about how SARS-CoV-2 spreads in any population with a known age-structure. We can quantify the effects of various containment policies on the spread of the virus even if those policies affect age-groups differently and if the policies vary in time. It can be used to study the effects of partially vaccinating the population, how wide-spread the vaccination should be to reach herd immunity and how to optimally distribute limited vaccine resources in a population. In this section we will see how we can find answers to all these questions.

Let's consider the case in which COVID-19 is spread uncontained, i.e. when people in the population interact with each other as in normal times, with no external or self-imposed restrictions in the interactions. Figures 1.1 and 1.2 show the Infectious and Removed population in each age-group in Germany. As mentioned in the previous section, the output of the model is the evolution of the trajectories of E i , I i and R i for i = 1, · · · , n. We can then add up these values to obtain the aggregate trajectories of E, I and R compartments. That is how the Figures 1.3 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

The copyright holder for this this version posted October 20, 2020. each age-group is calculated, and then the aggregate is calculated based on the population distribution for that country. As can be seen, the predicted peak in the number of Infectious people and the eventual ratio of the Removed population varies considerably among different countries. That can be explained based on differences in the population distribution in these countries. For example, in Iran, 66.8% of the population are under 40 years old while that ratio is 39.8% in Italy [3] . The model predicts that countries with an older population would be affected worse if they let the virus spread uncontained, in agreement with for example [12] when comparing mortality rates in the USA and UK.

It should be noted that the model is not directly concerned with the mortality rate or the number of those who might need respiratory or intensive care in hospitals. Such information can be inferred from the number of Infectious and Removed compartments in each country, based on the available data or estimates on mortality rate and also the ratio of the Infectious and that need intensive care. Dealing with that is out of the scope of this manuscript, but interested reader can look, for examples, at [17, 15, 16, 20] .

To solve any system of Ordinary Differential Equations (ODEs), apart from the equations themselves, we should also define the initial conditions, which in our case means the initial ratios of Infectious, Inhibited and Removed populations in each age group. In all the figures presented in this manuscript, I have assumed 1 in 100,000 in each age group is Infectious at time t = 0, and the Removed population is 0. We should be cautious in using a system of continuous ODEs for numbers lower than that. It should be noted that the peak values in Figures 1.3 and 1.4 are barely affected by the choice of initial conditions. But that is not true for the time it takes to reach the peak values. Hence, in order to predict the day in which the number of Infectious reaches the peak value, we should have a reasonable estimate of the initial conditions. But given that the number of confirmed cases are reported on a daily basis in many countries, and that there are suggestion on how to infer the total number of infected based on the reported number in each country, finding the right initial conditions for each country is a problem that can be solved.

We can also run simulations assuming different values of Basic Reproduction Number for uncontained populations, not just R 0 = 2.95 to see how that might change the trajectories. To clarify, I have assumed R 0 for uncontained population can be any value in [2.05, 3 .95] range with steps of 0.1. We can then estimate contact rates with each assume value for R 0 and then calculate the trajectories. As can be seen in Figures 1.5 and 1.6, higher R 0 leads to higher peaks in both I and R.

Also, I have run the same procedure varying T I , the average time each individual remains Infectious. [5, 14] days range. Increasing T I increases the peak value in I, but delays the time it takes for R to reach its maximum value. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

The copyright holder for this this version posted October 20, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

The copyright holder for this this version posted October 20, 2020. .

The advantage of having a stratified model is that we can quantitatively evaluate the effects of various containment strategies that affect different age-groups differently. Table 2 has listed some of the more common policies and how they are defined. The column under Policy Description defines to what extent the contacts of age-groups are assumed to change under each policy. These values are chosen intuitively, but any other definitions and any other policies can be easily defined in mitepid sim package that is developed as a part of this study [8] .

As can be seen in Figures 2.1 and 2. 2, all the containment policies decrease both the peak instantaneous Infectious ratio and eventual Removed ratio, which is to be expected. The most effective policy among the ones listed in the Table 2 is Lock-down. It is worth noting that although in Lock-down policy the contact are limited to half of Social Distancing policy, the decrease in the peak Infectious value is much less noticeable. Table 2 also shows the basic reproduction number, R 0 , for each policies. As can be seen in Figure  2 .1, and as it is well-known in epidemiology, when R 0 < 1, the disease starts to disappear from the population. But as importantly, bringing the ratio of the Infectious population down to a small enough ratio of the total population can take months, even if we impose a total lock-down strategy (the exact time depends on the ratio of Infectious when we start the policy). And maintaining such a strategy might not be feasible in many countries with a more fragile economy. In the next section, we will see what happens if we switch between different strategies with different degrees of severity as a long-term plan. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

The copyright holder for this this version posted October 20, 2020. 

So far, we have seen what happens if we leave the population uncontained and let the virus spread freely. It can lead to, for example in the case of Germany, around 92% of the population being eventually infected, which can lead to hundreds of thousands of death. Then we saw that some suppression strategies can be hugely effective in containing the spread of the virus and stop such humanitarian disasters. But a complete and long-lasting lock-down strategy, even if possible, can have a huge economic cost, apart from its psychobiological toll on the members of the society. A reasonable compromise is to switch between containment strategies of various degrees of strictness, depending on the observed trend in the number of confirmed cases. What results is what is known in control theory as a switched system. Implementing that using this framework is quite easy. We start with the first set of parameters, contact rates, at time t = 0, run the simulations until time t = t 1 , and use the final state of the previous ODE as the initial conditions for the new one. All of this can be easily implemented in MiTepid sim software package [8] .

What follows in this section is simply a few examples of how such strategies can or cannot be useful. This by no means is meant to be considered a comprehensive list of effective mitigation strategies but merely examples to highlight the capabilities of this framework. All the example discussed in the following is based on the population distribution of Germany. Let's start with a case in which we switch between uncontained case and Social-Distancing, as defined in Table 2 . It means the case in which contacts between children and adults is brought down to 20% of uncontained case, and for elderly to 25%. When for uncontained R 0 = 2.95, in Social Distancing it is decreased to R 0 = 0.63. We assume uncontained case for the first 30 days, starting from the case in which 1 in 100,000 is Infectious. As can be seen in Figures 3.1 and 3.2 the policy manages to initially contain the virus, but as the number of Infectious during uncontained phases increase, the exponential growth causes the number of Infectious to reach its peak value at 150 days and then start to decrease. Although this policy is obviously not able to contain the number of Infectious in a manageable level, we can still see some benefits in it. The peak value of Infectious is now less than what it was in uncontained case, 9.6% vs 14.3%. Also, the time to reach the peak value has increased from around 80 days to 150 days. But more importantly, the eventual recovered ratio, i.e. the total number of individuals who were infected at some stage, is decreased from around 92% to around 62%. That means although this policy will eventually lead to, most probably, overload of the health system resources, but it can limit the mortality rate significantly. Now let's repeat the same scenario, but now we, after the first 30 days of uncontained policy, we replace uncontained policy with the case in which R 0 = 1.5. Given R 0 = 2.95 for uncontained population, this means contacts between members of the population is assumed to be cut to half. But reaching the same level of restrictions as we have defined under the Social-distancing policy has proved very difficult (at least in Germany and many other countries.). So, lets look at a more realistic scenario, in which after the first 30 days of uncontained population, we switch between cases in which R 0 = 1.50 and R 0 = 0.9, which is not far from what has already happened in Germany. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

The copyright holder for this this version posted October 20, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

The copyright holder for this this version posted October 20, 2020. 

In Section 3.1 we saw that even in the uncontained case, not all of the individuals in the population need to be infected for the spread of the virus to be stopped. As can be seen in Table 1 and Figure 1 .4, in an uncontained population in which SARS-CoV-2 is spread freely, when the total number of recovered reaches a certain value, the disease dies out, a phenomenon which is called Natural Herd Immunity. The exact value depends on the dynamics of the virus and also age-structure of the population in case of SARS-CoV-2. But the price we pay for natural herd immunity is catastrophic. For example, in Germany, natural herd immunity is reached when around 92% of the population is eventually infected, a parameter that I call Herd Immunity Threshold (HIT). And with a mortality rate of 0.6% that amounts to around 460 thousand deaths.

We have already seen in the previous section that imposing any kind of containment policy, can reduce the HIT significantly. Now let's assume a vaccine for SARS-CoV-2 is produced and is approved by health authorities. Let's see how does that changes HIT. In particular, we want to know what ratio of the population should be vaccinated to reach herd immunity? Answering that question is very easy using the framework we have in place. There is no need to define a new compartment for the vaccinated people. We can assume the vaccinated population as the initial conditions for the Recovered compartment and then see how the trajectories of Infectious and Recovered evolve. Table 3 shows the eventual recovered and maximum Infectious ratios under different initial ratio of vaccinated population, assuming that after vaccination, a containment strategy is in place such that R0 = 1.50. Remember that R0 = 2.95 in an uncontained population, hence R = 1.50 means interactions in the populations are reduced to around half of that in an uncontained case. As can be seen in Table 3,   9 . CC-BY 4.0 International license It is made available under a perpetuity.

is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

The copyright holder for this this version posted October 20, 2020. when we inoculate around 30% of the population, we have already reached the herd immunity level. If we leave the population uncontained, we need to vaccinate around 65% of the population to reach HIT. Please note that these values are for a population with the same age-structure as Germany, for countries with a younger population, HIT is lower. But in these figures, we have assumed the same ratio in each age-group is vaccinated. In other words we have ignored an important capability of the model, the ability to impose different conditions on different age-groups. So, let's say we have enough vaccine units to only inoculate 15% of the population of Germany. We can turn this problem into an optimisation problem, in which we keep overall number of vaccinated population at 15% while minimising the eventual Recovered population if we vaccinate different ratios of each age-group. Table 4 show the difference that non-uniform vaccination in age-groups can have. And Figures 4.5 and 4.6 show the evolution of trajectories corresponding to uniform and optimised cases. Obviously, it is not really realistic that we vaccinate only 15% of the population and then none. This simple example is merely used to highlight this feature of this framework. The code used to run this optimisation approach is also available in mitepid opt [7] package. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

The copyright holder for this this version posted October 20, 2020. 

The method presented in this manuscript can be used to predict the spread of SARS-CoV-2 in each agegroup of a population with a known age-structure. Not just the aggregate values, but also the evolution of the number of Infectious and Recovered individuals in each age-group. It was shown how we can use various containment strategies and how to quantify their effects. Strategies which might be time-varying and might affect each age group independently. The main advantage of this framework is that it allows us to estimate the contact rates of the model without a detailed knowledge of how and to what extent different age-groups in a population interact with each other, or how the virus affects each age-group. The contact rates are estimated based on the available data on the spread of COVID-19. The model itself is a set of nonlinear Ordinary Differential Equations (ODEs), hence simulating various containment policies in different time frames does need any special computational power.

To better estimate the contact rates, any other available data can be used as the input to the optimisation scheme. We need only two points in time with a known number of reported infected cases in each age groups while the Basic Reproduction Number, R 0 , remains constant and known in between the two time-points. Given the ever-changing containment polices imposed on various populations in the 11 . CC-BY 4.0 International license It is made available under a perpetuity.

is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

The copyright holder for this this version posted October 20, 2020. . past few months, finding such population with a constant and known R 0 is difficult. That is the reasons that I have relied on the data collected from Wuhan, China, in the early weeks of the spread of the virus when no containment policy was imposed on the population. As already mentioned, if we obtain such estimation from data collected in China, we can easily adapt them to any other population of any country/region/city with a known age-structure.

In spite of all its strength, the methodology has some shortcomings that should kept in mind while interpreting the results. The method assumes the virus affects individuals (of the same age) in different countries similarly. If it is the case that the differences in genetic backgrounds or vaccination histories in different countries can affect the infection rate, such differences would be lost while transforming the contact rates from one population to another. Also, it is an implicit assumption in the model that the general interactions between people in different countries and societies are in general similar to each other. To clarify the point, if, for example, people older than 70 years in one country live with or have more contacts with the next generations, and in another country, they live in isolation or in nursing homes, then differences in contact rates caused by such social and cultural differences would also be lost. But if we estimate the contact rates based on two separate set of data collected in two different countries, and then compare the resulting normalised contact rates, we can discover such minute differences in how different age groups in the two populations interact with each other which is a useful corollary of this method.

The method can be extended in various directions to make the results even more useful. We can, for example, divide each age-group into sub-groups based on their vulnerability to the virus, or based on the relative amount of interactions with other individuals in the population. An obvious choice is people who work from home or are unemployed and people who have to go to their offices. Or people in 60-69 age range who are retired or those who still work. Even a rough estimate of the relative numbers of these sub-groups in each age group can give us more insight into more effective ways to contain the spread of the virus with less social and economic impact.

Given the fact that nowadays the population structure in different cities/regions in almost every country is known, we can use the model to describe the spread of the virus for each region or city, and then, assuming in and out-flow traffic to each city is known, consider them as exogenous inputs to our switched system. That would allow us to consider time-lags that might exist in the spread of COVID-19 to different parts of a country. But even as it is, this model and its estimated parameters can be a useful tool for different policy-makers in different countries in particular those countries with limited computational resources or people with expertise in mathematical epidemiology. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

The copyright holder for this this version posted October 20, 2020. . https://doi.org/10.1101/2020.10.16.20213835 doi: medRxiv preprint

To understand the model discussed in this section, it is enough to know some basic concepts in Ordinary Differential Equations (ODEs) and linear algebra. Knowing the following notations and definitions can be helpful.

R is the field of real numbers and R + is The set of non-negative real numbers. R n is The space of column vectors of size n of real numbers and R n×n is The space of n × n matrices of real numbers. I use x i to represent The ith entry of the vector x in R n , for i ∈ {1, · · · , n}. Please note that x 0 is a vector in R n that usually represents initial condition. Notation a ij is used for (i, j) entry of the matrix A. D = diag (x) is an n × n diagonal matrix in which d ii = x i for all i. A −1 is The inverse of the matrix A. I is the identity matrix of proper dimensions and 0 is the zero matrix of proper dimensions. σ(A) is the set of all eigenvalues (spectrum) of the matrix A. ρ(A) is the spectral radius of the matrix A, i.e. the maximum of the absolute values of all eigenvalues. µ(A) is The spectral abscissa of the matrix A.

A B means a ij > b ij , for all i, j ∈ {1, · · · , n}. It should not be mistaken with Positive Definite (PD) matrices. A > B means a ij ≥ b ij , for all i, j ∈ {1, · · · , n} and A = B and A ≥ B means a ij ≥ b ij , for all i, j ∈ {1, · · · , n}. R n + is The positive orthant of R n , given by {x ∈ R n : x ≥ 0}.

Knowing the following basic definitions can help in understanding the text better.

The matrix A is irreducible if and only if for every non-empty proper subset K of N := {1, · · · , n}, there exists an i ∈ K, j ∈ N \ K such that a ij = 0. When A is not irreducible, it is reducible.

For any subset U of R n , a point x 0 is called an interior point of U if there is an open ball around x 0 which is wholly contained in U. The set of all interior points of U is called the interior of U and is denoted by int (U).

Consider a continuous-time nonlinear systems of the form:

where f : D → R n is a nonlinear vector field on a subset D of R n and x 0 ∈ D is called the initial condition.

The SIS model, although not used in the main text, is presented here, both in the interest of completeness and to provide a theoretical basis for the SIR/SEIR discussions. The formulation presented in this section is adopted from [11] and [4] . In SIS model, the population of interest is first divided into two compartments S, Susceptibles, and I, Infectious. Each compartment is sub-divided into n groups. These groups can represent different age groups, different health conditions, professions, etc. In this manuscript, I have divided the population in each compartment into n = 9 age-groups defined as 0-10, 10-20, ..., 70-80 and 80+.

Let I i (t) and S i (t) be the number of Infectious and Susceptibles at time t in group i for i = 1, · · · , n, respectively. Also, let N i (t) = S i (t) + I i (t) be the total population of group i. The total population of each group is assumed to be constant; formally, N i (t) = N i . This does not oversimplify the model, especially when the total population is significantly greater than the number of dead and newborn. But even if that assumption is not deemed realistic for a population, the formulation stated below can still be used as we will shortly see.

Here, β ij , the contact rate between groups i and j, denotes the rate at which Susceptibles in group i are infected by Infectious in group j for i, j = 1, · · · , n. Further, γ i , the transfer rate, is the rate at which an infectious individual in the group i leaves the Infectious compartment and join the Susceptible compartment. We also consider birth and death in the population, but as discussed, we set the birth and death rates in each age-group to be the same value µ i to keep the total population in each group constant. Using the mass-action law, the basic SIS model is then described as follows [11] :

Since the population of each group is constant, it is sufficient to know I i (t). If we set x i (t) = I i (t)/N i andβ i,j = β i,j N j /N i and α i = γ i + µ i , we obtain the following differential equation:

for all i = 1, · · · , n. Based on the definition, x ∈ B n where B n := {x ∈ R n + : x ≤ 1}. We can write the differential equation (3) in compact form as:

where D = −diag (α i ) and B = (β ij ) > 0. As a reminder, here >represents an element-wise inequality, as defined in Section A.1.1, and does not refer to Positive Definiteness.

The following properties of (4) are easy to check. Interested reader can look at [4] for proofs.

(

x with D and B defined as above is C 1 in R n , therefore, the solution for every initial condition in R n exists and is unique for all t ≥ 0.

(ii) The origin is an equilibrium point of (4). This equilibrium is referred to as the disease-free equilibrium (DFE) of the system (4).

(iii) System (4) may have an equilibrium in int (R n + ) (also referred to as an endemic equilibrium). Conditions for existence of endemic equilibrium for the system (4) depends on parameter R 0 , explained below.

One important parameter in mathematical epidemiologically is the basic reproduction number, R 0 . There are different definitions for the basic reproduction number. Probably the most common definition is as follows.

The basic reproduction number is the expected number of secondary cases produced, in a completely susceptible population, by a typical infective individual during its entire period of Infectiousness [10] .

For the SIS model (4), following [11] , it can be proved that R 0 = ρ(−D −1 B). The reproduction number can be used to characterise the existence and stability of the equilibria of (4). As shown in [11, Theorem 2.3] , the disease-free equilibrium, i.e. the origin, is a globally asymptotically stable equilibrium of the system (4) if and only if R 0 < 1 (if matrix B is irreducible). And the endemic equilibrium, an equilibrium in int (R n + ), is globally asymptotically stable if and only if R 0 > 1. In other words, the necessary and sufficient condition to eradicate a disease for a population is to satisfy R 0 < 1.

The SIR model is quite similar to SIS, with a minor difference, namely, those who are cured, join the Removed, R, population, not S. Hence, the formulation for an SIR model is as follows:

15 . CC-BY 4.0 International license It is made available under a perpetuity.

is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

The copyright holder for this this version posted October 20, 2020. . https://doi.org/10.1101/2020. 10.16.20213835 doi: medRxiv preprint Again, assuming N i (t) = S i (t) + I i (t) + R i (t) is constant, similar to what was done in the previous section, if we set x i (t) = I i (t)/N i and y i (t) = R i (t)/N i andβ i,j = β i,j N j /N i and α i = γ i + µ i , µ i = 0, we obtain the following differential equation:

∀i = 1, · · · , n. In compact from, (6) can be written as follows:

where D = −diag (α i ) and B = (β ij ) > 0 and Γ = diag (γ i ) for i = 1, · · · , n.

The system (7) has the following properties.

x and g(y) = Γx with D, B and Γ defined as above are C 1 in R n , therefore, the solution for every initial condition in R n exists and is unique for all t ≥ 0.

(ii) To calculate the equilibria of the system we set f (x) = 0 and g(x) = 0. One equilibrium is the origin, Disease-Free equilibrium (DFE), and the other one, corresponds to the case in which the disease has swept through the population and a significant ratio of the population (and not necessarily all the population) belongs to Recovered compartment and I=0.

(iii) Basic Reproduction Number for (7), can be calculated using the same formula R 0 = ρ(−D −1 B), when all R i and I i are close to 0.

Property (iii) follows from the discussion in [18, Section 3] and the fact that this equation is derived from the model linearised around the origin. This also means that as the more Infectious ratio increases, the effective R 0 becomes less and less than ρ(−D −1 B).

The SEIR model is an extension of SIR in which the susceptibles enter a latent, E, compartment, before becoming Infectious. The formulation of an SEIR model can be stated as follows:

Again, we assume N i (t) = S i (t) + I i (t) + R i (t) is constant. Now we define x i (t) = I i (t)/N i and y i (t) = R i (t)/N i and z i = E i /N i andβ i,j = β i,j N j /N i and α i = γ i + µ i , µ i = 0, we obtain the following differential equation:

∀i = 1, · · · , n. In compact from, (9) can be written as follows:

where D = −diag (α i ) and B = (β ij ) > 0 and Γ = diag (γ i ) and Ξ = diag ( i ) for i = 1, · · · , n.

The same as in 7, the origin is also an equilibrium for 10. And using the same arguments as presented in the previous Section, we can use R 0 = ρ(−D −1 B) as a reasonable estimate for Basic Reproduction Number, R 0 when the trajectory is around the origin. When the trajectory moves away from the origin, ρ(−D −1 B) over-estimates the actual value of R 0 . is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

The copyright holder for this this version posted October 20, 2020. 

In order to solve ordinary differential equations (ODEs) (4) or (7) or or (10), we need to have a reliable estimate ofβ i,j and γ i for all i, j. Parameters and σ i are easy to estimate. If for example the average duration that individuals in a group i are Infectious is 5 days, then γ i = 1/5 = 0.20, given that we have chosen one day to be the unit of time. Estimatingβ i,j , on the other hand, is very difficult, and this section explains how we can estimate contact rates based on real-world data on the spread of COVID-19.

Column C2 in Table 5 shows the distribution of confirmed cases of COVID-19 in different age groups in China as of Feb. 11th [1] . We normalise these numbers to the relative distribution of each age group in the general population. By doing so, we can compare the differences in how different age groups are affected by the virus. We can do so by dividing values in Column C2 to those of Column C1. We can further divide the resulting numbers to the smallest of them, which happens to be the first row. Then we obtain the last column of Table 5 , which shows the normalised relative distribution of infective people in the Chinese population as of Feb. 11th.

The optimisation scheme aims to find the contact rates such that at a given time t g , the ratio of the values of the states in Systems (7) or (4) matches the values reported in the last column of Table 5 . At the same time, we should keep the basic reproduction number R 0 equal to its estimated value for the spread of COVID-19 in an unconstrained population. There are various estimates for R 0 for COVID-19, some are listed in [2] . Most estimates fall in [1.5, 3.5] range. I have chosen R 0 = 2.95 as reported in [19] .

Hence, the optimisation problem we need to solve is as follows: finding matrix B such that the following two conditions are satisfied:

(i) For a given dialogical matrix D and scalar R 0 = 2.95, R 0 = ρ(−D −1 B) (ii) The relative values of the states of system (4) or (7) at a given time t g and initial condition x 0 satisfy the values of the last column of Table 5 .

But how to choose t g and x 0 ? For that, I have relied on reports that the spread of the virus has probably started in Wuhan city, in late November [21] . Hence I have set t g = 75 days (meaning the spread of the virus has initiated 75 days before Feb 11th). I have also assumed 75 days before our target date of Feb. 11th, 2020, 1 in 100, 000 of the population are infected. That seems like a reasonable assumption, although we cannot be really sure. To overcome this uncertainty, adding more data points to the optimisation scheme, rather than only two data points as used here, can help. Now that all the required parameters are set, we can solve the optimisation problem to find a B opt . In order to solve the optimisation problem, I have used sqp algorithm in globalsearch function in Global Optimisation Toolbox in Matlab c . Optimisation is done in two steps, in the first step, initial values for matrix B are chosen randomly from a uniform distribution. When the optimisation algorithm converges to a solution, the optimisation procedure is repeated, this time with the optimum value obtained in the first step as the initial values.

The objective function in the optimisation scheme is the weighted sum of two terms. One is the 2norm of the difference between the ratio of trajectories of the set of ODEs at time t g and the desired ratios extracted from Table 5 . The second term is the difference between ρ(−D −1 B) and the desired basic reproduction number of R 0 = 2.95. The weights are tuned manually such that no term is obscured by the other one during the optimisation steps.

To implements the optimisation scheme, I have used an in-house package called MiTepid opt which is publicly available [7] . The optimisation algorithm runs in 15-20 minutes on 40 hyper-threaded CPUs of type Intel(R) Xeon(R) CPU E5-2687W v4 @ 3.00GHz.

Note A.2.1 It should be noted that the values obtained from the optimisation schemes areβ i,j which are usable only for population distribution in China. But using the relationship β i,j =β i,j N i /N j , as defined in Section A.1.4, we can obtain normalised values that can be used for all population densities, where N i , N j are the population ratios in age groups i and j. For each other target population we can use the equationβ i,j = β i,j N j /N i to calculate theβ i,j in (10) and then solve the ODEs to obtain the evolution of the trajectories.

Note A.2.2 This methodology can be applied to any uncontained or contained population if the effective basic reproduction number, R 0 is known. But given the difficulty in estimating R 0 in a contained population, and the fact that most countries in the world now have a form of containment policy in place, the data collected in early stages of the spread of the virus in China seems to be the best available option for the optimisation scheme.

This Section briefly explains the optimisation scheme used in Section 3.4 to optimise the distribution of a limited number of vaccine units in the population. The optimisation algorithm used for this problem is the exact same global optimisation method used in Section A.2 and I will not repeat its details. But the objective function is obviously different. In this case, the objective function we aim to minimise is a weighted sum of two terms. The main term, is the final aggregate value of the Recovered compartment. As a reminder, the value of compartment R at each time represents the total number of those who were infected at any time in the past and are not Infectious any more. The second term is devised to make sure the total number of vaccine units remains constant. Simply, turning a constrained optimisation problem into an unconstrained optimisation problem to save computational time.

As already mentioned in Section 1, the SEIR model seems to be a suitable choice for the spread of SARS-CoV-2 in a population, compared to SIR [14] . But as a thought experiment, in this section we can have a look at how the virus spreads in a population if instead of 4.6 days in Exposed period and 5 days in Infectious period, there is no latent period and 9.6 days of Infectious period. Table 6 show the eventual Recovered and maximum Infectious ratio of the same countries as in Table 1 . Comparing the two tables we can see that the eventual Recovered Ratio (which is the total number of people who are infected at some stage) is less in SIR case, but the peak of instantaneous Infectious has decreased. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

The copyright holder for this this version posted October 20, 2020. 

. CC-BY 4.0 International license It is made available under a perpetuity.

is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

The copyright holder for this this version posted October 20, 2020. . https://doi.org/10.1101/2020. 10.16.20213835 doi: medRxiv preprint 

Breakdown of 44,672 sample patients infected with novel coronavirus COVID-19 in china as of february 11

MIDAS 2019 novel coronavirus repository

Population pyramids of the world

Stability criteria for sis epidemiological models under switching policies

A serological assay to detect sars-cov-2 seroconversion in humans

Reinfection could not occur in sars-cov-2 infected rhesus macaques

MiTepid opt: An optimisation package to estimate the contact rates of a stratifeid compartmental epidemielogical model

MiTepid sim: A python-based library to simulate the spread of COVID19 in a stratified population

A stratified model to quantify the effects of containment policies on the spread of covid-19

On the definition and the computation of the basic reproduction ratio R 0 in models for infectious diseases in heterogeneous populations

Epidemiological models and Lyapunov functions

Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand

Modelling the covid-19 epidemic and implementation of population-wide interventions in italy

Projecting the transmission dynamics of sars-cov-2 through the postpandemic period

Age-specific mortality and immunity patterns of sars-cov-2 infection in 45 countries

Sars-cov-2 infection fatality risk in a nationwide seroepidemiological study

Serology-informed estimates of sars-cov-2 infection fatality risk in geneva, switzerland

Reproduction numbers and subthreshold endemic equilibria for compartmental models of disease transmission

Estimating the basic reproduction number of covid-19 in wuhan, china

Antibody prevalence for sars-cov-2 in england following first peak of the pandemic

Characteristics of and important lessons from the coronavirus disease 2019 (COVID-19) outbreak in china: summary of a report of 72 314 cases from the chinese center for disease control and prevention