key: cord-0940266-zrmhrhft authors: Reyné, Bastien; Saby, Nicolas; Sofonea, Mircea T. title: Principles of mathematical epidemiology and compartmental modelling application to COVID-19 date: 2021-12-28 journal: Anaesth Crit Care Pain Med DOI: 10.1016/j.accpm.2021.101017 sha: 2c05bc4efff0c1640799d009d55d39fb12f5b7d3 doc_id: 940266 cord_uid: zrmhrhft nan Modelling is a rational simplification of a phenomenon, a formalisation that focuses on the parts considered to be essential to generate the observed patterns. In the case of infectious diseases, the mathematical equations on which the models of their spread are based share a common backbone corresponding to transmission and recovery events while the vast diversity of details depends on the pathogen, host population, prevention, and treatments considered [3] . The choice of the starting assumptions and the formalism will essentially depend on the initial question but might as well be oriented by the kind of data available to calibrate the parameters. Most models designed to capture the spread of an epidemic in a given population are said compartmental. These models are related to the seminal work of Kermack and McKendrick known as the SIR model, where the population is divided into three compartments: susceptible ( ), infected ( ), and removed ( ) [4] . The model follows the change in the proportion of the population belonging to each compartment through time, by reproducing numerically the course of infectious diseases epidemics: an infectious individual transmits the disease to a susceptible individual, who becomes infectious and then transmits the disease at his turn (perpetuating the epidemic), before recovering (or dying) from the disease, assuming no further contribution to the spread of the disease. Advanced and current models are designed with a higher number of connected compartments to better reflect the clinical and epidemiological features and outcomes of the disease, as well as to be fit to a particular question and to the data on which dynamical inference can be done. For instance, one of the main questions addressed by models during the first wave of the SARS-CoV-2 epidemic in France (in March 2020) was how many people at most would be simultaneously hospitalized due to COVID-19, to prevent or at least anticipate a hospital overload. In such a case, a compartment for hospitalized infected individuals is then introduced. On a first approach, we could imagine a model with a compartment for infected individuals who will develop a mild or asymptomatic form of the disease and another compartment for infected individuals who will develop a severe form of the disease and will require a later hospitalization before recovering or die, as represented by the illustrative model presented on Figure 1 . This simple model can be further improved by adding new compartments and be amended on a second approach provided new knowledge or data (e.g. specifying an incubation period). There also exist different mathematical formalisms such as discrete-time modelling (see [5] ), based on a system of partial differential equations (e.g. [6] ) or even individual-based stochastic simulations [7] , but the most common remains the use of a system of ordinary differential equations (ODE) [3] , that is a set of conditions on the derivatives of unknown functions. These functions are simply the number of individuals in each compartment through time. Instead of trying to estimate at each time step the proportion of the population in each compartment, which would be somehow very tedious, we simply estimate the number of individuals who move from one compartment to another. Thus, we need to know only two things: 1. the state of the epidemic at a given time ; 2. what happens during the following small-time interval . The first step is usually straightforward since we might just consider the beginning of the epidemic where every individual but one is susceptible. The second step however represents the mathematical translation of disease dynamics. For instance, if we consider the model presented beforehand, the number of susceptible individuals at time + is given by the number of susceptible individuals at time , minus the individuals that got infected between and + . Assuming that the probability of infection is proportional to the current number of contagious individuals: where is the transmission rate, which represents, for each susceptible individual, the probability of being infected per unit of time and per capita of currently contagious individuals in the population. Hence, we need to know the transmission rate , which is often estimated on incidence data using external statistical procedures (e.g. reproduction number estimation as in [8] ). We also need to know the current total number of individuals infected ( ( ) = ( ) + ( )), which implies the different compartments must be followed simultaneously. This idea is generalized for each compartment: we determine at each time step the inflows and outflow related to each compartment. Within a compartment, inflows correspond to a certain number of individuals per time unit leaving other compartments, and departures are done with a constant ratemeaning at each time step a constant proportion of the individuals in the compartment leaves the compartment. Note that in the simplest models, the force of infection (the rate at which susceptible individuals become infected) is the only time-varying rate (as a function of the prevalence) and therefore captures the whole non-linearity of the dynamics. In more sophisticated models (e.g., weather effect, variant replacement, public health interventions such as transient social distancing or vaccination program, immune waning), otherwise constant rates would explicitly depend on time, therefore greatly increasing the richness of the dynamics. As for any quantitative approach, such models rely on simplifying assumptions which constitute inherent limitations. The main ones are the lack of spatial structure (all encounters have the same probability to occur) and host heterogeneity (interindividual differences are smoothed out). Despite their unrealistic nature, these assumptions have proved to provide robust and conservative estimates in the early stages of an epidemic [9] . On a longer timescale, however, increasing the number of compartments to introduce a form of spatial structure and/or individual heterogeneity is both common and straightforward [3] . This might be e.g., an age structure to take into account age-differentiated severity as for COVID-19 or add gender structure for J o u r n a l P r e -p r o o f modelling sexually transmitted infections. However, adding a structure needs further knowledge (literature, expertise, and data) to be implemented. Another caveat, specific to the ODE formalism presented here, concerns the ratebased departures of the compartments. It indeed implies the time spent for an individual in a compartment does not depend on the time already spent in the compartment (which is rarely true). A workaround consists in chaining many compartments of a given clinical-epidemiological stage to shape the probability for an individual to go to the next stage according to the time already spent, as explained in Figure 2 . For instance, [10] developed a model to estimate hospital occupancy in France in 2020. This model is shown in Figure 3 , it is structured by the age of the individuals. The difference in the number of compartments between Figure 1 and Figure 3 is mainly due to correcting the residency time memory problem mentioned beforehand. From models, we can also retrieve the basic reproduction number, 0 , that is the average number of secondary infections caused by an index case [11] . This key epidemiological descriptor quantifies not only the contagiousness of the disease but also relates to the epidemic risk (what is the probability for an outbreak to occur?), the herd immunity (what is the minimum vaccine coverage to prevent any further outbreak?) threshold and the attack rate (what is the proportion of individuals eventually infected in absence of intervention?). This might be intuitively seen as 0 = number of contacts per day × probability of transmission per contact × infection duration (in days). Its precise derivation however depends on the considered model. In a simple SIR model, the compartment satisfies where is the transmission rate and is the recovery rate. In this setting, an outbreak occurs whenever there is an initial increase of the number of infected individuals. From a mathematical point of view, this corresponds to a positive derivative of prevalence at = 0, ( ( ) − (0))⁄ > 0, which, using the previous equation, can be rewritten as As immunity builds up in the population (and assuming no immune waning nor variant), the average number of secondary infections per case eventually decreases to lower values and the epidemic dynamics are then described by its real-time analog, namely the temporal reproduction number . One of the major criticisms against those models is that predictions would be wrong. It is worth to note such mechanistic models usually are not (and should not) aimed to predict the future, but simulate the most likely trends given a set of assumptions like a pre-established contact rate among the population. In a one-year retrospective analysis, [12] showed that such projections can help anticipate COVID-19 critical care bed occupancy for more than a month, on average. However, mechanistic models perform poorly within two weeks that follows a steep change in the transmission pattern in the absence of previous analogous period, e.g., in the case of the first implementation of a national curfew for which the efficiency is yet unknown and requires consolidated testing data to be assessed. Given their simplifying nature and elementary mathematical formulation, compartmental models thus represent a trade-off between flexibility, robustness, and accuracy, which explain their central role in the monitoring and control of epidemics. Susceptible individuals ( compartment) become infected after being in contact with infected individuals. They can develop an asymptomatic or mild version of the disease ( ) before recovering ( ), or they can become severely infected ( ). In the latter, they will transmit the disease as the other infected individuals then end up in the hospital ( ) before recovering or dying ( ). The probability for a given individual of remaining in a specific compartment in a classical and simple SIR model, like the infected compartment ( ) in Model 1 follows an exponential distribution which is memorylessmeaning the time spent in the compartment does not depend on the time already spent in the compartment, which is unrealistic. A workaround consists of chaining compartments as 1 and 2 in Model 2, thus creating some heterogeneity and adding memory. For instance, to specify that an individual who just entered a compartment has a very low probability to clear the disease instantly and on the contrary if she or he spent already some significant time infected he has a higher probability to clear the disease. [10] to estimate the hospital conventional beds occupancy and ICU beds occupancy. In this model, individuals can be either susceptible ( , exposed ( ), infected but not hospitalised ( ), hospitalised in conventional beds ( ), hospitalised in ICU ( ) or removed ( ⁄ ). Special report: The simulations driving the world's response to COVID-19 Opinion: What models can and cannot tell us about COVID-19 Modeling Infectious Diseases in Humans and Animals A contribution to the mathematical theory of epidemics Memory is key in capturing COVID-19 epidemiological dynamics Age-structured nonpharmaceutical interventions for optimal control of COVID-19 epidemic Emerging dynamics from high-resolution spatial numerical epidemics Improved inference of time-varying reproduction numbers during infectious disease outbreaks Inferring R0 in emerging epidemics-the effect of common population structure is smal Estimating the burden of SARS-CoV-2 in France Infectious diseases of humans: dynamics and control Anticipating COVID-19 intensive care unit capacity strain: A look back at epidemiological projections in France We thank the ETE modelling team for discussion, as well as the University of Montpellier, the CNRS, and the IRD for their logistical support. We declare no competing interest.Along with the start of the SARS-CoV-2 pandemic, mathematical epidemiology quickly reached a new and broader audience in early 2020 [1] . If epidemiological models were sometimes recognized as useful tools for decision support, their relevance was also challenged, notably among decision-makers, but also by health workers and the public. While their scope and their limitations must be acknowledged [2] , the understanding of the modelling principles and main assumptions by non-specialists appears essential for both transdisciplinary improvement and better use of their results.