key: cord-0329810-2h8joq19 authors: Lazebnik, Teddy; Bunimovich-Mendrazitsky, Svetlana title: Generic Approach For Mathematical Model of Multi-Strain Pandemics date: 2021-11-16 journal: bioRxiv DOI: 10.1101/2021.11.16.468823 sha: 23efd591ad9c1c6c93420a15fbd801a8c92fa34c doc_id: 329810 cord_uid: 2h8joq19 Pandemics with multi-strain have become a major concern. We introduce a new model for assessing the connection between multi-strain pandemic and the mortality rate, basic reproduction number, and the maximum of infected individuals. The proposed model provides a general mathematical approach for representing multi-strain pandemics, generalizing for an arbitrary number of strains. We show the proposed model fits well with epidemiological historical data world health over a long period. From a theoretical point of view, we show that the increasing number of strains increases logarithmically the maximum number of infected individuals and the mean mortality rate. Moreover, the mean basic reproduction number is statistically identical to the single, most aggressive strain pandemic for multi-strain pandemics. 1 Introduction and Related Work 1 Humanity has experienced multiple types of disasters over the centuries [1] [2] [3] [4] . One of 2 them is (local and global) pandemics that cause significant mortality [5] . Moreover, 3 recent studies show that the occurrence rate of new pandemics has increased in the last 4 century, resulting in an increased number of pandemics and their influence [6] . Some of 5 these pandemics exert a global influence such as HIV/AIDS that killed 680 thousand 6 individuals only in 2020 according to the World Health Organization (WHO) 1 or the 7 COVID-19 pandemic that killed 4.5 million individuals and infected around 440 million 8 individuals worldwide during its first 18-months [7] . As a result, the need for 9 policymakers to be able to control a pandemic spread is becoming more relevant by the 10 day [8] . promising outcome. Nonetheless, the evaluation is limited to a small size (up to a few 48 hundred individuals), and the generalization to larger populations can be less accurate 49 due to the increased chance that a strain occurs during the pandemic and changes its 50 dynamics [26] . 51 Similarly, Bunimovich-Mendrazitsky and Stone [27] proposed a two-age group, adult 52 and children, for the Polio pandemic spread. Using the model in [27] , the extraordinary 53 jump in the number of paralytic polio cases that emerged at the beginning of the 20th 54 century can be explained. The model does not take into consideration some strains of 55 Polio [28] which results in an increased divergence from the actual dynamics over time. 56 In addition, one of the main extensions of the SIR model is the SIRD (D-Dead) 57 model, as this model takes is able to represent the reinfection process and the death of 58 individuals due to the pandemic [29] [30] [31] . This model better represents the 59 biological-clinical dynamics in human populations as the long-term immunity memory is 60 reduced over time making the individual susceptible again [32, 33] . We based our model 61 on this extension as it allows reinfection in several strains of the original strain. The mentioned models and other models that extend the SIR model can fairly fit 63 and predict the course of a pandemic spread [34, 35] . However, the models are not fitted 64 to capture sharp changes in the dynamics due to the pandemic modifications. One assume clinical-epidemiological dynamics that hold only for a subset of pathogens with 72 cross-immunity of less than 0.4 [36] . In a similar manner, Dang et al. [37] developed a shortcoming, from an analytical point of view, due to its stochastic and chaotic nature. 84 Moreover, the usage of multi-strain models that are used for specific pathogens is not 85 restricted to influenza. Marquioni and de Aguiar [38] proposed a model where a consideration compared to the other case [38] . Likewise, Khayar and Allali [39] 90 proposed a SEIR (E-exposed) model for the COVID-19 pandemic with two strains. The 91 authors analyzed the influence of the delay between exposure and becoming infectious 92 on several epidemiological properties. Furthermore, they proposed an extension to the 93 model (in the appendix of the manuscript) for multi-strain dynamics. In their model, an 94 individual can be infected only once and develop immunity to all strains [39] . In our 95 model, we relax this assumption, allowing individuals to be infected once by each strain. 96 Comparably, Gubar et al. [40] proposed an extended SIR model with two strains with 97 different infection and recovery rates. The authors considered a group of latent 98 individuals who are already infected but do not have any clinical symptoms. In this research, we developed an extension of the SIRD-based model which allows 115 an arbitrary number of strains |M | that originated from a single strain and is generic for 116 any type of pathogen. The model allows each strain to have its unique epidemiological 117 properties. In addition, we developed a computer simulation that provides an in silico 118 tool for evaluating several epidemiological properties such as the mortality rate, max 119 infections, and average basic reproduction number of a pandemic. The proposed model 120 allows for a more accurate investigation of the epidemiological dynamics while keeping 121 the data required to use the model relatively low. This paper is organized as follows: In Section 2, we introduce our multi-strain 123 epidemiological model. Based on the model, we present a numerical analysis of three 124 epidemiological properties as a function of the number of strains (|M |). In Section 3, we 125 present the implementation of the model for the case of two strains (|M | = 2) and 126 provide an analytical analysis of the stable equilibria states of the model and a basic 127 reproduction number analysis. Afterward, we show the ability of the model to fit 128 historical epidemiological data known to have two strains. In Section 4, we discuss the 129 main advantages and limitations of the model and propose future work. where i ∈ M is the index of a strain and J ∈ P (M ) is the set of strains an individual is the dynamical amount of individuals that recovered from a 166 group of strains J ∈ P (M ) over time. It is affected by the following two terms. First, for each strain i ∈ J, an individual who has recovered from group J\{i} of strains and 168 is infected with strain i, recovers at rate γ J\{i},i with probability of ϕ J\{i},i . Second, 169 individuals infected by strain i with rate β J,i . These individuals can be infected by any 170 individual with a strain i who has recovered from any group K of strains, so that i ̸ ∈ K. 171 (3) In Eq. (4), dD(t) dt is the dynamical amount of dead individuals over time. For each 172 strain i, and for each group J\{i}, infected individuals that do not recover are dying at 173 rate γ J\{i},i with the complete probability (1 − ϕ J\{i},i ). The dynamics of Eqs. (2-4) are summarized in Eq. (5) . The initial conditions of Eq. (5) are defined for the beginning of a pandemic as follows: 176 Based on the proposed model, and since for all the cases where |M | > 2 it is extremely 178 hard (or even impossible) to obtain an analytical result, we evaluated three important 179 epidemiological properties to see the influence of the number of strains on the pandemic 180 spread: mean basic reproduction number, mortality rate, and a maximum number of 181 the infectious. Formally, these properties can be defined as follows. First, the mean basic reproduction number is the mean basic reproduction number 183 overtime during the course of the pandemic. Therefore, it takes the form: Second, the mortality rate is defined as the number of deaths due to the pandemic 185 divided by the number of infections at some period of time. If not stated otherwise, we 186 assume the mortality rate refers to the entire duration of the pandemic. Hence, it takes 187 the form: . Finally, the maximum number of infectious refers to the cumulative number of 189 infections that occur during the pandemic. Thus, it takes the form: . In addition, we define the most aggressive strain using the following metric: a strain 191 k is considered more aggressive than strain l if and only if: properties associated with cross-immunity between strains. In addition, we assume the 202 population size is 10 million individuals to approximate (in order of magnitude) a 203 European metropolitan area. The simulation begins with one person getting infected by 204 each strain. In addition, it is assumed that no individuals have recovered or died as a 205 result of the pandemic. Formally, the initial conditions take the form: Moreover, due to the stochastic nature of the simulation originating in the large ranges 207 of values allocated to the model parameters, the simulation is repeated 1000 times, and 208 the mean ± standard division is presented. Using this generation we compute the 209 connection between |M | and the mean basic reproduction number, max infected 210 individuals, and mean mortality rate. The mean basic reproduction number (E[R 0 ]) has been evaluated for each simulation, 212 divided into two cases: the case where strain has unique parameter values and the case 213 The fitting function was obtained with a coefficient of determination R 2 = 0.79. The mean mortality rate as a function of the number of strains has been computed 224 and presented in Fig. 4 The fitting function was obtained with a coefficient of determination R 2 = 0.89. materials, Section S1. A schematic transition between disease stages of an individual for 238 the case of |M | = 2 is shown in Fig. 5 . Thus, the model for two strains is described by 239 eight equations as follows. The initial conditions of Eq. (9) are defined for the beginning of a pandemic as 241 November 11, 2021 9/20 follows: We use a model that does not allow temporary cross-immunity and without increased 244 susceptibility to the second infection. Therefore, Eq. (12) takes the form: From Eq. (12), the pandemic-free equilibria is obtained where because in this state, there are no more infected individuals which means all strains 254 have gone extinct. Therefore, the equilibria states take the form: According to [49] , this set of states (Eq. (13) The basic reproduction number, R 0 , is defined as the expected number of secondary 270 cases produced by a single (typical) infection in a completely susceptible population [50] . 271 In the case of a SIR-based model, the basic reproduction number indicates an epidemic 272 outbreak if R 0 > 1 or not if R 0 < 1. To find the basic reproduction number for two strains, we use the Next Generation 274 Matrix (NGM) approach [51] . First, we compute the new infections matrix Afterward, we compute the transfers of infections from one compartment to another (15) Now, R 0 is the dominant eigenvalue of the matrix [51] . which is obtained from the root of the representative polynomial: (17) Using the Matlab's (version 2021b) symbolic programming, one is able to obtain the R 0 . 280 Just find the roots of the polynomial shown in Eq. (17) and take the biggest one. This approach cannot be generalized for more than two strains |M | > 2 as the NGM 282 will be of size k × k where k = Σ |M | i=1 n x . Namely, the size of the NGM is larger than 283 four and according to Galois theory [52] and based on Abel-Ruffini theorem [53] , the 284 roots of the representative polynomial of the NGM cannot be obtained using radicals. This means one cannot analytically find the eigenvalues of the NGM which are used to 286 obtain R 0 . The proposed epidemiological model parameter for the case |M | = 2 is obtained by 294 fitting the proposed model onto the historical data from April 1 (2020) to December 1 295 (2020) of UK by the WHO [7] , using the fourth-order Runge-Kutta [47] and gradient 296 descent [54] algorithms. These dates are picked as the population in the UK during this 297 period had not been vaccinated against the COVID-19 disease yet and a second strain 298 (i.e., the COVID-19 UK Variant -B.1.1.7) appeared according to [55] , which based their 299 analysis on clinical testing and later reverse engineering of the mutation appearance [56] . 300 Both point to the same period while do not have a full agreement on the specific dates 301 of the appearance of the mutation. Specifically, we randomly guess the values of the 302 model's parameters, solving the system of ODEs using the fourth-order Runge-Kutta 303 method and computing the Gaussian (L 2 ) distance from the historical data. In 304 particular, we used the daily number of infection, recovered, and deceased individuals. 305 As such, the fitness function takes the form: Afterward, we repeated this process while modifying the value of a single parameter 311 by some pre-defined value δ = 0.01, obtaining a numerical gradient. At this stage, we 312 used the gradient descent algorithm in order to find the values that minimize the 313 model's L 2 distance from the historical data using Eq. (18) . The process is stopped 314 once the gradient's (L 1 ) norm is smaller than some pre-defined threshold value ϵ = 0.1. 315 The entire process is repeated r = 100 times and the parameter values that are obtained 316 most often are decided to be the model's parameter value. The values for (δ, ϵ, r) are 317 manually picked. A schematic view of the fitting method is presented in Fig. 6 . In order to numerically evaluate the ability of the proposed model to fit real 320 epidemiological data, we decided to simulate the COVID-19 pandemic in the United 321 Kingdom (UK). This case is chosen due to the availability of epidemiological data and 322 since a COVID-19 strain is known to originate in the UK [7, 57] . Therefore, we where R ∅ (0) = 67200000 to represent the size of the UK population in the beginning of 325 the pandemic. A summary of the obtained parameter values is shown in Table 1 , such 326 that 27% of the random parameter value initial conditions converged to the values with 327 d L2 = 0.089. Namely, the model has a daily mean square error of 8.9%. Symbol Value Infection rate of the strain (i = 1) [1] β ∅,1 0.04 Infection rate of the strain (i = 2) [1] β ∅,2 0.07 Infection rate of the strain (i = 2), after recovery from the strain (i = 1) [1] β {1},2 0.01 Infection rate of the strain (i = 1), after recovery from the strain (i = 2) [1] β {2},1 0.02 The average duration that is takes for an individual to recover from the strain (i = 1) in days [t −1 ] γ ∅,1 0.08 The average duration that is takes for an individual to recover from the strain (i = 2) in days [t −1 ] The average duration that is takes for an individual to recover from the strain (i = 1) after recovering from the strain ( The average duration that is takes for an individual to recover from the strain (i = 2) after recovering from the strain ( The probability an infected individual will recover from the strain (i = 1) [1] ϕ ∅,1 0.98 The probability an infected individual will recover from the strain (i = 2) [1] ϕ ∅,2 0.96 The probability an infected individual will recover from the strain (i = 2) after recovering from the strain (i = 2) [1] ϕ {1},2 0.99 The probability an infected individual will recover from the strain (i = 3) after recovering from the strain (i = 1) [1] ϕ {2},1 0.99 Table 1 . A summary of the model parameters and values for the case of |M | = 2, obtained from the fitting process to the historical WHO COVID-19 data from April 1 (2020) to December 1 (2020). A fitting dynamics between the historical data (circle, black) and the model's 329 prediction (axes, blue) is shown in Fig. 7 , where the x-axis describes the time from We have shown that for the case of only two strains (e.g., |M | = 2), the only stable 341 equilibria states are when the pandemic is over for both strains Table 1 . equilibrium analysis is that the pandemic-free states are stable only when the epidemics 344 of the two strains cease; that is, after the end of the general pandemic (Eq. (13)). Moreover, an analytical computation of the basic reproduction number (R 0 ) requires 346 information on infections between individuals with different strains, which is not 347 realistically available. Therefore, an immediate result of the model is that once a 348 pandemic developed secondary strains, a numerical and statistical approximation of R 0 349 is left to be the only feasible approach. In addition, the proposed model is evaluated on the COVID-19 pandemic (for the 351 case of the UK) and has shown promising ability to fit a long period of historical data 352 with multi-strain (eight months, 8.9% daily mean square error). A prediction of the last 353 two months of this period is shown in Fig. 7 , based on the obtained model's parameter 354 values which are presented in Table 1 . Strain i = 1 is mapped to the original strain of 355 COVID-19. Since at the beginning of the pandemic, this was only a single strain, the 356 measured epidemiological values are necessarily associated with this strain. This is not 357 the case for measurements of periods where two or more strains existed. The proposed 358 model captures a general trend of decreasing R 0 during this period while not matching 359 the data closely as it does not take into consideration other social and epidemiological 360 dynamics intentionally, allowing analytical analysis to be considered. However, future 361 extensions of the proposed model should be able to predict more closely historical 362 pandemic events. According to Voinsky et al. [58] , the average recovery rate of strain (i = 1) is 0.0714 364 while the model predicted γ ∅,1 = 0.08 (where the approximation size is δ = 0.01), as 365 presented in Table 1 . In addition, according to WHO [7] , the average mortality rate of 366 this period is ∼ 0.0138 while the model predicted that the average mortality rate from 367 this strain is 1 − ϕ ∅,1 = 0.02. Thus, while the model is simple, it is able to capture the 368 biological and epidemiological properties of the pandemic. 369 Furthermore, we evaluated the influence of the number of strains on the mean basic 370 reproduction number (E[R 0 ]), mortality rate, and a maximum number of infected 371 individuals, as shown in Figs. 2, 4, and 3 , respectively. We show that the basic 372 reproduction number is upper bounded by taking into consideration only the most 373 aggressive strain. Formally, we perform a paired two-tail T-test in order to evaluate if 374 the processes differ in a statistically significant way with α = 0.05 and obtain that 0 is 375 not in the confidence interval of the statistical test. As such, the most aggressive strain 376 approximation to the pandemic can be used as an upper boundary for the mean basic 377 reproduction number. An immediate outcome is that the proposed model is upper Based on Eq. (7), the maximum number of infected individuals is growing in a 385 logarithmic manner to the number of strains when the latter occurs simultaneously. In a 386 similar manner, based on Eq. (8), the mortality rate is growing in a logarithmic manner 387 to the number of strains when the latter occur simultaneously. We numerically show in 388 Figs. 3 and 4 that the epidemiological properties which indicate the severity of the 389 pandemic in a well-mixed population grow in a logarithmic manner as a function of the 390 number of strains (|M |). This connection indicates that the first few strains make a 391 relatively large contribution to the mortality and pandemic spread dynamics, but as the 392 number of strains grows, each strain contributes less to these numbers. Policymakers 393 can take advantage of this link when planning intervention policies to contain the 394 spread of a pandemic, given that new strains can emerge during pathogen mutation. The code developed for this model is publicly available as an open source 2 . Several possible future research directions emerge from the proposed initial modeling. 397 First, one can introduce a fixed delay parameter to the occurrence of strains, 398 investigating the influence of this parameter on the epidemiological spread. Second, one 399 can take into consideration more detailed biological settings, assuming the stochastic 400 occurrence of the strains from some distribution. Both directions aim to better 401 represent a real pandemic where several strains do not exist from the beginning of the 402 pandemic. Moreover, one can introduce a similarity matrix between the strains as they 403 are mutations of an original strain, which are reflected by the immunity response to 404 re-infection of different strains or a cross-immunity response. In the same direction, 405 adding an Exposed state would make the proposed model more biologically accurate, 406 since most strains have an incubation period before the host becomes infectious. The 407 multi-strain model is a theoretical platform that will help guide the decision-making Historical Disasters in Context: Science Economic Change, Disasters, and Migration: The Historical Case 414 of Manchuria. Economic Development and Cultural Change The Historical Development of Public Health Responses to 416 Disasters Better Understanding Disasters by Better Using 418 Historical and methodological highlights of quarantine measures: from 421 ancient plague epidemics to current coronavirus disease (COVID-19) pandemic A Literature Review of the Economics 424 of COVID-19 WHO Coronavirus Disease (COVID-19) Dashboard Pandemic management by a 429 spatio-temporal mathematical model. International Journal of Nonlinear Sciences 430 and Numerical Simulation Medical Science, Infectious Disease, and the Unity of Humankind Economic 434 growth, urbanization, globalization, and the risks of emerging infectious diseases 435 in China: A review Dynamics of a competing two-strain SIS epidemic 437 model with general infection force on complex networks Genetic Diversity in the SIR 440 Model of Pathogen Evolution Modelling seasonality and viral mutation 442 to predict the course of an influenza pandemic A contribution to the mathematical theory of 445 epidemics Simulation of emotional contagion using 447 modified SIR model: A cellular automaton approach Temporal Influence of 450 Non-Pharmaceutical Interventions Policies on Pandemic Dynamics and the Economy: The Case of COVID-19. Epidemiologic-Economic Evidence from an Estimated Spatial Econ-SIR Model PDF Logo S-I-R Model with Directed Spatial Diffusion. An 456 A Spatial Stochastic SIR Model for Transmission 458 Networks with Application to COVID-19 Epidemic in China Comparison of Pandemic Intervention Policies in Several 460 Building Types Using Heterogeneous Population Model. medrxiv Antiviral treatment for pandemic 462 influenza: Assessing potential repercussions using a seasonally forced SIR model Management of blood supplies during 465 an influenza pandemic The SIR model and the Foundations of Public Health A SIR model assumption for the spread 469 of COVID-19 in different communities Modeling Spread of Polio with the Role of 472 Modelling seasonality and viral 474 mutation to predict the course of an influenza pandemic Modeling polio as a disease of 477 development Determination of the mutation rate of 479 RNA-dependent RNA polymerase An SIS epidemiology game with two subpopulations Biological Dynamics Stability of periodic solutions for an SIS model with pulse 483 vaccination Threshold behavior of a stochastic SIS model with 485 Levy jumps Influenza: Options to Improve Pandemic 487 Time-dependent heterogeneity leads to transient suppression of the Inefficiency of SIR models in forecasting COVID-19 epidemic: a case study of 493 Mathematical models to 495 characterize early epidemic growth: A review Improving the realism of deterministic multi-strain 498 models: implications for modelling influenza A Competitive exclusion in a multi-strain 501 Modeling neutral viral mutations in the spread 504 of SARS-CoV-2 epidemics Global dynamics of a multi-strain SEIR epidemic model with 506 general incidence rates: application to COVID-19 pandemic Optimal Control of Heterogeneous Mutating 509 Human mobility 511 networks and persistence of rapidly mutating pathogens A data-driven model of 514 the COVID-19 spread among interconnected populations: epidemiological and 515 mobility aspects following the lockdown in Italy Not just antibodies: B cells and T cells mediate immunity 518 to COVID-19 pathogenic IgG antibodies Immunoinformatics and structural analysis for identification of immunodominant 524 epitopes in SARS-CoV-2 as potential vaccine targets Cross-immunity between respiratory coronaviruses may limit 526 COVID-19 fatalities A fourth order Runge-Kutta RK(4,4) method with error 528 control Analytically Obtain the Asymptotic Stable Equilibria States of Extended 533 Reproduction numbers and sub-threshold 535 The construction of next-generation 538 matrices for compartmental epidemic models The Mathematical Writings ofÉvariste Galois Mémoire sur leséquations algébriques, ou l'on démontre l'impossibilité 543 de la résolution de l'équation générale du cinquième degré The method of steepest descent for non-linear minimization problems Covid-19: What have we learnt about the new variant in the UK? vaccination intention in the UK: results from the COVID-19 vaccination 551 acceptability study (CoVAccS), a nationally representative cross-sectional survey Covid-19: New coronavirus variant is identified in UK Effects of age and sex on recovery from 555 COVID-19: Analysis of 5769 Israeli patients