key: cord-1034375-ayg6qyoa authors: Bogdanov, O. V.; Knopov, P. S. title: Stochastic Models in the Problems of Predicting the Epidemiological Situation date: 2022-01-28 journal: Cybern Syst Anal DOI: 10.1007/s10559-022-00435-4 sha: 5a066a6ec53a2deedacecfe133de0f85eff3d988 doc_id: 1034375 cord_uid: ayg6qyoa The paper investigates some stochastic models with discrete and continuous time to solve important problems of predicting the spread of epidemiological diseases in the population. Various factors of epidemic spread and the main parameters influencing the forecast assessment are taken into account. Some test calculations based on the proposed methods have been performed. The COVID-19 pandemic has become a global challenge for humanity in the 21st century. It needs adequate methods and means of its control. In the absence of herd immunity and coronavirus drugs, as well as unequal access to vaccines, the epidemic threatens human life and health. But at the same time, long-term quarantine and measures to limit the pandemic cause economic damage and hamper the economic development. Therefore, decisions in disease spread control need special consideration: on the one hand, it is about the lives and health of a large number of people; on the other, there are significant economic losses and potential impoverishment. Under these conditions, there is a growing need for for modeling and decision-making support tools based on accurate calculations of their consequences. Such tools include various models of predicting the epidemiological situation and medical assistance needs, models of predicting the economic consequences of governmental (regional) decisions to limit the epidemic, etc. It is also necessary to take into account various risks and uncertainties that occur in modeling such complex processes with the stochastic (uncertain) nature of their components. This requires appropriate mathematical methods, including the use of random processes and fields, stochastic differential equations, regressions of special kind, modern apparatus of risk measures, etc. In what follows, we will present some approaches to solving the above problems. As initial models, we took SIR (SEIR) and similar epidemiological models, which allow predicting the impact of restrictive measures on the dynamics of the spread of the disease. The main factor in these models is the virus replication rate (reproduction coefficient), which significantly depends on such measures. The study [1] analyzes the impact of school and workplace closures, public event cancellations, prohibition of public transportation, restrictions on domestic and international movement on the daily rate of spread of the disease. The main problem for such models is the difficulty of their set up (identification) by real data. More detailed models require more complete data on the disease profile and its prevalence. In deterministic models, most of their parameters calculate average values. They take into account the stochasticity of processes on average, successfully approximating the situation for large populations in a homogeneous environment. Stochastic models, in contrast to their deterministic analogs, more adequately reflect the course of the processes, especially in local or transient processes. The present article describes some models for predicting the epidemiological situation and attempts to adjust them. Note that more detailed models require access to more complete data as to the disease profile and its prevalence. Also, one of the issues is incomplete testing of the population and latent course of diseases with mild symptoms in some people. According to [2] , by the SIR model, population is divided into categories (compartments): S (Susceptible), I (Infected (detected illness cases)), and R (Recovered), whose dynamics can be described by the corresponding systems of differential equations: Note that the distribution of the population by the respective categories satisfies the overall balance . Therefore, medium-and long-term modeling should take into account the dynamics of the population N t 0 ( ) (demographics), given the number of births b per unit time and mortality rate m in the population: dS dt The SEIR model includes an additional category E (Exposed) for people in the incubation period, when a person has only become infected but does not infect others. In addition to the dynamics of the category E t ( ), the previous equations are supplemented with the parameter T inc , which is average incubation period, and the system becomes A variable D , which determines the mortality caused by infection [3] can be introduced in this model. Then the third equation of system (3) is replaced with the following: and the fifth equation is added: , where Cfp describes the average proportion of fatal (lethal) cases for infected people. The models described are then specified by adding new categories, such as people to be hospitalized and those who require the use of ventilators. However, detailed models require appropriate settings. Even the simple model (3) requires data on the distribution of infected people into categories E and I , i.e., into those who are in the incubation period and those who have already left it. Since testing cannot be considered sufficient, another parameter can be introduced for model (1) In this case, the parameters R 0 , T inf , and P inf of system (4) can be estimated by the proximity of trajectories I t * ( ) and S t ( ) to the respective observations. Since they can only be estimated by the trajectory I t * ( ), parameter P inf does not affect the solution. Therefore, system (4) does not differ from system (1). Let us return to system (1), which we will configure according to the observations of active infected people y t , t n = 1, , K , which are defined as the daily "total number of infected" minus "the total number of those who became ill." Since the process under study is dynamically changing, as the criterion we will consider the summation of the Let the population of Ukraine at the time of modeling be N 0 = 41 million 858 thousand of people (according to some estimates as of March 1, 2020). The population dynamics can be taken into account according to model (2) . Note that the parameter T inf determines the average period of time during which an infected person spreads the disease. This parameter is determined by the action of the virus and the human body, so it cannot change arbitrarily. According to the Oxford model for COVID-19, T inf » 4.5. Therefore, we can limit the selection of the parameters by the condition 4 4 . inf £ £ T 4.6, and selection of the parameters R 0 and T inf by the minimization of criterion (5), which can be performed using standard procedures for constrained global optimization. A large number of models of such "deterministic" type have been developed by now; the behavior and properties of many of them have been studied. The main disadvantage of these models is the lack of stochastic approaches, although the process of epidemic development is essentially random. This model is based on the results of [3] . It proposes a stochastic model of an epidemic with discrete time, in which the daily number of new diseases is binomially distributed depending on the number of diseases in the previous days. This model has the following advantages. 1. The model takes into account variations in the level of infectivity during the disease development, i.e., the probability of transmitting the disease on each day of the disease to individual patients. 2. The model is stochastic, which corresponds to the actual spread of infection among the population. 3. The model is easy to use, well-known formulas are used to calculate estimates by the method of maximum likelihood of the parameter (basic reproductive rate), which is equal to the average number of people infected by one ill person during the entire period of the disease. This parameter determines the epidemic spread rate. This assessment makes it possible to determine the parameter using the previous statistics of the daily number of new diseases to predict the further development of the epidemic. The study [3] also considers the versions of models according to which the population is divided into subgroups, for example, by age or absence of acquired immunity. An extended version of the model has also been developed [4] [5] [6] . 1. An additional parameter is introduced: the probability of detecting the disease. Since in real life not all cases of the disease are detected or taken into account by statistics, the estimate of the parameter on the basis of previous data is not accurate; therefore, the parameter is used to adjust the statistics, taking into account a certain level of inaccuracy. 2. The ability to divide the epidemic duration into several periods with different values of the parameters at different stages is added. Estimates of the parameters at certain stages are not independent; therefore, it is necessary to maximize the approximation of the statistics for the entire epidemic. Division into stages is necessary when new quarantine measures are introduced (the parameter being changed in this case) or when the level of monitoring of the population is changed (DR is changed). In cases of long-lasting epidemics (such as the COVID-19 pandemic), the disease spread dynamics may be seasonal (due to the effect of weather on the infectivity level and/or seasonal variation in the number of contacts among the population). A program has been developed for parameter estimation and further simulation of the development of the epidemic. Let us consider some models and methods used to determine the spread of epidemics as random rather than deterministic processes [7] [8] [9] [10] . Let n be the number of people who fell ill. Every day, every sick person (regardless of other patients) may recover with probability b / n and die with probability g / n . Also, patients receive a certain amount of medication õ every day, which in our model is considered absolutely effective. The process ends when all patients either recover or die. The task is: for the given values of the parameters g, b, and n find õ for which the effectiveness of the provided drugs is the maximum. Let us consider the features of the problem. 1. Unlike most epidemilogical models, no new individuals are added to the class of sick people during the treatment process. This is the case, for example, when the disease is genetic or caused by a single catastrophe. 2. All the problem parameters (g, b, n, and õ) are assumed to be positive numbers. 3 . We assume that õ does not change over time. In contrast to the parameters g, b, c n , , we regulate by the number of drug units õ , but this number remains constant. Problem Solution. Let N t ( ) be the number of sick people at time t . Consider the process given by the equation where x( ) i is the number of people who died at time i and m( ) i is the number of those who recovered on their own at time i. For any trajectory, we get In what follows, we will use Ì t ( ) and will show that the results can be applied to N t ( ). Three lemmas were proved to find an efficient method of drug delivery [10] . LEMMA 1. For the mathematical expectation, the statement holds: The result of Theorem 2 makes it possible to find, by numerical methods, for the given values of the parameters g, b, and n, the value of õ that maximizes the efficiency of providing drugs to patients. More and more attention is being paid to modeling, analysis, and forecasting of objects of variable structure and (or) with time-varying parameters. Such processes take place in medicine, economics and technology. There are several approaches to creating models of such objects. One of the most promising, in our opinion, is the use of switching regressions, where the switch points are unknown, so they need to be evaluated. The essence of switching regression is that the regression parameters are not constant throughout the observation interval. They are constant on its subintervals, which are separated from each other by switch points. By estimating the switch points, it is possible to determine the time intervals on which structural changes of the object took place. There are two forms of switching regressions: with continuous regression line and with regression line that has discontinuities at switch points. In studies of such regressions with discrete time, the contribution by P. Perron and co-authors is significant [7] [8] [9] . They showed the possibility of applying switching regressions in the economy. In [11] , a new class of switching regressions in continuous time was introduced and a method of their construction was proposed. Based on this study, the article [12] proposes a method for constructing switching regressions in discrete time. Article [13] provides a preliminary statistical analysis of the spread of coronavirus disease in Ukraine based on the use of switching regression. The calculation procedure described there can be automated, which will allow real-time data processing. Switching regression can also be used to determine the duration of treatment in people infected by coronavirus, as well as to monitor the course of various epidemics. We have considered some approaches to creating stochastic models of epidemic prediction, as well as the models, mathematical methods, and software for their implementation. The further development of these models for prediction and assessment of the spread of epidemics is associated with the use of regression models with continuous time and stochastic diffusion equations. Developing a mathematical model of the COVID-19 epidemic spread in Ukraine Modeling Infectious Diseases in Humans and Animals Stochastic discrete-time age-of-infection epidemic models Contributions to the mathematical theory of epidemics. I Using a stochastic model to predict long-term epidemics An analysis of the real interest rate under regime shifts Estimating and testing linear models with multiple structural changes Computation and analysis of multiple structural change models Modeling of epidemics Continuous-time switching regression method with unknown switching points An approximate method of constructing a switching regression with unknown switch points Statistical analysis of the dynamics of coronavirus cases using stepwise switching regression