key: cord-0056778-5nptpp5d authors: Shnip, A. I. title: Epidemic Dynamics Kinetic Model and Its Testing on the Covid-19 Epidemic Spread Data date: 2021-03-02 journal: J Eng Phys Thermophy DOI: 10.1007/s10891-021-02268-y sha: ca6f02903dcf5f211fd1b5eebc196606c3eb90d2 doc_id: 56778 cord_uid: 5nptpp5d A kinetic model has been proposed for the spread of epidemics, describing the dynamics of the variation in the number of disease-free, infected, and recovered (SIR) cases, based on a lag logistic equation. It has been established that this model predicts the possibility of existence of a quasi-steady-state mode of an epidemic in which the number of infected cases is constant due to the balance of the daily increment of infections and recoveries. Conditions have been identified under which such a mode can be a source of the advance of the second epidemic wave. The COVID-19 pandemic data were used to show the possibility of reliable forecasts based on this model of the spread of an epidemic for a period of up to two months. sick individuals, i.e., the part of infected individuals who have not recovered or died yet (z); the number of the recovered, i.e., persons who have acquired immunity, the insusceptible (x); the number of the deceased (d); the likelihood of transmitting infection in a single exposure to a healthy or disease-free individual, i.e., to a person having no immunity (k); the average number of differing contacts per day for an average person (b); the population of a country or a region (N 0 ); the average duration of the disease from infection to recovery or death (τ); the share of fatal outcomes in case of infection c d ; the time measured in days (t). Note. We clarify that the "average duration of the disease" is a conventional title for τ to an extent: it covers the period from the beginning of the latent phase of the disease (when a sick person becomes contagious) to the stage when the virus cannot be detected in the body and the person acquires immunity, although, in fact, disease symptoms may still persist, or, on the contrary, disease symptoms disappear, and the person remains infected for some time. Similarly, the number of infected individuals z implies the number of persons in this condition. The following balance relations are obvious for the introduced values: The likelihood w 1 (t) for a separate disease-free person to be infected during the day t is equal to the product of the likelihood of infection in a single exposure k by the number of his/her contacts per day b and by the likelihood of contacting a sick individual 0 ( ) z t N : Then the number of cases of infection per day or the infection growth rate is equal to the number of disease-free individuals multiplied by this likelihood: Substituting Eqs. (2) and (3) into (4), we obtain 0 ( ) ( ) ( ) 1 . dy t y t kbz t dt N For the initial epidemic phase [0, ] t ∈ τ when no one has recovered or died yet, the number of those infected (with symptoms) is equal to the number of those who have contracted the disease, i.e., ( ) ( ) . z t y t = Then from (5) with account for (6), we obtain for this stage a standard SI model [8, 9, 12] : in this case, d(t) = 0 and x(t) = 0. For the developed epidemic phase (t > τ), the number of the infected individuals (with symptoms) at a given moment z(t) is equal to the number of persons who have contracted the disease by this moment minus the number of those who had been infected before the moment t -τ and, hence, have either recovered or died by the current instant of time, i.e., The last term in (8) represents the sum of the recovered and the deceased by the instant of time t; therefore, At this stage, we obtain an epidemic development model from relation (5) by substituting expression (8) for z(t) in it; as a result we have Thus, the dynamics of epidemic development in this model breaks down into two phases and is described by Eqs. (7) and (10) for the initial and the developed phases respectively. Note that Eq. (10) belongs to the type of differential equations with delayed argument [13] that are characterized by certain elements of memory and fl uctuating behavior. It can also be considered as a certain analogue of the Kolmogorov-Petrovsky-Piskunov lag equation without account for diffusion [14] . Also, note that the parameters k and b are only present in the model in the form of a product that can be considered as one parameter. We introduce for it the designation kb kb λ = (11) and will call it the growth indicator. Since in its nature the process has a discrete pattern in both time (in increments of days) and for the dynamic variable y (in multiples of a whole number of people), it is natural to consider it in a fi nite-difference presentation. We designate the average duration of the infected condition τ expressed in days as a whole number n τ , and the serial number of the day since the simulation beginning as n. Then, replacing the time variable ( ) dy t dt with the fi nite-difference approximation ( ) dy t dt → 1 1 n n y y + − = y n+1 -y n , we rewrite the model equations (7) and (10) It is not diffi cult to have a numerical solution for these equations. For this purpose, it is necessary to assume initial conditions that can be assigned either as the number of individuals infected on day zero y 0 or as the number of those infected on each of n 0 (n 0 ≤ n τ ) days, i.e., the selection of values y 0 , y 0 , … , y n0 . After this, the subsequent values y n are found by simple iterations with respect to relations (12) and (13) . In simulating real epidemics, the statistical data for the initial phase are already known, as a rule; therefore, it is most logical to assign initial conditions as known values y n for the fi rst n τ days and to simulate only the developed epidemic phase according to Eq. (13) representing a variant of logistic mapping with delayed argument. After y n are found, the z, d, x, and u values are calculated from the above-cited relations (1), (2), and (9) . Thus, at the developed epidemic stage, for these variables and for the infection growth rate D in a discrete representation, we have On the face of it, the model has only two empirical parameters: n τ and λ kb but, in practice, the situation is more complex. The parameter of the average duration of the disease n τ is a characteristic of the virus, and it can be considered to be constant within a time interval of epidemic development if it is determined in some way. As for λ kb , it varies as the epidemic develops, since the values of its constituents k and b, characteristic of the "peaceful" time, are only valid for the initial stage. Later on, as quarantine and social restrictions are introduced, the average number of contacts b decreases, and, due to the expansion of the practice of using sanitary means and products (masks, gloves, disinfection, etc.), there is also a reduction in the likelihood of infection k. Both of these factors result in a decrease of λ kb . Subsequently, as the situation improves, sanitary measures can be relaxed, which leads to a certain rise in this parameter. Thus, in the course of the epidemic development, the parameter λ kb may assume various values at its different stages. General Properties of the Mathematical Model. Note that at low values of the indicator λ kb , there is a critical value of the initial condition (the number of infected individuals on day zero y 0 ) below which there is no epidemic outbreak. Indeed, if in Eq. (12) , at n = 0, y 0 assumes such a value that the expression after the angle parenthesis becomes smaller than unity, it turns into zero in rounding downwards, and we obtain a zero increment in the infectees, which is then repeated in the subsequent iterations too. Therefore, equating this expression to unity and expressing y 0 from this equality, we fi nd the ratio for the critical value of the initial condition y 0cr , which can be presented, with account for the inequality y 0 << N 0 , as 0cr 1 , where | · 〉 is the operation of rounding upwards. We will simulate an abstract case of a country with the population N 0 = 10 million residents. We assume the average duration of infected condition to be equal to n τ = 14. The substantiation for this choice will be given below. Figure 1 shows the results of simulating the epidemic dynamics according to Eqs. (12)- (14) for different values of the growth indicator λ kb (constant for the duration of the epidemic) at the initial condition y 0 = 20 identical for all cases. As we see, a substantial characteristic of this model is that it is not in every epidemic that the entire population becomes infected, as the case is in models [8, 9] . The percentage of a population infected as a result of an epidemic depends on the value of the indicator λ kb and rises with its increase. In this case, there is also a rise in the epidemic spread rate and a drop in its duration. The percentage of an infected population reaches the value of 99.9% at λ kb = 0.4195. Note that the maximum number of infection cases z(t) occurs some time later (about 7 days) after the passage of the peak in the infection growth rate (Fig. 1b and c) . Consider now how the initial conditions affect the pattern of the epidemic development. The initial conditions were such as to simulate an idealized situation when an epidemic starts with the arrival of y 0 infected patients in the country on its day zero. Figure 2 shows infection growth rate profi les for different y 0 at λ kb = 0.117. It is seen that the number of infectees on day zero affects only the duration of the initial slow stage of the epidemic, which decreases as y 0 rises. The parameters of the intensive epidemic stage such as the peak width at half height in the infection rate curve, the maximum infection rate, and the total number of cases do not change in practice, except that this stage shifts homeomorphically closer to day zero. Furthermore, this shift can be quite signifi cant; thus, for example, in the case of variation in the number of infection cases on day zero from 10 to 200, the infection rate maximum is reached 2 months earlier. The above-cited simulation implies that the growth indicator λ kb is constant throughout the duration of the epidemic, which is usually not the case in reality as was already mentioned above. Consider more realistic scenarios when, in the course of epidemic development, the λ kb value will begin to decrease due to the introduction of sanitary measures, and then, at the stage of epidemic decline, it will start rising again due to the lifting of restrictive measures. We assign the variation in the growth indicator by a continuous piecewise linear function that is constant and equal to λ kb1 = 0.22 at the initial epidemic stage (up to n (1) = 31 days), then, during 14 days (up to n (2) = 45 days), linearly falls to the value λ kb2 = 0.05 (introduction of sanitary measures), then remains constant for 122 days (up to n (3) = 167 days), and next, during 14 days, linearly rises (relaxing of sanitary measures) to the value λ kb3 = 0.12 (up to n (4) = 181 days) which, henceforth, does not change. This variation is assigned by the following piecewise linear function: 1 ( 1 ) 3 ( 4 ) for , for . Note that the case where the growth indicator does not change during the epidemic and remains equal to the initial value λ kb1 = 0.22 in (16) has already been shown earlier in Fig. 1 (the last curve in each of the plots). Figure 3 shows the results of simulating the epidemic development for the described conditions. A comparison of these data with the case indicated in Fig. 1 shows that the introduction of sanitary measures to the extent determined by function (16) reduces the intensity of epidemic development and the scale of population infection by two orders of magnitude. In this case, the percentage of infected population at the peak of infection growth rate drops approximately 300 times, and the peak value of the number of sick individuals decreases 430 times. However, here, quite unexpected results are possible too. The day of the beginning of relaxing quarantine restrictions [the parameter n (3) = 167 days in (16)] was not chosen by accident, but from a series of numerical experiments that resulted in establishing an interesting phenomenon. If this relaxation is started one day earlier, assuming n (3) = 166 in (16), we will obtain the result presented in Fig. 4 . As we see, in this case, the epidemic decreases at fi rst, and soon, after a rise in λ kb , a second epidemic wave originates. Since the amplitudes of the fi rst and second waves are incommensurable, the scale of the plots has been left, by way of illustration, the same as in Fig. 3 . We clarify that in the second epidemic wave, the plateau in the number of infections by the current day is reached at the level y max = 6,861,053; the peak number of cases per day is D max = 131,078, and the peak in the number of sick (infected) individuals is z max = 1,792,470. All this, of course, occurs under the condition that λ kb remains constant following the rise after the fi rst wave. In real epidemics, this is certainly not the case, and during the second wave, there is again an introduction of sanitary measures and restrictions that result in a repeated reduction of λ kb . Thus, the phenomenon of the second wave development has a threshold nature with regard to the duration of maintaining antiepidemic measures Δn l = n (3) -n (2) . That is, there is a threshold value Δn lt = n (3) -n (2) = 167 -45 = 122 such that, at all Δn l ≥ Δn lt , a second wave does not occur, and at Δn l < Δn lt , it does. To understand the mechanism of this phenomenon, we consider the plots in Fig. 3b and c on a large scale in the vicinity of the threshold value n (n (3) = 167) presented in Fig. 5 . The fi rst wave does not turn out to end in the total absence of sick people, but in the epidemic reaching a quasi-steadystate regime in which there is one infection and one recovery per day; in this case, the number of infected persons remains constant and equal to 14. A numerical simulation shows that this regime, in practice, continues for any random duration (so long as the number of the infected is by far smaller than the number of the population). In this case, the number 14 is not just a coincidence with the number n τ but a consistent pattern, since only in this case is there a possibility for a balance between the rise and decline in the number of infected people (between the number of those who have contracted the disease and those who have recovered from it). We will call this regime a latent (dormant) epidemic. Note that in all simulation scenarios with a constant λ kb , the epidemic ended in total absence of infection cases. Now we turn to Fig. 6 , which shows the plots (Fig. 4b and c ) on a large scale in the vicinity of the threshold value at n (3) = 166 when a second wave occurs. We see that in this case, the drop in the number of infections per day has no time to reach unity, since there is already a start in the rise of ( ) kb n λ . As a result, the balance between recoveries and infections shifts in favor of infections and a second wave starts. Despite the somewhat exotic nature of this scenario, its occurrence seems quite feasible in the form of local territorially isolated clusters Both these variables are cited in statistical data for epidemics, including the one of the most signifi cant current importance, viz. the COVID-19 epidemic. If the assumptions underlying this model are true, there must be a correlation in the statistical data of form (17), and the sign of the best implementation of this correlation can be used to determine n τ . To test this hypothesis, statistical data for Germany have been selected due to the author′s subjective opinion that, owing to the proverbial pedantry and orderliness of Germans, one can expect a high degree of data reliability. The materials henceforward have been taken from websites [15] . In what follows, we shall designate purely empirical date with a circumfl ex accent (up-arrow). Use has been made of two data arrays ˆn y and ˆn z covering a period of 120 days for March 3 to June 30, 2020 (n = 0, 1, … , 119) and representing the total number of those who have contracted the disease and those infected and manifesting symptoms by day n in Germany. This means that the array identifi ed as must correlate with the array ˆn z at a proper selection of n τ . The standard deviation Z n (n τ ) from ˆn z for the random n τ is determined as Figure 7 shows a diagram of Δ Z (n τ ) values in the range of the most realistic n τ values from which we can see that the best fulfi llment of relation (17) occurs at n τ = 14, and Fig. 8 shows a compression of statistical data on the number of infection cases, with them being calculated by (18) at n τ = 14, which confi rms a good correlation between Z n (n τ ) and ˆn z . Yet another welcome argument in favor of this choice of n τ is its coincidence with the established duration of a COVID-19 quarantine which is most likely not accidental. To be fair, it should be noted that, in contrast to Germany, for several other considered regions (Belarus, Italy, Sweden, etc.) such a good correlation at n τ = 14 is only observed at the initial epidemic stage, and then the statistical number of infection cases begins to exceed the calculated one. Let us illustrate this using the example of the data on the world pandemic. Figures 9 and 10 show the data for the fi rst 75 days of the world pandemic (through April 6), which are similar to those presented in Figs. 7 and 8 for Germany. As we see, here, at Δ Z = 14 the correlation of the two arrays is even more pronounced that it was in the previous case. However, after April 6, there is a sharp division of statistical and calculated data (Fig. 11) . Note that the same division in the case of Italy and Sweden begins in early April too, and for Belarus, it starts two weeks later. The causes of this division are not quite clear and can fi nd several explanations. This can be a change in the procedure of accounting for the data on infections (see Note), an abrupt expansion of the amount of testing, the impact of symptom-free patients, the absence of account for the territorial distribution of infections (a pandemic as a sum of separate relatively independent epidemics) in the simulation concept set forth herein, etc. Of interest is the hypothesis that the immunity acquired after the illness is of temporary rather than constant nature due to either virus mutation or the degradation of antibodies. In any case, this question requires separate additional investigations. In simulating real epidemics, variations in the parameter n τ do not have signifi cant effects on the simulation accuracy, since they are compensated for by variations in the λ kb adjusted for empirical data; therefore, henceforth, we will assume n τ = 14. Simulation of the COVID-19 Pandemic. Starting the simulation of real epidemics, we note that, if we have statistical data for a certain period for n τ , it is possible to solve an inverse problem and to fi nd empirical values of the growth indicator, expressing it from Eq. (13): (solid line). The blue color on the line is used to highlight the portion of data that were used for analytical approximation, and the green color was used for extrapolation. Fig. 13 . Simulation of the pandemic forecast based on a kinetic model: number of infection cases to date y n ; b) number of infections per day D n . The red circles (a) and the bars with circles (b) are statistical data; the solid lines are the simulation. The blue color on the lines is used to emphasize the portion of data that were used for analytical approximation, and the green color was used for the forecast. If these data are quite regular in nature, they can be approximated by a suitable analytical expression, and then this expression, in combination with Eq. (13), will represent an analytical model of the epidemic spread. We carry out this plan for the case of the spread of the COVID-19 pandemic. Use has been made of the data on its dynamics through August 6, 2020 (they are shown below in Fig. 13 ). According to (20), these data were used to calculate the values of the empirical growth indicator shown in Fig. 12 . As we see, after the variable initial stage, the said indicator approaches a regular regime making it possible to safely extrapolate its analytic representation beyond the approximation zone. The data obtained for approximating the recessive and convex zones into the variation of ( ) kb n λ by fourth-order polynomials, and the regular third zone was approximated by a power function with an additive constant. This function also contains the gauge factor C n , which is fi rst assumed to be unity, and then is adjusted by calibration for the results of preliminary simulation. The coeffi cients α k , β k , γ, and ε were found by the Levenberg-Marquardt method to minimize the residual of the analytical function (21) (at C n = 1) and the numerically assigned dependence of two variables (the Minerr procedure in the Mathcad package). After this, the correction of function (21) was carried out using the search for the value of C n such that it ensures the minimum standard deviation of an appropriate solution from the empirical data. In all this procedure, use was made of statistical data only until June 7 (0 < n ≤ 137, blue zone on the analytical approximation curve, Fig. 12) . Then, the obtained dependence was extrapolated to the period until August 6 (the green zone, Figs. 12 and 13 ). The numerical values of the coeffi cients α k , β k , γ, ε, and C n are shown in Table 1 . Thus, Eq. (13), in which λ kb was replaced by function (21) and the initial condition is the14 fi rst values of ˆn y , is an analytical model of the pandemic in which the results of the last 60 days (after June 7) will be a forecast obtained from the statistical data of the fi rst 137 days. The results of simulation are given in Fig. 13 ; as in the earlier cases, the green zone on the calculated curves corresponds to the forecast. As we see, if this forecast had been made two months before August 6, it would have been quite successful during the subsequent two months. Conclusions. Thus, at the regular stage in the spread of an epidemic, the developed mathematical model can yield quite accurate forecasts for a period of about two months, having the advantage of simplicity in this case. It has been established that this model predicts the possibility of existence of a quasi-steady-state regime of the epidemic in which the number of infection cases is constant due to the balance in the daily increment of infections and recoveries. Conditions have been identifi ed under which such a regime can be followed by the occurrence of a second epidemic wave. Calculation of the empirical growth indicator by formula (20) identifi es regular regime periods in the epidemic development, which is important from a practical standpoint. On the other hand, this fact serves as evidence that this model is consistent with the experimental data. b, average number of differing contacts per day; c d , share of fatal outcomes in infection cases; d, number of the deceased; k, likelihood of infection transmission from a sick person to a healthy individual in a single contact; N 0 , population of a country or region; n, serial number of a day calculated from the day of the epidemic beginning; n τ , number of days of illness; t, time; u, number of illness-free cases; x, number of the recoveries; y, number of infection cases; z, number of disease-affected cases; Δ Z , standard deviation of the calculated number of disease-affected cases from the registered one; τ, average duration of the disease; λ kb , growth indicator; ( ) kb n λ , empirical growth indicator. Subscripts: .ˆ, symbol designating empirical values of parameters. Eco-epidemiological assessment of the COVID-19 epidemic in China CoronaTracker: Worldwide COVID-19 outbreak data analysis and prediction Analysis and forecast of COVID-19 spreading in China, Italy and France A model to predict COVID-19 epidemics with applications to South Korea, Italy, and Spain. medRxiv Dynamics of COVID-19 epidemics: SEIR models underestimate peak infection rates and overestimate epidemic duration Monitoring Italian COVID-19 spread by an adaptive SEIRD model. medRxiv Why is it diffi cult to accurately predict the COVID-19 Prediction of the Development of Coronavirus Infection Rate in the Republic of Belarus in the Spring and Summer of 2020 Two Waves in the Dynamics of Coronavirus Rate in the Republic of Belarus A contribution to the mathematical theory of epidemics A mathematical model to investigate the transmission of COVID-19 in the Kingdom of Saudi Arabia Equations of Mathematical Biology General theory of differential equations with delayed argument Investigation into the equation of diffusion combined with the substance growth and its application to one biological problem Acknowledgments. The author extends his gratitude to his colleagues P. S. Grinchuk, V. L. Kolpashchikov, and S. P. Fisenko for stimulating discussions, and also to O. V. Molochko for preparation of statistical data.