key: cord-0581119-v3l1ziqv authors: Pelinovsky, Efim; Kurkin, Andrey; Kurkina, Oxana; Kokoulina, Maria; Epifanova, Anastasia title: Logistic equation and COVID-19 date: 2020-08-13 journal: nan DOI: nan sha: 62f50b0716721c77c33763ac0784b9b3e192a126 doc_id: 581119 cord_uid: v3l1ziqv The generalized logistic equation is used to interpret the COVID-19 epidemic data in several countries: Austria, Switzerland, the Netherlands, Italy, Turkey and South Korea. The model coefficients are calculated: the growth rate and the expected number of infected people, as well as the exponent indexes in the generalized logistic equation. It is shown that the dependence of the number of the infected people on time is well described on average by the logistic curve (within the framework of a simple or generalized logistic equation) with a determination coefficient exceeding 0.8. At the same time, the dependence of the number of the infected people per day on time has a very uneven character and can be described very roughly by the logistic curve. To describe it, it is necessary to take into account the dependence of the model coefficients on time or on the total number of cases. Variations, for example, of the growth rate can reach 60%. The variability spectra of the coefficients have characteristic peaks at periods of several days, which corresponds to the observed serial intervals. The use of the stochastic logistic equation is proposed to estimate the number of probable peaks in the coronavirus incidence. Already in this century, several global epidemics have broken out (bovine spongiform encephalopathy, avian influenza, severe acute respiratory syndrome (SARS), etc.). The latest coronavirus epidemic struck everyone with its scale and affected literally all countries forced to take emergency measures to prevent the infection spread of (closure of state borders, quarantine, self-isolation, temporary work break of many enterprises and institutions, transition to distance work and training). The number of people infected in the world exceeds 4.89 million people (the data from end-May 2020), and the number of deaths is more than 320,000 people. General information about this viral infection can be found on the Internet. The dynamics of the disease spread is illustrated in Fig. 1 , built according to the World Health Organization (WHO) website (https://www.who.int/emergencies/diseases/novel-coronavirus-2019/situation-reports) on 05/20/2020. In this figure, the growth in the number of coronavirus cases in the world and in several countries is indicated in a semi-logarithmic scale. The dashed lines show exponential asymptotics corresponding to doubling the number of cases in a certain number of days. Asterisks indicate the days when countries introduced restrictive measures. As one can see, the nature of the epidemics spread in each country follows almost the same scenario, first there is an exponential growth (or close to exponential) of the number of infected people, and then this growth slows down (however, the numerical values of the constants describing these curves are different for different countries). In some countries, the number of 3 cases is no longer increasing, so the coronavirus epidemic in these countries is almost over. In other countries, the curves in these coordinates are still almost straight lines, which means an exponential increase in the number of cases, and the epidemic has not yet reached its peak. In general, these curves are quite smooth, although some of them show bends associated with the action of certain quarantine measures. Fig. 2 are not smooth, and sporadic outbreaks of the number of cases are noticeable in them, which is caused by many, often unpredictable reasons. These data show that in the dynamics of the epidemic spread there are different scales from several months (the total epidemic duration), to several weeks (the incubation period), and even up to several days (the serial interval and local causes). Some of the scales are associated with certain virus properties, otherswith the action of the state and local authorities that introduced restrictive rules. The 4 noted features of the dynamics of the COVID-19 virus spread can be reproduced in mathematical models. To explain the spread of epidemics and predict their consequences, a number of mathematical models of different complexity levels are used. Historically, the first model is the Verhulst logistic equation [1] , representing a nonlinear first-order ordinary differential equation (ODE) with constant coefficients. It is also used as the simplest model to describe the population growth and advertising performance. Qualitatively, it explains the increase in the number of disease cases over the time presented in Fig. 1 : the exponential increase in the number of infected people at the initial stage of the epidemic development and the tendency towards a constant value by the end of the epidemic. In the context of COVID-19, this model is used in [2] , [3] . The COVID-19 data analysis given in [4] , showed that an exponential increase in the number of cases at the initial stage is found mainly in America and Australia, while in many European countries it is a power law. In this case, one can use the generalized logistic equation [5] , [6] , and it was used in [3] , [7] , [8] , [9] . From the mathematical point of view, the dynamics 5 in the framework of the logistic equation is trivial. More complex dynamics, including chaotic, arise in the different logistic equation or when the delay for the incubation period is accounted for [10] , [11] , [12] , [13] , [14] , and these models are also used to interpret and forecast COVID-19 [15] , [16] , [17] . In more complex models, people are divided into different groups: (S) The susceptible class: those individuals who are capable of contracting the disease and becoming infected, (I) The infected class: those individuals who are capable of transmitting the disease to others, and (R) The removed class: infected individuals who are deceased, or have recovered and are either permanently immune or isolated, so the mathematical model called SIR model and its generalizations, includes a higher-order ODE system. The dynamics of such systems has not yet been sufficiently studied, and stochastic oscillations are possible in it [18] , [19] , [20] , [21] , [22] , [23] , [24] , [25] , [26] . However, models of this level can be comparatively easily implemented, they have shown their effectiveness and are actively used to model the distribution of COVID-19 [27] , [28] , [29] , [30] , [31] , [32] , [33] , [34], [35] , [36] , [37] , [38] . There are also models that take into account, for example the super-spreading The statistical methods to forecast the epidemic development, based on Poisson statistics, are also worth mentioning [42], [43] , [44] , [45] . The main difficulty in applying mathematical models is associated with the uncertainty of the choice of coefficients in the equations. The more complex is the model, the larger is the number of its coefficients. The experience of using models to interpret "old" epidemics may not always help, since the intensity of the virus impact on living organisms changes, many epidemics were local, and, accordingly, measures to prevent the epidemic spread were different. The pattern of the curves shown in Figs. 1 and 2, shows their strong differences for different 6 countries, which is associated with different population density, differences in their customs, traditions and administrative preventive measures. Therefore, any forecasts at the initial stage of the epidemic development regarding its final stage are very rough and unreliable. As the epidemic develops, more and more constants in the equations can be determined from medical databases, but the previous constants are also corrected. Therefore, in essence, for prognostic purposes, equations with variable coefficients are solved, which mathematical properties (existence, convergence and stability) are not defined. As a result, different models with permanently "corrected" coefficients can lead to close forecast results for a short time. At the same time, for long-term forecasts, it is necessary to understand the possible temporal variability of the model coefficients, and their influence on the character of the obtained solutions. In this study, we will try to assess the character of the scatter of the logistic model coefficients and its generalizations on the basis of the currently available COVID-19 data. The data of the epidemic development were used for the following countries: Austria, Switzerland, the Netherlands, Italy, Turkey and South Korea. Section 2 presents the classical logistic equation and shows the calculations of the coefficient average values within this equation for the above mentioned countries. It has been shown that this model with a high determination coefficient is suitable to describe the number of patients with coronavirus in most countries, except for South Korea. To take into account the data randomness on the number of cases per day, it is proposed to switch to a stochastic logistic equation with external force. The spectral and statistical properties of random parameters of this equation are investigated. Section 3 describes the same procedure within the framework of the generalized logistic equation. It is shown that, on average, this model is suitable for all the countries listed above with a high determination coefficient. Section 4 summarizes the results. 7 Here we will give briefly the main information on the logistic equation theory written in the standard ODE form where N(t) is the total number of people affected by the epidemic, N  is the maximum number of the infected people during the whole epidemic, and r is the growth rate of the epidemic. The solution of this equation with constant coefficients can be easily found in the form where N 0 is the initial number of the infected people and t is the time from the beginning of the epidemic. At the initial stage of the epidemic, it can be represented by an exponential function and, if this curve approximates the increase in the number of cases at the initial stage well, we will be able to determine the growth rate r. At the same stage, the logistic model can be rejected if the data do not fit in with the exponential dependence. At the same time, the most important characteristic for predictionthe maximum possible number of the infected people N can be estimated only at the stage of the noticeable difference between the data and the exponential curve, when the number of sick people is already not small. To prepare medical institutions to function in an optimal way during an epidemic, it is important to know the number of infected people per day, which is easily obtained when Eq. (2) is differentiated 8 and this curve is nonmonotonic with the maximum given by max 4 which corresponds to the time (the epidemic peak) As it can be seen, these characteristics (Eqs. (5) and (6)) can only be estimated when the data are no longer described by an exponential curve and both model parameters r and N  are found or known. Let us note that the time dependences (2) and (4) are smooth functions, while from Fig. 2 it follows that dependence (4) must be non-smooth and irregular. The study of the resulting irregularity is carried out below. Since medical statistics operates with the cases per day, it is in fact necessary to solve the difference logistic equation After removing the index n, we obtain a simple relationship between the number of cases per day (K) and the total number of cases (N) which in these variables is a parabola. As an example, we will take the data on the coronavirus incidence in several countries where the epidemic is close to its end (at least its active phase is over). These countries are Austria (number of points 58), Switzerland (58 points), the Netherlands (64 points), Italy (72 9 points), Turkey (49 points) and South Korea (number of points 94). We will operate with the data on the 04/23/2020); they are taken from the WHO data (https://www.who.int/emergencies/diseases/novel-coronavirus-2019/situation-reports). Figure 3 shows the relationship between the number of cases per day (K) and the total number of cases (N) for each country. Parabolic approximations (the solid lines) arising from (8) are shown as well. Evidently, the parabolic approximation of the available data is good enough for almost all of the listed countries (R 2 > 0.8), but obviously has low accuracy for South Korea (R 2 ~ 0.55). Therefore, later in this section we will not use the data on South Korea, for which the logistic model is not suitable (this case is analyzed in the next section). Despite a good approximation of the data for most countries of the logistic curve, the scatter of points near the parabolic curve is still not small; it indicates that it is necessary to consider the coefficients of the parabolic curve as the time functions, which, in essence, is done in the forecasts when these coefficients are refined when new data appear. Let us, for example, change only the coefficient r. Within the framework of the logistic model, this coefficient variability can be determined from the available data by using the following formula, arising from (8): Actually, there are two ways of analyzing this coefficient; it can be either a function of the number of cases (N) or time (t). In the former case, the logistic equation remains to be the ODE with constant coefficients, but it has a rather complex nonlinearity. In the latter case we come to the ODE with variable coefficients. We study both possibilities of changing the growth rate coefficient r. Fig. 4, top, shows the variability of the function r(N) for Austria and Switzerland. For convenience, we switched to a dimensionless variable   r r r r norm /   , for its variability to be more obvious. Herewith we didn't take into account the first few days, when nothing is clear with the epidemic, and the last few days, when the epidemic was essentially over, since these points correspond to the small values of the denominator in Eq. (9) . Reducing the number of points, of course, affects somewhat the average value of this coefficient (for Austria 0.225 instead of 0.195 as in Fig. 3 , 0.2 instead of 0.16 for Switzerland), but more important is the demonstration of variability of the coefficient r. Functional dependence r(N) can be rewritten in more familiar terms of temporal variability r(t), presented in Fig. 4, bottom, where it is demonstrated that this coefficient changes almost every day. As an example, let us give the amplitude spectrum of the coefficient variation r, relative to the average, for Austria and Switzerland (Fig. 5, left) . Peaks corresponding to intra-weekly variability are clearly visible on the spectrogram due to the fluctuation properties of the epidemic spread, which are different inside condominiums with different apartment numbers or on farms far from each other. In fact, changes in model coefficients can be considered to be random. The probability distribution of the same coefficient for Austria and Switzerland is characterized by the probability density (Fig. 5, right), which is well described by a normal curve. The standard deviation is not small (60-70%), that fact speaks once again about the necessity to take into account the growth rate variability in the epidemic dynamics. Similar conclusions can be drawn for other countries, but we will not consider them in detail. From the analysis given above it is clear that, on average, the epidemic development in a number of countries is well described by the logistic equation with constant coefficients. 12 However, to give a more detailed understanding of variations in the number of cases per day, it is reasonable to consider a stochastic logistic equation or its difference analogue, in the general case with two random functions. From the point of view of the available data, the coefficients can be also considered to be random functions of the case number. The properties of a stochastic equation (Eq. (10)) have been investigated yet. In fact, another way to take into account the initial data irregularity is possible, namely, the introduction of an external random force in Eq. (1), as is often done in problems of mechanics: considering all the coefficients constant. The external force is easily found from the available data using the equation following from Eq. (11) : 1 14 Fig. 6 shows the dependence of the "external force" f calculated by formula (12) or its difference analogue with an external random force, depending on the number of cases or on time. It will explain the irregularities in the number of cases per day, and the appearance of several peaks of incidence and their duration, which are not predicted by the deterministic logistic equation. We will now consider a more general model of a logistic equation containing four constants [5] , [6] When 1   , Eq.(14) coincides with Eq. (1). Again, our goal is not to solve the equation, but to investigate the relationship between the number of cases per day (K) and the total number of the infected people (N), which is expressed by the algebraic curve resulting from (14): Let us consider this model applicability to the description of the development of the COVID-19 epidemic in the same countries as above: Austria, Switzerland, the Netherlands, Italy, Turkey and South Korea. Figure 8 growth rate variability is given by the formula generalizing Eq. (9). Fig. 9 shows the dependence   Similarly, we can relate the data discrepancy with the theory using the "external force" introduced analogously to Eq. (12) : 21 Fig. 11 shows the calculated dependences of external force on the number of patients and on the time. Its spectral and probability characteristics are illustrated by Fig. 12 . The corresponding graphs are similar to those within a simple logistic curve. We would like to emphasize once again that a larger number of countries are properly described by this model, in particular South Korea, demonstrating the qualitatively identical nature of the variability of the logistic model coefficients. 23 or with non-linear functions depending on the number of cases N. Summarizing the results, we would like to emphasize, that with all its simplicity and crudity, the logistic model describes properly the growth in the number of COVID-19 cases with time. This is illustrated by Fig. 13 , which shows the actual data and logistic curves. It is evident that for many countries the use of the simple logistic equation leads to a very good agreement with the available data. The use of a generalized logistic curve improves the agreement significantly, including the countries for which the logistic model is too crude (for example, South Korea). It is worth noting that the prognostic number of the total number of cases N  in the generalized logistic model is slightly higher than in the simple logistic model, and the approach to the limiting constant value is delayed in time. Fig. 14 illustrates the capabilities of the logistic model for describing the number of sick people per day. On average, the theoretical model describes the real data rather well, but the scatter of points is still not small, and sometimes deviations can reach 50% and higher, although on average they are less than 50%. These differences are especially evident near the epidemic peak when it is desirable to have a more accurate prognosis for medical facilities. The extent to which this scatter is better described by other models (such as SIR models) will be clear in the near future when the results of relevant studies appear. From the mathematical point of view, the resulting difference in the use of the logistic model to describe two characteristics: the total number of cases (N) and the number of cases per day (K) is obvious: the curve N(t) is the integral with respect to K(t), and, therefore, it is smoother and more determined. To describe the dependence K(t), at least on a qualitative level, it is better to use stochastic equations of a logistic model of the Eq. (19) type, where external random forces or random coefficients are introduced. They will help to understand the degree of the data spread, and, what is most important, the number of possible large outliers during the epidemic. Such work remains to be done. The authors understand that for the real forecast of the epidemic development, it is necessary to have multifactor models, which include dividing the population into different groups (children, the elderly, etc.), living conditions (traffic flows between territories, the population density etc.). Such models should include high-order ODEs and PDEs, taking into account lagging arguments and integral terms. Such complex models are being developed now, yet we will not consider them here. Nevertheless, the analysis within the framework of simple low-parameter models is important because it allows us to describe the process qualitatively, to understand the role of certain factors, and to identify certain phenomena (stochastization, fractality, nonlinearity) that are also interesting for other branches of physics and technology. In this sense, the results obtained above demonstrate the capabilities of a well-developed logistic model for describing an epidemic of such a grand scale as COVID-19. Notice sur la loi que la population suit dans son accroissement Mathematical model of infection kinetics and its analysis for COVID-19, SARS and MERS Dynamics of the COVID-19--Comparison between the theoretical predictions and real data Patterns of the COVID19 epidemic spread around the world: exponential vs power laws Logistic growth rate functions Modeling risk of extreme events in generalized Verhulst models Can we predict the occurrence of COVID-19 cases? Considerations using a simple model of growth Generalized logistic growth modeling of the COVID-19 outbreak in 29 provinces in China and in the rest of the world A simple mathematical model for the evolution of the corona virus Journal für die reine und angewandte Mathematik Applied problems of mathematical modeling in immunology Mathematical Immunology of Virus Infections Global attractivity of the zero solution for Wright's equation The Verhulst-Like Equations: Integrable OΔE and ODE with Chaotic Behavior Forecasting COVID-19 Mathematical Modeling of the Spread of COVID-19 in Moscow and Russian Regions Solvable delay model for epidemic spreading: the case of Covid-19 in Italy A stochastic differential equation SIS epidemic model Classification of asymptotic behavior in a stochastic SIR model SIRS-model with dynamic regulation of the population: Probabilistic cellular automata approach A multi-stage SIR model for rumor spreading Long-Term Analysis of a Stochastic SIRS Model with General Incidence Rates Dynamical analysis of a diffusive SIRS model with general incidence rate Failure of monotonicity in epidemic models Qualitative analysis of a stochastic SEITR epidemic model with multiple stages of infection and treatment Global dynamics of an epidemiological model with acute and chronic HCV infections Prospects and limits of SIR-type Mathematical Models to Capture the COVID-19 Pandemic Modeling infectious epidemics" A SIDARTHE model of COVID-19 epidemic in Italy Confinement strategies in a simple SIR model Effective containment explains subexponential growth in recent confirmed COVID-19 cases in China Accurate closed-form solution of the SIR epidemic model The comparison of trends in Spain and the Nederland: a Dynamical compartment model of the transmission of Coronavirus Why differential equation based models fail to describe the dynamics of epidemics? Data analysis and modeling of the evolution of COVID-19 in Brazil Analysis and forecast of COVID-19 spreading in China Deterministic Critical Community Size For The SIR System and Viral Strain Selection Novel Corona virus Disease infection in Tunisia: Mathematical model and the impact of the quarantine strategy Mathematical modeling of COVID-19 transmission dynamics with a case study of Wuhan A model based study on the dynamics of COVID-19: Prediction and control Characterization of the COVID-19 pandemic and the impact of uncertainties, mitigation strategies, and underreporting of cases in South Korea, Italy, and Brazil" Chaos, Solitons & Fractals this issue in press Predicting turning point, duration and attack rate of COVID-19 outbreaks in major Western countries COVID-19: Estimating spread in Spain solving an inverse problem with a probabilistic model The dynamics of natural selection in dispersal-structured populations Propagation analysis and prediction of the COVID-19 The presented results were obtained with the financial support of the grant of the President of the Russian