key: cord-0154306-qsuy7pgl authors: Rzadkowski, Grzegorz title: Logistic wavelets and logistic function: An application to model the spread of SARS-CoV-2 virus infections date: 2020-10-18 journal: nan DOI: nan sha: 5ddaffebaf475fe466cdc5c82d0dc1283e6ec620 doc_id: 154306 cord_uid: qsuy7pgl In the present paper, we model the cumulative number of persons reported to be infected by the SARS-CoV-2 virus, in a country or a region, by a sum of logistic functions. For a given logistic function, using Eulerian numbers, we find the zeros of its successive derivatives and their relationship with the saturation level of this function. In a given time series, having potentially the logistic trend, we use its second differences to determine points corresponding to these zeros. To estimate the parameters of the approximating logistic function, we define and use logistic wavelets. Then we apply the theory to the cases of SARS-CoV-2 infections in the United States and the United Kingdom. The mathematical modeling of epidemics has a long history; it began with the Kermack-McKendrick model [8] , introduced in 1927. In this seminal paper, the whole population is divided into Susceptible, Infectious, and Recovered sub-populations. Then, some ordinary differential equations are formulated specifying the time evolution of the functions representing these sub-populations. Wavelet analysis is now frequently used to extract information from epidemiological and other time series. Grenfell et al. [7] introduced wavelet analysis for characterizing non-stationary epidemiological time series. Cazelles et al. [1] use the Morlet wavelets for applications in epidemiology. Lavrova et al. [10] modeled the disease dynamics caused by Mycobacterium tuberculosis in Russia using a sum of two logistic functions (3) (bi-logistic model). SARS-CoV-2 initially emerged in China, at the end of 2019; after Chinese scientists identified the sequence of the new virus [17] , this information was shared with the international community. Since then, a lot of articles were written and published, describing from different points of view, the new SARS-CoV-2 coronavirus and the COVID-19 disease, caused by the virus. We will point out only some of them. Fokas et al. FDK used a generalization of the logistic function for forecasting the number of individuals reported to be infected with SARS-CoV-2 in different countries. Krantz et al. [9] proposed a two-phase procedure (combining discrete graphs and Meyer wavelets) for constructing true epidemic growth. A method similar to that one from Lavrova et al. [10] was used by E. Vanucci and L. Vanucci [16] for predicting the end date of Covid-19 disease in Italy. The outline of the present paper is as follows. In Sec. 2, we discuss the basic properties of Riccati's equation, logistic equation, and logistic curve. For this purpose, we use Eulerian numbers. Sec. 3 and Sec. 4 are devoted to logistic wavelets. In Sec. 5 we model the cumulative number of persons reported to be infected by SARS-CoV-2 in the United States as a sum of several logistic functions. We use the following convention for the Fourier transform: where f ∈ L 1 (R) ∩ L 2 (R). The logistic equation is defined as where t is time, u = u(t) is the unknown function, s, u max are constants. The constant u max is called the saturation level. The integral curve u(t) fulfilling condition 0 < u(t) < u max is known as the logistic function. After solving (2) we get the logistic function in the following form where t 0 is its inflection point, which is related to the initial condition u(0) = u 0 = u max 1 + e st0 , therefore Equation (2) is a particular case of Riccati's equation with constant coefficients The constants r = 0, u 1 , u 2 can be generally real or complex numbers. If u(t) is a solution of (4) then it is known a formula for the nth derivative u (n) (t) (n = 2, 3, 4, . . .) of u(t) expressing it as a polynomial of the function u(t) itself: where n = 2, 3, . . . and n k denotes the Eulerian number (number of permutations of the set {1, 2, . . . , n} having k, (k = 0, 1, 2, . . . , n − 1) permutation ascents, see Graham et al [6] ). The first few Eulerian numbers are given in the Table 1 . Formula (5) was discussed during the Conference ICNAAM 2006 (September 2006) held in Greece and it appeared, with an inductive proof, in paper [11] (see also [12] ). Independently the formula has been considered and proved, with the proof based on generating functions, by Franssens [5] . The polynomial of u, of order (n + 1), appearing on the right-hand side of (2) is known in the literature as a kind of the so-called derivative polynomials. It is easy to see that all (n + 1) roots of the polynomial are simple and lie in the interval [u 1 , u 2 ]. The derivative polynomials have been recently intensively studied. Formula (5) applied to the particular case of the logistic equation (2) is as follows: The polynomial of the variable u and of order (n + 1) on the right hand side of (6) is uniform in the sense of the following. Remark 1. If u 0 is a root of the polynomial on the right hand side of (6), i.e., then dividing both sides of (7) by u n+1 max we get Thus u 0 is a root of the derivative polynomial on the right hand side of (6) if u 0 /u max is the root of the polynomial Let us write down, using formula (6) and the notation of (8), the first few derivatives of the logistic function, which fulfills equation (2) . By Remark 1 we can assume, without loss of the generality, that u max = 1 and s = 1. We obtain successively: All roots of the polynomials P k (u) for k = 3, 4, 5, 6 can be calculated explicitly, so the polynomials can be factored and we get Therefore the minimal positive root of the polynomial Thus by using Remark 1 we see for example that if at a minimal time t 1 , u ′′′ (t 1 ) = 0 (t 1 is simultanously a maximum of u ′′ (t)) then the value of the logistic function at this point is Similar conclusions can be drawn for the smallest zero of the u (4) (t) (polynomial P 5 (u)) or u (5) (t) (polynomial P 6 (u)) using constants (9). Let a wavelet ψ 2 (x) (see Figure 1 ) be the second derivative of the logistic function u(x) = 1 1+e −x . Since u ′ (x) = −u(u − 1), then by (5) or directly we get and by (11) it follows that the wavelet has the following exact form Changing the variable u = 1 1+e −x , u ′ (x) = u(1 − u) in the following three integrals we calculate which proves that ψ 2 (x) ∈ L 1 (R) ∩ L 2 (R). In fact ψ 2 (x) ∈ S(R) (the space of rapidly decreasing functions on R). We will discuss this in the next section. By ψ + 2 (x) we denote the positive part of ψ 2 (x), i.e., The Fourier transform of ψ 2 (x) is as follows: It is well known (see [2] ) that a wavelet ψ(x) ∈ L 1 (R) ∩ L 2 (R) should satisfy the following admissibility condition We will show that for ψ 2 (x) the condition (15) is satisfied and even the integral can be expressed in a closed form in terms of the Riemann zeta function. Namely, using (14) and the following formula from Dwight's Tables [3] (item no 860.519): we have We generate a doubly-indexed family of wavelets from ψ 2 by dilating and translating, where a, b ∈ R, a > 0 and denote by ψ + a,b 2 (x) the positive part of ψ a,b 2 (x). Similarly as in the previous section we define a wavelet ψ n (x) to be the nth (n = 3, 4, ...) derivative of the logistic function u(x) = 1 1+e −x . Figure 2 shows graph of the wavelet ψ 3 (x). Thus (5) gives By definition, the function ψ n (x) is an even function for odd n and an odd function when n is even. The numerator of the expression (18) is a polynomial of degree n of the variable e −x , while the denominator of degree n + 1. Therefore for any polynomial p(x) we have lim x→−∞ p(x)ψ n (x) = 0. Since ψ n (x) has the symmetry property then also lim x→∞ p(x)ψ n (x) = 0. The last conclusion can also be drawn from multiplying the numerator and the denominator of (18) by e (n+1)x . From this and from the fact that ψ k+1 (x) = ψ ′ k (x) for any integer k ≥ 2 it follows that ψ n (x) ∈ S(R), (n = 2, 3, . . .). By (14) we haveψ Now using (19) and once again formula (16) we can calculate the integral of the admissibility condition (15) as follows: As usually we generate a doubly-indexed family of wavelets from ψ n by dilating and translating, where a, b ∈ R, a > 0, n = 2, 3, . . .. Denote by y * n total cumulative number of individuals reported to be infected up to nth day in a country or a region and by y n the 7-day central moving arithmetic average for the sequence y * n , i.e., We will look, in the sequence (y n ), for points corresponding to the zeros of the second or the third derivative of the logistic function. This is equivalent to detect the points, where the sequence of second differences, ∆ 2 y n = y n+1 − 2y n + y n−1 , takes a value close to zero or a maximum respectively. We will find these points either directly by observing the sequence of second differences (∆ 2 y n ) or detect them by using the wavelet ψ 2 (x) and its positive part ψ + 2 (x). From the considerations in Sec. 2 and from (10) it follows that parameter b should be determined as that point where the sequence (∆ 2 y n ) changes sign. Parameter a should be chosen in such a way that the distance between the zero and the maximum of (∆ 2 y n ) was approximately 1.319a. Thus, we obtain two parameters defining the first logistic function (first wave) approximating the time series (y n ). It remains to determine the third parameter of the first wave, i.e., its saturation level y max . Assuming that (y n ) initially follows a logistic function y n ≈ y(n) = y max and since by definition it holds then by (13) we get successively n ∆ 2 y n ψ + a,b Using (21) we can estimate y max as follows Parameters a and b can also be estimated by maximizing locally the integral on the left-hand side of (21). Thus we find in the sequence ∆ 2 y n the best pattern corresponding to the positive part of the wavelet ψ a,b 2 . To avoid the situation that the next wave, immediately following the previous one, could distort our findings we use here the positive part ψ + a,b 2 , not the whole wavelet ψ a,b 2 . The saturation level of the first wave can also be estimated as twice the value of the sequence (y n ) at the point where (∆ 2 y n ) changes signs (inflection point) or its maximal value multiplied by 1/0.211 (zero of the third derivative). Having found the values of the parameters a, b, and y max for the first wave, we create a new time series by subtracting the first wave from y n , i.e., z n = y n − y max 1 + exp(− n−b a ) , and with the sequence z n we proceed in the same way as with y n , calculating successive logistic waves. After this we use the nonlinear Generalized Reduced Gradient method to optimize the values of saturation levels (but not a's and b's). All data were collected from the https://www.worldometers.info/coronavirus/ platform. Let us use the theory to build a model for the total cumulative number of individuals reported to be infected by SARS-CoV-2 in the USA. We assumed the observation period of 189 days, from March 13 (n = 1, the first day when the number of cases exceeded 2, 000) to September 17, 2020 (n = 189). All calculations were performed in Excel. Using the above-described procedure we have got the approximating function as a sum of the following waves f (x) = . (23) Figure 3 shows the positive part of the wavelet (scaled) ψ + a,b 2 , a = 25, b = 6.6 fitted to the second differences (∆ 2 y n ). Figure 4 shows the total cumulative number of individuals reported to be infected by SARS-CoV-2 in the USA in the period: March 13 (n = 1) -September 17 (n = 189), (blue points) and the approximating function f (x) (23), (red points). The |y n − f (n)| y n + f (n) = 0.01195. The same procedure applied to the total cumulative number of individuals reported to be infected by SARS-CoV-2 in the United Kingdom for the period of 201 days from March 13 (n = 1) to September 29 (n = 201) gives the following approximating function (see Figure 5 ) . The approximation errors are: MAD = 1, 443.37, SMAPE 1 = 0.05955 (for the shorter period of 182 days, beginning from April 1, SMAPE 1 = 0.007123), SMAPE 2 = 0.003047. In the paper, we have proved, by using the Eulerian numbers, properties of the logistic function related to zeros of its successive derivatives. We also used logistic wavelets to estimate the parameters of a logistic curve that best fits to a given time series with a potential logistic trend. Then we described, based on the data from the United States and the United Kingdom, that the total reported number of SARS-CoV-2 infections can be modeled, in a natural way, as a sum of several logistic functions. The theory and the procedure can be applied to model the number of infections in any country or a region. In our further work, we intend to use, in a similar way, the logistic wavelets of higher order (see Sec. 4). Using some appropriate special numbers we are going to define analogous wavelets for the Gompertz function (see some initial calculations [13] , [14] ) or for the fractional logistic functions (some preliminary theorems see [15] ). Wavelet analysis in ecology and epidemiology: impact of statistical tests Philadelphia: SIAM, 1992. CBMS-NSF regional conference series in applied mathematics Tables of integrals and other mathematical data Mathematical models and deep learning for predicting the number of individuals reported to be infected with SARS-CoV-2 Functions with derivatives given by polynomials in the function itself or a related function Concrete Mathematics: A Foundation for Computer Science Travelling waves and spatial hierarchies in measles epidemics A contribution to the mathematical theory of epidemics True epidemic growth construction through harmonic analysis Bi-logistic model for disease dynamics caused by Mycobacterium tuberculosis in Russia Eulerian numbers and Riccati's differential equation Derivatives and Eulerian numbers On some connections between the Gompertz function and special numbers The Gompertz function and its applications in management Some applications of the generalized Eulerian numbers Forecast Covid-19 end date in Italy by logistics waves A pneumonia outbreak associated with a new coronavirus of probable bat origin This research was partially funded by the 'IDUB against COVID-19' project granted by the Warsaw University of Technology (Warsaw, Poland) under the program Excellence Initiative: Research University (IDUB).