key: cord-0822470-xmq502gm authors: Cherednik, I. title: A surprising formula for the spread of Covid-19 under aggressive management date: 2020-05-02 journal: nan DOI: 10.1101/2020.04.29.20084483 sha: a4a68df7030045d3eca3351ff050141d0630a521 doc_id: 822470 cord_uid: xmq502gm We propose an algebraic-type formula that describes with high accuracy the spread of Covid-19 pandemic under aggressive management for the periods of the intensive growth of the total number of infections. The formula can be used as a powerful forecasting tool. The parameters of the theory are the transmission rate, reflecting the viral fitness and "normal" frequency of contacts in the infected areas, and the intensity of prevention measures. The duration of the period of intensive growth is essentially inversely proportional to the square root of the intensity of hard measures. A more precise formula is based on Bessel functions. The data for the USA, UK, Sweden, Israel are provided. Power law of epidemics. The simplest equation for the spread of communicable diseases results in exponential growth of the number of infections, which is mostly applicable to the initial stages of epidemics. See e.g. [1, 2, 3, 4] here and below and [5] about some perspectives with Covid-19. We focus on the middle stages, where the growth is no greater than some power functions in time, which requires a different approach and different equations. The equally classical logistic models of the spread, as well as the SIR, SID models an generalizations, assume that the number of infections is comparable with the whole population, which we do not impose. Major new epidemics were not really of this kind during the last 100 years, which is obviously due to better disease control worldwide. The reality now is the power-type growth of total number of infections U (t) after a possible short period of exponential growth, Covid-19 included. Our approach is based on this assumption. Generally, the rate of change of the total number of infections dU (t)/dt is mostly related to U (t) − U (t − p), where p is the period when the infected people spread the virus in the most intensive way. Assuming that U (t) = t α : dU (t)/dt = αt α−1 and the leading term of U (t) − U (t−p) is pαt α−1 = pλU (t)/t, i.e. it is essentially proportional to dU (t)/dt. However, if α > 1, there will be other terms in the expansion of U (t)−U (t−p) and the proportionality with U (t)/t can be only if either the virus transmission strength diminishes over time or we reduce our contacts over time when the total number of cases growths faster than linearly. With Covid-19, we attribute it mostly to the latter, to the reduction of the contacts over time of infected individuals with the rest of population, i.e. to behavioral and sociological factors. This is different from other power laws for infectious diseases; compare e.g. with [3] . Generally, if someone wants to "see" the trend of the epidemic using only the total number of infections to date, then U (t)/t is the best way of course. This is qualitative, but the corresponding differential equation dU (t)/dt = c U (t)/t immediately gives that U (t) = Ct c for some constant C, i.e. results in a power growth. This can be really seen for Covid-19 and other epidemics before the active management begins. The coefficient c is approximately c ≈ 2 for Covid-19 initially; when the management begins, c drops and then the Bessel functions describe the increase of U (t) till the "saturation", which is a technical end of the epidemic, though not its real end of course. Some details. The full theory is presented in [6] . If the number of new infections is proportional to the current number of those infected, then the exponential growth of the spread is granted. By analogy with news impact over time from [7] , and similar sociological-type processes, it is quite likely that the number of such contacts is proportional to the current total number of the infected individuals to date divided by the time to date . The coefficient of proportionality is essentially the intensity of the spread. Sociologically, it is related to the intensity of the discussion of the epidemic by the authorities in charge and everywhere, which directly influences our understanding the gravity of the situation and results in the reduction of our contacts. This can happen even before any active prevention measures begin. Starting with the equation dU (t)/dt = c U (t)/t for the total number of infections U (t), the coefficient of proportionality, the basic transmission rate c, is therefore a combination of the transmission strength of the virus and the "normal" frequency of the contacts in the infected place. There can be some other mechanisms for the "power law", including the biological ones. Self-isolation of infected species is common not only for humans ... unless for rabies; it can grow over time and when the intensity of the spread increases. Another possible mechanism can be due to the replication processes for viruses, but this is well beyond this article. Anyway a sociological approach to the spread, which "explains" under some assumptions the power growth of the number of total cases, is quite natural in our work, because the active managements of epidemics is clearly of sociological nature, applicable only to humans. The coefficient c is one of the two main mathematical parameters of our theory. It can be practically seen as follows. Before the prevention measures are implemented, approximately it reveals itself in the growth ∼ t c of the total number of detected infections, where t is time. For Covid-19, c is around 2, which results in the quadratic growth of total number of infections after a short period of its exponential growth. Upon the active management of Covid-19, the growth quickly becomes t c/2 , i.e. essentially linear, which is part of our theory in [6] . This can be observed in many countries. Ending epidemics. The "power law" is only a starting point of our analysis. The main problem is of course to "add" here some mechanisms ending epidemics and then prevent their possible recurrence. These are major challenges, biologically, psychologically, sociologically and mathematically. One can expect this megaproblem to be well beyond the power law itself, but we demonstrate that mathematically there is a path based on Bessel functions. The power growth alone obviously cannot lead to any saturation. We are of course fully aware of the statistical nature of the problem, but the formula for the growth of the total number of infections we propose works almost with an accuracy of fundamental physics laws. This is very surprising for such stochastic processes as epidemics. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 2, 2020. . https://doi.org/10.1101/2020.04.29.20084483 doi: medRxiv preprint An important outcome of our modeling is that the measures of "hard type", like detecting and isolating infected people and closing the places where the spread is almost inevitable, are the key for ending an epidemic. Moreover, such measures must be employed strictly proportionally to the current numbers of infections, not to its derivative of any kind, which is the most aggressive "momentum" way to react to them, the "hardest" possible as we will explain now. For instance, if the number of infections doubled during some period, then the operation formula from [6] requires that testing must be increased 4-fold. Assuming that we approach the saturation, where the number of new infections becomes almost zero, the "hard way" is to keep testing at the same level as before, i.e. performing the same number of tests every day. This is in spite of almost zero number of new infections. Practically, this is not always the case. If we react to the average number instead of the absolute number of infections, then we are supposed to stop testing and other measures when there are no new "cases". This is the "soft way". If this is coupled with "soft" measures (see below), then mathematically the epidemic will never reach the saturation point. The u-formula. The intensity of "hard" measures will be denoted by a in this article. Upon composing the corresponding differential equations and integrating them we obtain the following. The function for the Bessel function J α (x) of the first kind models the growth total number of infections, where C is the scaling parameter necessary to adjust this formula to the actual numbers. Here and below time t is normalized days/10 for the number of days from the beginning of the period of the intense growth of the epidemic. This functions matches the total number of infections with high accuracy. The parameter a is 0.2 for the USA and UK; it can be about 0.3 for countries with somewhat more proactive approach, like Austria and Israel, and it becomes 0.15 and even 0.1 for "the world" and Sweden. The basic transmission rates are: c = 2.2 for the USA, c = 2.4 for UK, Sweden, and it is currently 2.8 for "the world". Recall that c approximately corresponds to the growth ∼ t c of total infections in the beginning of the epidemic, when no protection measures are in place. Limitations. Importantly, the function u(t) is proportional to t c/2+1 for t near t = 0, which is very close to ∼ t c , assuming that c ≈ 2. So the phase of "parabolic growth" is generally covered well by the u-formula too, though formally this does not follow from the theory in [6] . Of course there can be other reasons for u(t) to "serve epidemics"; it is not impossible that there are connections to the replication process of viruses, but this we do not touch upon. As always, there are limitations, which we will address now. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 2, 2020. . https://doi.org/10.1101/2020.04.29.20084483 doi: medRxiv preprint First and foremost, the available infection numbers are for the detected cases, which are mostly symptomatic. However this is not of much concern for us. We understand managing epidemics sociologically. Those who are detected mostly have symptoms, but the management is mostly focused on them too. So all our arguments fully applicable within this group; we just restrict ourselves with symptomatic cases. No assumptions on asymptomatic cases are necessary for the u-formula. We of course understand that when the number of new reported infections drops to zero, there can be many non-detected asymptomatic cases, which can potentially lead to the recurrence of the epidemic. Such "saturation" is only a technical end of the epidemic. The second reservation concerns newly emerged clusters of infections and the countries where the spread is on the rise. The u-formula can be used in spite of such fluctuations, but it is now a statistical tool; see Figure 5 . The predictions must be regularly updated. The third reservation is related to the management of the epidemic. Not all countries employ the protection measures in similar ways, but this is not a problem for us. The problem is if the intensity of the measures and the criteria are changed in some unpredictable ways. Diminishing the "hard" measures too early or even dropping them completely at later stages is quite possible. Last but not the least is the data quality . Changing the ways they are collected and the criteria frequently makes such data useless for us. Though, if the number of detected infections is underreported in some regular ways, whatever the reasons, such data can be generally used. Thus this is sufficiently relaxed, but the data for several countries, not too many, are not suitable for the usage of our "forecasting tool". Forecasting the spread. With these reservations, the first point of maximum t top of u(t) is a good estimate for the duration of the epidemic and the corresponding estimate for the top value of the total number of infections. This gives an important "forecasting tool", assuming that "hard measures" are the key in practical management of the epidemic. We note that the approximate reflection symmetry of du(t)/dt for the u(t) in the range from t = 0 to t top can be interpreted as Farr's law of epidemics under aggressive management. Generally, the portions of the corresponding graph before and after the turning point are supposed to be essentially symmetric to each other. This is not exactly true for du(t)/dt, but close enough. See Figure 2 and the other ones; the turning point is at max{du(t)/dt}. As any model, our one is based on various simplifications. We assume that the number of people perceptive to the virus is unlimited, i.e. we do not consider epidemics with the number infections comparable with the whole population, as well as herd immunity . Also, we disregard . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 2, 2020. . https://doi.org/10.1101/2020.04.29.20084483 doi: medRxiv preprint the average duration of the disease and that for the quarantine periods imposed. The total number of infections regardless of the output is what we are going to model, which is commonly used. In spite of all these assumptions, the u-formula works unexpectedly well. We mention here a strong connections with behavioral finance ; see [7] . For instance, practically the same u(t) in terms of Bessel function as above models "profit taking" in stock markets. The polynomial growth of u(t) is parallel to the "power law" for share prices. There is a long history and many aspects of mathematical modeling epidemic spread; see e.g. [1] for a review. We restrict ourselves only with the dynamic of momentum managing epidemics , naturally mostly focusing on the middle stages, when our actions must be as precise as possible. The two basic modes we consider are essentially as follows: (A) aggressive enforcement of the measures of immediate impact reacting to the current absolute numbers of infections and equally aggressive reduction of these measures when these numbers decrease; (B) a more balanced and more defensive approach when mathematically we react to the average numbers of infections to date and the employed measures are of more indirect and palliative nature. Hard and soft measures. The main examples of (A)-type measures are: prompt detection and isolation of infected people and those of high risk to be infected, and closing places where the spread is the most likely. Actually the primary measure here is testing ; the number of tests is what we can really implement and control. The detection of infected people is its main purpose, but the number of tests is obviously not directly related to the number of detections, i.e. to the number of positive tests . The efficiency of testing requires solid priorities, focus on the groups with main risks, and solving quite a few problems. Even simple mentioning problems with testing, detection and isolation is well beyond our article. However, numerically we can use the following. During the epidemics, essentially during the stages of linear growth, which are the key for us, the number of positive tests can be mostly assumed a stable fraction of the total number of tests. This is demonstrated in Figure 1 . Such proportionality can be seen approximately from March 16. However, at the end of this chart, testing was reduced below the levels required for mode (A); it must remain at least constant until the saturation of the number of total cases. The measures like wearing protective masks, social distancing, recommended self-isolation, restricting the size of events are typical for (B). This distinction heavily impact the differential equations we obtained in [6] . However the main difference between the modes, (A) vs. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 2, 2020. Hard measures are the key. Generally, the (A)-type approach provides the fastest possible and "hard" response to the changes with the number of infections, whereas we somewhat postpone with our actions until the averages reach proper levels under (B), and the measures we implement are "softer". Mathematically, the latter way is better protected against stochastic fluctuations, but (B) is slower and cannot alone lead to the termination of the epidemic, which we justify mathematically within our approach. The main objective of any managing epidemics is to quickly end them. However the excessive usage of hard measures can lead to the recurrence of the epidemic, some kind of "cost" of our aggressive interference in a natural process. This can be avoided only if we continue to stick to the prevention measures as much as possible even when the number of new infections goes down significantly. Reducing them too much on the first signs of improvement is a way to the recurrence of the epidemic, which we see mathematically within our approach. Some biological aspects. The viral fitness is an obvious component of the transmission rate c. Its diminishing over time can be expected, but this is involved. This can happen because of the virus replication errors. The RNA viruses, Covid-19 included, replicate with fidelity that is close to error catastrophe. See e.g. [8] for some review and predictions. Such matters are well beyond this paper, but one biological aspect must be mentioned, concerning the asymptomatic cases. The viruses mutate at very high rates. They can "soften" over time to better coexist with the hosts, though fast and efficient spread is of course the "prime objective" of any virus. Such softening can result in an increase of asymptomatic cases, difficult to detect. So this can . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 2, 2020. . https://doi.org/10.1101/2020.04.29.20084483 doi: medRxiv preprint contribute to diminishing c we observe, though this is not because of the decrease of the spread of the disease. We model the available (posted) numbers of total infections, which mostly reflect the symptomatic cases. To summarize, it is not impossible that the replication errors and "softening the virus" may result in diminishing c at later stages of the epidemic, but we think that the reduction of the contacts of infected people with the others dominates here, which is directly linked to behavioral science, sociology and psychology. 3. The formula for the growth.. We will need the definition of the Bessel functions of the first kind: . The key point is that measures of type (A) have "ramified" consequences, in some contrast to mode (B). Namely, an isolated infected individual will not transmit the virus to many people and the number of those protected due to this isolation grows over time. Combining this with our understanding of the power law of epidemics, we arrive at the differential equation for the total number of infections u(t): There is a surprisingly perfect match of the total number of infections for Covid-19 in the USA and UK till April 14 with our solutions u(t) above. This is from the moment when these numbers begin to grow "significantly", approximately around March 16 for the USA and UK. Epidemics are very stochastic processes, so such a precision is surprising. The site https://ourworldindata.org/coronavirus is mostly used for data, updated at 11:30 London time. We take x = days/10. The USA data. The scaling coefficient 1.7 in Figure 2 is adjusted to match the real numbers. For the USA, we set y = infections/100K, and take March 17 the beginning of the period of "significant growth". The parameters are c = 2.2, a = 0.2. The red dots show the corresponding actual total numbers of infections. They perfectly match u(t) = 1.7t 1.1 J 0.6 (t √ 0.2) in Figure 2 , which results in the following: the number of cases in the USA can be expected to reach its preliminary saturation at t top = 4.85 (48.5 days . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 2, 2020. . https://doi.org/10.1101/2020.04.29.20084483 doi: medRxiv preprint from 03/17 to May 5) with u top = 10.3482, i.e. with 1034820 infections (it was 609516 at 04/15). This is of course a projection: about 1M of total cases at the saturation point t top near May 5. The black dots show the test period, till April 29. The predictions are of course based on the assumption that the intensity of hard measures continues to be proportional to the total number of detected infections to date, as it was clearly the case for the red dots. The jumps like the one at about x = 4.1 can be because of various reasons impossible to forecast, as well as some period of linear growth after it. However the general trend for black dots matches our u(t) well enough. Obviously t = t top cannot be the end of the epidemic. The data from South Korea and those from other countries that went through the "saturation", demonstrate that a linear growth of the total number of cases can be expected around and after t top , with periodic fluctuations. The obvious reasons are (a) reducing the "hard" and "soft" measures, (b) no country is isolated from infections from other places, (c) continued testing can result in finding more asymptomatic cases, (d) new clusters of disease are always possible. Anyway, fluctuations closer to the "saturation" and after it are quite likely. Resuming the corresponding measures is possible, but this is not always the case. Our formula is just a forecasting tool, which is mostly applicable to the period when the protection measures, especially "hard" ones, are applied in a regular manner. Then the match with the real data is good. Covid-19 in UK. One of the reasons of the strong match of red dots and our u(t) can be that the USA consists of 50 states, and therefore the total number of infections is quite an average. However this is . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 2, 2020. . https://doi.org/10.1101/2020.04.29.20084483 doi: medRxiv preprint no worse for UK. The data will be from 03/16 till 04/15; add 18 to our "red dots", the initial number at 03/16, to match the actual total numbers. The black dots constitute the control period: 04/16-04/29. Now c = 2.4, a = 0.2 work fine, and the scaling coefficient is 2.2. The total number of cases is divided by 10K, not by 100K as for the USA. The expectation is now as follows: the "saturation moment" can be 5.17, i.e about 51 days after March 16, somewhere around May 6. The estimate for corresponding number of infections is about 170000, with all ifs. This is assuming that the "hard" measures will be employed at the same pace as before April 15. The black dots confirm the trend. Sweden 03/07-04/23. This is an example of the country that remains essentially "open". Actually, they actively do testing followed by the isolation of infected people, the key "hard" measure from our perspective. Also, the strength of the health-care in this country must be taken into consideration, and that it is surrounded by the countries that fight Covid-17 aggressively. The growth of the total number of cases was essentially quadratic for a relatively long period, which is what "power law" states for the epidemics with minimal "intervention". By now, the growth is linear, which means that what the measures they use appeared working. Interestingly, our u-formula is applicable, but for record law a = 0.1; it is a = 0.15 for "the world". Here y is the total number of cases divided by 1000 and add 137, the initial value, to our "dots". The projected saturation is around May 17. As always, it is with a reservation about their future policies, which seem sufficiently stable by now. In spite of their "soft" approach, our u-formula obviously can be used; see Figure 4 . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 2, 2020. Israel: "saturation". The last example we provide is what can be expected when the country went through the "saturation". Israeli population is diverse, which has a potential of significant fluctuations of the number of cases and various clusters of infection. However its solid response to Covid-19 and good overall health-system, made the growth of the spread sufficiently predictable. Generally, for small countries the fluctuations can be expected higher than for countries like the USA, UK, though it depends on how "homogeneous" they are. We divide the total number of cases by 1000, as for Sweden; see Figure 5 . The red dots began March 13, when the total number of detected infections was 96, and stopped April 17; the remaining period till April 29, shown by the black dots, was the "control one". The saturation forecast went through almost perfectly, but there were significant fluctuations in process. After April 26, the predicted moment of the saturation, the growth of the total number of (known) infections is supposed to be mild linear. The parameters are: a = 0.3, i.e. the intensity of hard measures is better than with the USA, UK, and c = 2.6. The latter means that the initial transmission coefficient was somewhat worse than c = 2.4 in the USA, UK, possibly due to the greater number of "normal" contacts. Recall that a, c are parameters of our theory, related to but not immediately connected with the real factors. Some discussion. Recall that we consider only "total cases", the numbers of all detected infections, and begin at the moment when the "significant" growth begins, which is also essentially when the active measures start. According to our theory, the match with u(t) above can be expected in some interval around the turning point. However . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 2, 2020. practically it holds better than this: an almost perfect match is for about 80% of the whole periods of intensive growth. Our restriction to the period till April 15, 2020 with fixing the parameters a, c, except for Sweden, is not accidental. It was sufficient for the practical confirmations of the theory presented in Figures 2,3 , and with other countries. The latest data, the black dots, provide the "realtime checks"; they were obtained after the parameters were fixed. The pandemic is far from over, but within the scope of this paper, the data we needed for the "forecasting tool" were essentially present by 04/15. When the epidemic approaches the saturation, which is the first maximum of u(t) in our model, its management can be expected to evolve toward reducing and abandoning "hard" measure. This can result in a growth like Ct c/2 with some "mild" C and c/2 ≈ 1 and significant fluctuations, which really happens. Such growth at the end is generally covered by our theory; it is not connected with Bessel functions. The superb match of our u(t) during the significant part of the period of intensive growth of the spread can be of real importance for practical managing epidemics. On the basis of what we see, the best ways to use the u-curves seem as follows: (1): determine a, c using about the first 30-40% after the intensive growth begin, (2): update them constantly till the turning point and somewhat beyond, (3): try to the adjust the measures at later stages "to stay close to the curve". With (3), the constant response is needed to new clusters of the disease, some jumps with the new cases due to the reductions of the measures and so on. Generally, our forecasting tool can serve the best if the data and the measures are as uniform and "stable" as possible. Then underreporting, focusing on symptomatic cases, and inevitable fluctuations with . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 2, 2020. . https://doi.org/10.1101/2020.04.29.20084483 doi: medRxiv preprint the data may not influence too much the match with the u-curves; this is of course statistically and with usual reservations. Also, (4): testing the population and some "soft" measure must be continued well after the "saturation" to prevent the recurrence of the epidemic. The mathematics of infectious diseases Periodicity in Epidemiological Models Power-Law models for infectious disease spread Epidemic psychology: a model Modeling infectious disease dynamics Momentum managing epidemic spread and Bessel functions Artificial intelligence approach to momentum risk-taking Are RNA viruses candidate agents for the next global pandemic? A review A Treatise on the Theory of Bessel Functions Acknowledgements. I'd like to thank ETH-ITS for outstanding hospitality. My special thanks are to Giovanni Felder, Rahul Pandharipande. I also thank very much David Kazhdan for his valuable comments and suggestions. Funding: partially supported by NSF grant DMS-1901796 and the Simons Foundation.