key: cord-0910293-f4g8us3t
authors: Kröger, M.; Schlickeiser, R.
title: Verification of the accuracy of the SIR model in forecasting based on the improved SIR model with a constant ratio of recovery to infection rate by comparing with monitored second wave data
date: 2021-09-22
journal: R Soc Open Sci
DOI: 10.1098/rsos.211379
sha: 6ead1cb456e57b03857a045917f43f4491a3272b
doc_id: 910293
cord_uid: f4g8us3t

The temporal evolution of second and subsequent waves of epidemics such as Covid-19 is investigated. Analytic expressions for the peak time and asymptotic behaviours, early doubling time, late half decay time, and a half-early peak law, characterizing the dynamical evolution of number of cases and fatalities, are derived, where the pandemic evolution exhibiting multiple waves is described by the semi-time SIR model. The asymmetry of the epidemic wave and its exponential tail are affected by the initial conditions, a feature that has no analogue in the all-time SIR model. Our analysis reveals that the immunity is very strongly increasing in several countries during the second Covid-19 wave. Wave-specific SIR parameters describing infection and recovery rates we find to behave in a similar fashion. Still, an apparently moderate change of their ratio can have significant consequences. As we show, the probability of an additional wave is however low in several countries due to the fraction of immune inhabitants at the end of the second wave, irrespective of the ongoing vaccination efforts. We compare with alternate approaches and data available at the time of submission. Most recent data serves to demonstrate the successful forecast and high accuracy of the SIR model in predicting the evolution of pandemic outbreaks as long as the assumption underlying our analysis, an unchanged situation of the distribution of variants of concern and the fatality fraction, do not change dramatically during a wave. With the rise of the α variant at the time of submission the second wave did not terminate in some countries, giving rise to a superposition of waves that is not treated by the present contribution.

MK, 0000-0003-1402-6714; RS, 0000-0003-3171-5079

The temporal evolution of second and subsequent waves of epidemics such as Covid-19 is investigated. Analytic expressions for the peak time and asymptotic behaviours, early doubling time, late half decay time, and a half-early peak law, characterizing the dynamical evolution of number of cases and fatalities, are derived, where the pandemic evolution exhibiting multiple waves is described by the semitime SIR model. The asymmetry of the epidemic wave and its exponential tail are affected by the initial conditions, a feature that has no analogue in the all-time SIR model. Our analysis reveals that the immunity is very strongly increasing in several countries during the second Covid-19 wave. Wave-specific SIR parameters describing infection and recovery rates we find to behave in a similar fashion. Still, an apparently moderate change of their ratio can have significant consequences. As we show, the probability of an additional wave is however low in several countries due to the fraction of immune inhabitants at the end of the second wave, irrespective of the ongoing vaccination efforts. We compare with alternate approaches and data available at the time of submission. Most recent data serves to demonstrate the successful forecast and high accuracy of the SIR model in predicting the evolution of pandemic outbreaks as long as the assumption underlying our analysis, an unchanged © 2021 The Authors. Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.

Currently, many countries over the world have to cope with handling the subsequent (second, third …) wave outbreaks of Covid-19 pandemic. The onset of a subsequent wave is most probably caused by a sudden mutation of the virus resulting in an effective (about 40%) increase of the infection rate. Of high medical and economic interest are reliable predictions on the duration, peak days and total amplitudes of both the number of infections and the number of fatalities which are closely related to the maximum need for breathing apparati in hospitals in order not to overburden the medical clinics capacities and to avoid fateful triage decisions.

In the past reasonably accurate (within 50%) predictions on the first Covid-19 wave have been made [1] [2] [3] [4] adopting the Gaussian distribution for the daily rate of new cases and the corresponding cumulative numbers (including both infections and fatalities) for many countries in the world with wellmonitored case rates. By modelling simultaneously both infection and fatality rates the dark number of infections and the degree of herd immunity from the first wave has been determined in several countries [5] by adopting a mortality rate of 0.5%, i.e. one out of 200 infected persons ultimately dies from Covid-19. While the Gauss distribution can be justified both from the central limit theorem of statistics [1] and for early times until the peak time from the susceptible-infectious-recovered/removed (SIR) epidemic model [6] [7] [8] [9] , it is less accurate at late times of the wave evolution as compared to monitored data [10] .

The SIR model [6, 7] describes the time evolution of infectious diseases in human populations, and is the simplest and most fundamental of the compartmental models and its variations. It had been solved numerically using various approaches, including Monte Carlo methods, wavelets, fuzzy control, deep learning etc. and approximate solutions had been proposed [32] [33] [34] . Our recent work [35, 36] improved the analytical modelling of epidemics based on the well-established SIR epidemic model invented nearly 100 years ago.

There are two variants of implementing the SIR model, as described in detail previously [35, 36] . The alltime SIR model must use an initial condition that is compatible with the SIR equations of change, which allows it to 'predict' both the past and present consistently. The semi-time SIR model can be used with any initial condition, but makes predictions only about the future. Often the second variant is used without noticing the inconsistency, while the inconsistency had already been noted by Kendall [6] . Assuming a constant ratio between infection and recovery rates, the all-time SIR model does predict a single wave only, while the semi-time SIR model we are going to use in this work allows us to model many waves with their own specific parameters, that capture only the future development in each case.

For both variants, the considered population of N ≫ 1 persons is assigned to the three compartments s (susceptible), i (infectious) or r (recovered/removed). Persons from the population may progress with time between these compartments with given infection (a(t)) and recovery rates (μ(t)) which in general vary differently with time. As demonstrated before [35] , it is convenient to introduce with I(t) = i(t)/N, S(t) = s(t)/N and R(t) = r(t)/N the infected, susceptible and recovered/removed fractions of persons involved in the infection at time t, with the sum requirement I(t) + S(t) + R(t) = 1. Moreover, if t 0,n denotes starting time of the nth wave with the initial conditions I(t 0,n ) = η and S(t 0.n ) = 1 − η, it is also appropriate to introduce the dimensionless reduced time variable

accounting for arbitrary but given time-dependent infection rates a(t). With the introduction of the reduced time (1.1) the following analysis includes and addresses the effects of non-pharmaceutical interventions (NPIs). These affect the infection rate a(t) by lowering it to lower values providing substantial differences to the simple linear relation τ = a 0 t of the reduced time as a function of the real time t in case of an unchanged initial infection rate a 0 . This has been described and illustrated in detail in § §2.2 and 2.3 of earlier work [10] . For the special and important case of a time-independent ratio K(t) = μ(t)/a(t) = k = const., analytical results of the SIR model have been recently derived [35, 36] both for the all-time and semi-time cases appropriate for the first and subsequent wave evolution, respectively. For a growing epidemics outburst, royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 211379 exhibiting a peak at some time past t 0,n in the future, the constant ratio k has to be smaller than 1 − 2η, where as noted before η = I(t 0,n ) denotes the infected fraction at the starting time t 0,n of the nth wave. The requirement k < 1 − 2η corresponds to the initial infection rate a 0 at time t 0,n being larger than the initial recovery rate μ 0 , both in units of days −1 . Alternatively, for initial ratios k > 1 − 2η, the wave amplitude is purely decaying from its maximum at t 0,n without exhibiting a wave phenomenon [36] . Here, the time t 0.n refers to the observing time when the onset of a new nth wave is recognized by the monitoring of case rates.

It has been demonstrated [36] that the two parameters k and η determine the full evolution of each subsequent wave in reduced time (1.1) including in particular the effects of NPIs which determine the relation between reduced time and real time for non-stationary infection rates. Moreover, it is important to emphasize that in the peak case where k < 1 − 2η the SIR model not only provides a causal connection between the early (before the peak) and late time development of the pandemic wave, but that the parameters k and η obtained from fitting the early time evolution also determine the final and maximum values of the cumulative and daily case rates.

Below, we will derive simple analytic expressions for all measurable amounts of cases and fatalities during a pandemic evolution described by the semi-time SIR model, that share all relevant features with the exact solution of the semi-time SIR model, including time and position of the peak of daily new infections, as well as the asymptotic behaviours at small and large times. The expressions are so precise that they can be used instead of a numerical solution of the SIR model. The advantage of an analytical expression is obvious, as it allows us to quickly determine the SIR parameters from the measured data well ahead of the peak time, and thus allows for predictions that serve as a prerequisite to make decisions. We apply the approach to eight different countries from different continents. We begin by summarizing the exact features of the semi-time SIR model, stating the approximants for the reliably measurable quantities, and collect all the derivations of the new approximant in an appendix. The occurrences of a clearly separated (in time) second wave in these countries provide very suitable data sets to verify the forecasts of the SIR model which is one important motivation for this study here.

Throughout this work, we ignore the effect of vaccination campaigns on the temporal evolution which we investigate elsewhere [37] . In that reference, it is demonstrated that the generalization to solutions with vaccinations relies heavily on very accurate analytical solutions of the SIR model in order to interpolate especially at small values of the ratio of vaccination to infection rates. It is precisely the purpose of this study here to provide such accurate solutions of the SIR model. We furthermore assume an unchanged situation of the distribution of variants of concern and the fatality fraction during a single wave.

In order for the interested reader to comprehend why our analysis is based on predictions made in January 2021, we note that the original version of this manuscript had been received January 16, 2021 by the Royal Society. It was passed over internally to Royal Society Open Science in April 2021. The original forecast was based on data collected on January 11, 2021. For the purpose of this study, the original forecast has not been revised using later data, but later data was added to compare with the original forecast. The original preprint of this manuscript entitled 'Forecast made on January 11, 2021 for the second Covid-19 wave based on the improved SIR model with a constant ratio of recovery to infection rate' is available since January 21, 2021 at preprints.org. The title had been modified as the forecast is not a forecast anymore at the time of publication of this manuscript.

The exact solution of the SIR model with a constant ratio of the recovery to infection rate in the semi-time case [36] (hereafter referred to as KS-SIR model) is given by 

royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 211379

As shown before [36] without explicit inversion of equation (2.1), equations (2.1) and (2.2) yield for the final cumulative number fraction of infected persons J ∞ = lim τ→∞ J(τ) and for the maximum rate of new infections j max occurring at J 0 that

in terms of the principal (W 0 ) and non-principal (W −1 ) solutions of Lambert's equation [35] , the wellknown Lambert's functions. The asymmetry of j(τ) about its peak time is encoded in a difference between J 0 and J ∞ /2. We emphasize again that the integral quantities J ∞ and J 0 refer to the pandemic evolution in reduced time so it includes the effects of NPIs which determine the relation between real and reduced time.

We approximately invert the solution (2.1) by using the constrained Taylor-expansions of the reciprocal integrand nðyÞ ≃ j c ðyÞ ¼ about y c = η and y d = J ∞ up to second order, respectively. Here c 0 , c 1 and d 1 are the respectively positively valued Taylor expansion coefficients given by

and

Hereby, constrained expansion refers to choosing the second-order expansion coefficients c 2 and d 2 such that the respective expansion evaluated at y = J 0 yields the maximum value of the daily case rate, i.e.

with j max from equation (2.6). The requirement (2.12) then readily provides

and According to equations (2.7) and (2.8), we have thus constructed the approximation for the daily rate j(J ) as a function of the cumulative number jðJÞ ≃ c 0 þ c 1 ðJ À hÞ þ c 2 ðJ À hÞ 2 for J J 0

which is continuous at J = J 0 where j(J 0 ) = j max attains its peak value. As shown in the following by demanding that our approximation (2.15) attains the exact maximum value at J = J 0 and a vanishing final value at J = J ∞ we obtain a very high agreement between the approximated analytical and the exact numerical pandemic evolution as a function of the reduced time τ.

As we show next, inserting the approximation (2.15) then allows the analytical inversion of the solution (2.1) providing J(τ) as a function of the reduced time τ which then can be used either in equation (2.15) or equation (2.2) to infer also the rate j(τ) as a function of reduced time. The remaining SIR quantities are then obtained from J(τ) as well via

While we have expressed τ in terms of J above, for all practical purposes one is interested in the reverse relationship, the time τ-dependent behaviour of J and also j. With the approximations (2.7) and (2.8) we obtain for the solution (2.1)

ð2:16Þ

Introducing the peak time scale 

with the dimensionless peak time 

For very late reduced time τ ≫ τ m the rate j(τ), second case in equation (2.22) , approaches the decreasing exponential function in reduced time

with the decay half-reduced time [38] t 1=2 ¼ ln 2

Similarly, for very early reduced time τ ≪ τ m , provided such a regime exists, as its pronounced appearance depends on the values for η and k, the rate j(τ), first case in equation (2.22), is an increasing exponential function in reduced time with the doubling time

defined by j(τ + τ 2 ) = 2j(τ).

The early differential rate exhibits a very interesting feature referred to here as the half-early-peak law. According to equation (2.22) , the rate at half of the peak time j 1/2 = j(τ m /2) is given by

The corresponding cumulative half-early-peak law follows from the first case of equation (2.19) as

With this equation the half-early-peak law (2.26) reads

ð2:29Þ

In case of temporal wave distributions with an apparent peak the half-early-peak law (2.29) provides an important test for the derived parameters of the wave as it relates directly the monitored quantities J(0) = J(τ = 0), J 0 = J(τ m ), j 1/2 , j max and c 3 τ m /2, where the latter can also be written in terms of the ratio between peak time τ m and early doubling time τ 2 as c 3 t m =2 ¼ lnð2 1=4 Þt m =t 2 % 1 6 t m =t 2 .

The rates (2.22) and the cumulative number (2.19) refer to the relative time τ defined in equation (1.1), whereas the monitored data refer to real time t. We therefore adopt, well justified for the initial phase of any new emergent wave, a constant infection rate a(t) = a 0 throughout so that τ = a 0 (t − t 0 ), where for ease of exposition we drop the index n and simply write t 0 = t 0,n . We then obtain in real time for the peak time

royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 211379

Likewise, the pandemic time evolutions in real time follow readily from

and are written down in appendix B. Likewise, the doubling time at early times t ≪ t m and the decay halftime at late times t ≫ t m are given by

whereas the half-early-peak law in real time becomes

This will be made more precise below, as we apply the semi-time SIR model to two waves, each with its own onset.

The above derivations resulted in explicit expressions for the dimensionless fraction J of infected persons and the dimensionless rate of infections j, both in terms of reduced time τ, the inverse reproduction number k and the initial condition J(0) = η. Measured, reliable data is usually available for the total number of deceased persons D(t) as function of time t in units of days, and the total population N can be considered known. Since the number of deaths follows the number of infections with a delay of about 10 days [2] , we can use D(t) to make predictions about the number of infected persons at t − 10 days.

To uniquely determine the model parameters from the data, which allows us then to draw conclusions about the future time-evolution of measurable quantities, we need to assume a fixed relationship between real and reduced time. To this end, we adopt for the time during each of the pandemic waves a constant infection rate a 0 so that t ¼ Ð t t0 djaðjÞ ¼ a 0 ðt À t 0 Þ where t 0 is the real time marking the beginning of the nth wave, and τ the reduced time that vanishes at the beginning of the nth wave.

With the known fatality ratio of f ≈ 0.005, the cumulative number of infected persons (including those that have not been identified) is fD(t)/N. More precisely, J(τ) = f D(t)/N during the first wave. Because the cumulative number accumulates during subsequent waves, it is more convenient to model the measured daily rate _ JðtÞ of newly infected persons instead of the cumulative numbers. More precisely, one has jðtÞ ¼ f _ DðtÞ=a 0 N, and it is this dimensionless j(τ) which we have expressed in terms of k and η above, while η is contained in the initial condition, jð0Þ ¼ hð1 À hÞ ¼ f _ Dðt 0 Þ=a 0 N. For each wave, there are thus three parameters to be determined, k, η and a 0 , or alternatively, k, t 0 and a 0 . In practice, as the measured data is fluctuating considerably, especially at the beginning of a pandemic wave, it turns out to be even more convenient to work with four unknowns, k, t 0 , η, and a 0 , and to determine these parameters upon requiring that the absolute deviation between measured and predicted j(τ) achieves a global minimum for each wave separately. Start values for the global minimization are readily available from our above consideration about the early time evolution, the position of the maximum in real time, t m = t 0 + a 0 τ m , the value of the dimensionless J(τ m ) = J 0 = fD(t 0 + τ m /a 0 )/N, where τ m and J 0 are known in terms of k and η. With the fitted parameters at hand the model-predicted cumulative fraction of infected persons in real time t is

For the sake of clarity, because the number of infections is f −1 times the number of fatalities, and because the number of fatalities is more reliably reported than the true number of infections, we do not add any delay time when presenting figures, so that the time of outbreak of the pandemic can be considered 10 days earlier than t 0 . Similarly, the peak time of daily new infections is roughly 10 days earlier than the tabulated t m , which is the peak time of daily fatalities.

royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 211379

In the following, we determine the two sets of four SIR parameters by fitting the early time evolution before the peaks of first and second waves with the monitored early daily case rates j(t) during the waves. The obtained numbers are tabulated in table 1 for selected countries, while the procedure has been applied to more than 100 countries during the course of this study (supplementary website). The measured data are compared with the SIR predictions in figures 1 and 2. With the four coefficients c 1 , c 2 , t 0 and a 0 determined we then can infer (1) the values of the parameters η and k characterizing the waves, (2) the final number of infected persons NJ ∞ , the maximum daily rate j max at the peak time τ m according to equations (2.3), (2.6) and (2.17) and (3) the late time evolution of the second wave after its peak from equations (2.19) and (2.31) determined by the coefficients d 1 and d 2 given uniquely by equations (2.11) and (2.14) in terms of k, J ∞ , c 0 , c 1 and c 2 , which in turn are given by equations (2.9), (2.10) and (2.13), i.e. finally, in terms of k, the initial condition ɛ, and reduced time τ.

All these results are collected in table 1. In the last column of this table, we have added data that became available in the meantime. Deviations between the early predictions from 11 January 2021 and the reported numbers for the second wave are about 10%. We regard this as a convincing proof that the homogeneous SIR model equations successfully and highly accurately (within 10%) forecast the evolution and outcome of Covid-19 pandemic events. At the same time, the model is not able to capture the onset of a (n + 1)th wave, if it occurs simultaneously with the nth wave. This is evident from the data for Italy shown in figure 1. At the time of submission of the original manuscript, the α variant started to rise in Italy and shortly afterwards gave rise to a third wave. This situation is not captured by the current approach, as it requires a clear time separation between waves. Still, we estimated the second wave to terminate when the new variant started to dominate ( figure 3 ).

Despite the differences in the response of authorities to the emerging pandemic, length and completeness of lockdowns etc. there are very comparable patterns for all countries. Both the inverse reproduction number k and the infection rates a 0 have significantly dropped during the second wave when compared with the first wave. The decreasing k = μ 0 /a 0 tends to increase the peak height, while the decreasing a 0 tends to lower it. The reduction of a 0 comes as a surprise to us, but a decrease of a 0 is in agreement with the observed broadening of the second wave.

One likely explanation may be that the early phases of the second waves in all considered countries occurred under light lockdown conditions whereas no lockdown measures have been taken during the initial phases of the first wave. Moreover, the infection rates a 0 of the second wave have dropped as a result of the overall, increasingly cautious, self-protective behaviour of the population.

While the second peak times are comparable in Italy, Switzerland, France and Russia, they are delayed in Canada, Germany and Great Britain. On the positive side, according to our analysis, the peak time t m of the second wave has passed already in France, Belgium, Italy, Germany, Switzerland and Russia, and more than half of the population in Great Britain will have been infected already after the second wave, thus getting closer to herd immunity. In general, the second wave increased the immunity more significantly than the first wave, as is obvious from the sharp rise of J ∞ between first and second wave (table 1). The last column in this table contains the SIR predictions for the number of fatalities at the end of the second wave. While countries like Germany were hit by a moderately low number of fatalities during the first wave, the number of fatalities will be increasing by a factor of about 7 during the second wave, despite rigorous interventions. Still, the fraction of seriously affected population remains slightly below in this country compared with countries like Switzerland, that did not install a similarly rigorous scenario with hard lockdowns. The measured data is captured by the semi-time SIR model by a relative error of about 5-10%. At present, the situation in Great Britain is obviously the darkest for these sets of data. The trend is most likely caused by the recent, more transmittable variant (SARS-CoV-2 VUI 202012/01) of the virus. As of today (13 January 2021) the peak time in the death rate in Great Britain is still ahead.

Within the semi-time SIR model, the exponential behaviour of the differential rate of newly infected persons j(τ) is characterized by the early doubling time τ 2 and the late half decay times τ 1/2 , which we have included in table 1. Their analytic expressions were given by equations (2.25) and (2.24). It seems that τ 1/2 is for all cases reported in our table approximately equal to the doubling time τ 2 . To see if this is a generic feature of the SIR model, one has to just inspect the ratio τ 2 /τ 1/2 = (J ∞ − 1 + k)/2c 3 , royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 211379 Table 1 . Results from the analysis of nth (first and second) pandemic waves in selected countries based on data from 11 January 2021. The columns contain the population N, SIR parameters k, η, infection rate a 

in agreement with table (1), as k is sufficiently close to unity. To stress an important aspect, and because we can take the opportunity to compare with related recent work [35] , the ratio over the whole k range is shown in figure 4 for various finite η. This ratio reflects the asymmetry of the pandemic wave, as J 0 − J ∞ /2 does. It is important to notice here that the initial condition affects not only the asymmetry of the wave but the asymptotic behaviour qualitatively. The corresponding result for the all-time SIR model is shown for comparison. It is however recovered by the semi-time SIR for sufficiently small η. The appearance and relevance of power law and Gaussian rather than exponential tails, that are not predicted by the SIR model with time-independent k, had been discussed in several recent works [1] [2] [3] [4] 40] . Early outbreak estimates of k and their uncertainty have been discussed in detail [41] .

We have derived simple analytic expressions for all measurable amounts of cases and fatalities during a pandemic evolution described by the semi-time SIR model, that share all relevant features with the exact solution of the semi-time SIR model, including time and position of the peak of daily new infections, as well as the asymptotic behaviours at small and large times. We show, in particular, how the asymmetry of the epidemic wave and its exponential tails are affected by the initial conditions; a feature that has no analogue in the all-time SIR model. The expressions are so precise that they can be used instead of a royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 211379

numerical solution of the SIR model. The advantage of an analytical expression is obvious, as it allows us to quickly determine the SIR parameters from the measured data well ahead of the peak time, and thus allows for predictions that serve as a prerequisite to make decisions. We applied the approach to second waves in eight different countries from different continents. We summarized the exact features of the semi-time SIR model, stated the approximants for the reliably measurable quantities, and collected all the derivations of the new approximant in an appendix. Our analysis reveals that the immunity is very strongly increasing during the second wave, while it was still at a very moderate level of a few per cent in several countries at the end of the first wave. The wavespecific SIR parameters μ 0 and a 0 describing the infection and recovery rates we find to behave in a [39] . Note that all data are not necessarily representative. Sometimes some samples are more likely to be sequenced than others [39] . Data shown for (a) ITA and (b) FRA (time frame: 1 January 2020 to 9 August 2021). The vertical red line marks the date (16 January 2021) at which this manuscript had originally been submitted. This date happens to roughly coincide with the time of rising relevance of the α variant, while the forecast for the second wave done in the present manuscript assumes an unaltered majority of 20A variants.

royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 211379 similar fashion, while their ratio k = μ 0 /a 0 was decreasing only by about 5% for the countries mentioned in table 1 and figures 1 and 2. Still, an apparently moderate change of k can have significant consequences for the relevant numbers like the final number of infected or deceased population, captured by J ∞ . For k close to unity and small η ≪ 1, our results imply j max % 1 2 ð1 À kÞ 2 and J ∞ ≈ 2(1 − k), cf. figure 5. This implies a typical duration of the differential fraction of newly infected persons, w ≈ J ∞ /a 0 j max ≈ 4/ a 0 (1 − k), that decreases dramatically with decreasing k, but increases with decreasing a 0 . It is this qualitative feature of the SIR model that leads to its counterintuitive parameters we reveal having analyzed the two pandemic waves, and that has to be taken into account when speculating about possible additional waves. As we have shown, the probability for an additional wave exceeding the peak fatality rate of previous waves is however low in several countries due to the fraction of immune inhabitants at the end of the second wave (second and third for cases where both waves are strongly overlapping), irrespective of the currently ongoing vaccination efforts.

The SIR-modelling is founded on a mean-field approach where the injection and recovery rates are averaged over a large number of persons in the considered countries. Therefore, the well-established influence of pre-symptomatic and asymptomatic disease carriers on the spread of the Covid-19 is washed-out and interwoven with the far more infected persons with less intense pathegonic transmissions. For this reason, our results should only be applied to populations with large enough sizes.

Making use of data that became available during review time allowed us to test the original forecast. The accuracy of our early predictions on the second wave temporal evolution is convincingly good. We predicted the fraction of infected population at the end of the second wave ( figure 1) Figure 4 . Ratio between early doubling time τ 2 and late half decay time τ 1/2 versus k, for various initial conditions. For comparison, we include the corresponding result [35] for the all-time SIR model, (Exact behaviour (solid lines) of the peak differential fraction of infected persons j max , the final fraction of infected persons J ∞ , and the ratio J ∞ /j max , that characterizes the dimensionless width of the wave. All curves are shown to be well captured by the analytic expressions mentioned in the legends (dashed lines), that we use to discuss qualitatively, in concert with the dimensional infection rate a 0 , the obtained SIR parameters. The exact expressions for j max and J ∞ are given in terms of Lambert's function in equations (2.6) and (2.3) . The plot is done with η = 10 −6 , but it is essentially unaffected by the choice of η as long as η ≪ 1.

royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 211379 predicted 0.29 while the true value is 0.33 (13% off ), for DEU we predicted 0.20 compared with 0.22 (10% off ), for CHE 0.24 versus 0.26 (8% off ) and for GBR 0.41 versus 0.40 (2.5% off ). This is an outstanding agreement between predictions and measurements given all the parameters of the problem. This comparison proves the high accuracy of the SIR model in predicting the evolution of pandemic outbreaks.

At first sight, it seems that we were unsuccessful in forecasting possible further Covid-19 waves in some countries. However, it is too early for such a statement since so far the appearance of such additional waves did not show up clearly in the very reliable monitored death rates in six out of the eight considered countries (with the exception of Russia and Canada) but only in the monitored rate of new infections which are known to be highly incomplete and therefore less trustworthy. Secondly, our analysis of the second waves has been based on the monitored death rates adopting a mortality rate of 1/200 with respect to the rate of new infections, and it has assumed a clear separation of second and potential third wave in time. The model could be extended to consider the occurrence of simultaneous waves, non-local effects, multiple seeds, spreaders or outbreaks to the expense of additional parameters, with the help of numerical simulation [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] .

Data accessibility. All data points shown in this contribution are available from https://doi.org/10.5061/dryad.jsxksn09n.

The data used to create all graphs that mentioned reported numbers were collected from the following website at github: https://pomber.github.io/covid19/timeseries.json. See https://github.com/pomber/covid19 and https:// github.com/CSSEGISandData/COVID-19 and [57] for a description of the data sources. The analysis has been performed for many additional countries. Results are available online at https://www.complexfluids.ethz.ch/ covid19-wave2. and equation (A 15) is solved by 

We next offer two different routes A and B to come up with equivalent, but formally very different expressions for both Y(τ ≤ τ m ) and J(τ ≤ τ m ). Route (A): Using τ m from equation (A 7), we find

According to the first equation (2.15) 

The latter two equations can be combined to yield

Inserting equation (A27) into equation (A19) leads to the alternative expression for Y, Figure 7 . (a,c) Comparison between exact and approximate solutions to the semi-time SIR model, denoted as version II in our earlier work [36] , for various k. (b,d ) Relative error of the approximants shown in (a,b): j(τ) from equation (2.22) , which is identical with dJ/τ and also identical with j(τ) according to equation (2.15) (solid) and j(τ) = n(J(τ)) according to equation (2.2), with J(τ) from equation (2.19 ). The plot is done with η = 10 −6 , but it is qualitatively unaffected by the choice of η, not only for small η ≪ 1. While for small k the error is significantly smaller for this approximant compared to the approximant used in the present work, the present approximant has the neat features that it exactly reaches the exact j max at peak time, and that it possesses the correct asymptotic behaviour, which is not reflected by watching the relative error. Furthermore, the present approximant has a smaller error than the so-called version I approximant [36] .

royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 211379

where we used equation (A 22) and the earlier introduced abbreviation X = c 3 (τ m − τ). figure 6a , we show the reduced time evolution of j(τ) for various k. The exact solution (solid black) is visually matching the approximant (2.22) (solid red). For comparison, we include (dashed red) the j(τ) calculated via the identity (2.2). The relative error of the approximant, defined by the difference between approximant and exact solution, divided by j max , is shown in figure 6b , for the same k's, and again for our approximant (solid) and for the j(τ) calculated from the approximant J(τ) given by equation (2.19) , and inserted into equation (2.2) . The relative error of the approximant is below 2% for the relevant regime of k ≥ 8, and drops to about 5% for k = 0.6. It is important to realize from figure 6b that the error is only large during a small interval in time, but afterwards vanishes again, so that the deviations are negligible for any practical purposes. Similarly, deviations between exact and analytic approximant for J(τ) vanish exactly at τ = τ m and for both small τ ≪ τ m , and large times τ ≫ τ m . This means for the differential rate j(τ) that deviations are not accumulating, but compensating in time. Since the measured data is also fluctuating by amounts that easily exceed 10% between subsequent days, the precision can be considered excellent. For the readers' convenience, we include a similar comparison for the so-called version II approximant we derived in our previous work ( figure 7) . The present approximant has the neat features that it exactly reaches the exact j max at peak time τ m , and that it possesses the correct asymptotic behaviour, which is not fully reflected by watching the relative error. Furthermore, the present approximant has a smaller error than the so-called version I approximant [36] , and a simpler analytic form than the so-called version II approximant [36] .

A Gaussian model for the time development of the SARS-CoV-2 corona pandemic disease. Predictions for Germany made on March 30

Covid-19 predictions using a Gauss model

Propagation analysis and prediction of the Covid-19

Mathematical prediction of the time evolution of the Covid-19 pandemic in Italy by a Gauss error function and Monte Carlo simulations

2020 Dark numbers and herd immunity of the first Covid-19 wave and future social interventions

Deterministic and stochastic epidemics in closed populations

A contribution to the mathematical theory of epidemics

Modeling infectious diseases in humans and animals

COVID-19 and SARS-CoV-2. Modeling the present, looking at the future

Epidemics forecast from SIR-modeling, verification and calculated effects of lockdown and lifting of interventions. Front. Phys. 8, 593421

An efficient numerical method for fractional SIR epidemic model of infectious disease by using Bernstein wavelets

Estimation of COVID-19 dynamics 'on a back-of-envelope': Does the simplest SIR model provide quantitative parameters and predictions?

Dynamics of COVID-19 pandemic at constant and time-dependent contact rates

The susceptibleinfected-recovered model on a Euclidean network

Susceptibleinfected-recovered and susceptible-exposedinfected models

Global dynamics of a delayed two-patch discrete SIR disease model

Probability analysis of a perturbed epidemic system with relapse and cure

Using the contact network model and Metropolis-Hastings sampling to reconstruct the COVID-19 spread on the

Novel exponents control the quasi-deterministic limit of the extinction transition

2020 Stability analysis and approximate solution of SIR epidemic model with Crowley-Martin type functional response and Holling type-II treatment rate by using homotopy analysis method

Studying the progress of COVID-19 outbreak in India using SIRD model

2020 Inversion of a SIR-based model: a critical analysis about the application to COVID-19 epidemic

2020 Spreading of infections on random graphs: a percolation-type model for COVID-19

Automata network SIR models for the spread of infectious-diseases in populations of moving individuals

Numerical modeling and theoretical analysis of a nonlinear advection-reaction epidemic system

An infection agespace structured SIR epidemic model with royalsocietypublishing.org/journal

2020 Fractional order SIR model with generalized incidence rate

An agent-based computational framework for simulation of global pandemic and social response on planet X

Numerical solution and parameter estimation for uncertain SIR model with application to COVID-19. Fuzzy Optim

Mathematical models and deep learning for predicting the number of individuals reported to be infected with SARS-CoV-2

An approximate solution of the standard deterministic epidemic model

Accurate closedform solution of the SIR epidemic model

Exact analytical solutions of the susceptible-infectedrecovered (SIR) epidemic model and of the SIR model with equal death and birth rates

Analytical solution of the SIR-model for the temporal evolution of epidemics. Part A: timeindependent reproduction factor

Analytical solution of the SIR-model for the temporal evolution of epidemics. Part B: semi-time case

Analytical modeling of the temporal evolution of epidemics outbreaks accounting for vaccinations

2020 Gaussian doubling times and reproduction factors of the COVID-19 pandemic disease. Front. Phys. 8, 276

2017 Data, disease and diplomacy: GISAID's innovative contribution to global health

2020 Patterns of the COVID-19 pandemic spread around the world: exponential versus power laws

Reconciling early-outbreak estimates of the basic reproductive number and its uncertainty: framework and applications to the novel coronavirus (SARS-CoV-2) outbreak

Identifying multiple influential spreaders based on maximum connected component decomposition method

2021 A kernel-modulated SIR model for Covid-19 contagious spread from county to continent

On a class of nonlocal SIR models

Multiple outbreaks in epidemic spreading with local vaccination and limited vaccines

Outbreaks in susceptible-infected-removed epidemics with multiple seeds

Effects of multiple spreaders in community networks

Effects of the distance among multiple spreaders on the spreading

Outbreak size distributions in epidemics with multiple stages

Towards a characterization of behaviordisease models

Existence of multiple periodic solutions for an SIR model with seasonality. Nonlinear Anal.-Theory Methods Appl

Multiple epidemic waves in delayed susceptible-infected-recovered models on complex networks

Spontaneous behavioural changes in response to epidemics

The onset of oscillatory dynamics in models of multiple disease strains

The effect of crossimmunity and seasonal forcing in a multi-strain epidemic model

2021 Modeling the impact of SARS-CoV-2 variants and vaccines on the spread of COVID-19

An interactive web-based dashboard to track COVID-19 in real time

Authors' contributions. Both authors contributed equally to this work. Competing interests. We declare we have no competing interests. Funding. No funding has been received for this article. Acknowledgements. M.K. thanks the Swiss National Supercomputing Centre for providing computational resources (CSCS projects go11 and s987).

Equation (2.18) can be written asin terms of the three integrals τ m , I c and I d ,where in the last step we substituted x = 1/w. The integral (A4) then becomes for d 1 ≠ 0As the coefficient c 2 < 0 is negative the expressionroyalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 21137913 is always real-valued and positive. Consequently, the two integrals (A 2) and (A 3) are given byandwhere in both calculations we made use of equation (A 6) in the last step. To make the expression (A8) more readable, we used the abbreviations β 1 = J 0 + J − 2η and β 2 = (J 0 − η)(J − η). As an aside, we note that for J = η the last formula (A 8) correctly reduces to equation (A 7).A.1. Early reduced times τ ≤ τ m Collecting terms in equation (A 1), we obtain for early reduced times τ ≤ τ m with the abbreviation X = c 3 (τ m − τ) as well asandthe linear relationInserting equation (2.13) for c 2 , we findFrom equations (A 10) and (A 7), we identify yieldingFrom equation (2.14), we findin agreement with the second case in equation (2.19 ). The late time solution correctly provides J 0 for τ = τ m and J ∞ in the limit τ → ∞.Taking the derivative of equation (A 36) with respect to τ, the corresponding rate becomesconfirming the second case in equation (2.22) .

As data are usually available in real time t, we here write down the cumulative and differential rates of newly infected persons, which follow from their dimensionless counterparts equations (2.19) and (2.22) , upon using the transformation (2.31). The cumulative number of infected persons at time t isðB 1Þ and the differential rate _ JðtÞ ¼ dJðtÞ=dt reads 

Here, we evaluate the quality of the analytic approximant for the solution j(τ) of the semi-time SIR model. These calculations can be done for all remaining SIR quantities S(τ), I(τ), R(τ) as well as J(τ), but the results royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 211379