key: cord-0828157-6l7igbmx
authors: Lee, Fook Hou
title: A Heuristic Model for Spreading of COVID 19 in Singapore
date: 2020-04-18
journal: nan
DOI: 10.1101/2020.04.15.20067264
sha: c1d2ca7c8c0ec25c535ae1273aca01a05aed62a0
doc_id: 828157
cord_uid: 6l7igbmx

This paper presents a simple heuristic model for COVID 19 spreading. The model is based on a propagation unit of time. The state of the epidemic at the end of the time unit is then related to that at the start through recurrence relationships. By propagating these relationships over the required number of time units, a projection can be made over time. The model is readily implemented on a spreadsheet and is therefore potentially widely accessible. It can serve as a useful tool for scenario planning and forecasting not just for an entire population, but also for a specific community within a population.

This paper presents a heuristic epidemic spreading model, which can be readily implemented, modified, up-scaled and generalised personal computer spreadsheet. The model based on two parameters, namely detection efficiency and daily infectivity. The model formulation is first discussed. Examples are then presented to verify and calibrate the parameters, with respect to COVID 19. Some findings from the simulation, relevant to COVID 19, particularly for the Singapore context, are highlighted. Finally, an extension of the model is also made to include the effects of herd immunity and asymptomatic period.

The proposed model is akin to the SIR model, in that both are process-based. However, this model is formulated in discrete format, wherein the starting point is a propagation unit, taken herein to be of a duration of 1 day. Fig. 1 illustrates the propagation process of this model. At the start of day 0, an assumed number of active infected cases is seeded into the population. During the course of the day, detection processes will identify and remove a certain fraction, designated by k, from the infected cases.

The remaining undetected cases will then go on to re-infected other cases, based on a daily infectivity ratio r. The typical sequence of operations for three days is shown in Fig. 1 , wherein the variables are defined as follows:

Ni, Ni+1 and Ni+2 = the active number of infected cases still in the population at the start of the three respective days i.e. i, i+1 and i+2. This does not include those cases which have been previously detected and removed from the infected reservoir. Hence, it is not the same as the total number of infected cases. k = the detection efficiency. k = 1 means 100% detection, i.e. every infected case is detected. r = the average daily infectivity of each undetected infected case, e.g. r = 2 means every undetected case went on to infect another 2 persons within a day.

As the arrows highlighting the sequence of operations in Fig. 1 show, the number detected during the course of each day is subtracted from the number at the start of the day. The undetected number is then allowed to re-infect new cases which, together with the undetected cases, goes on to form the active cases at the start of the following day. For simple scenarios, mathematical relationships for this recurrence model can be derived, as shown in the sections below.

The total number of case that is usually reported M is actually the cumulative number of cases 

If k and r are constants, then Eq. 1 is a geometric series, which, when summed to the j th term gives

We note that Mj increases monotonically with j, i.e. the number of days.

Mj does not relate directly to the probability of encountering an infected person at any point in the population. This probability is represented more by the active number of undetected infected persons on a given day, say the j th day, Pj, which is given by

The ratio

and decrease as j increases. Hence, the total number of reported cases is not representative of the actual probability of encountering an infected person in the population.

The number of active cases at the start of the j th day is Nj. For this number not to increase,

Or 1

If r = 2, then k ≥ 0.67 (67% efficiency).

If r = 3, then k ≥ 0.75 (75% efficiency).

If r = 4, then k ≥ 0.80 (80% efficiency).

Hence, to decrease the active number of infection day-by-day, the detection efficiency has to be sufficiently high in relation the daily infectivity. As discussed below, the latter is a function of the disease, infectious period, community and social setting. For instance, a disease in a setting where people are isolated may be a low r-value. The same disease in a setting where people are highly socialised would have a higher r-value.

It should be noted that the N-value is not the number of new cases detected each day. That number Qj is given by Qj = kNj. However, the criterion for stabilization is similar, if the number detected each day is to remain steady or decrease, then

which also requires Eq. 6 to be satisfied. It should be noted that if the incremental number of reported cases is steady day-upon-day, then

Hence, Eq. 9 is the condition for the curve of cumulative reported number of case to reach the point of inflexion. At this point inflexion, the rate of detection is just sufficient to offset the rate of new infection.

At this point, the battle is not won, but it is also not lost. It is akin to a stalemate situation.

The undetected cases in each day Pj is given by   

The number of undetected cases Pj affects the probability of encountering an infected person in the population. If k can be estimated, then Pj can be readily estimated from Qj.

To illustrate the significance of this, if the number of detected cases on a certain day Qj = 120 (this being the peak number in Singapore, on 5 April 2020) and the detection efficiency k = 0.5, then number of undetected cases Pj = 120. If the daily infectivity r = 2, then at the start of the following day, we would expect the number of active cases to be ~360 and an upper-bound estimate of the probability of encountering an infected person in Singapore ~ (https://www.worldometers.info/coronavirus/#countries), or 1 in 4200, which is ~3.6 times higher.

Hence, the cumulative number of detected cases is not directly related to the chances of encountering an infected person. On the other hand, if k = 0.1, then Pj = 9Qj and every detected cases implies 9 undetected cases. In other words, the reservoir of undetected active cases will far exceed the number of detected cases and the latter ceases to be meaningful.

The mathematical results presented above only applies to very simple scenarios. However, the recurrence relationships can be readily upscaled and generalised to more complex scenarios using a spreadsheet or other programming environments. The examples below were calculated by incorporating these recurrence relationships into a Excel spreadsheet.

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint . https://doi.org/10. 1101 6.1 Examples of application

We now consider the simplest case where the daily detection efficiency k and infectivity r are both constant. Figs. 2a and b shows the effect of different values of k for r = 2. As can be seen, the trend is highly sensitive to the detection efficiency and daily infectivity. A slight drop in k or a slight increase in r is enough to cause a drastic increase in the numbers of cumulative and incremental detected cases.

The case of k=0.6 in Fig. 2a also shows what happens in a yet-to-stabilised epidemic situation. As can be seen, the upper part of this curve is almost linear, indicating a log-linear relationship between Mj and number of days N, of the form

In which N0 = exp(β) is the initial number of infected persons (at 0 th day). The parameter α governs the rate of exponential increase and is dependent upon k and r. The effect of α can be assessed by considering the value of Mj at intervals of a specified number of days ΔN. For instance, on the j th day,

If SΔN is chosen to be 2 and ΔN2 is the number of days for a doubling of the detected cases, then . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint . https://doi.org/10.1101/2020.04.15.20067264 doi: medRxiv preprint 1 National University of Singapore (non-medical researcher)

decreases monotonically in all cases. In scenarios which have yet to stabilise e.g. Fig. 4c for k = 0.6, j j P M stagnates at a relatively high value, but which is, nonetheless, still significantly less than 1, indicating that the cumulative detected cases may not serve as a good representative of the number of active cases.

We now consider a case where the detection efficiency starts at a lower level but was ramped up at different points and with different rates, using the following strategies:

(a) k = 0.6 up to the 10 th day and then ramped up to 0.9 on the 16 th day.

(b) k = 0.6 up to the 15 th day and then ramped up to 0.9 on the 30 th day.

(c) k = 0.5 up to the 15 th day and then ramped up to 0.9 on the 35 th day.

As Fig. 5a shows, the initial exponential followed by inflexion and then stabilisation is now clearly reflected. Furthermore, as Fig. 5b shows, the peaking of the incremental number of detected cases is also reflected in all three cases but is more pronounced in Strategy 3 owing to the later and slower ramp-up. Finally, the effect of increasing the k-value is also reflected in a decrease in The daily infectivity r is similar, but not identical to the basic reproduction number R0 widely used in epidemiology (https://www.healthknowledge.org.uk/public-health-textbook/research-methods/1aepidemiology/epidemic-theory). The basic reproduction number R0 is defined as "the average number of secondary infections produced by a typical case of an infection in a population where everyone is susceptible". This ignores the effect of herd immunity. To account for herd immunity, an effective reproductive number R is also defined, such that

Where x is the fraction of the population that is still susceptible. Since R0 is the reproduction number over the infectious period of an infected individual and r is a daily rate

where n = the number of days over which infection can occur. If herd immunity is insignificant, then

It is well-established that R0 is not just dependent upon the infecting agent or the disease, it is also dependent upon a host of other factors including . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint . https://doi.org/10.1101/2020.04.15.20067264 doi: medRxiv preprint 1 National University of Singapore (non-medical researcher)  The rate of contacts in the community. This depends upon the social setting, proximity between people and whether they are using protective equipment and practising good personal hygiene etc. For instance, Zhang et al. (2020) , using early stage data from the Diamond Princess cruise ship, estimated the R0 to range from 2.06-2.52, with a median at 2.28. also estimated a similar range for R0, from 2.2 to 2.7. Leung et al. (2020) also estimated an R0 = 2.68. On the other hand, Sanche et al. (2020) estimated a much higher R0 ranging from 4.7 to 6.6. These differences may be attributed to different social settings and data accuracy. There may also be issues relating to undetected cases which may impact the estimated R0. Niehus et al. (2020) , for instance, postulated that "…the global ability to detect imported cases is 38% (95% HPDI 22% -64%) of Singapore's capacity".

 The infectious period. According to WHO (2020) Asymptomatic cases have been reported as part of contact tracing efforts in some countries".

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint . https://doi.org/10.1101/2020.04.15.20067264 doi: medRxiv preprint  The probability of infection being transmitted during contact, which is largely a characteristics of the disease.

Based on this, one can surmise that the infected person is likely to highly contagious within the first 3 days of onset of symptoms. Some transmission may occur during the asymptomatic period but does not appear to be as significant. It is also likely that after the first 3 days, he would have sought medical attention and would likely have been detected. Hence, it is postulated herein that an average estimate of the infectious period n ~ 4 to 5 days. Together with the upper bound value of R0 of 6.6, this would imply r ~ 1.3 -~1.7. However, this is an average figure; for special social setting, r may be much higher.

Less is known about the detection efficiency k. Niehus et al.'s (2020) postulation only relates to the relative (rather than absolute) detection capacity and they noted that Singapore's detection capacity is likely to be less than 100% (or 1.0) and some other countries may be much lower. One way to backestimate k is through the geometric rate of increase. If we assume a median value of r ~1.5, based on the discussion above, then k can be estimated by back-fitting the α-value, Table 1 . As Table 1 shows, Singapore's back-fitted detection efficiency k is significantly higher than that of China (in the early stage) and the US. This is consistent with Niehus et al.'s (2020) postulated that "…the global ability to detect imported cases is 38% (95% HPDI 22% -64%) of Singapore's capacity". The current backfitted results indicate a somewhat higher percentage (53%) instead of 38%. Nonetheless, the trend is generally correct and it explains the slower rise in infected cases in Singapore. Moreover, the detection rate of 35.5% for China is also reasonably consistent with Wang et al.'s (2020) finding that "at least 59% of infected cases were unascertained in Wuhan". The latter would imply a detection efficiency of 41% or less. Fig. 6 shows the cumulative number of cases from dormitory clusters, non-dormitory clusters and unlinked cases from 30 March 2020 till 9 April 2020. As can be seen, both the non-dormitory clusters and unlinked cases plot to almost the same gradient, implying almost equal α-value, which can be fitted using k = 0.54. This suggests a possible decrease in the detection efficiency from the initial high value of 0.67. The alternative explanation is an increase in r-value, but this seems rather unlikely since the social setting, virological characteristics and infectious period have not changed. However, there may be some over-estimation of the unlinked cases as some of these might have been subsequently linked to existing clusters or cases. The dormitory clusters show a much higher α-value. Since the kvalue is unlikely to be different for this group, compared to the non-dormitory clusters and unlinked cases, the only alternative explanation is a higher value of r (=2.2).

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint . https: //doi.org/10.1101 //doi.org/10. /2020 Fig. 6. Cumulative cases from dormitory clusters, non-dormitory clusters and unlinked cases starting from 30 March 2020.

Herd immunity can be readily incorporated through Eqs. 18 and 19 together with

in which Tj is the total number infected up to the point of time, and G is the size of the population. Fig.   7 shows the effect of herd immunity on a population size of 5.7 million, with an initial r of 1.5, and different values of k. As can be seen, the stabilization time and number are both affected by the detection efficiency k. With a low detection efficiency, stabilization occurs faster as the epidemic works through the population, but stabilization occurs at a higher total number of infected cases.

The model can be extended to asymptomatic spread, although this requires some re-organization of the spreadsheet, since the detected and undetected cases arising from each day of the asymptomatic period need to be tracked. There is also a possibility that the k and r-values may differ between asymptomatic and symptomatic periods. To cater for this, separate parameters are used for asymptomatic and symptomatic period; these being designated as ka and ks, and ra and rs, where the subscripts a and s denote asymptomatic and symptomatic parameter. Fig. 8 shows the effect of asymptomatic period on the overall spreading for the values of k and r as shown in Table 2 . The main effect of a low asymptomatic detection efficiency is a slower pick-up in the initial number of detected cases; but this is more than offset by a faster pick-up as the larger number of symptomatic cases kicks in. Not much is known about the asymptomatic detection efficiency but one may surmise that this should be lower than the symptomatic value. The asymptomatic daily infectivity is also highly debated.

As more information comes to light in future, these parameters may be more reliably estimated.

An important feature of this model is its simplicity. It can be readily implemented on a spreadsheet or any other programming environment, and therefore potentially widely accessible. There are only two parameters in the models although these parameters may be not time-invariant. The detection efficiency, for instance, may changes as contact tracing and testing are ramped up or down. Similarly, the daily infectivity may also be function of the community setting. The fact that these two parameters can be related to real changes in conditions is an advantage since they allow the effect of any changes in conditions to quantified. As the body of data on the relationship between these parameters and real conditions grows, the model will become increasingly reliable and definitive.

The detection efficiency is not affected by undetected case because back-fitting is based on the rate of increase in the number of detected (or reported) cases rather than the total number of infected cases.

In fact, as illustrated in the examples above, if the daily infectivity can be independently estimated, the detection efficiency can be estimated through back-fitting the rate of increase in the number of detected cases.

It can also be readily up-scaled and adapted to different scenarios and conditions. This is especially true for the spreadsheet implementation. As such, this model may be useful as a tool for scenario planning and forecasting not just for an entire population, but also for a specific community within a population. For instance, back-fitting of the data suggests a higher daily infectivity for the dormitory clusters than for the other clusters. Gevorg Yeghikyan (2020). Modelling the coronavirus epidemic in a city with Python. Towards Data Science. https://towardsdatascience.com/modelling-the-coronavirus-epidemic-spreading-in-acity-with-python-babd14d82fa2

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint . https: //doi.org/10.1101 //doi.org/10. /2020 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint . https://doi.org/10.1101/2020.04.15.20067264 doi: medRxiv preprint 1 National University of Singapore (non-medical researcher) CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint . k=0.6 k=0.7 k=0.8 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Incremental number of detcted cases Day r=2 0.6 (10th day) -> 0.9 16th day 0.6 (15th day) -> 0.9 (30th day 0.5 (15th day) -> 0.9 (35th day . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.04.15.20067264 doi: medRxiv preprint 1 National University of Singapore (non-medical researcher) (c) Non-domitory cluster Dormitory clusters Unlinked cases . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.04.15.20067264 doi: medRxiv preprint 1 National University of Singapore (non-medical researcher) Fig. 7 . Effect of herd immunity on a population size of 5.7 million. Fig. 8 . Effect of asymptomatic period on overall spread for different cases, with parameters as shown in Table 2 . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10. 1101 

Special report: The simulations driving the world's response to COVID-19

Transmission dynamics of the COVID-19 outbreak and effectiveness of government intervention

Forecasting COVID-19 impact on hospital bed-days, ICU-days, ventilator days and deaths by US state in the next 4 months

A Contribution to the Mathematical Theory of Epidemics

Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia

Estimating underdetection of internationally imported COVID-19 cases

Estimation of the reproductive number of novel coronavirus (COVID-19) and the probable outbreak size on the Diamond Princess cruise ship: A data-driven analysis

The Novel Coronavirus, 2019-nCoV, is Highly Contagious and More Infectious Than Initially Estimated

Evolving Epidemiology and Impact of Nonpharmaceutical Interventions on the Outbreak of Coronavirus Disease

Coronavirus disease 2019 (COVID-19) Situation Report -73