key: cord-0992175-hfd4eqkv
authors: Brennan, P. V.; Brennan, L. P.
title: Susceptibility-adjusted herd immunity threshold model and potential R0 distribution fitting the observed Covid-19 data in Stockholm
date: 2020-05-22
journal: nan
DOI: 10.1101/2020.05.19.20104596
sha: 2887b2e0d5356c6cb8ad29f029ee12f7e10890c6
doc_id: 992175
cord_uid: hfd4eqkv

The reproduction number, R0, is commonly used, and sometimes misused, in conjunction with the classic Kermack and McKindrick theory based on the assumption of homogeneity, in order to estimate herd immunity threshold (HIT). This provides a crude first estimate of HIT, with more elaborate modelling required to arrive at a more realistic value. Early estimates of HIT for Covid-19 were based on this simplistic homogeneous approach, yielding high HIT values that have since been revised downwards with more sophisticated network modelling taking account of R0 heterogeneity and with reference to the low HIT found from serological sampling in Stockholm County. The aim of this paper is to describe a simple model in which host susceptibility is directly linked to the heterogeneous R0 distribution, to shed further light on the mechanisms involved and to arrive at a bimodal R0 distribution consistent with the Covid-19 HIT observed in Stockholm County.

The herd immunity threshold (HIT) is a widely-used concept to estimate the infection rate within a population at which infection ceases to grow exponentially [1, 2] . A basic expression is frequently used for this estimate based on the assumption of homogeneity: each member of the population having equal reproduction number, R 0 . However, this assumption is simplistic and unrealistic in a great many cases and it is often observed that the actual HIT is substantially lower than given by this elementary approach [3] .

This work describes a simple model to take account of heterogeneity of R 0 in the estimate of HIT, by assuming that host susceptibility is directly proportional to reproduction number. A variety of R 0 distributions are explored to examine the effect, followed by working backwards from real data to arrive at a bimodal distribution that yields a HIT close to that observed with Covid-19 in Stockholm County.

The HIT, the proportion of immunity within a given population beyond which the effective reproduction number is unity, is easily deduced and given by:

This is based on a very simple, naïve model of a homogeneous population in which a given infected individual is equally likely to infect R 0 other individuals, all of whom are susceptible hosts at the outset. It is assumed that the entire population has the same R 0 value, i.e. R 0 is constant with p, the cumulative infection variable, and the same susceptibility to infection.

In reality, R 0 must vary, since some people are more likely than others to transmit infection due to occupation, environment, lifestyle and other factors. For instance, an infected nurse may be many times as likely to infect others as a single person working from home. Hence there is actually a statistical distribution of R 0 across the population [4, 5] . If R 0 is variable (heterogeneous) but host susceptibility is assumed to remain constant then it is valid to use the mean value of R 0 in the population to calculate the herd immunity threshold,

and it is common practice to do this by default [6] . In practice, however, host susceptibility is also variable and in general those with a higher R 0 value are likely to exhibit greater susceptibility to infection for the very same reasons that they are more likely to transmit infection to others, for instance by working in professions such as medical practice, social care or hospitality where they are frequently in contact with other infected people. A reasonable premise and first approximation that is used in this work is to assume that host susceptibility is in direct proportion to the associated R 0 value., i.e. the probability of becoming infected is proportional to the probability of infecting others. Based on this principle, the R 0 distribution after a small proportion δp of the population have become infected may be obtained by adjustment of the p-values. This is illustrated in Figure 1 with reference to a population with two R 0 values, R 01 and R 02 (1 and 3 for illustration) in initial proportion p 01 : p 02 . Infection occurs in proportion to the respective R 0 values, giving a change in distribution as follows:

It is clear from Figure 1 that cohorts with higher R 0 values have a proportionately higher share of infection, biasing the distribution towards lower R 0 values as the infection rate increases. It follows from Equation (3) that the R-value after δp infections is given, in the general case, by the relation,

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 22, 2020. . https://doi.org/10.1101/2020.05.19.20104596 doi: medRxiv preprint This process is repeated in a recursive fashion, each time updating the probability distribution, p nm , according to Equation (3) . The herd immunity threshold is reached when the reproduction number falls to 1. A non-recursive approximation to the last result is obtained by regarding δp as the total change in population, Δp, giving for a continuous distribution,

and, by equating this to 1, a herd immunity threshold of:

For the homogeneous case of a constant reproduction number across the population, the last result converges to that of Equation (1), as would be expected.

The recursive, method described in Section 2 (Equations (3) and (4)) is used to explore the HITs obtained with a range of candidate R 0 distributions in relation to the HIT obtained with a homogeneous population. A mean R 0 of 2.4 is taken for all cases. Continuous distributions are used, with probability density functions presented at the outset and calculated at the threshold of herd immunity, the area beneath the curve being unify at the outset and equal to the HIT at herd immunity. Figure 4 shows a population with uniformly-distributed R 0 over the range 0 to 4.8. At herd immunity, the distribution is biased heavily towards the lower R 0 values with a steadilydeclining trend with R 0 . There is a significant reduction in HIT from 58.3% to 46.1% with the corresponding equivalent R 0 value (which would produce the same HIT from the homogeneous model) of 1.85.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 22, 2020. Figure 5 shows the case of a Rayleigh-distributed population with a mean value of 2.4. Infection occurs more rapidly in the higher R 0 values, skewing the distribution towards lower values. This results in a modest reduction in HIT from 58.3% to 49.2% with the corresponding equivalent R 0 value that would produce the same HIT from the homogeneous model of 1.97. Note that in this case, as in all cases, the total area under the red and green curves is 1 and the area under the green curve is the HIT. It is clear that the HIT value can only be reduced by this mechanism, regardless of the R 0 distribution, relative to the homogeneous model.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 22, 2020. The final case is an attempt to arrive at a distribution close to the observed 17% HIT from random serological testing of Covid-19 in Stockholm County [7] . This requires a quite severe bimodal or multi-modal distribution. An example of such a bimodal distribution fitting the Stockolm data is shown in Figure 6 , comprising a Rayleigh region with a mean close to 1 and a Gaussian region with a mean of 12, in the ratio 85:15, not far from the 80:20 Pareto rule often cited in the context of super-spreaders [8] . This, in practice, may represent the situation where the bulk of the population has a relatively low R 0 , whilst a minority cohort in professions or situations with much greater exposure to infection, such as medical or publicfacing occupations, have a much higher R 0 , so-called 'super-spreaders' [9] .

From Figure 6 , the R-value trend with the postulated bimodal distribution is seen to decline very rapidly with infection rate and reaches herd immunity at just 17.6% prevalence. At the onset of herd immunity, most of the high-R 0 cohort are infected whilst only a small minority of the low-R 0 cohort are infected. The R 0 value found in Stockholm County and used here was 2.4, which would correspond to a HIT of 58.3% using the elementary Kermack and McKindrick approach based on a homogeneous population, and was indeed the estimate made by Ferguson [10] , some 3.4 times the observed value.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 22, 2020. . https://doi.org/10.1101/2020.05.19.20104596 doi: medRxiv preprint Figure 6 . Homogeneous and susceptibility-adjusted herd immunity thresholds, bimodal distributed R 0 .

The recursive model described in Section 3 is extended to calculate the total infection rate versus time/generation for the bimodal case representing Stockholm County, with the result as shown in Figure 7 . The final infection rate is 32.7% in the absence of intervention or 28.6%, 24.4% and 17.9% with intervention equivalent to arresting the cases at infection rates of 5%, 10% and 15%, respectively. This suggests that intervention to temporarily slow or halt the infection rate just below the HIT is effective at limiting the final infection rate to just over the HIT. This would require careful timing and monitoring of infection rate across the population, which may not be possible in practice.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 22, 2020. 

An enhanced yet simple model for herd immunity threshold has been described based on a non-uniform reproduction number distribution and the assumption that host susceptibility is directly proportional to reproduction number. Results for a number of R 0 distributions show that the herd immunity threshold is reduced relative to the homogeneous model and may be substantially reduced, for instance by a factor of 3.4 from the Stockholm County data used here, if there is a large variation in R 0 distribution across the population. The herd immunity threshold is strongly influenced not just by the mean R 0 value but by its distribution, and simply using the mean R 0 value for a given population as done by Ferguson [10] for Covid-19 is likely to lead to an unrealistic overestimate.

An attempt has been made to postulate a distribution that yields the HIT of 17% observed from serological sampling of Covid-19 in Stockholm County. This requires a quite severe bimodal distribution with a cohort of super-spreaders having much higher R 0 and associated susceptibility than the bulk of the population. Whether this situation is realistic cannot easily be determined, but it serves to illustrate the principle and offer insight into the likely R 0 distribution responsible for the observed result. In addition to demographic R 0 heterogeneity, spatial and other heterogeneities are likely to contribute to modulation of the HIT, as described by network models [5] , so it is likely that the simple mechanism modelled here is a partial explanation for the observed low HIT seen for Covid-19 in Stockholm County.

The significance for infections such as Covid-19 is that in order to accurately estimate HIT it is crucial to take account not just of the basic R 0 value averaged over a given population but also of its distribution, which results in a reduced and possibly substantially-reduced HIT estimate.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 22, 2020. . https://doi.org/10.1101/2020.05.19.20104596 doi: medRxiv preprint

Herd immunity: recent uses in vaccine assessment

Herd immunity: a rough guide

The disease-induced herd immunity level for COVID-19 is substantially lower than the classical herd immunity level

Individual variation in susceptibility or exposure to SARS-CoV-2 lowers the herd immunity threshold

Beyond R 0 : Heterogeneity in secondary infections and probabilistic epidemic forecasting

Herd immunity -estimating the level required to halt the COVID-19 epidemics in affected countries

Super-spreaders in infectious diseases

Superspreading and the effect of individual variation on disease emergence

Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand