key: cord-0130382-c75irg4r authors: McLure, Angus; Glass, Kathryn title: Some simple rules for estimating reproduction numbers in the presence of reservoir exposure or imported cases date: 2018-09-04 journal: nan DOI: nan sha: 8760f206472b115c9508781190223bf55b45e404 doc_id: 130382 cord_uid: c75irg4r The basic reproduction number ($R_0$) is a threshold parameter for disease extinction or survival in isolated populations. However no human population is fully isolated from other human or animal populations. We use compartmental models to derive simple rules for the basic reproduction number for populations with local person-to-person transmission and exposure from some other source: either a reservoir exposure or imported cases. We introduce the idea of a reservoir-driven or importation-driven disease: diseases that would become extinct in the population of interest without reservoir exposure or imported cases (since $R_0<1$), but nevertheless may be sufficiently transmissible that many or most infections are acquired from humans in that population. We show that in the simplest case, $R_0<1$ if and only if the proportion of infections acquired from the external source exceeds the disease prevalence and explore how population heterogeneity and the interactions of multiple strains affect this rule. We apply these rules in two cases studies of Clostridium difficile infection and colonisation: C. difficile in the hospital setting accounting for imported cases, and C. difficile in the general human population accounting for exposure to animal reservoirs. We demonstrate that even the hospital-adapted, highly-transmissible NAP1/RT027 strain of C. difficile had a reproduction number<1 in a landmark study of hospitalised patients and therefore was sustained by colonised and infected admissions to the study hospital. We argue that C. difficile should be considered reservoir-driven if as little as 13.0% of transmission can be attributed to animal reservoirs. Many pathogens affecting humans circulate between humans and animals through contact, food or indirectly through common disease vectors in the environment. Other pathogens move across population boundaries due to the movement of people. In the absence of transmission from other populations or reservoirs, the basic reproduction number -the average number of secondary cases arising from each primary case in a susceptible population -determines whether a disease will die out or persist through ongoing person-to-person transmission. Effective interventions can interrupt transmission by reducing the basic reproduction number below 1 causing the disease to die out in that population. However, any reservoir exposure or imported cases will continue to replenish the infected population, and so a disease will die out in a population if and only if the basic reproduction number is <1 and all reservoir exposure and importation are avoided. There is a rich literature in metapopulation models that capture the interactions of populations that introduce or reintroduce pathogens to one another (e.g. [1] [2] [3] [4] ). However, one often only has data or interest in a single population but needs to account for external sources of infections. It is in this context that we wish to generate some simple principles or rules for estimating the reproduction number. Methods have been developed to estimate the human reproduction numbers of emerging zoonoses with limited person-to-person spread [5, 6] . Others have developed methods to account for the often large proportion of imported cases at the beginning of new epidemics, which if excluded or treated as if locally acquired would overestimate the reproduction number [7] . Though the term 'elimination' has been defined in many different ways [8] , reducing the local reproduction number below one is one measure of this progress [9] , and is a necessary step towards global eradication. Methods have been developed to estimate the reproduction number that account for the potentially large proportion of imported cases in settings where progress is being made towards elimination [9] . However none of these methods account for susceptible depletion and so are restricted to diseases with very low prevalence [5, 6, 9] or calculate the effective reproduction number [7] , which is not a threshold parameter for disease persistence. Starting with simple models and incorporating heterogeneity or multiple strains, we have derived simple rules for estimating the reproduction number in a population where the disease is at endemic equilibrium due to a combination of local person-to-person transmission and reservoir exposure or imported cases. Many of these rules only require knowledge of disease prevalence and the proportion of infections attributable to the external source. We have applied these rules in two case studies of C. difficile infections. We begin by adapting the simplest possible compartmental model: the standard SIS model with a homogeneous, well-mixed population without demographics. We include two sources of infection: (1) person-to-person transmission which is proportional to the number of people infected (rate: ) and (2) constant transmission from some reservoir that does not depend on the number of people infected (rate: ). Person-to-person transmission could be through direct contact, or mediated via fomites, airborne droplets, water or food provided this transmission scales with the infectious population. For our purposes a reservoir is anywhere where the pathogen persists apart from the human population, for instance a population of wild animals or livestock animals that carry the disease. The disease in the human population can be described by a system of ODEs for the proportion of the population that is susceptible ( ) and infected ( : where is the force of infection and is the rate at which infected individuals recover. Diseases that are acquired entirely from food or animals and diseases that are spread entirely by person-to-person transmission, are extreme cases of this model with 0 and 0 respectively. Many diseases lie between these two extremes. Almost all human cases of H7N9 avian influenza have been acquired from birds, but there has been some person-to-person transmission which is not enough to maintain endemic disease [10] . Meanwhile human-adapted seasonal influenza (H1N1, H3N2) are mainly transmitted to humans by other humans, though there are low frequency transmission events from animal reservoirs (e.g. [11] ). Middle-eastern respiratory syndrome coronavirus sits somewhere in the middle of the spectrum with significant human-to-human and animal-to-human transmission [12] . The reproduction number for this simple model in the next-generation sense [13] is the same as for the standard SIS model ( / ) but is a threshold parameter for extinction of the disease only when there is no transmission from the reservoir ( 0), i.e. when the model reduces to the standard SIS model. Otherwise, the reservoir will continually replenish the infected population whatever the value of . If there is no transmission from the reservoir we have the well-known relationship between the basic reproduction number ( ) and the proportion of the population susceptible at equilibrium ( : 1/ . The model parameters are difficult to measure directly and so we wish to estimate through observable quantities by generalising this rule. Let , and be the non-trivial (i.e. , 0) equilibrium values of , and . As equilibrium points of 1 they satisfy or equivalently . Now the proportion of transmission that is from the reservoir at equilibrium, , is 1 , which re-arranged for gives Substituting these expressions for and into the expression for the reproduction number we get ≡ β γ 1 . We can also write this in terms of the proportion infected (which is usually what is reported rather than the proportion susceptible). These expressions simplify to 0 if the disease is only acquired from the reservoir ( 0, 1) or to when all transmission is person-to-person ( 0, 0 . The general cases of these expressions lead to a simple rule for the reproduction number: 1 if and only if . The disease can be maintained by person-to-person transmission in the absence of reservoir exposure if and only if the prevalence exceeds the proportion of transmission from the reservoir. This simple rule has surprising implications. For diseases with low prevalence (e.g. 2%), if a small but larger portion (e.g. 3%) of transmission is from the reservoir, then the disease cannot be sustained in the population by person-to-person transmission alone (since . . 1). Preventing the small proportion of transmission from the reservoir (reducing and to 0) will cause the disease to become extinct in the population. Nevertheless, names like 'food-borne' or 'zoonotic' may be misleading for such diseases because the source of transmission is another human in most (e.g. 97%) individual infections. Instead we call these diseases reservoir-driven. We define the reservoir-driven threshold as the minimum proportion of transmission which must be from the reservoir for the disease to be considered reservoir-driven ( in our simple SIS model). The rest of this article will consider variants and extensions of the simple SIS model to demonstrate which assumptions do and do not affect the above expressions for the reproduction number and reservoir-driven threshold. We will also show that an equivalent rule of thumb and threshold exists when a disease is driven by imported cases due to travel or immigration. We will then consider how this rule of thumb can be applied to case studies in real diseases. Simple demographics does not change our rule for the reproduction number. A modified model including deaths from both classes at rate and births that balance deaths is described by the equations where is the force of infection. In this model . Let , and be the non-trivial (i.e. , 0) equilibrium values of , and . Then , or equivalently . The force of infection terms are the same as for our original model so again we have 1 . Substituting this into the expression for the reproduction number we get the same result as 2 and 3: ≡ 1 1 1 and the reservoir-driven threshold is still . We have assumed that the death rates are the same for infected and susceptible persons, but it is simple to show that a higher (or lower) death rate for infected individuals does not affect the reasoning. The simplest SIR model without birth and deaths or waning immunity does not have an endemic equilibrium point so our method for estimating the reproduction number is not applicable to these models. Instead, consider the SIR model with births and deaths: where the force of infection. Note that adding the recovered class to the SIS model with births and deaths does not change the reproduction number, the equation governing the number of infected individuals or the force of infection and so the reasoning is identical to previous section. However, since there are more than two classes, 1. Therefore 3 does not hold but instead, 1 S 1 1 . The reservoir-driven threshold here is , i.e. the disease can be sustained by person-to-person transmission in the absence of reservoir exposure if and only if the proportion of transmission which is due to reservoir exposure is less than the total proportion of people infected or immune/recovered. The same reasoning can be used for models with waning immunity, vaccination, or latent/exposed classes. Since these modifications do not affect the equations governing the number of infected individuals or the force of infection, equation 2 still holds and therefore the reservoir-driven threshold is 1 in all these cases. For diseases with comprehensive vaccination programs (or common diseases with lifelong immunity), almost all the population can be immune (e.g. 95%) and the proportion susceptible very low. If reservoir exposure accounts for nearly all cases but is still less than the reservoir-driven threshold (e.g. 90%), the disease could be sustained by person-to-person transmission alone if reservoir exposure was eliminated (since . . 1) and so eliminating exposure to the reservoir would not eliminate the disease from the human population. Analogous rules can be derived for settings where some infections are acquired locally and others are imported through immigration or those returning from travel. We assume that susceptible and infected individuals emigrate/leave at the same rate , that immigration balances emigration and that a proportion of those entering the population are infected. The governing equations are is the force of infection. Again, / but is not by itself a threshold parameter for disease extinction because the continuous immigration of new infected individuals will sustain the disease (unless 0). The equilibrium proportion infected ( ), proportion susceptible ( ) and force of colonisation ( ) satisfy , or equivalently . Meanwhile the proportion of new cases that are imported, , is 1 1 which we can rearrange for the transmission parameter giving Therefore, we can write the reproduction number as ≡ 1 1 1 . These expressions lead to simple rules for the reproduction number analogous to those derived for diseased reservoir exposure. 1 if and only if . That is, in this simple model, the disease can be sustained without importation by local transmission if and only if the prevalence exceeds the proportion of new cases that are imported through migration or travel. By analogy to the reservoir exposure model, we call this threshold the importation-driven threshold. This analogy still holds when heterogeneity or multiple strains are incorporated into these models -extensions we consider in sections 4 and 5. It is known that accounting for population heterogeneity tends to increase estimates of reproduction numbers [14] . Therefore, we might expect that introducing heterogeneity into models with reservoir exposure will increase the reservoir-driven threshold. Consider a general SIS model with separable mixing in a heterogeneous population consisting of people of different -types with susceptibility , transmission parameter and mean infectious period 1 / , distributed according to the probability density function . Then , , , , , , and , , , where , . For this model, / [15], but as before is only a threshold parameter for disease extinction if 0. Let and be the non-trivial equilibrium distributions of , and the equilibrium value of . As equilibrium points they satisfy , or equivalently, . At equilibrium, the proportion of infections acquired from the reservoir, which is the proportion of force of infection attributable to the reservoir, is 1 . 1 . If we let / we can write the reproduction number and the proportion of infections from the reservoir in simpler terms where is the mean value of across the population and 1 1 1 where : is the total susceptible population and is the mean value of across the susceptible population. Therefore By similar reasoning one can show that is the proportion of the whole population that is infected and is the mean value of across the infected population. The quantity we want to estimate, , appears as in the right-hand sides of each equation and the quantities and are unlikely to be known, so this does not provide a practical way to estimate the reproduction number. However, these statements provide some insight into how heterogeneity can affect our estimates of the reproduction number or reservoir-driven threshold. The rule of thumb is similar to the rule for a homogenous population: 1 if and only if / , i.e. the reservoir-driven threshold is / . If those who are infected have higher-than-average (or lower-than-average) , then accounting for this heterogeneity increases (or decreases) the reservoir-driven threshold. We derive simple expressions for the value of / under some specific assumptions. If we assume that the infectiousness ( ) is fixed but the product of susceptibility and length of infectious period ( : / ) is heterogeneous, then the reservoir-driven threshold is always higher than for a homogenous population. Consider the ratio / Since the odds of infection for an individual of type is proportional to ( ), individuals with high (i.e. more susceptible individuals or individuals with longer infectious periods) will be overrepresented in the infected portion of the population at equilibrium. Therefore and so the reservoir-driven threshold is at least as high as for a homogenous population. If the prevalence is low for people of all -types (i.e. ) there is a simple approximation for the reservoir-driven threshold. We can rearrange 4 to get . and so and 1 . If the population variance of is : the ratio can be written approximately as 1 1 and the reproduction number is When there is no heterogeneity in (i.e. when 0) this simplifies to the result for the homogenous SIS model. The larger the variance for a given mean, the greater the basic reproduction number and the higher the reservoir driven-threshold, 1 / . For example if and are such that the distribution of across the population is gamma with mean and shape parameter (a convenient and often used assumption [14] ), then the reservoir-driven threshold is approximately 1 (Figure 1 ). If the type of an individual corresponds to some easily determined risk class -for instance if denotes gender or smoker status -then the proportion of people in each class, , and the odds of infection within each group, / , may be known. Since the odds of infection is proportional to , we can express and in terms of these observed quantities: If infectiousness ( ) is heterogeneous, but the product of susceptibility and length of the infectious period ( : / ) is fixed, then the reservoir-driven threshold is the same as for a homogenous population. Consider the ratio / which can be simplified as , where and are the mean values of across the whole population and across the infected portion of the population respectively. Now by 6, if is constant across the population the odds of infection at equilibrium is the same for people of every -type, i.e. independent of their infectiousness. Therefore, the mean infectiousness amongst the infected population is the same as the mean infectiousness across the whole population and / ⁄ 1. Equations 5 then simplifies to 1 1 which is the same as the result for a homogenous population. Heterogeneous infectiousness will affect the reservoir-driven threshold in a population which is also heterogeneous with respect to susceptibility or infectious period. If those who are more likely to be in the infected class (high ) are less infectious (low ), this will reduce the reservoir-driven threshold relative to homogeneous infectiousness but heterogeneous susceptibility and infectious period. As a simple example of this, assume that 1/ . Then 1, 1 and the reservoirdriven threshold is simply , less than what it would be if were constant across the population. On the other hand, if those who are more likely to be colonised (high ) are also more infectious (high ), the reservoir-driven threshold will increase relative to homogeneous infectiousness but heterogeneous susceptibility and infectious period. As another simple example consider the proportional mixing assumption, i.e. ∝ . In this case ∝ and so . When the prevalence is low for people of all -types (i.e. ) we can use the same reasoning as in the previous section to approximate this ratio as and the reproduction number by 1 1 . where is the third raw moment of across the population. If for example, and are such that is gamma distributed with shape parameter then the reservoir-driven threshold is approximately 1 , which is higher than if were homogeneous. Figure 1 summarises how the reservoir-driven threshold changes for different types of heterogeneity explored so far. Heterogeneous exposure to the reservoir in an otherwise homogeneous population does not change the reservoir-driven threshold. Consider an SIS model where the population consists of people of type distributed according to each with their own level of exposure to reservoir . Then the differential equations governing the system are , , , , , , , , and , , , where , , is the force of infection acting on individuals of type . The basic reproduction number for this model is / . Let and be the non-trivial equilibrium distributions of , (i.e. 0), and be the total number of people infected and susceptible at equilibrium and be the equilibrium force of infection. As equilibrium points they satisfy . The proportion of transmission that is acquired from the reservoir is then leaving the reservoir-driven threshold unchanged. However, interactions with additional heterogeneities will affect the reservoir-driven threshold. Consider the case where both reservoir exposure ( ) and the person-to-person transmission rate ( ) depend on the -state. In this case the equilibrium force of infection is , the reproduction number is ⁄ where is the mean value of in the population. The proportion of infections that are acquired from the reservoir is Those that have greater exposure to the reservoir are more likely to be infected and so will have a disproportionally large effect on . If those with more exposure to the reservoir are also on more infectious then and the reservoir-driven exposure is lower, and conversely if those with more exposure to the reservoir also less infectious then and the reservoir-driven threshold is higher (Figure 2) . Note that this is opposite to the relationship for heterogeneous and heterogeneous (Figure 1 ). There is frequently more than one strain of a pathogen co-circulating within human populations and the dynamics of multi-strain interactions have been modelled extensively (e.g. [16] [17] [18] [19] [20] ). In the few simple multi-strain models we consider, accounting for host competition increases the reservoir driven threshold for each strain compared to the single strain model. Consider a simple competitive multi-strain extension of our basic SIS model with reservoir exposure. Each strain has its own transmission parameter ( ), recovery rate ( ) and reservoir exposure rate ( ). We assume that infection with one strain prevents infection from all other strains for the duration of the infection. With strains the 1 equations governing this system are where is the force of infection for each strain. Each strain has its own basic reproduction number in a fully susceptible population: / . Here, are not threshold parameters for strain extinction because reservoir exposure will cause the disease to persist and strain competition for hosts may cause a strain without reservoir exposure to die out even if that strain's reproduction number exceeds one. Let , be the equilibrium number of susceptible people at the nontrivial equilibrium where the number of people infected with each strain ( ) and the force of colonisation for each strain ( ) are non-zero. For each strain we have the following relation , or equivalently . The proportion of transmission of strain that is from the reservoir is 1 . Rearranging for : We can re-write the basic reproduction number for strain as Consequently 1 if ∑ . It follows that a given strain cannot persist without reservoir exposure if the proportion of transmission of that strain due to reservoir-exposure is more than the total prevalence of all strains. We also want to account for strain competition which can lead to the extinction of strains that would otherwise persist in a population. Therefore, we consider the invasion reproduction number for each strain, i.e. not the reproduction number in a fully susceptible population, but in a population at endemic equilibrium with all the other strains. Consider the equilibrium point without any infections of strain that exists if there is no reservoir exposure for strain (i.e. 0 . Let , , … , , be the equilibrium values of s, , … , when 0, such that 0 and 0 if . The invasion reproduction number for strain is then . It is possible to calculate in terms of , … , and , … , but the exact form is cumbersome (even for the 2 case) so instead we consider a simple bound. Consider that the equilibrium proportion of each strain other than will certainly not decrease in the absence of the competition with strain , i.e. for . Consequently since 1 1 1 . We can bound the invasion reproduction number for strain by 1 1 1 . Consequently is an upper bound for the reservoir-driven threshold in the presence of other strains since 1 whenever . Our simple competitive model assumes complete exclusion, but in reality, strains are unlikely to completely exclude one another. If one allows for the possibility of coinfections or superinfection, assuming that persons infected with strains other than strain ( ) are times as susceptible to infection with strain as those not infected with any strain ( ) and that coinfecting/superinfecting strains do not affect infectiousness or infectious period for the infecting strains, then at endemic equilibrium where is the proportion of people infected with strain (who may also be infected with other strains) and . One can use the same reasoning as above to show that ≡ 1 1 1 I 1 and 1 1 1 1 1 1 . and so is an upper bound for the reservoir-driven threshold. If 0, this reduces to the case of complete exclusion we considered above. If 1, that is if infection with another strain neither prevents nor predisposes a patient to infection with strain , then the reservoir driven threshold and reproduction number are the same for as for a model with only a single strain. In general, the greater the exclusion against strain (i.e. as → 0), the higher the reservoir-driven threshold and reproduction number. Consequently the case of complete exclusion is an upper bound for these quantities in these simple models. Clostridium difficile is a bacterium that colonises the intestines of many mammals including humans and livestock [21] . Most human hosts do not have symptoms despite being colonised. Colonisation is typically transient, lasting approximately one month in adults [22] , due to competition and interactions with other intestinal flora [23] . Disruption of the gut flora, often caused by consumption of antibiotics or proton-pump-inhibitors, allows C. difficile to proliferate in large numbers [23] . Toxigenic strains of C. difficile then produce a number of toxins that can cause diarrhoea which is often severe and sometimes life-threatening [24] . A robust immune response to these toxins is able to neutralise their effect [25] and most of the population have anti-toxin antibodies starting at a young age [26] . Immune responses protect against symptoms but not protect against colonisation [27] . Asymptomatically colonised carriers are also infectious [28] while animal models have shown that disruption of gut flora, even in the absence of symptoms, increases spore shedding and infectiousness [29] . Since immunity does not prevent colonisation or infectiousness, a simple SIS model is an appropriate starting point for C. difficile, provided we identify the I-class with all C. difficile positive individuals (not just those with symptoms). We will use variations on the SIS model below to determine whether C. difficile is importation-driven in a hospital setting, and calculate the reservoir-driven threshold for C. difficile for the human population as a whole (where animals are the reservoir). Historically, C. difficile has been of most concern and thus most studied in hospitalised patients where it complicates the care of many initially hospitalised for other conditions [30] . However, there is growing recognition of community-acquired cases that manifest during hospital stay. Since C. difficile is consistently present in many hospitals, it has been assumed that C. difficile is endemic in these settings and is responsible for many cases in the community. If we begin by modelling C. difficile in hospitals as a (homogeneous) SIS model with very high rates of migration (hospital admission and discharge) then we can estimate the reproduction number using the method outlined in Section 3.3. In words, we will estimate the within-hospital reproduction number as 1 Proportion of colonisations and infections acquired prior to admission 1 Prevalence of colonisation and infection . One study of colonisation and infections in hospitalised patients found 184 patients colonised at admission, and another 240 patients that acquired colonisation or infection after admission [27] . They identified an additional 60 or so patients that developed a CDI within 72 hours of admission who were therefore deemed to have been exposed prior to admission. Thus, the proportion of C. difficile positive patients that acquired the pathogen prior to admission was approximately 50%. In the same study 528/5422 patients were colonised or developed an infection for part of their hospital stay. Some patients were excluded from their analysis (mostly for missing data) leaving 424/4143 patients that were colonised or developed an infection for part of the hospital stay. While these do not provide an estimate of prevalence (since many of the colonised or infected patients were only colonised for part of the hospital stay) these figures provide upper bounds to the prevalence of colonisation and infection in the study hospital: 9.7% amongst all study patients and 10.2% after exclusions. Putting this into the above formula gives an upper bound for the within-hospital reproduction number of approximately 0.55. Unlike the study cited above, most studies focus on symptomatic patients and do not test asymptomatic patients at admission. However, the proportion of patients diagnosed with a C. difficile infection that were admitted for a C. difficile infection (principal diagnosis) is routinely reported. As patients admitted with asymptomatic colonisation who subsequently develop symptoms will not have C. difficile infection as their principal diagnosis, this proportion is a lower bound for the total proportion of infections that are due to exposure prior to admission and thus let us estimate an upper bound for the reproduction number. In the USA in the years 1993-2014, 20-34% of admissions who had a C. difficile infection had it as their primary diagnosis [31] . This is in excess of typical prevalence of colonisation and infection amongst hospitalised patients: a review of colonisation prevalence reported a range of 4-29% [32] . Therefore, our upper bound for the reproduction number lies in the range 0.69-1.1. So far, we have assumed hospitalised patients are homogeneous, but this is not the case. Patients who have recently been administered antibiotics are not more susceptible to colonisation but are more likely to develop symptoms and be more infectious [27] . Thus, an SIS model with heterogeneous infectiousness is perhaps more appropriate. However, heterogeneity in infectiousness alone does not affect the estimate of the reproduction number (Section 4.2). Factors affecting susceptibility to colonisation exist and adjusting for these will increase our estimate of the reproduction number (Section 4.1), but this is unfortunately beyond the scope of this case study. However, our simple estimates of are in agreement with more sophisticated models of C. difficile transmission in hospitals that have found that the reproduction number is likely to be less than one in many or most hospital settings [33, 34] . There are many strains and types of C. difficile and it has been suggested that certain strains or types, such as NAP1/RT027, are particularly hospital-adapted [35, 36] . It is possible that these strains have significantly higher reproduction numbers in the hospital than we have estimated above and thus may be self-sustaining in hospitals. Unfortunately, we do not have strain-level or type-level data for all strains or types. However, the article used to calculate our first estimate of the reproduction number report the proportion of infections and colonisations typed as NAP1/RT027 [27] . As the authors did not type all isolates, we assume that un-typed isolates were equally likely to be NAP1/RT027 as the isolates from similar patients that were typed, and that the proportion of NAP1/RT027 infections in those with onset <72h after admission (not reported) was similar to patients with colonisation at admission (13%). Under these assumptions, approximately 32 out of 150 (21%) colonisations or infections with NAP1/RT027 were present at admission. Of the approximately 10% of cases that were colonised or infected for some part of their hospital stay, approximately 3% were with NAP1/RT027 and the remaining 7% were with other types. Though colonisation with non-toxigenic strains appears to be protective against infection with toxigenic strains [37] , we do not have good information about the interaction of C. difficile types. Nevertheless, we can use the argument we presented in section 5 to bound the invasion reproduction number. This becomes 1 Proportion of NAP1 colonisations and infections acquired prior to admission 1 Prevalence of NAP1 colonisation and infection Proportion . negative Prevalence of NAP1 colonisation and infection 1 0.21 1 0.03 0.9 0.03 0.8. This suggests that even if other strains were eliminated and NAP1/RT027 did not compete for hosts, the continual importation of colonised and infected individuals would be required to sustain endemic disease in the study hospital. If we perform the same analysis for the pooled non-NAP1/RT027 strains in the study (approximately 212 of 334 colonisations and infections were present on admission) the equivalent upper bounds for the invasion reproduction number and basic reproduction number are both approximately 0.4. Therefore it appears that NAP1/RT027, though importation-driven, was better adapted for transmission in the study hospital than other strains. Carriage of C. difficile in the general adult population is less common than in hospitals or aged-care facilities, with reported prevalence in the range 0-15%, though ≲ 5% is perhaps most typical [32] . C. difficile is also commonly found colonising pets and livestock, while C. difficile spores are frequently isolated on meat, fresh produce and in water [21] . Crucially, there is significant overlap in strains observed in human and non-human sources [35] . However the proportion of human cases that are acquired from a non-human reservoir is unknown. Consequently, we cannot use our methods to estimate the reproduction number, but we can calculate the reservoir-driven threshold. If it is reasonable to suspect that reservoir exposure accounts for a proportion equal to or exceeding the threshold, then C. difficile may be sustained in the human population by exposure to animal reservoirs. If we begin with a homogeneous SIS model with reservoir exposure, then our estimate of the reservoirdriven threshold is simply the prevalence in the community which is typically ≲ 5% for adults (Section 2). Given the ubiquity of non-human exposure it is plausible that reservoir exposure exceeds this very low threshold. Some individuals will have higher exposure to these reservoirs (depending on diet and lifestyle factors), but this alone will not affect the reservoir-driven threshold unless those with greater exposure are also are more (or less) infectious (Section 4.3). If we heterogeneous infectiousness of those with and without symptoms, or with and without recent antimicrobial exposure, this also does not affect the food driven exposure in isolation (Section 4.2). However, communities are not homogeneous with regards to C. difficile colonisation risk, as demonstrated by the higher rates of colonisation and infection in hospitals, aged-care facilities and the very high colonisation rates amongst infants. Accounting for this heterogeneity will increase our estimate of the reservoir-driven threshold (Section 4.1). If we split our population into four risk categories -(A) hospitalised patients, (B) aged-care residents, (C) infants under 12 months and (D) the rest of the population -we can begin to account for some of this heterogeneity. If we assume separable mixing with heterogeneous susceptibility and infectious period, we need only the prevalence in each group and the proportion of the population that is in each group to estimate the reservoir-driven threshold (equation 7). The reported range of colonisation prevalence in each of these groups is (A) 0-29%, (B) 0-51%, (C) 18-90% and (D) 0-15% respectively [32] , while the total proportion of the population in each of these groups in a developed country like Australia is (A) <0.5% [38] , (B) <1% [39] , (C) <1.5% [40] and (D) >97% respectively. If we use the upper end of the prevalence range for each risk group, though only 16.6% of the population is colonised, the reservoir-driven threshold is 48.0%. Assuming a lower colonisation prevalence in the majority population (D) decreases overall prevalence but increases heterogeneity and can increase the reservoir-driven threshold. If only 1% of the healthy adult population is colonised, then overall prevalence is 3.0% but the reservoir-driven threshold is much higher at 81.1%. These extreme values taken from across the literature are not typical and are unlikely to coincide in a single population. If we consider more typical values of colonisation prevalence, the picture is quite different. With prevalence half of the maximum reported values (i.e. (A) 14.5%, (B) 25.5%, (C) 45% and (D) 7.5%), which is still probably much higher than typical for infants in particular [41] , the reservoir-driven threshold is only 13.0%. The reservoir-driven threshold is lower still if prevalence is lower in any of the high-risk minority groups (A-C). Figure 3 explores the effect of different prevalence assumptions on the reservoir-driven threshold. This model and estimate of the reservoir-driven threshold is of course very rough. Transmission is not well mixed between or within the four risk-categories. Furthermore, the pathogen's interactions with medications, gut-flora and host immunity leads to greater complexity than can be captured with a simple SIS model. The risk-categories of individuals change over time as patients age or move in and out of hospitals and so a multi-patch with age structure would provide better estimates. Nevertheless, this very simple calculation serves as a back-of-the-envelope estimate for the plausible range of the reservoir-driven threshold, demonstrating that under a range of reasonable assumptions a relatively small amount of transmission from animals could be sustaining endemic disease in human populations. Our simple calculations with figures from the middle of the reported prevalence range agree with a detailed, model of hospitals and communities that found the reservoir-driven threshold was between 3.5% and 26.0% for a wide range of plausible assumptions. There are many strains or types of C. difficile that circulate in human populations and the arguments set out in section 5 can be used to determine whether individual or types are reservoir-driven. It could be the case that some strains are sustained by exposure to animals, while other strains -though also present in animal populations -are sufficiently transmissible between humans to persist without transmission from animals. C. difficile PCR ribotype 078 (RT078) is a particularly good candidate to consider as a reservoir-driven strain. Though it is not known what proportion of human RT078 cases can be attributed to transmission from an animal source, whole-genome sequencing of isolates of this strain from livestock and humans strongly suggest frequent transmission between these groups [42] . On the other hand NAP1/RT027 which is found in livestock but appears to be more transmissible between people than other strains, might have some human cases attributable to animal sources but is less likely to be animal-driven [43] . Finally RT001, which accounts for many human infections in European settings, appears to be uncommon in livestock [43] . We have outlined the theory and application of very simple rules to estimate reproduction numbers in the presence of reservoir-exposure or imported cases. The rules require minimal information about the population and the pathogen of interest and could be a useful starting point or alternative to more complex models tailored to a population or pathogen. Churcher et al. have developed a statistical test using branching process theory to infer whether 1 in a population nearing disease elimination but with many imported cases [9] . Cauchemez et. al use a similar approach that accounts for incomplete case detection and the overrepresentation of larger outbreaks to estimate the reproduction number for emerging zoonoses [5] . However, their models assume almost all the population is susceptible and so are not suitable for situations where the prevalence of infection or immunity is far from zero. Moreover, the latter method assumes that the reproduction is less than one so is not appropriate in settings where is there is genuine uncertainty as to whether the reproduction number is above or below one [5] . Our model accounts for susceptible depletion and works for infections where the reproduction number is above or below one, but relies on estimates of prevalence to do so. This can pose a potential difficulty as incidence rather than prevalence is usually reported. Reliable estimates of prevalence either requires near-perfect case acquisition or surveys with large sample sizes especially when prevalence is low. Indeed a good deal of the variability in colonisation prevalence reported for C. difficile outside hospitals might be attributed to the relatively small sample sizes involved [32] . Some caution is required when using the reservoir-driven and importation-driven thresholds. It does not follow that if a disease is reservoir-driven or importation driven, then interventions targeting the external source and transmission from the external source will be most effective or 'best'. The 'best' control strategy will depend on the relative effort required to prevent each kind of exposure, the impact of these interventions and metric used to compare these. If it is equally feasible and desirable to eliminate all (or most) exposure from either source, eliminating transmission from the reservoir or importation is clearly the better choice for a reservoir-driven or importation disease as this will prevent all local human cases, while preventing all person-to-person transmission will prevent only the proportion of human cases spread locally by humans. However, if only modest reductions are feasible, then targeting local human transmission may be more effective. One can calculate the normalised derivatives of equilibrium prevalence to estimate the reduction in prevalence achieved by a small reduction in person-to-person transmission or exposure to the external source. For example, in the homogenous SIS model with reservoir-exposure, a greater reduction in prevalence is achieved by reducing person-to-person transmission whenever less than half of cases are acquired from the reservoir 1 . This is true whether or not the disease is reservoir-driven. A similar rule can be derived for the SIS model with imported cases. The major limitation of our method is the assumption that the disease and population are at equilibrium. Many diseases, including our case study disease C. difficile, exhibit seasonal variation [44] . It is possible that an infection is sufficiently transmissible to be locally sustained in high-transmission 1 For this simple model the normalised derivatives w.r.t the person-to-person transmission rate and reservoir exposure rate can be written and . Hence > whenever 1/2. seasons, but reservoir-driven or importation-driven in low-transmission seasons [9] . Similarly, it possible that exposure to the reservoir is seasonal [45] . It is possible that an epidemic in one setting is driven by exposure to a population or reservoir where an epidemic is ongoing. Our model does not account for these kinds of temporal variability when estimating reproduction numbers and reservoirdriven thresholds. The simplicity, minimal data requirements, generality and extensibility of the method we have presented here make it useful starting point for understanding the impact and interaction of transmission sources both internal and external to a population. AM is supported by an Australian Government Research Training Program Scholarship. We thank Laith Yakob for suggested reading that proved valuable for framing our analysis of models with imported cases. Figure 1 The reservoir-driven threshold (RDT) -the minimum proportion of transmission attributable to the reservoir above which the basic reproduction number is <1 -as a function of disease prevalence. Each curve indicates the RDT for different population heterogeneity assumptions for infectiousness ( ) and the product of susceptibility and infectious period ( ≔ / ). The RDT for a homogenous population is equal to the disease prevalence (black line). Heterogeneous alone does not change the RDT (black line). The RDT is higher if heterogeneous and homogenous (solid curves). The size of the effect increases with increasing heterogeneity (green curves: ~ , , blue curves: ~ , ). Heterogeneity in interacts with heterogeneity in , further increasing the RDT if ∝ (dashed curves) but decreasing the RDT if ∝ / (black line). The reservoir-driven threshold (RDT) for different assumptions for heterogeneity of reservoir exposure ( ) and person-to-person transmission ( across the population. The RDT for a homogenous population is equal to the disease prevalence (black line). The RDT does not change if only or only is heterogeneous (black line). The RDT is lower if both are heterogeneous and ∝ (dashed curves). The RDT is higher if decreases with increasing (solid curves: ∝ ). The size of the effect increases with increasing heterogeneity (green curves: ~ , , blue curves: ~ , ). Estimates of the reservoir-driven threshold for C. difficile in human populations and its dependence on the prevalence of each of four risk groups. In each subfigure, the prevalence in one risk group is varied across the reported range [32] (x-axes) while the other three prevalences are fixed at the values indicated by the vertical lines in the other subfigures. We consider two scenarios; one where each of the fixed prevalences is assumed to be in the middle of the reported range (solid lines and curves); the other the same except the prevalence in infants is only 25% (dotted lines and curves). We assume that 0.5%, 1%, 1.5% and 97% of the population are in the hospital, aged-care, infant and 'other' risk groups respectively. Forecast and control of epidemics in a globalized world The effect of household distribution on transmission and control of highly infectious diseases Multiscale mobility networks and the spatial spreading of infectious diseases A general model for stochastic SIR epidemics with two levels of mixing Using routine surveillance data to estimate the epidemic potential of emerging zoonoses: application to the emergence of US swine origin influenza A H3N2v virus Distinguishing Between Reservoir Exposure and Human-to-Human Transmission for Emerging Pathogens Using Case Onset Data Effective reproduction numbers are commonly overestimated early in a disease outbreak How absolute is zero? An evaluation of historical and current definitions of malaria elimination Measuring the path toward malaria elimination. Science (80-. ) Clusters of Human Infection and Human-to-Human Transmission of Avian Influenza A(H7N9) Virus Novel Swine-Origin Influenza A (H1N1) Virus Investigation Team et al. 2009 Emergence of a novel swine-origin influenza A (H1N1) virus in humans Middle East respiratory syndrome The construction of next-generation matrices for compartmental epidemic models Superspreading and the effect of individual variation on disease emergence On the definition and the computation of the basic reproduction ratio R0 in models for infectious diseases in heterogeneous populations The dynamics of cocirculating influenza strains conferring partial cross-immunity The effect of cross-immunity and seasonal forcing in a multi-strain epidemic model. Phys. D Nonlinear Phenom Incorporating demographic stochasticity into multi-strain epidemic models: application to influenza A The effect of antibody-dependent enhancement on the transmission dynamics and persistence of multiple-strain pathogens Five challenges in modelling interacting strain dynamics Clostridium difficile infection in the community: a zoonotic disease? Treatment of asymptomatic Clostridium difficile carriers (fecal excretors) with vancomycin or metronidazole. A randomized, placebo-controlled trial Interaction between the intestinal microbiota and host in Clostridium difficile colonization resistance Clostridium difficile Toxins: Mechanism of Action and Role in Disease The host immune response to Clostridium difficile Serum Antibody Response to Toxins A and B of Clostridium difficile Host and Pathogen Factors for Clostridium difficile Infection and Colonization Asymptomatic Carriers Are a Potential Source for Transmission of Epidemic and Nonepidemic Clostridium difficile Strains among Long-Term Care Facility Residents Antibiotic Treatment of Clostridium difficile Carrier Mice Triggers a Supershedder State, Spore-Mediated Transmission, and Severe Disease in Immunocompromised Hosts The epidemiology of Clostridium difficile infection inside and outside health care institutions Healthcare Cost and Utilization Project (HCUP) Asymptomatic Clostridium difficile colonization: epidemiology and clinical implications Healthcare-Associated Clostridium difficile Infections are Sustained by Disease from the Community Epidemiological Model for Clostridium difficile Transmission in Healthcare Settings Diversity and Evolution in the Genome of Clostridium difficile Enhanced surveillance of Clostridium difficile infection occurring outside hospital Administration of spores of nontoxigenic Clostridium difficile strain M3 for prevention of recurrent C. difficile infection: a randomized clinical trial Healthcare Activities: Hospital Beds Residential aged care in Australia 2010-11: A statistical overview Australian Demographic Statistics, 'TABLE 59. Estimated Resident Population By Single Year Of Age, Australia ', time series spreadsheet Longitudinal Investigation of Carriage Rates, Counts, and Genotypes of Toxigenic Clostridium difficile in Early Infancy Zoonotic Transfer of Clostridium difficile Harboring Antimicrobial Resistance between Farm Animals and Humans Clostridium difficile in foods and animals: history and measures to reduce exposure Clostridium difficile Infection Seasonality: Patterns across Hemispheres and Continents -A Systematic Review Possible Seasonality of Clostridium difficile in Retail Meat