key: cord-0998329-xum44wt7 authors: Meehan, Michael T.; Cope, Robert C.; McBryde, Emma S. title: On the probability of strain invasion in endemic settings: Accounting for individual heterogeneity and control in multi-strain dynamics date: 2020-02-21 journal: J Theor Biol DOI: 10.1016/j.jtbi.2019.110109 sha: da4d9fd9673c937311cc269e5851ac2cc3100a4f doc_id: 998329 cord_uid: xum44wt7 Pathogen evolution is an imminent threat to global health that has warranted, and duly received, considerable attention within the medical, microbiological and modelling communities. Outbreaks of new pathogens are often ignited by the emergence and transmission of mutant variants descended from wild-type strains circulating in the community. In this work we investigate the stochastic dynamics of the emergence of a novel disease strain, introduced into a population in which it must compete with an existing endemic strain. In analogy with past work on single-strain epidemic outbreaks, we apply a branching process approximation to calculate the probability that the new strain becomes established. As expected, a critical determinant of the survival prospects of any invading strain is the magnitude of its reproduction number relative to that of the background endemic strain. Whilst in most circumstances this ratio must exceed unity in order for invasion to be viable, we show that differential control scenarios can lead to less-fit novel strains invading populations hosting a fitter endemic one. This analysis and the accompanying findings will inform our understanding of the mechanisms that have led to past instances of successful strain invasion, and provide valuable lessons for thwarting future drug-resistant strain incursions. Anti-microbial drug resistance (AMR) currently presents one of the most significant challenges to public health worldwide ( Levy and Marshall, 2004 ) . The incidence of cases infected with non-wild-type variants of pathogenic diseases continues to rise with endemic levels already witnessed in several regions ( McKenna, 2013; Laxminarayan et al., 2013 ) . Exacerbating the problem is the growing misuse and overuse of antimicrobial agents which, rather than suppressing pathogen strain diversity, often acts to accelerate the spread and diversification of mutant variants ( Barbosa and Levy, 20 0 0; Neu, 1992 ) . More broadly, the emergence of new pathogen strains that have evolved to escape natural immunity within a population pose substantial risks to public health. Alarmingly, in many settings, direct transmission of these mutant pathogens is believed to have become the primary source of incident cases ( Luciani et al., 2009; others, 2018 ) suggesting that intervention strategies targeting the transmission and treatment of wild-type strains alone will likely fail. Even with * Corresponding author. E-mail address: Michael.Meehan1@jcu.edu.au (M.T. Meehan) . 1 These authors contributed equally to this work. recent increases in funding ( Simpkin et al., 2017 ) , the ongoing arms race between pathogen evolution and the development of novel antibiotics is still being won by the former. Consequently, we are now faced with a rapidly diminishing arsenal of effective therapies -one which may soon prove inadequate ( Hancock, 2007 ) . The evolution of infectious pathogens and the emergence of mutant variants are fundamentally driven by random processes -both at the within-host and population levels. Not only do mutant strains arise as a result of random microbiological processes ( Alekshun and Levy, 2007 ) , but phenotypically distinct variants initially appear in small numbers of hosts and must therefore escape a treacherous stochastic regime (i.e., avoid extinction) in order to establish themselves within the community. Often furthering this challenge is the presence of a resident endemic (possibly ancestral) strain that deprives the newly-emergent mutant of susceptible hosts (e.g., through cross-immunity). In spite of these challenges, countless mutant pathogens -often exhibiting varying levels of drug-resistance -have flourished and been able to firmly entrench themselves among host populations ( Chang et al., 2015 ) . Previous modelling investigations of pathogen ecology and diversity have often assumed that the multi-strain dynamics are well established ( Blanquart et al., 2016; Cobey et al., 2017; Colijn et al., 2010; Lehtinan et al., 2017; Meehan et al., 2018 ) ; whilst those studies that have specifically modelled the emergence of https://doi.org/10.1016/j.jtbi.2019.110109 0022-5193/© 2019 Elsevier Ltd. All rights reserved. mutant variants, have predominantly done so deterministically (see e.g., ( Cohen and Murray, 2004; Blower and Chou, 2004; Tanaka and Francis, 2006; D'Agata et al., 2009 ) ). The goal of this paper is to analyze the early-time dynamics of novel strain emergence in a stochastic setting whilst accounting for the background population dynamics associated with endemic infection. In particular, we investigate the conditions under which invasion by a novel strain is possible and calculate the probability of this event occurring, depending on the relative fitness (i.e., reproductive capacity) of the two strains, and the relevant characteristics of the host population (i.e., the heterogeneity in individual reproductive capacity). We then analyze the effects of control strategies and timing on the emergence of novel strains and discuss the implications of differential control efficacy when the invading strain boasts some level of resistance to treatment. The results of this investigation will extend analogous findings in single-strain settings and elucidate the consequences of different control mechanisms/timings in multistrain systems -particularly the impact of differential control. Moreover the analytical framework that we develop will provide a flexible and robust theoretical basis for future modelling work. Before turning to the multi-strain scenario, we first review the analogous computation in an infection-naïve setting -for which several classical results are available ( Griffiths, 1973; Britton, 2010; Diekmann et al., 2012 ) : the probability of an epidemic outbreak when an infectious disease is introduced into an entirely susceptible host population. To begin, when the susceptible population is sufficiently large and well mixed, it is common to approximate early epidemic behavior by a branching process ( Griffiths, 1973; Britton, 2010; Diekmann et al., 2012 ) , as infected individuals are likely to have contacts only with susceptible individuals. In this case the depletion of susceptible individuals due to infection is negligible such that (in an infection-naïve population) the susceptible fraction remains approximately equal to one. Thus, to determine the epidemic trajectory we need only track the number of infected individuals and the rate at which they produce secondary cases, or offspring. This framework allows for tractable computations of the probability an outbreak becomes extinct in its early stages, using branching process theory. When our goal is solely to determine the probability of extinction or establishment of the outbreak, we need only consider the infection process in terms of discrete, non-overlapping generations. If the duration of infectiousness is constant, the number of secondary cases, or offspring, Z , generated by each member of the infectious population during their infectious lifetime (i.e., one generation) is a Poisson random variable with rate parameter R 0 , where R 0 is the basic reproduction number: Z ~Poisson( R 0 ). In this case, the probability that an epidemic outbreak, that begins with a single infectious individual, goes extinct, q , is given by the smallest root of the equation (see e.g., ( Diekmann et al., 2012 ) ): (1) For R 0 ≤ 1 the unique root of Eq. (1) is q = 1 and extinction is guaranteed. For R 0 > 1 a second solution q < 1 emerges and an epidemic outbreak is possible. This outcome becomes increasingly likely as R 0 increases. A significant recent contribution to this theory is the work by Lloyd-Smith et al. ( Lloyd-Smith et al., 2005 ) in which the authors highlighted the importance of heterogeneity in individual infectiousness among members of the host population and its role in determining the fate of an infectious disease outbreak. By analyzing secondary case counts from several historical infectious disease outbreaks, including the 2003 outbreak of severe acute res-piratory syndrome (SARS) in Singapore, the authors demonstrated that in order to accurately reconstruct observed epidemics, models must account for individual heterogeneity in reproductive potential. In particular, they found that observed secondary case counts follow highly skewed distributions where outbreaks are driven by a small subset of so-called 'super-spreaders': individuals that generate a disproportionately large number of secondary cases, and that the large majority of infected individuals lead to no, or few, secondary infections. As a consequence of this disparity, the probability of extinction increases and epidemics become rarer (but more explosive), as heterogeneity increases. Importantly, their findings show that previous analyses that rely solely on population mean parameters -such as the basic reproduction number R 0 -to model outbreak dynamics inadequately capture the true stochastic nature of disease spread, because the prospects of an epidemic outbreak depend critically on the distribution of infection potential within the host population. This work has since been extended in Yates et al. (2006) using a multi-type branching process framework to incorporate other forms of heterogeneity including susceptibility and assortative mixing leading to similar conclusions. To model the infectious cohort as a heterogeneous mixture of infected individuals, the authors in Lloyd-Smith et al. (2005) assigned each member of the infectious population a random reproductive capacity ν drawn from a Gamma probability distribution with population mean R 0 = E (ν) and dispersion (i.e. shape) parameter k . As such, the probability density function for ν was defined as Here, smaller values of k indicate higher levels of heterogeneity in the infected population and a greater propensity for superspreading behaviour. Therefore, whilst this framework assumes a well-mixed host population whereby any two individuals can come into contact, this mixing is heterogeneous, in that some individuals will make more contacts throughout their infectious lifetime. The variation in reproductive capacity can be interpreted as representing some combination of, for example, variation in duration of infectiousness, differences in social contacts or behaviour between individuals, or higher transmissibility among children as compared to adults. Being able to represent this heterogeneity without needing to explicitly construct social contact networks or multi-class processes provides a flexible and tractable means of exploring a range of scenarios. More importantly, the Gamma distribution specifically, in conjunction with the branching process approximation, was shown in Lloyd-Smith et al. (2005) to accurately reproduce the secondary case counts of several observed epidemic outbreaks. For these reasons we adopt this modelling approach in our investigation. When the reproductive potential of each infectious individual, ν, is Gamma distributed, the total number of offspring generated by each infectious cohort, Z , follows a Negative Binomial distribution with parameters p and k: Z ~NegBin( p, k ), (1) generalizes to ( Lloyd-Smith et al., 2005 ) In the limit k → ∞ , that is, for a homogeneous population with ν = R 0 , this equation reduces to the familiar form (1) , and 2 The Negative Binomial distribution arises as the composition of the Gamma distribution for individual reproductive potential ν, p ν ( x ), and the Poisson distribution for secondary cases Z , given ν, p Z ( z | ν), i.e., for k = 1 (i.e., the typical continuous-time Markov chain with Exponentially-distributed infectious periods and inter-arrival times between infection events), we have the simple expression q = 1 R 0 . Once again, for R 0 < 1 the smallest root of Eq. (3) is q = 1 and extinction is guaranteed. For R 0 > 1 the probability of extinction decreases monotonically (i.e., an epidemic becomes more likely) both with increasing R 0 and k . Therefore, as highlighted in Lloyd-Smith et al. (2005) , higher levels of host heterogeneity (i.e., smaller k ) lead to a higher probability of epidemic extinction. Now we consider the multi-strain scenario, in which a new invading pathogen strain i is introduced into a population that already hosts a resident endemic strain r . We apply a branching process approximation to calculate the probability that strain i establishes itself within the community, i.e., avoids extinction, but note that the approximation used must necessarily account for the impact of the resident strain on the capacity of the invader to infect individuals. We note that Leventhal et al. (2015) take a similar branching process-based approach, but consider network structure explicitly, rather than allowing contact between any pairs of individuals as is the case here, i.e., we consider a well-mixed population. For simplicity, we assume that the background infection is in some quasi-stationary distribution ( sensu Bartlett (1957) ). We also assume that the population is large, so that given the presence of the resident strain r , there remain sufficient susceptible individuals that the introduction of the invading strain i can be approximated by a branching process, and that early in the outbreak of i , the fraction of the population that are susceptible does not change substantially. Analysis is possible when these assumptions are broken, with results similar to those presented here; however these scenarios are often substantially more complex, and do not cover the full range of intervention scenarios considered here. For example, Hartfield and Alizon (2014) consider a scenario in which the resident strain is not in equilibrium, resulting in a more complex branching process approximation, and Humplik et al. (2014) evaluate establishment probability in a simple SIS model, but with finite N explicitly without a branching process approximation. More broadly, we emphasise that invasion of a second disease against an endemic strain in a fully-mixed population is a setting that has been considered several times previously, with early results in deterministic systems producing similar outcomes to those presented here (see e.g., Levin and Pimentel, 1981 ) . One difference between this scenario and the one considered in the previous section is that, assuming perfect cross-immunity, the presence of a resident strain r diminishes the pool of available susceptibles presented to the invading strain i , thus impeding its epidemic potential. Specifically, in our multi-strain scenario, infectious individuals come into contact with (other) individuals according to a Poisson process in the normal way, but rather than those individuals necessarily being susceptible (as in the naïve case), they are only susceptible with probability (approximately) S / (N − 1) , where S is the mean endemic susceptible population, i.e., S = N/R r 0 , and N is the population size. (For convenience, we note that N ≈ (N − 1) when N is large, and thus discard the −1 hereafter. Thus, S /N is the susceptible fraction of the population.) Consequently, given the presence of the endemic strain, we find that the effective reproduction number, R i eff , of the invading strain i at the time of its introduction is rescaled accordingly: (4) Comparison of the probability of extinction, q , for a new invading strain entering an endemically infected population under a uniform partial (solid) and polarized (dot-dashed) control policy as a function of the level of control c and varying dispersion parameter k . Here, we have set R i 0 = 5 and R r 0 = 5 / 3 which gives a critical control level c crit = 1 − R r 0 /R i 0 = 2 / 3 . In this figure we have assumed that control has been implemented coincident with the emergence of the invading strain i and that it is equally effective against both strains. R i 0 and R r 0 are the mean basic reproduction numbers of the invader and resident strain, respectively, as measured in the absence of the other. Note that this does not rely on a specific formulation around other disease dynamics, e.g., the presence of a recovered class and how waning immunity or replenishment of susceptibles occurs. To calculate the extinction probability of the invading strain in this case we need only replace R 0 in Eq. (3) with R i eff = R i 0 /R r 0 , and solve numerically. The results are shown in Fig. 1 where we effectively reproduce Fig. 2 b) of Lloyd-Smith et al. (2005) with the x -axis rescaled from R 0 → R i eff . Accordingly we now find that the epidemic threshold condition becomes R i eff > 1 such that extinction of the invading strain is guaranteed for R i 0 < R r 0 . Once again, we observe that the extinction probability decreases monotonically with increasing R i eff and, as in Lloyd-Smith et al. (2005) , that increasing individual heterogeneity (decreasing k ) favours extinction. To verify the utility of this branching process approximation for strain invasion against an endemic background, and test the conditions under which it can be applied with confidence, we performed a simulation study. Full details appear in Appendix A. Exact simulations of a SIRS-type model with two strains and heterogeneity in individual reproductive capacity identical to that considered here were produced across a range of population sizes and parameter choices. The simulation results demonstrated that the branching process approximation was accurate for sufficiently high population sizes (e.g., N ≥ 500), and that this accuracy was robust to choices of k , R r 0 , and R i 0 . We now consider the impact of control on the invasion (or extinction) probability of a new strain entering an endemic population focussing separately on several key factors: first, the mechanism by which control is implemented and its effect on the distribution of reproductive potentials ν among members of the infected population; second, the timing of the introduction of control and the implications for the background population dynamics; and, finally, the relative efficacy of intervention measures when applied to the resident strain r and the invading strain i , which we anticipate enjoys some level of resistance to control (e.g., drug resistance). From the description above we see that for the same level of control c , the mean effective reproduction number in the presence of control R i eff ,c = (1 − c) R i eff is the same in both cases. Importantly however, in general the polarized control policy acts to increase the spread (i.e. heterogeneity) in individual infectiousness, ν, among the infectious cohort, whilst uniform partial control reduces it. We emphasise that control here is something that impacts infectious individuals rather than susceptible individuals; specifically, under polarized control, the controlled individual becomes infected, but then produces zero offspring. This definition follows from our choice to model the outbreak as a branching process where we only track the infectious cohort. As the uniform partial control scenario scales the transmissibility of every individual equally, its impact on the emergence of the invader follows directly from the results in the previous section, and existing literature. In contrast, however, the polarised control scenario requires a modification to the branching process and its outputs therefore do not follow directly from preceding sections or existing literature. Assuming (for now) that the background susceptible population is not affected by the application of control, for the uniform partial and polarized control scenarios described above, the probabilities of epidemic extinction q u and q p are respectively determined according to the following equations: Uniform partial control : In the uniform partial control case all individuals are impacted equally and the extinction equation resembles Eq. (3) , only now the effective reproduction number R i eff has been appropriately scaled according to the level of control. In the polarized case, a fixed proportion of infected individuals c are guaranteed to produce no subsequent infections (i.e., those branches of the infection process become extinct) whilst the extinction probability of the remaining (1 − c) infected individuals remains unaltered. Since polarized control acts to increase population heterogeneity and uniform partial control reduces it, we expect that q p > q u for all which, in fact, can be proven analytically ( Lloyd-Smith et al., 2005 ) . For now, we assume that control measures are equally effective against both the resident strain r and the invading strain i and that they are implemented coincident with the emergence of the invading strain. We relax these assumptions in the upcoming sections. In Fig. 2 we compare the extinction probabilities of the invading strain i under the uniform partial (solid) and polarized (dotdashed) control policies discussed above where, for reference, we have set R i 0 = 5 and R r 0 = 5 / 3 such that R i eff = 3 . Similar to before, in the uniform partial control case the effect of adding control is to rescale the extinction curves presented in Fig. 1 For both control policies, we see that extinction of the invading strain is guaranteed for c ≥ c crit = 1 − 1 /R i eff and That is, the polarized control policy, for which the extinction probability q increases almost linearly with c for c < c crit , outperforms the uniform partial control policy, in terms of preventing the establishment of the invading strain. Note that applying control to the resident strain will cause the susceptible background to shift. How quickly this occurs depends upon how control is applied: if it were applied to all existing infected individuals immediately, the background would likely shift quite quickly, causing the branching process approximation for invader success to be inaccurate. If control is instead applied only to new infectious individuals, the susceptible background will shift more slowly, and the branching process approximation may remain relatively accurate. However, as control coincident with the introduction of the invading strain is somewhat unrealistic, we will not investigate this in more detail. Instead, in the next section we consider the more realistic scenario, where control precedes the introduction of the invader. In the previous section we assumed that control measures were implemented coincident with the emergence of the invading strain i . There it was understood that any intervention measures applied to the resident strain r will not have had sufficient time to suppress the prevalence of the endemic strain r and replenish the susceptible pool, S . We now consider an alternate (more realistic) scenario in which control has been implemented over the long-term to the endemic disease population prior to strain i 's emergence, i.e., at t < 0. In this case we assume that control has been applied sufficiently in advance that the population dynamics have had sufficient time to re-equilibrate. The major difference between the two scenarios described above is the number of susceptibles greeting the invading strain i : In this figure we have assumed that control has been implemented prior to the emergence of the invading strain i and that it is equally effective against both strains. controlling the resident strain r over the long term replenishes the susceptible pool, S and, if the level of control is sufficient, eventually the resident strain is eliminated and the whole population becomes susceptible, i.e., S(t = 0) → N. Therefore, in the prior control scenario the invading strain i has an increased effective reproduction number which saturates upon elimination of the resident strain: Note that this expression does not yet take into account the effect of control applied to the invading strain i itself; for that we need to substitute (6) into the extinction probability Eq. (5) . In doing so, we observe that for the uniform partial control case the factor (1 − c) cancels and the extinction probability q u is independent of the level of control c , at least until the resident strain becomes extinct at c > 1 − 1 /R r 0 . Below this threshold ( c < 1 − 1 /R r 0 ), the increase in the susceptible population awaiting strain i (as a result of controlling the resident strain r ) negates the effects of any intervention measures that are applied to the invading strain. Therefore, in this regime, uniform partial control is ineffective at preventing the invasion of the novel strain (though we should certainly keep in mind that such measures would still reduce the overall/combined disease prevalence, relative to a scenario with no control). A similar cancellation does not occur for the polarized control case and given that q p > q u we anticipate that control will always be effective in this case. A plot of the extinction probability is given in Fig. 3 where again we compare the extinction probabilities of the invading strain i under the uniform partial (solid) and polarized (dotdashed) control policies discussed above. For ease of comparison with Fig. 2 , we have once again set R i 0 = 5 and R r 0 = 5 / 3 . Immediately we observe that for c < 1 − 1 /R r 0 uniform partial control measures have no impact on the invasion prospects of strain i . Not until we enter the second regime, 1 − 1 /R r 0 ≤ c < 1 − 1 /R i 0 , for which the resident strain r has been eradicated prior to the emergence of the invading strain i , does increasing the level of control c in-crease the extinction probability of strain i . Conversely, polarized control is always effective for c > 0, and becomes increasingly so for c > 1 − 1 /R r 0 . We note that differences in the im pact of uniform partial versus polarised control observed here resemble those observed for a single strain being introduced into an infection-naive population, for which polarised control is also known to be more effective ( Gomes et al., 2014 ) . In the preceding analysis we assumed that the control measure being implemented is equally effective against both the resident strain r and the invading strain i . A more general scenario to consider is when the control measure that is implemented differentially impacts the two strains. Indeed, if the invading strain is a mutant variant of the wild-type resident strain it is possible that this new strain enjoys some level of resistance to active control measures. Therefore, in this section we generalize the approach taken above by assuming that for a fixed level of control c applied to the background resident or endemic strain r , only a fractional component αc -where 0 ≤ α ≤ 1 -inhibits the progress of the invading (possibly drug-resistant) strain. In this case the extinction Eqs. (5) given above generalize to Uniform partial control : where we previously observed that the effective reproduction number in the simultaneous and prior control scenarios are given respectively by Simultaneous control : Prior control : For the simultaneous control case (with R i eff = R i 0 /R r 0 ) the effect of differential control is to simply rescale the control factor c → αc . As a result, in this particular case, the extinction probabilities (for the invading strain) calculated using Eqs. (7) and (8) will be qualitatively the same as that presented in Fig. 2 , with the x -axis c reduced to αc . Therefore, in this section we concentrate only on the prior control scenario. If we substitute the expression for the effective reproduction number under the prior control scenario into (7) we obtain results that are both quantitatively and qualitatively distinct from those found previously (see Figs. 4 and 5 ). Of particular interest is the combination of uniform partial control implemented prior to the emergence of the invading strain in the sub-elimination regime c ≤ 1 − 1 /R r 0 (i.e., below the critical level required to eliminate strain r ). In this particular instance, the effective reproduction number in the presence of control becomes such that increasing c will actually increase the effective reproduction number of strain i and enhance its invasion prospects relative to the no control (c = 0) scenario. Similarly, although a closed analytical expression cannot be found, it is also possible to show that an analogous region of parameter space exists in which the application of polarized control prior to the emergence of the invading strain leads to an enhanced probability of invasion, relative to the no control scenario. taken k = 1 , R i 0 = 5 and R r 0 = 5 / 3 , and assumed that control has been implemented prior to the emergence of the invading strain. For comparison, we have also drawn the reference plane for the extinction probability in the absence of control, i.e., for c = 0 . (b) A schematic describing the meaning of the different regimes that occur. The dashed line represents the critical threshold of the resident strain, i.e., for control below this threshold, the resident may persist in the absence of the invader, for control above the threshold, the resident becomes extinct. The solid red line indicates the boundary of the region where control facilitates invader success relative to the control-free scenario. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) An important corollary to these results is that under the differential control scenario, invasion can occur even when the invading strain is less transmissible than the resident strain, i.e., when R i 0 < R r 0 it is still possible to obtain R i eff ,c > 1 . That is, while implementing control can have substantial positive impacts in terms of reducing the prevalence of a resident disease strain, it may facilitate the emergence of new, drug-resistant strains (that would not have emerged in the absence of control). This is a relatively unsurprising result that is consistent with similar outcomes in a range of different disease modelling scenarios e.g., ( Ballesteros et al., 2009; Colijn and Cohen, 2015 ) . To illustrate this finding we solve Eq. (7) using the prior control effective reproduction number given in (8) (see Figs. 4 and 5 ). As a specific example we have taken k = 1 (corresponding to the Exponentially-distributed infectious potential ν), but, we point out that the behaviour is qualitatively the same for different values of R i 0 = 5 and R r 0 = 5 / 3 and assumed that control has been implemented prior to the emergence of the invading strain. For comparison, we have also drawn the reference plane for the extinction probability in the absence of control, i.e., for c = 0 . (b) A schematic describing the meaning of the different regimes that occur. The dashed line represents the critical threshold of the resident strain, i.e., for control below this threshold, the resident may persist in the absence of the invader, for control above the threshold, the resident becomes extinct. The solid red line indicates the boundary of the region where control facilitates invader success relative to the control-free scenario. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) k -the difference being that the region for which control promotes invasion by the invading strain increases as k increases. We observe that the region for which control enhances the invasion prospects of strain i is larger for the uniform partial control case compared to the polarized control scenario ( Figs. 4 and 5 ) . This observation follows from our earlier remark that q p > q u meaning that invasion is always less successful in the polarized case. However, even in the polarized control case we still observe enhanced invasion probabilities provided the level of relative control α is sufficiently small, which in the event of multi-or even extensively-drug-resistant pathogens could well be the case. Finally, we reiterate that while control may in some cases facilitate the establishment of an invader (that may not have established in a control-free environment), the application of control always necessarily reduces overall disease prevalence. Moreover, note that, except in the case where the endemic disease has been eradicated due to control efforts, introduction in competition with an endemic background is always more challenging (from the infection's perspective) than introduction into a naïve population. Even in the extreme case in which the level of differential control α → 0 + , the effective reproduction number R i eff ,c < R i 0 . Hence the presence of an endemic strain always acts to protect the host population from strain invasion provided there is some level of cross-immunity. The number of phenotypically distinct strains of various pathogens circulating in the global population continues to rise, with new strains continually emerging. Due to the stochastic nature of disease transmission, the successful emergence of a new strain is not certain, even when that strain is more fit than any pre-existing strains circulating in the population. Understanding the relationships between relative strain transmissibility, population heterogeneity, and the impact of different control measures on strain emergence will enable more effective modelling of multistrain disease dynamics in the face of challenges such as AMR. In this article we advanced previous work on disease emergence in an infection-naïve population ( Lloyd-Smith et al., 2005 ) to address emergence of novel mutant strains competing against resident endemic strains. This is a simple but powerful theory that can inform multi-strain modelling projects going forward (e.g., Meehan et al. (2018) ; McBryde et al. (2017) ), including deterministic approximations, which can be applied once the outbreak has successfully established ( Rebuli et al., 2017 ) . We also demonstrated that the time at which control measures are implemented has a substantial impact on the effectiveness of control for preventing the emergence of new strains, and that this varied by the chosen means of control. Additionally, in the case of differential control, where the invading strain shows some level of resistance to the applied control measure, we found regions of parameter space for which control was detrimental and increasing c increased the invasion probability of the new strain. Hence, in accordance with previous findings, we showed that when control impacts the invading strain differently to the resident, this can (in some cases) facilitate the emergence of the invading strain, relative to the control-free case. We observed that, for equal levels of effective coverage, polarized control measures that completely neutralize a random fraction of the population (e.g., perfect quarantine of a proportion of infected individuals) consistently outperform imperfect control measures that partially control the entire population (e.g., vaccination that reduces, but does not eliminate, the susceptibility of all individuals). That is, random, polarized control provides greater protection against potential strain invasions. This is again consistent with results around invasion in naïve populations ( Lloyd-Smith et al., 2005 ) . This could provide an important consideration for managers choosing control strategies when concerns exist around AMR. Of course, more targeted control measures directed towards the most highly infectious members of the host population -so-called 'super-spreaders' -would outperform both the polarized and uniform partial control scenarios considered here; however, identifying such groups a priori , if possible, may be difficult in practice. For control timing, we considered two separate scenarios: implementing the various control strategies (i.e., polarized, and uniform partial) only upon emergence of the invading strain; and doing so prior to its emergence, such that the population dynamics have had sufficient time to re-equilibrate. We note that the latter case, where control exists prior to the introduction of the invading strain, is likely to be more realistic, given the desire to control existing outbreaks. The difference between these two cases is the number of susceptible individuals made available to the newly introduced invading strain. In the simultaneous case, the susceptible pool remains depleted by the presence of the resident strain such that the invading strain is inhibited by both a reduced S(t = 0) = N/R r 0 and the added burden of control. In the prior scenario, the susceptible population has been replenished by control measures applied to the resident strain, S(t = 0) = N/ ((1 − c) R r 0 ) . In this latter case, the effects of uniform partial control on the invading strain are negated by the increase in available susceptibles: R i eff ,c = R eff . Only once the level of control is sufficient to eradicate the resident strain, i.e., c > 1 − 1 /R r 0 (and the susceptible population then becomes the whole population, N ) does uniform partial control start to impede the invasion prospects of the invading strain. Conversely, implementing a random, polarized control strategy is always beneficial provided the control measure is equally effective against both the resident and invading strains. One limitation of our analysis is that we assume that the susceptible individuals are the same, and mix homogeneously, so that they are equally likely to become infected. If this were not the case, the resident infection would likely impact which susceptibles are available to the invader, requiring a more complex modelling framework. For example, given that both infection potential and susceptibility are functions of the number of contacts an individual experiences, these quantities may be naturally correlated. Leventhal et al. (2015) approach this problem by imposing an explicit contact-network structure on the study population, such that heterogeneity in susceptibility was directly linked to the heterogeneity in infectiousness through the (fixed) number of contacts shared by each individual. In this context, invasion events were found to be less likely because potential 'super-spreading' hosts (or hubs in the network) were often already infected with the resident strain meaning they were unable to perpetuate the spread of the invading strain, whilst simultaneously providing protection to susceptible satellite nodes. Given these findings, it would be interesting to consider more general correlation structures between susceptibility and infectiousness and their effect on strain invasion potential in a generalized branching process framework. We also assume perfect cross-immunity holds between the resident and invading strains and thereby neglect the possibility of replacement infection or within-host co-infection -we do however, still allow for subsequent infection with an alternate strain following recovery. This assumption may be valid for particular infectious pathogens, e.g., measles, varicella, for which infection with one strain precludes coincident infection with any other, but this property does not hold in general. For instance, replacement infection or co-infection with e.g., influenza, streptococcus, staphylococcus is known to occur. As a result of our perfect cross-immunity assumption, strain invasion (and eventual replacement) or strain extinction become the only two possible outcomes of our simulation scenario. However, relaxing this assumption allows for more complex dynamics including strain coexistence ( Spicknall et al., 2013 ) . Throughout our analysis, we considered the potential for heterogeneous individual reproductive capacity (i.e., the capacity for 'super-spreaders') among members of the host population. While this does not influence the impacts of control qualitatively, we emphasize that under the branching process approximation it does impact how likely the invading strain is to establish. In particular, invaders are less likely to establish when individual reproductive capacities are more highly dispersed. However, we note that this heterogeneity also has a subtler effect: the error between simulated epidemic results and the branching process approximation varies in both magnitude, and direction, with k ( Fig. 9 ) . One clear possibility is that if a 'super-spreader' individual emerges early in an outbreak, it may allow a strain to establish itself when it otherwise would not have. In addition, since the heterogeneity also influences the resident strain, increased heterogeneity both enables the natural extinction of the resident strain ( Fig. 10 ) , and more generally variation of the initial susceptible proportion faced by the invader away from N/R r 0 . Applying a branching process approximation for the emergence of a new strain in the presence of an endemic background, rather than into a naïve population, requires care. Our simulation study determined that the approximation was quite robust even for relatively small populations (e.g., N ≥ 500). However, there was some variation due to different parameter choices at smaller population sizes. In particular, accuracy varied when heterogeneity in individual reproductive capacity was high, and when parameter choices caused the endemic disease to be at risk of natural extinction (even without the influence of the invader). We also emphasize that the branching process approximation is limited in that it can only quantify the probability of initial strain extinction: There is likely some additional probability of subsequent fade-out due to depletion of susceptibles (e.g., sensu Ballard et al. (2016) ). A separate analysis would be necessary to investigate this phenomenon. One key assumption we make here is that, in effect, we require that (functional) mutations occur sufficiently rarely that the new strain will either establish, or become extinct, before a repeat mutation can occur. If instead there was a possible ancestral relationship between the resident strain and the invader, where the resident effectively fuels the invading cohort with continual reinforcements, invasion dynamics would likely be different. It would be interesting then to re-evaluate our results with a coupling between the two strains: our expectation is that with a perpetual supply of reinforcements, replacement with any fitter (i.e., R i 0 > R r 0 ) variants is guaranteed -it just a question of when. Further, assuming an ancestral relationship between the competing strains we could also model multiple epochs (i.e., we would not terminate the simulation after a single replacement event has occurred) and follow the evolution of the pathogen (see also ( Yates et al., 2006 ) ). These considerations are left as avenues for future research. The results of this analysis point to the role of endemic infection as an immunizing agent and its ability to impede the emergence of new exotic strains. In addition to population-wide effects, this has important implications at the individual level, where the use of broad-spectrum antibiotics can inadvertently eliminate colonizing bacterial infections (e.g., Staphylococcus aureus ), and thus further highlights the importance of antibiotic stewardship. Perhaps the most critical outcome of this work is the potential for the invading strain to become established even when it is less transmissible than the resident due to differential control. The emergence of mutant strains in the presence of endemic backgrounds is of substantial concern in many disease settings (particularly e.g., Tuberculosis), and the impact of differential control highlighted here should be considered in any future control effort s as well as effort s to model multi-strain dynamics in these systems. To determine conditions under which the branching process approximation for invader establishment success could be confidently applied, we performed a simulation study. We considered a relatively simple stochastic epidemic model ( Fig. 6 ) , in which each of the N individuals in the population are either: susceptible ( S ); infectious with the resident strain ( I r ), or the invader ( I i ); or recovered ( R ). Transitions in this model are somewhat unconventional. We assume the infectious duration of each individual is constant , rather than random, with value 1/ γ (and for simplicity we set γ = 1 , without loss of generality). Each infected individual j (when infected) has their own individual reproductive capacity, ν j , Gamma distributed with mean R s 0 and dispersion parameter k , where s is the strain they are infected with. Infection events occur according to a Poisson process, with rate S N−1 j∈ infected ν j , i.e., where one would normally expect a βI transmission term when infection rates are homogeneous, we instead have j ∈ infected ν j . The combination of the deterministic infectious period and the Gamma-distributed individual reproductive capacity results in this situation being equivalent to that considered throughout the manuscript. Recovered individuals have Example simulation of this process, with N = 10 0 0 , k = 1 , γ = 1 , R r 0 = 2 , R i 0 = 4 . The simulation is initialised in an approximate equilibrium state for the resident disease, and a single infectious individual of the invader strain is introduced at t = 1 . The simulation ends when the resident strain is extinct. waning immunity, occurring according to a Poisson process with rate ηR . While this process is not Markov, it can nonetheless be simulated through a small modification of the standard Doob-Gillespie stochastic simulation algorithm ( Algorithm 1 ). The key step is to record the recovery times r j and individual reproductive capacities ( ν j ) of each infected individual in a first-in-first-out (FIFO) queue data structure. Then, rather than generating the time of the next event as in a standard Markov process, we generate a candidate time for the next infection or recovery event, and check if it is due to occur before the next recovery that is due (i.e., the first r j in the queue). If the candidate infection time is before the next recovery, infection occurs, otherwise the recovery occurs. For consistency with the situation described in the main article, we initialise the simulations at t = 0 in an approximate equilibrium state for I r (i.e., such that the average transition rates into and out of each compartment are approximately equal; effectively a moment-closure approximation for the mean of the process), and introduce a single I i at t = 1 . Full details appear in Algorithm 1 . An example realisation of this process appears in Fig. 6 . Simulations were produced for: • N ranging between 100 and 10 0 0. • k taking values 0.1, 0.3, 1, 3, and 10 0 0 (as in the main manuscript, but with 10 0 0 in place of ∞ ). • R r 0 taking values 2, 3, 4 and 10. • R i eff taking values 1, 1.1, 1.5, 2, 3, 4 and 5. At least 10,0 0 0 simulations were produced for each combination of parameters. η = 1 / 10 was used throughout. Simulations were terminated at the later of time t = 5 , and the time when one or other of the strains became extinct. The t = 5 minimum was established as in some cases (particularly with small N ) the resident strain could become extinct with little influence from the invading strain, with the invading strain becoming extinct shortly after -in which case the invading strain should not be recorded as successfully becoming established. Note that we do not consider extinction of the invader at a later time, e.g., epidemic fade out in the first trough (as in Ballard et al. (2016) ), unless that fade out occurs within the t ≤ 5 interval. When the simulated population size was small, there was substantial error between simulated establishment success of the invading strain, and the branching process approximation; however, as population size increased beyond 500, the error between these quantities reduced ( Fig. 8 ) . This behaviour was robust to variations in k , and to R r 0 ( Fig. 7 ) . The most substantial difference was that, particularly at smaller population sizes, the invader had some chance to become established even when R i eff was 1.0; and this was more likely for more heterogeneous individual reproductive capacity distributions. One possible explanation for this is that the empirical distribution of transmissibility in a (small) finite population may not match the generating distribution (and, in particular, the average transmissibility of the invader may be substantially greater than expected). Intuitively, this could occur when the initial invading individual was a 'superspreader' (i.e., an individual with high individual reproductive capacity). Alternatively, the resident strain could fluctuate substantially from the average quasi-stationary behavior (even to the point of being at risk of extinction without the influence of the invader), which would provide windows of opportunity for the invader to establish. Similar phenomena were observed by Humplik et al. (2014) , wherein the resident strain Data : N, k, R r 0 , R i 0 , η, γ = 1 , T max Result : Produces an exact simulation of the epidemic process. initialise: t = 0 ; end U r ← sort (U r ) ; // Existing infectious individuals at time 0 must recover between time 0 and 1, uniformly. Algorithm 1: Algorithm to simulate the full epidemic process. Note that queue () creates an empty queue data structure, push (X , y ) adds element y to the end of queue X , peek (X ) looks at the first value in X (without removing it), and pop (X ) removes the first value in X , and returns it. For simplicity we do not include initialising the first I i individual in this description; it is introduced at time 1, with a transition from a susceptible individual, and its death time and individual reproductive capacity replace the first element in U i and V i , respectively. Comparison between the branching process approximation for invader establishement success, and empirical invader establishment success in simulations of a full epidemic process. The process was simulated for a range of values of k , R r 0 , and R i eff (i.e., R i 0 /R r 0 ), as the total population size, N varied from 100 to 1,0 0 0. The red line indicates the branching process approximation solution. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) Fig. 8 . Error between the branching process estimate of invader establishment probability, and establishment success in simulations of size N . R r 0 is fixed at 2.0 in these simulations. Error is calculated as 5 R eff =1 ˆ q s (R eff ) − q (R eff ) , where ˆ q s (R eff ) is the proportion of simulations in which establishment was successful at the given R eff level. could become naturally extinct within the lifetime of the invading strain, even without the heterogeneity in reproductive capacity considered here. More broadly, the interactions between N, k , the reproductive capacity of the resident strain, and the effective reproductive capacity of the invader have complex impacts on the correspondence between the branching process approximation and simulation results ( Fig. 8 ) . For intuition around the accuracy of the branching process approximation, we can consider two key drivers. • The size of the population of susceptibles available to the invading strain, early in its outbreak. A remark of Ball and Donnelly (1995) suggests that, in the case they consider, the epidemic grows like a branching process until approximately N 1/2 of individuals in the population are infected. While the process we consider is not the same, we can use this as a heuristic. Rather than the full population, we are interested in the population of susceptible individuals, given the circulation of the endemic disease in the population. We denote this S 0, and note that it is approximately N R r 0 . We observe that for small values of N , the growth of the process can only be approximated by a branching process up to very small outbreaks, e.g., until 7 or fewer infected individuals when N = 100 . • The capacity of the resident strain (or the invader) to persist in the population. In small populations, it is possible for the resident strain itself to become extinct quickly, due to the stochastic nature of disease transmission. As heterogeneity in individual reproductive capacity increases, this becomes more likely to occur, particularly at smaller values of R 0 ( Fig. 10 ). As such, when the invader is introduced, it becomes impossible to distinguish success of the invader strain from natural extinction of the resident; or failure of the invader from successful establishment followed quickly by its natural extinction. In short, the branching process approximation is definitely to be avoided for small population sizes, but can produce reasonable results as population size increases (particularly under conditions when the endemic disease is capable of persisting for substantial time periods). We advise caution, and, where necessary, verifying results through simulation. that, when R i eff is 1.0, or k is 0.1, the branching process approximation underestimates the success of the invader; but when R i eff is large, its success is overestimated, particularly when R r 0 is small. Fig. 10 . Simulated extinction times for the endemic disease process (initiated in equilibrium at time 0), in the absence of the invader strain. Note that the simulation was truncated at time t = 100 , and so outbreaks that persisted at that time, were assigned that extinction time. Lines show median values, error bars indicate 25% and 75% quantiles. Molecular mechanisms of antibacterial multidrug resistance Strong approximations for epidemic models The probability of epidemic fade-out is nonmonotonic in transmission rate for the markovian SIR model with demography Influenza a gradual and epochal evolution: insights from simple models 20 0 0. The impact of antibiotic use on resistance development and persistence. Drug Resist On theoretical models for competitive and predatory biological systems An evolutionary model to predict the frequency of antibiotic resistance under seasonal antibiotic use, and an application to Streptococcus pneumoniae Modeling the emergence of the 'hot-zones': tuberculosis and the amplification dynamics of drug resistance Stochastic epidemic models: a survey Origin and proliferation of multiple-drug resistance in bacterial pathogens. Microbiol Host population structure and treatment frequency maintain balancing selection on drug resistance Modeling epidemics of multidrug-resistant M. tuberculosis of heterogeneous fitness How competition governs whether moderate or aggressive treatment minimizes antibiotic resistance What is the mechanism for persistent coexistence of drugsusceptible and drug-resistant strains of Streptococcus pneumoniae ? Modeling the invasion of community-acquired methicillin-resistant Staphyloccocus aureus into hospitals Mathematical tools for understanding infectious diseases A missing dimension in measures of vaccination impacts Multivariate birth-and-death processes as approximations to epidemic processes The end of an era? Epidemiological feedbacks affect evolutionary emergence of pathogens Evolutionary dynamics of infectious diseases in finite populations Antibiotic resistance: the need for global solutions Evolution of antibiotic resistance is linked to any genetic mechanism affecting bacterial duration of carriage Evolution and emergence of infectious diseases in theoretical and real-world networks Selection of intermediate rates of increase in parasite-host systems Antibacterial resistance worldwide: causes, challenges and responses Superspreading and the effect of individual variation on disease emergence The epidemiological fitness cost of drug resistance in Mycobacterium tuberculosis The risk of global epidemic replacement with drug-resistant Mycobacterium tuberculosis strains Antibiotic resistance: the last resort Coupled, multi-strain epidemic models of mutating pathogens The crisis in antibiotic resistance Hybrid markov chain models of s disease dynamics Incentivising innovation in antibiotic drug discovery and development: progress, challenges and next steps A modeling framework for the evolution and spread of antibiotic resistance: literature review and model categorization Detecting emerging strains of tuberculosis by using spoligotypes How do pathogen evolution and host heterogeneity interact in disease emergence? This research was supported by a seed funding grant (to M.T.M. and R.C.C.) from the NHMRC Centre of Research Excellence in Policy Relevant Infectious disease Simulation and Mathematical Modelling (PRISM 2 ). R.C.C. received funding from the Data to Decisions Cooperative Research Centre (D2DCRC), and the Australian Research Council Centre of Excellence for Mathematical and Statistical Frontiers (project number CE14010 0 049 ). This work was supported with supercomputing resources provided by the Phoenix HPC service at the University of Adelaide.