key: cord-1038621-ibgm0fer
authors: Großmann, Gerrit; Backenköhler, Michael; Wolf, Verena
title: Heterogeneity matters: Contact structure and individual variation shape epidemic dynamics
date: 2021-07-20
journal: PLoS One
DOI: 10.1371/journal.pone.0250050
sha: 6f5fd671d6b70e40e0204666ba1e8deb094ce29e
doc_id: 1038621
cord_uid: ibgm0fer

In the recent COVID-19 pandemic, mathematical modeling constitutes an important tool to evaluate the prospective effectiveness of non-pharmaceutical interventions (NPIs) and to guide policy-making. Most research is, however, centered around characterizing the epidemic based on point estimates like the average infectiousness or the average number of contacts. In this work, we use stochastic simulations to investigate the consequences of a population’s heterogeneity regarding connectivity and individual viral load levels. Therefore, we translate a COVID-19 ODE model to a stochastic multi-agent system. We use contact networks to model complex interaction structures and a probabilistic infection rate to model individual viral load variation. We observe a large dependency of the dispersion and dynamical evolution on the population’s heterogeneity that is not adequately captured by point estimates, for instance, used in ODE models. In particular, models that assume the same clinical and transmission parameters may lead to different conclusions, depending on different types of heterogeneity in the population. For instance, the existence of hubs in the contact network leads to an initial increase of dispersion and the effective reproduction number, but to a lower herd immunity threshold (HIT) compared to homogeneous populations or a population where the heterogeneity stems solely from individual infectivity variations.

At the beginning of 2020, the world was struck by the coronavirus (SARS-CoV-2) pandemic. Faced with the approaching overload of healthcare systems, the international community turned to non-pharmaceutical interventions (NPIs) in an attempt to contain the spread of the pathogen [1] . Computational epidemiological modeling became an important asset to predict the propagation and to evaluate the prospective effectiveness of various measures such as school closings and travel restrictions [2, 3] . For an overview of COVID-19 models and their successes and failures, we refer the reader to [4, 5] . Literature abounds with new studies describing and forecasting the spread of COVID-19. Instead, we focus on fundamental properties of popular models and the consequences of popular modelling assumptions.

In this work, we highlight the importance of population heterogeneity for computational epidemiology and explain why many models used in the current COVID-19 pandemic do not adequately capture it. In particular, we analyze how individual variations in contact numbers Contribution (2) is based on preliminary results that were published during the first COVID-19 wave in the spring of 2020 [25] .

Our manuscript is structured as follows: We present relevant literature in Section Related work. In Section Method, we show how to translate ODE models to network-based models and discuss their relationship. Section Experiments provides numeral experiments based on synthetically generated random contact networks. Conclusions completes the manuscript.

Mathematical modeling of epidemics is a wide research area to control, predict, and understand epidemics or similar types of propagation processes (like opinions, or computer viruses). Here, we mostly focus on the network-based spreading paradigm [26] and its relation to other model types. In particular, we study which types of population heterogeneity can be expressed and how models are used in the current COVID-19 pandemic. Note that currently all models that study COVID-19 quantitatively suffer from the poor quality of data and uncertainty about parameters [27, 28] . We focus on the three model types we consider most relevant for epidemiological modeling in general. While ODE models and network-based models are directly part of the manuscript, we include branching processes because they are the de facto standard for formalizing and studying dispersion in epidemics. For a comparative analysis of models specific to COVID-19, we refer the reader to [4, 29] .

The most wide-spread epidemiological model type is based on a system of ordinary differential equations (ODEs) in which coupled fractions of individuals in disease compartments change deterministically and continuously over time [30] . For an overview on various applications, we refer the reader to [31] [32] [33] [34] . Compartments refer to different disease stages (e.g., susceptible (S), infected (I), recovered (R), exposed (E), dead (D)). Most commonly used is the threecompartment SIR-model (cf. Fig 1) . Note that ODE models use a single parameter (λ) to model the chance to meet someone (interaction structure) and the probability to transmit the infection.

Population heterogeneity. Expressing population heterogeneity is only possible to a very limited degree. The typical way is to introduce additional compartments that encode a membership to a certain group (e.g., susceptible and "younger than 20"). These extended models are often referred to as meta-population [35] models. Apart from that, a homogeneous interaction structure is assumed. Effects like super-spreaders (or overdispersion) are practically not expressible. The same holds for local die-outs. Moreover, the deterministic nature makes it difficult to conceptualize risk and uncertainty. ODE models arise as the mean-field limit of a well-mixed Markov population model (corresponding to a complete graph in the networkbased paradigm) [ . The spread of COVID-19 was studied for many more countries and settings [43] [44] [45] [46] [47] . Other studies use a meta-population approach and group individuals based age [48] [49] [50] [51] or region [52] [53] [54] . Moreover, [55, 56] , modify ODE models to account for individual variation in susceptibility. Roda et al. use an ODE model to illustrate the general difficulty of predicting the spread of COVID-19 data [57] . Limitations in the applicability of ODE models regarding data from Italy is reported in [58] . Similar results are found by Castro et al. using COVID-19 data from Spain [59] . General concerns are articulated in [60] .

Stochastic branching processes operate in discrete or continuous-time and are useful when studying the underlying stochastic nature of an epidemic. They are based on a tree that grows over time and represents the infected individuals. The children (offspring) of each node represent an individual's secondary infections and the number of children is drawn from an offspring distribution with mean R 0 that is provided by the modeler [14, [61] [62] [63] .

Population heterogeneity. The offspring distribution makes it straight-forward to encode individual variations in infectiousness or connectivity. The paradigm allows to study random extinction probabilities of the epidemic and the effects of super-spreaders/overdispersion [12] . However, branching processes do not admit a (model intrinsic) saturation due to growing immunity in the population. Moreover, the high level of abstraction makes it difficult to study the effects of NPIs and the characteristics of the spatial diffusion of the pathogen. COVID-19. Zhang et al. use a branching process to measure the dispersion of COVID-19 inside China [64] and Endo et al. estimate the dispersion based on local clusters outside China [16] . Moreover, [65] , use a branching process to infer epidemiological values and [22] study the influence of temporal viral load variation. Alternative branching process models to study dynamical properties specific to COVID-19 were proposed in [66] [67] [68] .

Network-based epidemic models use graphs to express the interactions (edges) among individuals (nodes). They are stochastic in nature and can be formulated in discrete or continuous time. Each node occupies a compartment (node state) at each point in time and infected nodes can (randomly) infect their susceptible neighbors [26, 69, 70] . Generalizations to multi-layer and weighted networks have been suggested [71] .

Population heterogeneity. The network-based paradigm decouples the connectivity of the population from the infectiousness of the virus. Moreover, each individual is represented by an autonomous agent which adds flexibility and makes it straightforward to include individual variations of the population. The key advantage of networks is that they represent a universal way of encoding different types of complex interaction structures like hubs, communities, households, small-worldness, different mixing within in population-groups, etc. The contact network can also represent spatial or geographical constraints. Network-based models relate to ODE models in the sense that the ODE model represents the mean propagation of an epidemic on an infinite complete graph (all nodes are directly connected), assuming that all nodes are attributed with exponentially distributed jump times. Conceptually, the completeness "removes" the heterogeneity from the interaction structure and the infinite size prevents artifacts from the stochasticity.

COVID-19. Effects of different contact networks were studied in [25, [72] [73] [74] . Contact networks are being used to build realistic simulations of a society, for instance by creating household-structures with various types of inter-household connections [75] [76] [77] [78] . The flexibility to control networks modeling NPIs easy [76, 77, 79, 80] . Moreover, [81] [82] [83] , use a networkbased approach for spatial properties (e.g., flow between geographical regions). Although the importance of hubs was recognized very early [84] , the concrete relation to overdispersion as it is studied in branching processes remains under-explored. Networks, where the contact structure changes over time, are particularly well-suited to study quarantine measure and social distancing [85] [86] [87] .

In this section, we show how to translate a ODE model to a network-based model in order to impose variation in connectivity and infectiousness while keeping the population averages of clinical and transmission parameters fixed. We use a COVID-19 ODE model that is heavily inspired by the SIR-extension of [76] . A summary of the model is depicted in Fig 2 

Our model contains six disease stages or compartments (cf. Fig 2) : susceptible (S), exposed (E) (infected but not yet infectious), removed (R) (immune or dead), as well as mild, severe, and critical infection stages (I 1 , I 2 , I 3 ). In contrast to [76] , we merge dead and recovered stages to a single removed stage, as both do not influence the infection dynamics further (we assume immunity after recovery. Note that perfect and permanent immunity is not given for COVID-19. In this study, we ignore the impact of re-infected individuals. The fraction of individuals in each compartment evolves according to a system of ordinary differential equations (ODEs) given in Fig 2b. Unlike network-based models, ODE models admit a semantics that is invariant to the population size. Thus, we assume that the population is normalized. A further difference to [76] is that we only have a single infection parameter γ. All other parameters have a meaningful clinical interpretation and can be specified accordingly (cf. Table 1 ). The set of transition parameters γ, μ j , β j gives raise to a specific R 0 . We can compute R 0 by assuming that an infinitesimal fraction � (representing patient zero in an infinitely large population) is infected (I 1 ) and that the rest of the population (1 − �) is susceptible. Specifically, R 0 is the ratio between � and the population fraction that leaves S due to �. Therefore, we consider the outflow from S to E caused by this initially infected fraction while it passes the three disease stages (taking into consideration that only an even smaller fraction of � reaches I 2 and I 3 ):

For instance, m 2 m 2 þb 1 refers to the fraction of � that reaches I 2 and g=z b 2 þm 3 corresponds to the outflow of S attributed to this fraction.

Hence, we can fix R 0 and thereby control γ. We use R 0 = 2.5 which leads to γ � 0.394.

We consider a static, undirected, unweighted, strongly connected graph. At each point in time, each node (individual) is annotated with a compartment. The dynamics is specified as a The transition rates refer to exponentially distributed residence times. The expected residence time in each disease stage is the inverse of the sum of the outgoing transitions (e.g. for I 1 it is 1/(β 1 + μ 2 ) = 6 (days)). Likewise, the probability to go to I 2 is 0.2. We refer to [76] for clinical justification of μ j , β j .

https://doi.org/10.1371/journal.pone.0250050.t001

continuous-time Markov chain (CTMC) [26] where the labeling changes randomly over time. We use the compartments described in Fig 2. Nodes change their compartment following exponentially distributed residence times corresponding to specific instantaneous rates. For the transition from susceptible to exposed, the rate depends on the neighborhood of the node (cf. Fig 2a) . We consider two cases: (i) all nodes have the same infection rate λ and (ii): each node i has an individual infection rate λ i , sampled from a probability distribution with density ν. We start with the former case. Case (i): Homogeneous infectiousness. Each S − I 1 can transmit an infection with rate λ. If the infected node is in compartment I 2 or I 3 the infectiousness decreases (e.g., because people stay home) and is given by λ/z. Note that we use exponentially distributed residence times which are potentially less realistic than, for instance, beta-distributions [76] , but these relate directly to ODE models. Hence, observed differences in the dynamics can be attributed to the connectivity/stochasticity not the shape of the residence time.

R 0 is defined as the expected number of neighbors that a randomly chosen patient zero infects in a susceptible population, thus R 0 cannot be larger than the mean degree (number of neighbors). In the case of a fixed infection rate λ, fixing k mean also determines R 0 . We can approximate R 0 as shown in Fig 3. We use that each infection happens independently and approximate R 0 � p I k mean where p I denotes the probability that patient zero infects a random neighbor (while potentially transitioning to I 2 , I 3 ). The approximation comes from the fact that an already infected neighbor can infect another neighbor of patient zero violating the independence assumption, rendering this an over-approximation. Note that p I is conceptually similar to the secondary attack rate in a completely susceptible population. We construct the networks such that k mean = 8 (except for the complete graph where k mean is the number of nodes minus one). Like in the ODE-approach, we fix R 0 = 2.5 and determine λ = 0.0706 (cf. Fig 3) .

Case (ii): Individual differences in infectiousness. In the case of individually varying infectiousness, we associate each node i with infection rate λ i that is drawn from a distribution with density ν. Again, our goal is to introduce variation while keeping the population mean. Hence, we construct ν such that λ = E[λ i ] = 0.0706. We define R i 0 as the node-dependent basic reproduction number when the infection starts in node i. Moreover, we define the node-independent basic reproduction number as the corresponding unweighted mean R 0 ¼ E½R i 0 �. Interestingly, different ν (with the same mean) can lead to different R 0 . Theoretically, this follows from the computation of p I which is now based on an integral over ν. In the next section, we set ν to be an exponential distribution and study the resulting changes in the dynamics. A key takeaway of our study is that increasing the variance in the degree distribution does not change R 0 , increasing the variance in the individual infectiousness does so (in fact, it decreases R 0 ). For the evaluation, we use an exponentially distributed λ i with mean 0.0706. That is, nðxÞ ¼le Àlx withl ¼ 1=0:0706.

We test different types of contact networks that highlight different characteristics of real-world human-to-human connectivity. To this end, we describe the contact networks using random graph models. In each simulation, we create a specific realization (variate) of such a random graph model. A schematic visualization of example networks is provided in Fig 4. We use a complete graph (each possible pair of nodes is connected) as a baseline to study the evolution of the epidemic when no contact structure is present. Thereby, we can mimic the effects of stochasticity and variation in infectiousness while keeping the simulation as close as possible to the assumptions underlying the ODE. We use Power-law Configuration Model networks to study the effects of hubs (potential super-spreaders). These networks are-except from being constrained on having power-law degree distribution-completely random. The power-law degree distribution is omnipresent in real-world networks and entails a small number of nodes with a very high degree. We fix the minimal degree to be two and choose the power-law parameter numerically such that the network admits the desired mean degree. We also test a synthetically generated Household network that was loosely inspired by [88] . Each household is a clique, the edges between households represent connections stemming from work, education, shopping, leisure, etc. We use a geometric network to generate the global inter-household structure. The household size is 4. In the case of k mean = 8, each node has 3 edges within its household and (on average) 5 outgoing edges. We also compute results for Watts-Strogatz (WS) random networks. They are based on a ring topology with random re-wiring. Each node has exactly k mean neighbors. We use a small re-wiring probability of 5% to highlight the locality of real-world epidemics.

Apart from the baseline (complete graph), we use specifically these three network models because they are well-studied in literature and very different in their respective global properties. Moreover, they all encode important properties of human-to-human connectivity like hubs (power-law), small-worldness (power-law and WS) and tightly connected household structures.

Given a set of independent simulation runs, we measure dispersion by analyzing the empirical offspring distribution at day t (averaged over all nodes). Specifically, we consider the offspring distributions of the nodes that were exposed within day t (the actual secondary infections may happen later). We also perform a discretization of time over intervals of one day. We quantify dispersion in three ways:

• CoV: Together with the mean of the offspring distribution R t , we report the coefficient of variation (CoV), that is the ratio of the standard deviation to mean. The CoV is a widely-used measure of dispersion of a probability distribution.

• Top-k: We explicitly report how many new infections within day t are caused by which fraction of infected nodes (e.g., 80% of the new infections are caused by 20% of the nodes). We report which fraction of new infections can be traced back to (the most harmful) 10%, 20%, and 30% of infected nodes.

• Offspring: We report the fraction of nodes (that were infected within day t) that lead to 0, 1, 2, . . . children.

Note that overdispersion inevitably indicates not only the existence of super-spreaders but also the existence of nodes that are unlikely to pass the infection at all. Like super-spreaders, these individuals might be the result of host properties (i.e., a low viral load) or connectivity differences.

We compare the evolution and dispersion of the four network models. We have two main experiments. In the first experiment (overview in Fig 7) , we study a fixed infection rate (mimicking the case that there is only variation in the connectivity). In the second experiment (Fig  8) , we additionally impose individual variation in the infectiousness λ i . Recall that the networks (aside from the complete graph) have approximately the same density (number of edges) and that nodes approximately admit the same mean infectiousness, thereby, ensuring that the resulting differences are solely a consequence of the corresponding variation. Python code is available at www.github.com/gerritgr/Covid19Dispersion.

For each network, we perform 1000 simulation runs on a network with 1000 nodes. Networks are generated such that the mean degree is approximately eight. For network models where we cannot directly control k mean , we start by generating sparse networks and increase the density until k mean has the desired value. In each simulation run, we start with three randomly chosen infected nodes in I 1 (to reduce the likelihood of initial instantaneous die-outs). The ODE (cf. Fig 6) starts with a value of 3/1000 in I 1 . The number of simulation runs is enough to estimate the mean fractions (and the standard deviations) corresponding to each compartment with high accuracy (confidence intervals are not shown but would be barely visible anyway). The number of 1000 nodes was used for practical reasons, however, increasing the network size preserves the qualitative characteristics of the dynamics.

We characterize epidemics in terms of the evolution of population fractions, that is, mean fraction of nodes in compartment S, I 1 , R for each time point t (the remaining compartments evolve approximately proportional to I 1 , thus, we leave them out for clarity). This evolution informs about the time point and the height of the infection peak and the final epidemic size (or HIT) that is equivalent to the (mean) fraction of recovered nodes when the epidemic is over (which is mostly the case at 200 time units). Note that the final death toll is proportional to the final epidemic size. Moreover, we study the effective reproduction number R t (2nd row in Figs 7 and 8). We define R t to be the average number of secondary infections for a node that got exposed at day t 2 N �0 (we discretize time for this purpose). Thereby, we also report an empirical R 0 based on the three initially infected nodes that diverges slightly from the theoretical R 0 in Table 1 . Dispersion is quantified using the three techniques explained in Section 1 (2nd to 4th row in Figs 7 and 8).

Results for a fixed λ on different network types are shown in Fig 7. In most simulation runs, the power-law dynamics admits a very early peak and the epidemic dies out early with a comparably small final epidemic size. This effect can directly be attributed to the hubs that get infected very early (because of their high centrality) and jump-start the epidemic. In contrast, in household networks-and even more so in WS networks-the infection peak is lower and happens at a later time point. This is no surprise as the connectivity in both networks imposes a level of locality that slows down the propagation. For better visibility, the differences in the infection curve (based on I 1 ) are summarized in Fig 6. We also see that the complete graph behaves very similarly to the ODE model.

The effective reproduction R t starts from around 2.5 (the theoretical overapproximation) and decreases in most cases monotonically. The exception is again the power-law network where hubs cause a huge jump of R t in the first day from 2.5 to around 12. This jump is also reflected in the dispersion measure, most noticeable in the offspring plot (4th row). The power-law topology generally admits a higher dispersion than the other networks. For instance, the fraction of nodes with zero offspring is much higher. Moreover, on most days, the top 20% of the infected nodes account for more than 80% of the new infections which would roughly fit the estimations for COVID-19 in a typical population. In the other network types, there is less temporal variation of the dispersion. The dispersion is the lowest in the WS networks which is unsurprising as all nodes have degree 8 which provides an upper bound to the offspring number. Generally speaking, we see that dispersion can be measured robustly using the empirical offspring distribution.

Note that measuring the dispersion becomes increasingly difficult over time for the powerlaw network. The reason for this is that the epidemic tends to die out early with high probability. Thus, the dispersion is estimated on an comparatively small amount of samples. At the same time, the variance of the distribution is comparably high. This leads to a noticeable amount of noise.

In this experiment, we draw in each simulation run for each node i a random λ i that is distributed according to ν. Here we use an exponential distribution. The effects on the evolution and dispersion are reported in Fig 8. In all networks, the epidemic becomes "weaker" in terms of final epidemic size and infection peak height (see also Fig 6) . This effect is strongest in the WS network (where the epidemic dies out almost immediately) and weakest in the complete graph. This is also mirrored in the difference of R 0 compared to Experiment 1 (as explained in Fig 3, the relationship between λ and R 0 is now non-linear). The complete graph leaves R 0 almost unchanged (i.e., around 2.5) while it goes down to around 2.0 in the WS network. Effects on the household and power-law network are less drastic but still evident.

Regarding the dispersion, the differences to Experiment 1 are expected. The variance in the empirical offspring distribution generally increases. Interestingly, this happens in all networks by roughly the same amount regardless of whether the dispersion was high or low in the first case. We can also consistently see the change in dispersion in all three dispersion measures, but it is especially evident in the top-k plots (3rd row). It is also interesting to see that all networks admit a characteristic signature in the histogram of the offspring distribution (4th row). The infection rate variation shifts these plots (in particular, because the number of nodes without offspring increases) but they still entail a clear distinction between networks.

We also tested uniform and gamma distributions for ν (results not shown), and found that the epidemic generally becomes weaker with higher variance. We expect that this is due to an increased likelihood of local and global die-outs.

The two experiments show that heterogeneous interaction structures and variations in infectiousness strongly influence key quantities of an epidemic. However, there are important differences between the sources of variations:

• The existence of hubs in the network can cause R t to increase, variations in λ generally do not cause this behavior.

• Different networks with the same k mean and a fixed λ will (approximately) admit the same R 0 . However, choosing densities ν of different form (while keeping the mean of the distribution fixed) changes R 0 .

• Allowing infectiousness to vary between individuals generally weakens the epidemic's impact. Allowing infectiousness to vary between individuals can make some aspects of the epidemic worse (e.g., height of the infection peak in power-law networks).

• λ has the weakest influence when the interaction structure is homogeneous (i.e., on a complete graph) and the strongest when the interaction structure is based on locality (the average distances in the graph increase) as in the Watts-Strogatz network.

• Varying λ increases the stochasticity (e.g., the variance of the number of infected nodes at any given point in time) of the epidemic.

• The interaction structure has a large influence on the dispersion. Individual infectiousness variations induces a smaller but consistent increase of dispersion.

• Hubs influence how the dispersion changes over time. Infectiousness variations increase the dispersion consistently for all time points.

Our results underline that networks are a feasible tool to encode a wide variety of different features of a population's interaction structure. Generally speaking, it is not surprising that some networks support the formation of epidemics better than others. To some extent, this has been studied in terms of the epidemic threshold of graphs [89] . However, the variety of the influence of the networks and the interplay between heterogeneity in the degree of infectiousness and dispersion is remarkable and, to our knowledge, underexplored in literature.

There are even further possibilities to adjust population heterogeneity, e.g., by adding non-Markovian residence times in the compartments, by varying the remaining transition rates, or by imposing more temporal variability in infectiousness. Our results already show that models, based on point estimators of population averages (i.e., most mean-field ODE models), are not adequate for analyzing or predicting the dynamics of an epidemic.

Regarding the dispersion, we see that none of the considered network structures by itself leads to a dispersion where 80% of the infections are caused by only 15% of the infected nodes (at least not right from the beginning, in the power-law graph this point is approximately reached within a few days). However, the differences between networks are remarkable. From branching process theory, it is known that a higher dispersion increases the die-out probability [12] . Generally, this effect also holds for networks. For a fixed network, increasing dispersion by using a probabilistic infection rate does, in fact, increase the die-out probability. However, the network topology strongly modulates the strength of this effect.

Conclusively, we find that in most cases population diversity makes an epidemic less harmful but increases the dispersion and the variability of the evolution. Hubs in the contact networks are an exception to this rule. These are drivers of the epidemic as they become infected very early and infect many others. This distinguishes them from very infectious people (due to a high viral load) with an average number of contacts who also potentially infect many others. However, a high infectiousness alone does not make them more likely to be infected early enough (i.e., on average earlier than other nodes) to cause an early explosive surge of the epidemic. Hubs also highlight that the effective reproduction number can change significantly while the environmental conditions remain the same simply because the prevalence shifts towards highly connected individuals in the beginning.

Considering that an exponentially distributed λ i can be considered a fairly strong assumption about individual differences, our work can-with necessary caution-be seen as further evidence that the network structure plays a more important role for the dispersion than individual viral load variability.

Transferring these characteristics to NPIs, our work indicates that reducing long-range connections (e.g., by corresponding mobility restrictions) and keeping degree-variability small (to avoid hubs) are particularly effective control strategies. Reducing mobility seems to be especially effective for overdispersed epidemics. Note that the differences between the WS networks (that admit a high level of locality) and the other networks become even more evident when we vary infectiousness. This can be explained by the observation that in overdispersed epidemics the virus has to be introduced to a susceptible population multiple times before an outbreak becomes likely.

We tested the influence of heterogeneity in the population's interaction structure and degree of individual infectiousness on the dynamical evolution of an epidemic. We find that the dynamics depends strongly on these properties and that there is an intriguing interplay between these two sources of variation. Our work also highlights the role of small-worldness, local die-outs, and super-spreaders for an epidemic.

Naturally, mathematical modeling is based on assumptions and abstractions. However, heterogeneity seems to be particularly vital and excluding it should only be done with great caution. It is particularly difficult to capture population heterogeneity in the widely-used class of ODE models. This is due to their inherent homogeneity assumptions w.r.t. each compartment. Although effects such as overdispersion can be modelled to some extent using this paradigm [90] , the complex interplay of varying infectiousness and connectivity remains mostly elusive for such models. Discussing epidemics in terms of population averages may not adequately reflect the complexity of the emerging dynamics.

On a high-level, this work highlights limitations of certain model classes and shows that subtle differences in assumptions can make important differences. For future work, it would be interesting to study the effects of heterogeneity and their implications for NPIs empirically. Moreover, we plan to test implications of further sources of heterogeneity, for instance regarding compliance with NPIs, susceptibility to infections, or whether people tend to meet indoors or outdoors.

In a broader spectrum, population heterogeneity is only one aspect that may cause models to perform much worse in the real-world than one might expect. This simulation study is a reminder that models are prone to hidden assumptions, and that we should be cautious with their interpretation. 

Inferring the effectiveness of government interventions against COVID-19

Special report: The simulations driving the world's response to COVID-19

What 5 coronavirus models say the next month will look like

Data-driven modeling of COVID-19-Lessons learned

Wrong but useful-what covid-19 epidemiologic models can and cannot tell us

Superspreading events in the transmission dynamics of SARS-CoV-2: Opportunities for interventions and control

Review of Ferguson et al "Impact of nonpharmaceutical interventions

COVID-19 super-spreaders: Definitional quandaries and implications

Pattern of early human-to-human transmission of Wuhan

Clustering and superspreading potential of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections in Hong Kong. PRE-PRINT (Version 1) available at Research Square

Superspreading in Early Transmissions of COVID-19 in Indonesia. medRxiv

Superspreading and the effect of individual variation on disease emergence

Maximum likelihood estimation of the negative binomial dispersion parameter for highly overdispersed data, with applications to infectious diseases

Contact Tracing & Super-Spreaders in the Branching-Process Model

SARS-CoV-2 transmission dynamics should inform policy. Available at SSRN 3692807

Estimating the overdispersion in COVID-19 transmission using outbreak sizes outside China

Real-time monitoring the transmission potential of COVID-19 in Singapore

Beyond R0: Heterogeneity in secondary infections and probabilistic epidemic forecasting. medRxiv

Transmission heterogeneities, kinetics, and controllability of SARS-CoV-2

An analysis of SARS-CoV-2 viral load by patient age. medRxiv

Viral load in community SARS-CoV-2 cases varies widely and temporally. medRxiv

Wrong person, place and time: viral load and contact network structure predict SARS-CoV-2 transmission and super-spreading events

SARS-CoV-2 detection, viral load and infectivity over the course of an infection: SARS-CoV-2 detection, viral load and infectivity

Implications of heterogeneous SIR models for analyses of COVID-19

The effect of control strategies to reduce social mixing on outcomes of the COVID-19 epidemic in Wuhan, China: a modelling study. The Lancet Public Health

Contacts in context: large-scale setting-specific social mixing matrices from the BBC Pandemic project

Diverse local epidemics reveal the distinct effects of population density, demographics, climate, depletion of susceptibles, and intervention in the first wave of COVID-19 in the United States

A metapopulation network model for the spreading of SARS-CoV-2: Case study for Ireland. medRxiv

A SIR model assumption for the spread of COVID-19 in different communities

Power-law population heterogeneity governs epidemic waves

Individual variation in susceptibility or exposure to SARS-CoV-2 lowers the herd immunity threshold. medRxiv

Why is it difficult to accurately predict the COVID-19 epidemic? Infectious Disease Modelling

Inversion of a SIR-based model: A critical analysis about the application to COVID-19 epidemic

The turning point and end of an expanding epidemic cannot be precisely forecast

The challenges of modeling and forecasting the spread of COVID-19

Mathematical biosciences lecture series, stochastics in biological systems

Branching process models for surveillance of infectious diseases controlled by mass vaccination

The theory of branching processes

Evaluating transmission heterogeneity and super-spreading event of COVID-19 in a metropolis of China

Reporting, epidemic growth, and reproduction numbers for the 2019 novel coronavirus (2019-nCoV) epidemic

Branching processes modelling for coronavirus (COVID'19) pandemic

Branching stochastic processes as models of Covid-19 epidemic development

A model of COVID-19 propagation based on a gamma subordinated negative binomial branching process

Epidemic processes in complex networks. Reviews of modern physics

Analysis and control of epidemics: A survey of spreading processes on complex networks

Spreading processes in multilayer networks

An agent-based model of Covid-19

Modeling COVID-19 on a network: super-spreaders, testing and containment. medRxiv

A new SAIR model on complex networks for analysing the 2019 novel coronavirus (COVID-19)

Implications of the schoolhousehold network structure on SARS-CoV-2 transmission under different school reopening strategies in England. medRxiv

Dynamics of COVID-19 under social distancing measures are driven by transmission network structure

Covasim: an agent-based model of COVID-19 dynamics and interventions. medRxiv

Data-driven contact structures: from homogeneous mixing to multilayer networks

A Social Network Model of COVID-19. Available at SSRN 3584895

COVID-19 superspreading suggests mitigation by social network modulation. medRxiv

Complex network model for COVID-19: human behavior, pseudo-periodic solutions and multiple epidemic waves

Covid-19 spread: Reproduction of data and prediction using a SIR model on Euclidean network

Multi-city modeling of epidemics using spatial networks: Application to 2019-nCov (COVID-19) coronavirus in India. medRxiv

Epidemic spreading in scale-free networks

Active and inactive quarantine in epidemic spreading on adaptive activity-driven networks

Balancing quarantine and self-distancing measures in adaptive epidemic networks

Assessing the Impact of Social Network Structure on the Diffusion of Coronavirus Disease (COVID-19): A Generalized Spatial SEIRD Model

Analysis of a stochastic SIR epidemic on a random network incorporating household structure

Threshold conditions for arbitrary cascade models on arbitrary networks. Knowledge and information systems

Full genome viral sequences inform patterns of SARS-CoV-2 spread into and within Israel