key: cord-0149857-gpnaldjk authors: Gomes, M. Gabriela M.; Feasey, Nicholas A.; Ferreira, Marcelo U.; LaCourse, E. James; Langwig, Kate E.; Reimer, Lisa; Ringwald, Beate; Rylance, Jamie; Stothard, J. Russell; Taegtmeyer, Miriam; Terlouw, Dianne J.; Tolhurst, Rachel; Wingfield, Tom; Gordon, Stephen B. title: Unfolding selection to infer individual risk heterogeneity for optimising disease forecasts and policy development date: 2020-09-02 journal: nan DOI: nan sha: ae21feee424b25121c9c1b93ab99c2ab013350f5 doc_id: 149857 cord_uid: gpnaldjk Mathematical models are increasing adopted for setting targets for disease prevention and control. As model-informed policies are implemented, however, the inaccuracies of some forecasts become apparent, for example overprediction of infection burdens and overestimation of intervention impacts. Here, we attribute these discrepancies to methodological limitations in capturing the heterogeneities of real-world systems. The mechanisms underpinning single factors for infection and their interactions determine individual propensities to acquire disease. These are potentially so numerous that to attain a full mechanistic description may be unfeasible. To contribute constructively to the development of health policies, model developers either leave factors out (reductionism) or adopt a broader but coarse description (holism). In our view, predictive capacity requires holistic descriptions of heterogeneity which are currently underutilised in infectious disease epidemiology but common in other disciplines. Setting realistic targets and developing feasible strategies for disease prevention and control depends on representative models. These can be conceptual, experimental, or mathematical. Mathematical modelling was established in infectious diseases over a century ago [1] [2] [3] . Propelled by the discovery of aetiological agents for infectious diseases, and Koch's postulates, models have focused on the complexities of pathogen transmission and evolution to understand and predict disease trends in greater depth [4] . This has led to their adoption by decision makers to inform national and international policy. However, as model-informed policies are being implemented, the inaccuracies of some forecasts are increasingly apparent, most notably their tendency to overpredict infection burdens and overestimate the impact of control interventions [5] [6] [7] . Here, we discuss how these discrepancies could be explained by methodological limitations in capturing the effects of individual variation in real-world systems. We suggest improvements that derive from theory developed in demography to study frailty variation [8] . Using simulations, we illustrate the problem by incorporating individual variation within infectious diseases models and formulate a pragmatic approach to estimate the most impactful forms of heterogeneity. We use the examples of acquired immunodeficiency syndrome (AIDS) and coronavirus disease 2019 to illustrate the effects that individual heterogeneity can have on the performance of mathematical models for the dynamics of endemic and epidemic diseases. Since the detection of AIDS in the early 1980s, it has been evident that heterogeneity in individual sexual behaviours needed to be considered in mathematical models for the transmission of the causative agent -the Human Immunodeficiency Virus (HIV) [9] . Much research has been devoted to measuring contact networks in diverse settings and by different methods, to attempt to reproduce transmission dynamics accurately [10] [11] [12] . However, other equally important sources of inter-individual variation were overlooked. For example, unmodelled heterogeneity in infectiousness and susceptibility led to over-emphasis of the acute-phase HIV as a driver of new infections [13] . This resulting in an overlooked opportunity in "treatment as prevention" measures. The problem of unaccounted heterogeneity in disease forecast models can be illustrated with the simplest mathematical description of infectious disease transmission in a host population. Figure 1 shows the prevalence of infection over time under three alternative scenarios: all individuals are at equal risk of acquiring infection (black trajectories); individual risk is affected by a factor that modifies either their susceptibility to infection (blue) or exposure through connectivity with other individuals (green). Homogeneous models assign every individual a risk factor of 1 (black frequency plot), whereas heterogeneous risk derives from a distribution with mean one (blue and green density plots). As the virus spreads within the population, individuals at higher risk are predominantly infected as indicated at endemic equilibrium (Figure 1 A, B , C, density plots on the right, coloured red) and after 100 years of control (Figure 1 D, E, F). The control strategy applied to endemic equilibrium in the figure is the 90-90-90 treatment as prevention target advocated by the Joint United Nations Programme on HIV/AIDS (UNAIDS) whereby 90% of HIV-infected individuals should be detected, with 90% of these receiving antiretroviral therapy, and 90% of these should achieve viral suppression (becoming effectively non-infectious). Figure 1 shows that heterogeneous models that account for wide biological and social variation require higher basic reproduction numbers ( ! ) to reach a given endemic level and predict less impact for control efforts when compared with the homogeneous counterpart model. This holds true regardless of whether heterogeneity affects susceptibility or connectivity and is extensive to more realistic combinations of the two traits. At endemic equilibrium, individuals at higher risk are predominantly infected (red distributions have mean greater than one as marked by the red vertical lines), and hence those who remain uninfected are individuals with lower risk (blue and green distributions have mean lower than one as marked by the black vertical lines). Thus, the mean risk in the uninfected but susceptible subpopulation decreases, and the epidemic decelerates (thin blue and green curves); higher values of ! are consequently required if the heterogeneous models are to attain the same endemic level as the homogeneous formulation (heavy blue and green curves). Finally, interventions are less impactful under heterogeneity because ! is implicitly higher. Indeed, these biases could help explain trends in HIV incidence data which lag substantially behind targets informed by model predictions, even in settings that have reached the 90-90-90 implementation targets [5, 6] . At the end of 2019, a novel severe acute respiratory syndrome coronavirus (SARS-CoV-2) isolated from a patient in China began to spread worldwide causing the COVID-19 pandemic. Countrywide epidemics have been extensively analysed and modelled throughout the world. Initial studies projected attack rates of around 90% if transmission had been left unmitigated [14] , while subsequent reports noted that individual variation in susceptibility or exposure might flatten epidemic curves and reduce these estimates substantially [15] [16] [17] , as shown in Figure 2 Figure 2 illustrates how reinfection risk is likely to be overestimated when heterogeneity is neglected (black horizontal line represents individual risk ratio while blue and green curves depict time-dependent population risk ratios under heterogeneous susceptibility and connectivity, respectively). Representing individual variation is necessary to predict infectious disease dynamics and inform policy. Epidemic curves for COVID-19 are widely available, and it is possible to construct models with inbuilt risk distributions. Their shapes can be inferred by assessing their ability to mould simulated trajectories to observed epidemics, while accounting for realistic social and biomedical interventions [17] . Variation in infectiousness has been critical to the occurrence of explosive outbreaks resulting from superspreaders in both 2002 SARS-CoV-1 and 2019 SARS-CoV-2 [18, 19] . This heterogeneity is different, however: variation in infectiousness does not lead to selective depletion of the susceptible pool as variation in susceptibility or connectivity do, i.e., models with and without variation in infectiousness perform identically when implemented deterministically and only differ through stochasticity processes. The need to account for heterogeneity in risk of acquiring infections is generally applicable across other models of infectious disease epidemiology. Moreover, similar issues arise in methods intended to evaluate the efficacy of interventions from experimental studies. Individual variation in susceptibility or exposure to infection induces biases in cohort studies and clinical trials. Vaccine efficacy trials offer a useful illustration of the problem and give insight into a potential solution. In a vaccine trial, two groups of individuals are randomised to receive a vaccine or placebo and disease occurrences are recorded in each group. As disease affects predominantly higher-risk individuals, the mean risk among those who remain unaffected decreases and disease incidence declines. In the vaccine group the same trend will occur at a slower pace (presuming that the vaccine protects to some degree). As a result, the two randomised groups become different over time with more highly susceptible individuals remaining in the vaccine group. The vaccine efficacy, described as a ratio of cases in vaccinated compared to control group, therefore appears to wane (Figure 3 ) [20, 21] . This effect will be stronger in settings where transmission intensity is higher, inducing a trend of seemingly declining efficacy with disease burden [22] . The concept is illustrated in Figure 3 by simulating a vaccine trial with heterogeneous and homogeneous models analogous to those utilised in Figures 1 and 2 . Selection on individual variation in disease susceptibility thus offers an explanation for vaccine efficacy trends that is entirely based on population level heterogeneity, in contrast with individual waning of vaccine-induced immunity [23] . It is important to disentangle their roles, as both may occur concurrently in a trial and lead to different interpretations of the same data. For example, waning of individual vaccine-induced immunity may superficially look the same as a population decline due to selection on individual variation. To capture this in a timely manner requires multicentre trial designs with sites carefully selected over a gradient of transmission intensities (e.g., optimally spaced along the incidence axis in Figure 3 C, F), and analyses performed by fitting curves generated by models that incorporate individual variation. An alternative and more tightly controlled approach would be to use experimental designs in human infection challenge studies where these are available [24] to generate dose-response curves and apply similar models [25] . These approaches have recently been successfully tested in animal systems [26] [27] [28] . An essential purpose in suggesting these study designs (randomised controlled trials with long follow-up, multicentre trials over a gradient of transmission intensities, or dose-response infection challenges) is to enable the unfolding of selection gradients in such a way that individual risk heterogeneity can be inferred from observed patterns of infection. Heterogeneities in predisposition to infection depend on the mode of transmission. In respiratory infections, heterogeneity may arise from variation in exposure of the susceptible host to the pathogen, or the competence of host immune systems to control it. These two processes have multiple component factors. Some of the most studied are age, patterns of inter-personal contacts, exposure to smoke, nutritional status, pre-existing respiratory illness such as asthma or chronic obstructive pulmonary disease, and the presence of other The mechanisms underpinning single factors for infection and their interactions determine individual propensities to acquire disease. These factors are potentially so numerous and intertwined that to attain a full mechanistic description is likely unfeasible. Even if a list of all putative factors were available, the measurement of effect sizes might be subject to selection within cohorts resulting in underestimated variances [29] . To contribute constructively to the development of health policies, model building involves compromises between leaving factors out (reductionism) or adopting a broader but coarse description (holism). Holistic descriptions of heterogeneity are currently underutilised in infectious diseases. Descriptive measures of individual variation can be formulated into disease transmission models, whether they depict endemic [30, 31] or epidemic [17] processes, in much the same way that they are used to describe risk inequality in non-communicable diseases, such as cancer [32] , or non-health disciplines, such as economics, offering a holistic approach to improve the predictive capacity of models. Having conceived the model, the challenge becomes the quantification of relevant statistical dispersion parameters. In epidemic diseases, characterised by marked temporal dynamics, individual variation can be most simply estimated by fitting dynamic models to series of reported infections, hospitalisations, or deaths [17] . As for endemic diseases, typically these do not display as much change over time that we might learn from, so we need to be more creative at unfolding selection gradients. This may involve stratifying the population into groups of individuals with similar risk, which may be as granular as individual level for frequent diseases, such as influenza or malaria [31] , geographical units for diseases which cluster by proximity, such as tuberculosis [30] , or familial relatedness when there is a clear genetic contribution to risk, such as cancer [32] . By recording disease events in each group, specific incidence rates can be calculated and ranked. The formulated models, which incorporate explicit distributions of individual risk, are then fitted to the stratified data to estimate the extent of individual variation among other parameters of interest. Once developed, these models will automatically adjust average risks in susceptible subpopulations to changes in transmission intensity, should these occur naturally or in response to interventions. Not subject to the selection biases described in this paper, this modelling approach inherently enables more accurate impact forecasts for use in policy development. There is compelling evidence for the utility of holistic indicators that account for individual variation in disease risk, admitting that heterogeneity is so vast in real-world systems that complete mechanistic reconstructions may be unachievable. Inspired by other population disciplines and supported by successful applications in both infectious and noncommunicable diseases, we describe methods of study design and analyses that enable holistic inferences of heterogeneity by estimating how much selection occurs as susceptible subpopulations are depleted through infection. These methods rely on unfolding selection gradients. Applying these approaches to epidemiology offers significant advantages: disease models could provide more accurate descriptions of intervention effects, and better disease forecasts. MGMG conceived the idea and all authors contributed to the development and writing of this article. We declare no competing interests. An application of the theory of probabilities to the study of a priori pathometry, Part I An application of the theory of probabilities to the study of a priori pathometry, Part II A contribution to the mathematical theory of epidemics Modeling infectious disease dynamics in the complex landscape of global health Is the UNAIDS target sufficient for HIV control in Botswana? Joint United Nations Programme on HIV/AIDS (UNAIDS). 2017. Global AIDS update Elimination of lymphatic filariasis in South East Asia Impact of heterogeneity in individual frailty on the dynamics of mortality A preliminary study of the transmission dynamics of the human immunodeficiency virus (HIV), the causative agent of AIDS Heterogeneities in the transmission of infectious agents: implications for the design of controls programs Networks and epidemic models Transmission network parameters estimated from HIV sequences for a nationwide epidemic Reassessment of HIV-1 acute phase infectivity: accounting for heterogeneity and study design with simulated cohorts Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand (Imperial College COVID-19 Response Team A mathematical model reveals the influence of population heterogeneity on herd immunity to SARS-CoV-2 Herd immunity under individual variation and reinfection Individual variation in susceptibility or exposure to SARS-CoV-2 lowers the herd immunity threshold Superspreading and the effect of individual variation on disease emergence Clustering and superspreading potential of SARS-CoV-2 infections in Hong Kong Estimability and interpretability of vaccine efficacy using frailty mixing models Apparent declining efficacy in randomized trials: Examples of the Thai RV144 HIV vaccine and CAPRISA 004 microbicide trials Clinical trials: the mathematics of falling vaccine efficacy with rising disease incidence Sevenyear efficacy of RTS,S/AS01 malaria vaccine among young African children Design, recruitment, and microbiological considerations in human challenge studies A missing dimension in measures of vaccination impacts Vaccine effects on heterogeneity in susceptibility and implications for population health management Unveiling time in dose-response models to infer host susceptibility to pathogens Variation in Wolbachia effects on Aedes mosquitoes as a determinant of invasiveness and vectorial capacity Understanding variation in disease risk: the elusive concept of frailty Introducing risk inequality metrics in tuberculosis policy development Modelling the epidemiology of residual Plasmodium vivax malaria in a heterogeneous host population: a case study in the Amazon Basin Inequality in genetic cancer risk suggests bad genes rather than bad luck TW is supported by grants from the Wellcome Trust, UK (209075/Z/17/Z) and the Medical Research Council, Department for International Development, and Wellcome Trust (Joint Global Health Trials, MR/V004832/1).