key: cord-0821312-wqw14s93 authors: Rasigade, J.-P.; Barray, A.; Shapiro, J. T.; Coquisart, C.; Vigouroux, Y.; Bal, A.; Destras, G.; Vanhems, P.; Lina, B.; Josset, L.; Wirth, T. title: SARS-CoV-2 phylodynamics differentiates the effectiveness of non-pharmaceutical interventions date: 2020-08-26 journal: nan DOI: 10.1101/2020.08.24.20180927 sha: 24669b71fca9b16e9058b5b0277cc26adf07dc49 doc_id: 821312 cord_uid: wqw14s93 Quantifying the effectiveness of large-scale non-pharmaceutical interventions against COVID-19 is critical to adapting responses against future waves of the pandemic. By combining phylogenetic data of 5,198 SARS-CoV-2 genomes with the chronology of non-pharmaceutical interventions in 57 countries, we examine how interventions and combinations thereof alter the divergence rate of viral lineages, which is directly related to the epidemic reproduction number. Home containment and education lockdown had the largest independent impacts and were predicted to reduce the reproduction number by 35% and 26%, respectively. However, we find that in contexts with a reproduction number >2, no individual intervention is sufficient to stop the epidemic and increasingly stringent intervention combinations may be required. Our phylodynamic approach can complement epidemiological models to inform public health strategies against COVID-19. effective reproduction number (13) , we quantify the reduction of independently attributable to each intervention, exploiting heterogeneities in their nature and timing across countries in multivariate models. In turn, these results allow us to estimate the probability of stopping the epidemic ( < 1) when implementing selected combinations of interventions. The dissemination and detection of a virus in a population can be described as a transmission tree ( Fig. 1A ) whose shape reflects that of the dated phylogeny of the sampled pathogens (Fig. 1B) . In a phylodynamic context, it is assumed that each lineage, represented by a branch in the phylogenetic tree, belongs to a single patient and that lineage divergence events, represented by tree nodes, coincide with transmission events (10) . Thus, branches in a dated phylogeny represent intervals of time between divergence events interpreted as transmission events. This situation can be translated in terms of survival analysis, which models rates of event occurrence, by considering divergence as the event of interest and by treating branch lengths as time-to-event intervals ( Fig. 1C -D). Phylogenetic survival analysis was devised by E. Paradis and applied to detecting temporal variations in the divergence rate of tanagers (11) or fishes (14) , but it has not been applied to pathogens so far (12, 15, 16) . To quantify the effect of non-pharmaceutical interventions on the transmission rate of COVID-19, we adapted the original model in (11) to account for the specific setting of viral phylodynamics (see Methods). Hereafter, we refer to the modified model as phylodynamic survival analysis. In survival analysis terms, we interpret internal branches of the phylogeny (those that end with a transmission event) as time-to-event intervals and terminal branches (those that end with a sampling event) as censored intervals ( Fig. 1C ; see Methods). The time-to-event intervals are loosely related to the so-called clinical onset serial interval, which is the delay between the onset of symptoms in the source and infected patients in a transmission pair (but see (17) ). The predictors of interest in our setting, namely, the non-pharmaceutical interventions, vary both through time and across lineages depending on their geographic location. To model this, we assigned each divergence event (and subsequent branch) to a country using maximumlikelihood ancestral state reconstruction (18) . Each assigned branch was then associated with the . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180927 doi: medRxiv preprint set of non-pharmaceutical interventions that were active or not in the country during the interval spanned by the branch. Intervals containing a change of intervention were split into subintervals (19) . These (sub)intervals were the final observations (statistical units) used in the survival models. Models were adjusted for the hierarchical dependency structure introduced by interval splits and country assignations (18) . Under the assumption that each viral lineage in a phylogeny belongs to an infected patient, the dates of viral transmission and sampling events in a transmission tree (A) coincide with the dates of divergence events (nodes) and tips, respectively, of the dated phylogeny reconstructed from the viral genomes (B). Treating viral transmission as the event of interest for survival modelling, internal branches connecting two divergence events are interpreted as time-to-event intervals while terminal branches, that do not end with a transmission event, are interpreted as censored intervals (C). Translating the dated phylogeny in terms of survival events enables visualizing the probability of transmission through time as a Kaplan-Meier curve (D) and modelling the transmission rate using Cox proportional hazards regression. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180927 doi: medRxiv preprint The evolution of lineages in a dated viral phylogeny can be described as a birth-death process with a divergence (or birth) rate and an extinction (or death) rate (20) . In a phylodynamic context, the effective reproduction number equals the ratio of the divergence and extinction rates (20) . Coefficients of phylodynamic survival models (the so-called hazard ratios; see Methods) act as multiplicative factors of the divergence rate , independent of the true value of which needs not be specified nor evaluated. As = ⁄ , multiplying by a coefficient also multiplies , independent of the true value of . Thus, coefficients of phylodynamic survival models estimate variations of in response to predictor variables without requiring external knowledge of, or making assumptions about and . We assembled a composite dataset by combining a dated phylogeny of SARS-CoV-2 ( Fig. 2A) , publicly available from Nextstrain (21) and built from the GISAID initiative data (22) , with a detailed timeline of non-pharmaceutical interventions available from the Oxford COVID-19 Government Response Tracker (OxCGRT) (18, 23) . Figure S1 shows a flowchart outlining the data sources, sample sizes and selection steps of the study. Phylogenetic and intervention data covered the early phase of the epidemic up to May 4, 2020. The 5,198 SARS-CoV-2 genomes used to reconstruct the dated phylogeny were collected from 74 countries. Detailed per-country data including sample sizes are shown in Data S1. Among the 10,394 branches in the phylogeny, 2,162 branches (20.8%) could not be assigned to a country with >95% confidence and were excluded, also reducing the number of represented countries from 74 to 59 ( Fig. S1 ; a comparison of included and excluded branches is shown in Fig. S2 ). The remaining 4,025 internal branches had a mean time-to-event (delay between transmission events) of 4.4 days (Fig. 2B) . These data were congruent with previous estimates of the mean serial interval of COVID-19 ranging from 3.1 days to 7.5 days (24) . The 4,207 terminal branches had a mean time-to-censoring (delay from infection to detection) of 10.6 days (Figure 2A-B) . This pattern of longer terminal vs. internal branches is typical of a viral population in fast expansion (10) . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180927 doi: medRxiv preprint We compared the timing and dynamics of COVID-19 spread in countries represented in our dataset ( Figure 2C-D . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180927 doi: medRxiv preprint Figure 2 . Comparison of the timing and reproduction numbers of the COVID-19 epidemic in 74 countries based on a dated phylogeny. A: dated phylogeny of 5,198 SARS-CoV-2 genomes where internal (time-to-event) and terminal (time-to-censoring) branches are colored red and blue, respectively. B: histogram of internal and terminal branch lengths. C: box-and-whisker plots of the distribution over time of the inferred transmission events in each country, where boxes denote interquartile range (IQR) and median, whiskers extend to dates at most 1.5x the IQR away from the median date, and circle marks denote dates farther than 1.5 IQR from the median date. D: point estimates and 95% confidence intervals of the relative effective reproduction number, expressed as percent changes relative to China, in 27 countries with ≥10 assigned transmission events. Countries with <10 assigned transmission events (n=32) were pooled into the 'Others' category. E, F: representative Kaplan-Meier survival curves of the probability of transmission through time in countries with comparable (E) or highly different (F) transmission rates. '+' marks denote censoring events. Numbers denote counts of internal branches and, in brackets, terminal branches. G, H: scatter plots of the reported numbers of COVID-19 cases and deaths per country, in absolute values (G) and per million inhabitants (H). . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180927 doi: medRxiv preprint The implementation and release dates of large-scale non-pharmaceutical interventions against COVID-19 were available for 57 countries out of the 59 represented in the dated phylogeny. Definitions of the selected interventions are shown in Table 1 (18) . Branches assigned to countries with missing intervention data, namely, Latvia and Senegal, were excluded from further analysis Contrasting with previous approaches that constrained coefficients (9) , this intuition was not . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180927 doi: medRxiv preprint enforced a priori in our multivariate model, in which positive coefficients (increasing ) might have arisen due to noise or collinearity between interventions. The absence of unexpectedly positive coefficients suggests that our model correctly captured the epidemic slowdown that accompanied the accumulation of interventions. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180927 doi: medRxiv preprint A reduction of through time, independent of the implementation of interventions, might lead to overestimate their effect in our model. Several potential confounders might reduce through time but cannot be precisely estimated and included as control covariates. These included the progressive acquisition of herd immunity, the so-called artificial diversification slowdown possibly caused by incomplete sampling, and time-dependent variations of the sampling effort (see Methods). To quantify this potential time-dependent bias, we constructed an additional model including the age of each branch as a covariate (Table S1 ). The coefficients in this time-adjusted model only differed by small amounts compared to the base model. Moreover, the ranking by effectiveness of the major interventions remained unchanged, indicating that our estimates were robust to time-dependent confounders. We also quantified the sensitivity of the estimated intervention effects to the inclusion of other interventions (collinearity) by excluding interventions one by one in 9 additional models ( Fig. 3C ). This pairwise interaction analysis confirmed that most of the estimated effects were strongly independent. Residual interferences were found for gathering restrictions, whose fullmodel effect of -22.3% was reinforced to -33.5% when ignoring home containment; and for cancelling public events, whose full-model effect of -0.97% was reinforced to -15.1% when ignoring gathering restrictions. These residual interferences make epidemiological sense because home containment prevents gatherings and gathering restrictions also prohibit public events. Overall, the absence of strong interferences indicated that our multivariate model reasonably captured the independent, cumulative effect of interventions, enabling ranking their impact on COVID-19 spread. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . Fig. 2C . B: point estimates and 95% confidence intervals of the independent % change of the effective reproduction number predicted by each intervention in a multivariate, mixed-effect phylogenetic survival model adjusted for between-country variations. C: matrix of pairwise interactions between the interventions (in rows) estimated using 9 multivariate models (in columns), where each model ignores exactly one intervention. Negative (positive) differences in blue (red) denote a stronger (lesser) predicted effect of the intervention in row when ignoring the intervention in column. D, E: simulated impact of interventions implemented independently (D) or in sequential combination (E) on the count of simultaneous cases in an idealized population of 1 million susceptible individuals using compartmental SIR models with a basic reproduction number = 3 (black lines) and a mean infectious period of 2 weeks. Shaded areas in (D) denote 95% confidence bands. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180927 doi: medRxiv preprint To facilitate the interpretation of our estimates of the effectiveness of interventions against COVID-19, we simulated each intervention's impact on the peak number of cases, whose reduction is critical to prevent overwhelming the healthcare system ( Fig. 3D and Fig. S8 ). We used compartmental Susceptible-Infected-Recovered (SIR) models with a basic reproduction number In this idealized setting, home containment, independent of all other restrictions, only halved the peak number of cases from 3.0x10 5 to 1.5x10 5 (95% CI, 1.0x10 5 to 2.0x10 5 ) (Fig. 3D ). However, a realistic implementation of home containment also implies other restrictions including, at least, restrictions on movements, gatherings, and public events. This combination resulted in a relative of -50.8% (95% CI, -59.4% to -40.2%) and a 5-fold reduction of the peak number of cases to 6.0x10 4 (95% CI, 1.9x10 4 to 1.2x10 5 ). Nevertheless, if = 3 then a 50% reduction is still insufficient to reduce below 1 and stop the epidemic. This suggests that even when considering the most stringent interventions, combinations may be required. To further examine this issue, we estimated the effect of accumulating interventions by their average chronological order shown in Fig. 3A , from information campaigns alone to all interventions combined including home containment (Fig. 3E) . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180927 doi: medRxiv preprint We present a phylodynamic analysis of how the divergence rate and reproduction number of SARS-CoV-2 varies in response to large-scale non-pharmaceutical interventions in 57 countries. Our results suggest that no single intervention, including home containment, is sufficient on its own to stop the epidemic ( < 1). Increasingly stringent combinations of interventions may be required depending on the effective reproduction number. Home containment was repeatedly estimated to be the most effective response in epidemiological studies from China (27), France (28), the UK (29), and Europe (9) . Other studies modelled the additional (or residual) reduction of by an intervention after taking into account those previously implemented (4, 8) . Possibly because home containment was the last implemented intervention in many countries, these studies reported a weaker or even negligible additional effect compared to earlier interventions. In our study, home containment, even when implemented last, had the strongest independent impact on epidemic spread ( percent change, -34.6%), which was further amplified (-50.8%) when taking into account implicit restrictions on movements, gatherings and public events. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180927 doi: medRxiv preprint We found that education lockdown substantially decreased COVID-19 spread ( percent change, -25.6%). Contrasting with home containment, the effectiveness of education lockdown has been more hotly debated. This intervention ranked among the most effective ones in a study of 41 countries (4) but had virtually no effect on transmission in other reports from Europe (8, 9) . Children have been estimated to be poor spreaders of COVID-19 and less susceptible than adults to develop disease after an infectious contact, counteracting the effect of their higher contact rate (6, 30) . However, the relative susceptibility to infection was shown to increase sharply between 15 and 25 years, from 0.40 to 0.79 (30) . Importantly, we could not differentiate the effect of closing schools and universities because both closures coincided in all countries. Thus, our finding that education lockdowns reduce COVID-19 transmission might be driven by contact rate reductions in older students rather than in children, as hypothesized elsewhere (4), and, in addition, by parents staying at home with their children. Restrictions on gatherings of >100 persons appeared more effective than cancelling public events ( percent changes, -22.3% vs. -1.0%, respectively) in our phylodynamic model, in line with previous results from epidemiological models (4) . Notwithstanding that gathering restrictions prohibit public events, possibly causing interferences between estimates (Fig. 3C) , this finding is intriguing. Indeed, several public events resulted in large case clusters, the so-called superspreading events, that triggered epidemic bursts in France (31), South Korea (32) or the U.S. (33) . A plausible explanation for not detecting the effectiveness of cancelling public events is that data-driven models, including ours, better capture the cumulative effect of more frequent events such as gatherings than the massive effect of much rarer events such as superspreading public events. This bias towards ignoring the so-called 'Black Swan' exceptional events (34) suggests that our findings (and others' (4)) regarding restrictions on public events should not be interpreted as an encouragement to relax these restrictions but as a potential limit of modelling approaches (but see (35) ). There are other limitations to our study, including its retrospective design. We could not consider important non-pharmaceutical interventions that are difficult to date and quantify, such as contact tracing or case isolation policies. Data were analyzed at the national level, although much virus transmission was often concentrated in specific areas and some non-pharmaceutical interventions were implemented at the sub-national level (36) . From a statistical standpoint, the . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180927 doi: medRxiv preprint interval lengths in the dated phylogeny were treated as fixed quantities in the survival models. Ignoring the uncertainty of the estimated lengths might underestimate the width of confidence intervals, although this is unlikely to have biased the pointwise estimates and the ranking of interventions' effects. The number of genomes included by country did not necessarily reflect the true number of cases, which might have influenced country comparison results in Fig. 2 , but not intervention effectiveness models in Fig. 3 which were adjusted for between-country variations of . Finally, our estimates represent averages over many countries with different epidemiological contexts, healthcare systems, cultural behaviors and nuances in intervention implementation details and population compliance. This global approach facilitates unifying the interpretation of intervention effectiveness, but this interpretation still needs to be adjusted to local contexts by policy makers. Beyond the insights gained into the impact of interventions against COVID-19, our findings highlight how phylodynamic survival analysis can help leverage pathogen sequence data to estimate epidemiological parameters. Contrasting with the Bayesian approaches adopted by most, if not all, previous assessments of intervention effectiveness (4, 7, 9) , phylodynamic survival analysis does not require any quantitative prior assumption or constraint on model parameters. The method should also be simple to implement and extend by leveraging the extensive software arsenal of survival modelling. Phylodynamic survival analysis may complement epidemiological models as pathogen sequences accumulate, allowing to address increasingly complex questions relevant to public health strategies. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. indicators can be found in (37). We focused on large-scale interventions against transmission that did not target specific patients (for instance, we did not consider contact tracing) and we excluded economic and health interventions except for information campaigns. This rationale led to the selection of the 9 indicators shown in Table 1 . To facilitate interpretation while constraining model complexity, the ordinal-scale indicators in OxCGRT data were recoded as binary variables in which we only considered government requirements (as opposed to recommendations) where applicable. We did not distinguish between localized and nation-wide interventions because localized interventions, especially in larger countries, targeted the identified epidemic hotspots. As the data did not allow to differentiate closures of schools and universities, we use the term 'education lockdown' (as opposed to 'school closure' in (23)) to avoid misinterpretation regarding the education levels concerned. The original phylogenetic survival model in (11) and its later extensions (38) considered intervals backward in time, from the tips to the root of the tree, and were restricted to trees with all tips sampled at the same date relative to the root (ultrametric trees). Censored intervals (intervals that do not end with an event) in (11) were used to represent lineages with known sampling date but unknown age. In contrast, viral samples in ongoing epidemics such as COVID-19 are typically collected through time. A significant evolution of the viruses during the sampling period violates the ultrametric assumption. To handle phylogenies of these so-called measurably evolving populations (39), we propose a different interpretation of censoring compared to (11) . Going . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180927 doi: medRxiv preprint forward in time, the internal branches of a tree connect two divergence events while terminal branches, those that end with a tip, connect a divergence event and a sampling event (Fig. 1B) . Thus, we considered internal branches as time-to-event intervals and terminal branches as censored intervals representing the minimal duration during which no divergence occurred (Fig. 1C) . SARS-CoV-2 genome sequences have been continuously submitted to the Global Initiative on Sharing All Influenza Data (GISAID) by laboratories worldwide (22) . To circumvent the computational limits of phylogeny reconstruction and time calibration techniques, the sequences of the GISAID database are subsampled before analysis by the Nextstrain initiative, using a balanced subsampling scheme through time and space (21, 40) . Phylogenetic reconstruction uses maximum-likelihood phylogenetic inference based on IQ-TREE (41) and time-calibration uses TreeTime (42) . See (43) for further details on the Nextstrain bioinformatics pipeline. A dated phylogeny of 5,211 SARS-CoV-2 genomes, along with sampling dates and locations, was retrieved from nextstrain.org/ncov on May 12, 2020. Genomes of non-human origin (n = 13) were discarded from analysis. Polytomies (unresolved divergences represented as a node with >2 descendants) were resolved as branches with an arbitrarily small length of 1 hour, as recommended for adjustment of zero-length risk intervals in Cox regression (44) . Of note, excluding these zerolength branches would bias the analysis by underestimating the number of divergence events in specific regions of the phylogeny. Maximum-likelihood ancestral state reconstruction was used to assign internal nodes of the phylogeny to countries in a probabilistic fashion, taking the tree shape and sampling locations as input data (45) . To prepare data for survival analysis, we decomposed the branches of the dated phylogeny into a set of time-to-event and time-to-censoring intervals (Fig. 1C) . Intervals were assigned to the most likely country at the origin of the branch when this country's likelihood was >0.95. Intervals in which no country reached a likelihood of 0.95 were excluded from further analysis (Figs. S1-S2). Finally, intervals during which a change of intervention occurred were split into sub-intervals, such that all covariates, including the country and interventions, were held constant within each sub-interval and only the last subinterval of an internal branch was treated as a time-to-event interval. This interval-splitting approach is consistent with an interpretation of interventions as external time-dependent covariates (19) , which are not dependent on the event under study (the viral divergence). . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180927 doi: medRxiv preprint Variations of the divergence rate in response to non-pharmaceutical interventions were modelled using mixed-effect Cox proportional hazard regression (reviewed in (46) ). Models treated the country and phylogenetic branch as random effects to account for non-independence between subintervals of the same branch and between branches assigned to the same country. The predictors of interest were not heritable traits of SARS-CoV-2, thus, phylogenetic autocorrelation between intervals was not corrected for. Time-to-event data were visualized using Kaplan-Meier curves with 95% confidence intervals. The regression models had the form where ( ) is the hazard function (here, the divergence rate) at time for the th observation, ( ) is the baseline hazard function, which is neither specified or explicitly evaluated, is the set of predictors of the th observation (the binary vector of active nonpharmaceutical interventions), is the vector of fixed-effect coefficients, is the random intercept associated with the th phylogenetic branch and is the random intercept associated with the th country. Country comparison models (Fig. 2D) , in which the country was the only predictor and branches were not divided into subintervals, did not include random intercepts. Raw model coefficients (the log-hazard ratios) additively shift the logarithm of the divergence rate . is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08. 24.20180927 doi: medRxiv preprint with mean equal to the sum of the means and variance equal to the sum of the variance-covariance matrix of the deviates. Thus, the coefficient corresponding to a sum of coefficients with mean and variance has mean ∑ and variance ∑ , from which we derive the point estimates and confidence intervals of a combination of predictors. Importantly, summing over the covariances captures the correlation between coefficients when estimating the uncertainty of the combined coefficient. A central question regarding the effectiveness of interventions or combinations thereof is whether their implementation can stop an epidemic by reducing below 1 ( where Φ is the cumulative density function of the normal distribution with mean log + and variance . By integrating over the coefficient distribution, this method explicitly considers the estimation uncertainty of when estimating . Time-dependent phylodynamic survival analysis assumes that variations of branch lengths though time directly reflect variations of the divergence rate, which implies that branch lengths are conditionally independent of time given the divergence rate. When the phylogeny is reconstructed from a fraction of the individuals, as is the case in virtually all phylodynamic studies including ours, this conditional independence assumption can be violated. This is because incomplete sampling increases the length of more recent branches relative to older branches (47) , an effect . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180927 doi: medRxiv preprint called the diversification slowdown (48, 49) . Noteworthy, this effect can be counteracted by a high extinction rate (16, 47) , which is expected in our setting and mimicks an acceleration of diversification. Moreover, whether the diversification slowdown should be interpreted as a pure artifact has been controversial (49, 50) . Notwithstanding, we considered incomplete sampling as a potential source of bias in our analyses because a diversification slowdown might lead to an overestimation of the effect of non-pharmaceutical interventions. Additionally, the selection procedure used by Nextstrain to collect genomes included in the dated phylogeny possibly amplified the diversification slowdown by using a higher sampling fraction in earlier phases of the epidemic (40). To verify whether the conclusions of our models were robust to this potential bias, we built an additional multivariate model including the estimated date of each divergence event (the origin of the branch) as a covariate. The possible relation between time and the divergence rate is expectedly non-linear (47) and coefficient variations resulting from controlling for time were moderate (Table S1) , thus, we refrained from including a time covariate in the reported regression models as this might lead to overcontrol. Further research is warranted to identify an optimal function of time that might be included as a covariate in phylodynamic survival models to control for sources of diversification slowdown. Epidemic dynamics can be described by partitioning a population of size into three compartments, the susceptible hosts , the infected hosts , and the recovered hosts . The infection rate governs the transitions from to and the recovery rate governs the transitions from to (we avoid the standard notation and for infection and recovery rates to prevent confusion with Cox model parameters). The SIR model describes the transition rates between compartments as a set of differential equations with respect to time , The transition rates of the SIR model define the basic reproduction number of the epidemic, = / . From a phylodynamic standpoint, if the population dynamics of a pathogen is described as a birth-death model with divergence rate and extinction rate , then = ⁄ or, alternatively, = + 1 (51) . We simulated the epidemiological impact of each individual intervention in . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180927 doi: medRxiv preprint SIR models with = 3 and = 2 weeks based on previous estimates (25, 26) , yielding a baseline infection rate = = 6. In each model, the effective infection rate changed from to ⋅ exp on the implementation date of an intervention with log-hazard ratio . To determine realistic implementation delays, the starting time of the simulation was set at the date of the first local divergence event in each country and the implementation date was set to the observed median delay across countries (see Fig. 3A ). All models started with 100 infected individuals at = 0, a value assumed to reflect the number of unobserved cases at the date of the first divergence event, based on the temporality between the divergence events and the reported cases (Fig. S3) and on a previous estimate from the U.S. suggesting that the total number of cases might be two orders of magnitude larger than the reported count (52). Evaluation of the SIR models used the R package deSolve. All data and software code used to generate the results are available at github.com/rasigadelab/covid-npi. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180927 doi: medRxiv preprint We thank Philip Supply, François Vandenesch, Jean-Sébastien Casalegno, Vanessa Escuret, and Christophe Ramière for fruitful discussions and reviews of our work. We thank the GISAID, Nextstrain and OxCGRT teams for making their high-quality datasets available to the community. A list of authors and laboratories contributing SARS-CoV-2 genome sequences is shown in Data S3. Funding: JPR received support from the FINOVI Foundation (grant R18037CC). Competing interests: BL is currently active in groups advising the French government for which BL is not receiving payment. Data and material availability: Both data and analysis code are available online at https://github.com/rasigadelab/covid-npi. Tables S1-S2 External Databases S1-S3 References (37-53) . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180927 doi: medRxiv preprint A new coronavirus associated with human respiratory disease in China A pneumonia outbreak associated with a new coronavirus of probable bat origin A Novel Coronavirus from Patients with Pneumonia in China The effectiveness of eight nonpharmaceutical interventions against COVID-19 Impact of school closures for COVID-19 on the US health-care workforce and net mortality: a modelling study. The Lancet Public Health No evidence of secondary transmission of COVID-19 from children attending school in Ireland Inferring change points in the spread of COVID-19 reveals the effectiveness of interventions Impact of non-pharmaceutical interventions on documented cases of COVID-19. medRxiv, in press Estimating the effects of nonpharmaceutical interventions on COVID-19 in Europe Viral phylodynamics Assessing temporal variations in diversification rates from phylogenies: estimation and hypothesis testing Likelihood Methods for Detecting Temporal Shifts in Diversification Rates The effective reproduction number R of an epidemic can be interpreted as the average number of new infections directly caused by a single infected patient. The effective reproduction number equals the basic reproduction number R in a fully susceptible population when no mitigation strategy is active Speciation in North American black basses Inferring Speciation Rates from Between-divergence intervals are only approximately equal to serial intervals (up to variations in the incubation period) when there is a single infectee and the phylogenetic branches does not contain unobserved divergence events. When there are several infectees, between-divergence intervals, which begin with each divergence event, are shorter than serial intervals which begin with the infector's illness Materials and methods are available as supplementary materials at the Science Website Time-dependent covariates in the Cox proportional-hazards regression model Birth-death skyline plot reveals temporal changes of epidemic spread in HIV and hepatitis C virus (HCV) Nextstrain: real-time tracking of pathogen evolution GISAID: Global initiative on sharing all influenza data -from vision to reality Coronavirus Government Response Tracker. Oxford COVID-19 Government Response Tracker, Blavatnik School of Government. Data use policy: Creative Commons Attribution CC BY standard Serial interval of SARS-CoV-2 was shortened over time by nonpharmaceutical interventions Factors Associated With Prolonged Viral RNA Shedding in Patients with Coronavirus Disease 2019 Epidemiology and transmission of COVID-19 in 391 cases and 1286 of their close contacts in Shenzhen, China: a retrospective cohort study. The Lancet Infectious Diseases Changes in contact patterns shape the dynamics of the COVID-19 outbreak in China Estimating the burden of SARS-CoV-2 in France Centre for the Mathematical Modelling of Infectious Diseases COVID-19 working group, Effects of nonpharmaceutical interventions on COVID-19 cases, deaths, and demand for hospital services in the UK: a modelling study Age-dependent effects in the transmission and control of COVID-19 epidemics Serologic responses to SARS-CoV-2 infection among hospital staff with mild disease in eastern France. medRxiv, in press Transmission potential and severity of COVID-19 in South Korea How Superspreading Events Drive Most COVID-19 Spread Forecasting unprecedented ecological fluctuations Chopping the tail: how preventing superspreading can help to maintain COVID-19 control. medRxiv, in press COVID-19) in the EU/EEA and the UK -tenth update Statistical Analysis of Diversification with Species Traits Measurably evolving pathogens in the genomic era IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era TreeTime: Maximum-likelihood phylodynamic analysis Orientation: so, what does Nextstrain do? Tutorial: Using Nextstrain for SARS-CoV-2 Modeling Survival Data: Extending the Cox Model The Maximum Likelihood Approach to Reconstructing Ancestral Character States of Discrete Characters on The reconstructed evolutionary process Tree of Life Reveals Clock-Like Speciation and Diversification Prolonging the past counteracts the pull of the present: protracted speciation can explain observed slowdowns in diversification The impact of taxon sampling on phylogenetic inference: a review of two decades of controversy The epidemic behavior of the hepatitis C virus Acknowledgements: