key: cord-0949061-9o2tk4x5 authors: Romero-Severson, Ethan Obie; Hengartner, Nick; Meadors, Grant; Ke, Ruian title: Decline in global COVID-19 transmission date: 2020-04-23 journal: nan DOI: 10.1101/2020.04.18.20070771 sha: 05103eeb2e5e0f34a84c112e9ab64d3c6751ab7e doc_id: 949061 cord_uid: 9o2tk4x5 We analyzed COVID-19 data through April 16, 2020 using a partially observed Markov process. Our method uses a hybrid deterministic and stochastic formalism that allows for time variable transmission rates and detection probabilities. The model was fit using iterated particle filtering to case count and death count time series from 51 countries. We found evidence for a declining transmission rate in 42 of the 51 examined countries. Of those 42 countries 34 have significant evidence for subcritical transmission rates, although the decline in new cases are relatively slow compared to the initial growth rates. This suggests that global scale social distancing efforts to slow the spread of COVID-19 are effective although they need to be strengthened in many regions and maintained in others to avoid further resurgence of COVID-19. The slow decline also suggests alternative strategies to control the virus are needed before social distancing efforts are partially relaxed. Since its initial outbreak in Wuhan China in late 2019 and early 2020 (1), COVID-19 caused repeated rapid outbreaks across the global from February through April 2020. The extremely rapid spread of COVID-19 in China (2) does not appear to be an anomaly: the disease has shown a short doubling time (2.4-3.6 days) outside of China as well (3) . As of April 17, 2020, the virus caused 2,181,508 reported infections, and 147,337 deaths globally (4) . In response, most affected countries/regions have implemented strong social distancing efforts, such as school closures, working-fromhome, shelter-in-place orders. As a result, the spread of COVID-19 slows down substantially in some countries (5) , and thus the epidemic curves are flattened. As social distancing induces high costs to both society and individuals, plans to relax social distancing are discussed. However, it is not clear if the transmission in these areas are subcritical and how long the transmission is reduced to a level that are manageable for health care systems. In this report, we developed a deterministic-stochastic hybrid model and fitted the model to case incidence and death incidence time series data from 51 countries. Following the approach suggested by King et al.(6) , we use a (partially) stochastic model and base our estimates on incidence rather than cumulative incidence data . Using both case count and death count of data allowed us to disentangle changes in surveillance intensity from changes in transmission (3) . We found evidence for large decreases in the country-level transmission rates, in several of the worst-affected countries. Importantly, using data up to April 16, 2020, we specifically tested whether the transmission is subcritical (i.e. the reproductive number R is below 1). Most countries showed large decreases in transmission rates over time, and some have transmission rates below the epidemic threshold, i.e. R = 1. On the other hand, many countries still appear to be showing rapid exponential growth. Given its highly contagious nature, COVID-19 can spread rapidly when strong social distancing measures are lifted, even partially (3) . Alternative strategies that can effective control the virus are needed when social distancing measuares 2 Materials and Methods Case count and death data were downloaded from the Johns Hopkins GitHub repository (https://github.com/CSSEGISandData/COVID-19) on April 16 2020. Any country in the data that had more than 2000 cumulative cases by April 16 was included in the modeled data. China and South Korea were excluded from the final analysis because the model was not capable of capturing the complex dynamics in those countries. To minimize the effect of repatriated cases we started each time series on the first day when the cumulative number of cases was greater than or equal to 100. All data processing and model fitting were otherwise done on the incidence scale. To address obvious bulk-reporting issues in the data (e.g. sudden zeros in the data followed by very large numbers), we smoothed the data using Tukey's 3-median method. Because several countries had days with a single death surrounded by days with no deaths, which the smoothing method would set to zero, days with a single death were not replaced with smoothed values. The original data and the smoothed data used for estimation are shown in Supplemental figure S1. We model the spread of COVID-19 as a partially observed Markov process with real-valued states S (susceptible), E (exposed), I (infected), and R (recovered) to model the latent population dynamics, and integer-valued states C 0 (to be counted), Y 1 (counted cases), D 0:3 (dying), and Y 2 (counted deaths) to model sampling into the data. The model and all of its parameters have time units of days. The latent population model is governed by the following ordinary differential equations where χ(t) is the time-variable transmission rate and N = S + E + I + R is assumed to be fixed over the run of the model. λ EI is the rate at which exposed persons become infectious and λ IR is the rate at which cases recover (i.e. are either no longer infectious or die). At every time interval, we sample persons moving from the E to I states into a stochastic arm of the model that is used to calculate the likelihood of the data. To relate the latent population model to data, we randomly sample individuals from the unobserved population into a stochastic process that models the random movement from infection to being either counted as an observed case or counted as an observed death. The number of persons sampled into the observation arm of the model over a time interval dt is given by and ω are the probabilities that an infected person will be counted or die respectively, ρ c (t) and ω c are the probabilities of being not sampled or not dying respectively, and g(u) is a stochastic function that maps from real-value u to integer g(u); it takes value u with probability u mod 1 and value u with probability 1 − (u mod 1). In plain English, the model tracks the random fate of each newly infected person as they move from the exposed to infected state with respect to eventually being observed as a case and/or a death in data. At each time step of length dt, the change in state space is 2 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 23, 2020. . https://doi.org/10.1101/2020.04.18.20070771 doi: medRxiv preprint given by dt . X i indicates the i th element of Multinomial random variate defined above. The rate λ Y 1 determines the rate at which persons who will be counted are counted (i.e. lower values of λ Y 1 mean a longer delay in cases showing up in the data). The values of Y 1:2 are set to zero at the beginning of every day such that they accumulate the simulated number of cases and deaths that occur each day. Both the transmission rate and detection probability are allowed to vary with time in the following way where t f is the final time in the given dataset. Note that, to prevent over-fitting, the final values the transmission rate and detection probability are forced to be constant over the last 10 days of the data. We assume that the data are Negative Binomial-distributed, conditional on the simulated number of cases and deaths that occur in a given time interval, i.e. the number of cases in the i th observation period has density NegBin(Y 1 , 1 ) and the number of deaths in the i th observation period has density NegBin(Y 2 , 2 ). The Negative Binomial is parameterized such that the first argument is the expectation and the second is an inverse overdispersion parameter that controls the variance of the data about the expectation; as i becomes large, the data model approaches a Poisson with parameter Y i . Both 1 and 2 were estimated from the data. The model was fit to the data using an iterated particle-filtering method, implemented in the pomp R library (7) . To optimize the likelihood of the data, we used 2000 particles in 100 iterations for each country. The reported likelihoods were measured using 6000 particles at the optimized Maximum Likelihood Estimates (MLEs). The Monte Carlo error in the likelihood measurement was less than 0.1 log units. Because we are using an optimization rather than sampling method we fit a sequence of models to test if the data support time-variable detection and/or transmission rates. The first model assumes both the transmission rate and detection probability are constant (i.e. χ 0 = χ f and ρ 0 = ρ f ); the second model allows for time variable detection probability (i.e. χ 0 = χ f ); the third model allows for time variable transmission rates ρ 0 = ρ f ; and the fourth model estimates all the free parameters. We determine the best model by a sequence of likelihood ratio tests. If model 2 and/or model 3 are supported at α = 0.05, then the model with the higher likelihood is preferred. If there is support for model 4 over the preferred model, then model 4 is considered best. This testing cascade avoids overfitting by considering both possible explanations in isolation and only including both if the data support it. Finally, for data sets that support a declining transmission rate, we run an additional model where the final transmission rate is set such that the reproduction number is 1. Note that the model structure assumes the final transmission rate is constant over the last 10 days of the data. We use the likelihood from that model conduct a likelihood ratio test to distinguish between countries that have nearly subcritical transmission rates to ones that have clear evidence of a shrinking epidemic. The initial state at time zero is computed by assuming there were I(0) infected persons 21 days before the first reported death (by definition time one). The number of initial number of susceptibilities was assumed to be the predicted 2020 population size of the given country (8) . The model is then simulated forward for 21 days, which is taken at the initial state of the model at time zero. Parameters have constraints ; all other parameters are constrained to be positive. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 23, 2020. . https://doi.org/10.1101/2020.04.18.20070771 doi: medRxiv preprint We fit our model to data collected from 51 countries. Model fits are shown in Figure 1 , and parameter values are given in Table 5 . The model can capture the data well, with a few exceptions. Iran, Spain, and Italy all display a tight temporal correlation between changes in the number of cases and changes in the numbers of deaths (i.e. a downturn in cases co-occurring with a downturn in deaths), which the model cannot reproduce exactly as it occurs. This may due to the assumption in the model that the delay in case detection from infection is constant over time, and the inability of the model to find a single time delay in case detection that fits both the case and death time series in Italy, Spain, and Iran both at the beginning and end of the data suggests heterogeneity in the detection rate over time of either cases or deaths in these countries. Modeling efforts that focus on these regions should consider including temporal heterogeneity in the detection rate. We found evidence for decreasing transmission rates in 42 of the 51 included countries. Figure 2 shows the starting and final exponential growth rates (growth rates were computed from model parameters according to (9) ). Note that the starting exponential growth rate is the growth rate estimated from the initial data points we collected in each country, and it is not necessarily the rate of outbreak growth at the beginning of the outbreak. In countries with a decline, the average reduction in the transmission rate was 84% from the starting value. Of the included countries, 32 had shrinking epidemics (i.e. infected persons are generating less than 1 new infection on average). To further test whether there exists significant evidence that the transmission is sub-critical, we modified our model to force the transmission to be at the threshold, i.e. the growth rate is 0, during the last 10 days of simulation. Comparing the two models, we found that 8 countries have maximum likelihoods that suggest R < 1 (i.e. negative growth) could not be meaningfully distinguished from R = 1. Countries with significantly subcritical dynamics on April 16 2020 were Algeria, Australia, Austria, Belgium, Czechia, Ecuador, France, Germany, Greece, Iran, Israel, Italy, Luxembourg, Malaysia, Netherlands, Norway, Poland, Portugal, Romania, Spain, Switzerland, Thailand, United Kingdom, and the United States. One caveat in the interpretation of the results is that the potentially long delays (the model allows up to a 21 day average delay) in reporting lead to a high level of ambiguity in the data concerning the transmission rate in the recent past. The model estimated the time delay in the time from infection to detection (Supplemental Figure S4 ); the average delay was 14 days. We further looked for correlations in the initial detection probability and initial growth rate (Supplemental Figure S3 ), but the correlation appeared to be insignificant (removing two outliers, Qatar and South Africa, led to a non-significant negative correlation (p=0.32). Our model included an explicit susceptible population in each country, allowed us to estimate the fraction of the population that had been infected by April 16 2020 (Supplemental Figure S5 ). On average 1.5% of the included countries infected population was infected on April 16 2020. Our optimization-based inference method does not allow us to easily compute confidence intervals on the parameter point estimates. Our previous work using a deterministic modeling approach suggests the range of variation could be an order of magnitude given uncertainties in parameter values, such as the infection case fatality ratio (3). Our model found evidence for reductions in transmission rates of COVID-19 in 42 of 51 examined countries. Encouragingly, of those countries, we found statistical evidence that the transmission of COVID-19 is decreasing in 24 countries, i.e. the effective reproductive number is less than 1, using data up to April, 16, 2020. This suggests that, despite the highly heterogeneous populations represented by these countries, the growth of COVID-19 outbreak can be reverted. Although our model cannot attribute exact causes to the global decline in transmissions rates, most countries implemented sustained, population-level social distancing efforts over a period of weeks to months. These efforts are highly likely to play a major role in reducing the transmission of COVID-19 (5). We estimated that in countries with decreasing transmission, the rate of decrease is in general less than 0.1/day. Based on data from 8 European countries, the US and China, we previously estimated that in the absence of intervention efforts, the epidemic can grow at rates between 0.19-0.29/day (2) . This means that the outbreak can grow rapidly and avert public heath efforts made if social distancing measures are completely relaxed. For example, if the rate of decrease under strong public health interventions is 0.1/day and the growth rate in the absence of public heath interventions is 0.2/day, then, the number of cases averted in two weeks of intervention will be regained in only one week. Social distancing measures have their own social costs. Our results suggests alternative strategies to control the virus are needed in place when social distancing efforts are relaxed. Due to the uncertainties in the impact of each specific control measures, changes to policies should be made slowly because the signal of changing transmission can take weeks to fully propagate into current data streams as a result of the long lag between infection to case confirmation (as we estimated to be approximately 2 weeks). . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 23, 2020. Overall, our results suggest that COVID-19 is controllable in diverse settings using a full range of strong and comprehensive non-pharmaceutical measures, and that future deaths from the disease are avoidable. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 23, 2020. Initial number infected fit 6 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 23, 2020. The values r 0 and r f are the initial and final exponential growth rates implied by the other model parameters. The growth rate r and transmission rate χ are derived according to (9) . The term λ −1 Y 1 is the mean time in days from infection to detection of a new case in days. χ 0 , χ f , r 0 , r f , have day −1 units; I 0 is a count; ρ 0 and ρ f are probabilities; and λ −1 Y 1 has day units. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 23, 2020. The envelope shows the 95% CIs for the data. The x-axis is measured in sequential days from the first datum. 8 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 23, 2020. 9 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 23, 2020. . https://doi.org/10.1101/2020.04.18.20070771 doi: medRxiv preprint 6 Supplemental The differential equations for infected I and counted C cases in time t, with infection growth rate α and counting rate λ > 0, where both I and C are dimensionless numbers, but α and λ have units of (time) −1 , state that, Equations 1 and 2 are coupled first-order equations. Cases where λ varies in time are of interest. As such cases are not necessarily integrable, a start is to first solve the constant λ case. Small perturbations can subsequently be probed. The equations can be rearranged into an uncoupled, second-order homogeneous ordinary differential equation, assuming constant α, which can be solved analytically for constant λ. The solution is fixed by requiring that I 0 ≡ I(t = 0) and C 0 ≡ C(t = 0): The full time derivatives of I and C satisfy Equations 1 and 2, 6.1.2 Long-duration limit In the limit t → ∞, these equations simplify to yield a ratio (I/C): as well as, whereĪ ≡ lim t→∞ I andC ≡ lim t→∞ C, = α λ C . . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 23, 2020. . https://doi.org/10.1101/2020.04.18.20070771 doi: medRxiv preprint This limit can also be interpreted as a time lag τ under the assumption of constant growth α, where τ = α −1 log (α/λ). As expected of the growth rate, the limit also yields dĪ/dt = αĪ and dC/dt = αC, the latter equivalent to λĪ. Taking this t → ∞ limit, with λĪ = αC assumed from Equation 9 , discrepancies appear if λ is allowed to vary, where the terms in dλ/dt perturb the solutions to Equations 1 and 2. The perturbation to the differential equations is negligible when dλ/dt α 2 (1 + α −1 λ) and αλ(1 + α −1 λ). If the testing rate λ is smaller than the growth rate α, the latter condition is stronger. When λ α, the condition simplifies to dλ/dt αλ. Under the (αλ) condition, the natural logarithm of Equation 9 can be differentiated, where the αλ condition implies that the right-hand side is much less than α. As λ is positive, the time-derivative of the logarithm ofC is greater than that ofĪ. The countC will increase faster than the number of uncounted, infected casesĪ. This result holds when d logC/dt ≈ d logĪ/dt and both are approximately α. The interpretation is that time-varying λ is an acceptable perturbation if the difference between the time-derivatives of the logarithms ofC andĪ is much smaller than the time-derivatives themselves. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 23, 2020. 12 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 23, 2020. Inital detection probability Initial growth rate Figure S3 : Values of initial detection probability and initial growth rate. Initial growth rate is negatively correlated (-0.35) with initial detection probability (p=0.01) Country names are indicated by ISO Alpha-3 codes. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 23, 2020. Average time to detection Figure S4 : Histogram of the average time from infection to detection. Detection rates were bound such that the mean time to detection was between 5 and 21 days inclusive. Fraction ever infected Figure S5 : Fraction of population that has been infected by April 16 2020. Infected fraction includes exposed, infected, recovered, and dead. 14 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 23, 2020. . https://doi.org/10.1101/2020.04.18.20070771 doi: medRxiv preprint Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia Early Release -High Contagiousness and Rapid Spread of Severe Acute Respiratory Syndrome Coronavirus 2 -Volume Fast spread of COVID-19 in Europe and the US and its implications: even modest public health goals require comprehensive intervention An interactive web-based dashboard to track COVID-19 in real time 2020-03-30-COVID19-Report-13.pdf Avoidable errors in the modelling of outbreaks of emerging pathogens, with special reference to Ebola Statistical inference for partially observed Markov processes The World Factbook 2020 Estimating epidemic exponential growth rate and basic reproduction number