key: cord-0115976-82bm1mm5 authors: Petrova, Tatiana; Soshnikov, Dmitri; Grunin, Andrey title: COVID-19 reproduction number estimated from SEIR model: association with people's mobility in 2020 date: 2021-08-27 journal: nan DOI: nan sha: 28303f7e0b72707576187b0ae5aee4a0053337cb doc_id: 115976 cord_uid: 82bm1mm5 This paper is an exploratory study of two epidemiological questions on a worldwide basis. How fast is the disease spreading? Are the restrictions (especially mobility restrictions) for people bring the expected effect? To answer the first question, we propose a tool for estimating the reproduction number of epidemic (the number of secondary infections $R_t$) based on the SEIR model and compare it with an non-model $R_t$ estimation. To measure the $R_t$ of COVID-19 for different countries, real-time data on coronavirus daily cases of infections, recoveries, deaths are retrieved from the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. To assess the effectiveness of mobility restrictions for the COVID-19 pandemic in 2020, the correlations between the $R_t$ and people's mobility (based on the Apple mobility index) are presented. The correlations were considered for 12 countries and for most of them, the correlations are negative. This shows a delay in the implementation of mobility restrictions - the countries imposed them in response to growth of new COVID-19 cases, rather than preventively. Coronavirus disease (COVID- 19) was first identified in late December 2019 in Wuhan (China) and during the first half of 2020 spread all over the word. On March 11th 2020 the coronavirus outbreak was labelled a pandemic by the World Health Organization [1] . By the middle of 2021, COVID-19 has spread to almost all countries, affecting more than 188 million people worldwide and causing more than 4 million deaths [2] . More than half of the world's population has experienced a lockdown with strong mobility restrictions. This was the first large-scale implementation of such measures in history. One of the key points during an epidemic is to design appropriate mathematical models and tools for effective decision-making. To optimize the quarantine steps it is important to estimate parameters characterizing infectious disease transmission using real-time data and track the temporal changes in those values [3] . Another important point is to assess the effectiveness of implemented restrictions. The time-dependent (effective) reproduction number (Rt) is one of the key parameters characterising evolution of an epidemic. This value is a function of time, and represents the expected number of secondary cases arising from a primary case infected at time t [4] . If Rt < 1 the disease will decline and eventually die out. If Rt > 1 the disease will be transmitted between people, and an outbreak is likely to occur. Reducing the reproduction number below 1 is one of the main goals of implementing quarantine steps. There are different methods to define the initial reproduction number R0 and time-dependent reproduction number Rt. The numerical estimations of Rt using different methods vary depending on implementation [4] [5] [6] [7] [8] [9] [10] . Some of the standard methods were applied to assume basic and time-dependent reproduction numbers for COVID-19 outbreak for different countries [11] [12] [13] [14] [15] . In review [16] the differences for estimations of COVID-19 reproduction number for China are shown, and the difference beteween forecasted values via different methods is explained by insufficient data and different estimation techniques. The mobility restrictions were used in countries worldwide to slow the spread of COVID-19. They included closing public transportation, stores, offices; travel restrictions; requirement to stay home, etc. The scale of these response measures was incredibly large and inccurred significant costs. Longitudinal data of epidemic spreading and people's mobility are now available. They can be used for investigating the longterm effects of quarantine steps, especially mobility restrictions. Previous works have shown that social distancing orders have an impact on reduced mobility and case growth in India and the United States [17] [18] [19] [20] [21] . Oh, J., et al. [22] have demonstrated that mobility restrictions appeared to reduce the spread of COVID19 in many countries in the early stages of the pandemic wave, but in the later stages, once other mitigation measures are taken, the impact will weaken. Report [11] concludes that easing social-distancing restrictions should be considered very carefully, as small increases in contact rates are likely to risk resurgence even where COVID-19 is apparently under control. In the report [23] a consistent correlation between reductions in mobility and reductions in transmission intensity of COVID-19 was found for the first months of 2020. The authors developed a minimalist compartmental model to study the impact of mobility restrictions in Italy during the Covid-19 outbreak, based on SIOR model (here (Observed) -are individuals who present symptoms acute enough to be detected from the national healthcare system, Observed individuals switch into the (Removed)). It was shown that, while an early lockdown shifts the contagion in time, beyond a critical value of lockdown strength the epidemic tends to restart after lifting the restrictions. To understand the longitudinal impact of mobility restrictions to pandemic spread, we estimated the correlations between the Rt and people's mobility (based on the Apple mobility index) for 12 countries worldwide in 2020. For estimating Rt in real-time, we developed an open-source tool, requiring only data that are commonly recorded during an outbreak (infections, recoveries, deaths, infection duration), with clear science based methodology and without limitations on epidemic phase. The SEIR-based reproduction number is complemented with the non-model estimation of Rt to compare the calculated results. The tool also allows to compare the dynamics of Rt and peoples mobility, and assess the correlation between those. The Rt calculation in connection with people's mobility allows to forecast the progression of disease outbreaks and understand the effect of restrictions on epidemics spread. To estimate the time-dependent reproduction number, the following procedure based on the SEIR model is used. We begin at each country's 1000th case of infection; then we move a sliding 7 days window over the whole period, and for each step we estimate the time-dependent reproduction number by minimizing square difference between real and predicted (by SEIR model) number of new infected cases. To explore the time-dependent reproduction number in connection with peoples mobility, we calculate the correlation between Rt and peoples mobility, Rt and derivative of peoples mobility. Real-time data on COVID-19 daily cases of infections, recoveries, deaths, as well as each country's population are retrieved from the COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University [24] . The data were reported by each country's official surveillance system starting January 22nd, 2020. Peoples mobility data is sourced from Apple Mobility Index [25] . It is generated by counting the number of requests made to Apple Maps for directions. Three data streams are used: 'driving', 'walking' and 'transit' mobility for Apple device users. The data is available from January 13th (except May, 11th-12th) and is updated every day. The mobility data for China is temporarily not available. The online data of infected, recovered, deceased cases, each country's population and transposed dates are joined in one table, which gives us each country's dataframe. The beginning is each country's 1000th case of infection, then the numbers of observed cases are smoothed out by computing 3-days rolling average on all the columns. To estimate the number of expected cases, the data on infected shifted by the median incubation period of COVID-16 (5 days [26] ) is used. In addition, we applied the results of Oran D.P., Topol E.J. [27] , according to which 42 to 45 percent of coronavirus cases are asymptomatic. We automatically collect all available online Apple mobility data, obtain the mean values of three data streams ('driving', 'walking' and 'transit') for each country and apply 7-days running average to reduce weekly fluctuations. The SEIR model is a modification of the classical SIR approach to epidemic simulation presented by Kermack-McKendrick [28] for the number of people infected with a contagious illness in a closed population over time. The assumptions of the model are: constant (closed) population size (N); constant rates (e.g., transmission, removal rates -the probability of disease transmission in a single contact multiplied by the average number of contacts per person); no demography (i.e., births and deaths); well-mixed population (where any infected individual has a probability of contacting any susceptible individual that is reasonably well approximated by the average). The SEIR model describes the connection between S (susceptible -number of people who have the potential to be infected), I (infected -number of infected people), R (removed -number of people who are non susceptible to infection, this includes the number of deceased people as well) and E (exposed -number of people who have been infected but does not show symptoms yet: it can be called a latent phase) with the following system of differential equations: dS/dt = -βSI/N, dR/dt = γI. β is known as the effective contact rate. We assume that in a unit time each infected individual will come into contact with βN people. From those people, the proportion of susceptible people is S/N, thus the speed at which new infections occur is considered to be −βSI/N. γ is the removal rate, and the number 1/γ defines the number of days during which a person stays infected. Thus the term γI defines the speed at which infected individuals are moved from being infected to recovered (or deceased). k is the progression rate from exposed (latent) to infected and governs the lag between having undergone an infectious contact and showing symptoms. In the equations, it brings people from the E category to the I category. The rates are supposed to be constant. In the assumption that the population is completely susceptible (S=N), no demography (i.e., births and deaths) and the population is well-mixed, the basic reproduction number R0 ≈ β / γ [29] . Model predictions based on system (1) depend on the parameters (β, γ, k). Optimization of the model with respect to the parameters and fitting it to the real data allows us to find the parameters that correspond to the actual outbreak. It is more consistent to optimize for β, and set the γ and k parameters according to medical studies. According to WHO [30] , the median time from onset to clinical recovery for mild cases is approximately 2 weeks, and it is 3-6 weeks for patients with severe or critical disease. Among patients who have died, the time from symptom onset to outcome ranges from 2 to 8 weeks. Since the time period is wide enough, we considered the mean recovery time of 30 days, and fixed γ as 1/30, respectively. We estimated the number of expected cases as the data on numbers of infected, shifted by the median incubation period of COVID-16 (5 days), so k is fixed as 1/5. To smooth inaccuracies in the statistics at beginning of the outbreak the calculations start from day t0, where t0 is the first day when the number of infected people is above 1000: t0 = min{t|I(t) > 1000}. For each country, the following denotations for the data are used: V(t) -the number of total (accumulated) infected cases at day t, t > t0, E eff β(t), -number of exposed cases at day t, G(t) -number of recovered cases at day t, D(t) -number of fatal cases at day t. According to the results [27] , from 42 to 45 percent of coronavirus cases are asymptomatic, so we use the number of asymptomatic cases as 43 percent and calculate E eff The values corresponding to each β are denoted as Sβ(t), E eff β(t), Iβ(t) and Rβ(t), respectively. Sβ(t), E eff β(t), Iβ(t) and Rβ(t) are the solutions of system (1) with initial values of: Here N is the population of a country being considered. The accumulated number of total infected people computed by the model is given by: I′β(t) = I + β(t+1) -I + β(t). The value β * is defined below (we will denote the number of days in consideration by n=7): 2 . The process of finding argmin (the minimum value along a given axis) is a complex optimization process, and at each step the numerical solution of ODE (1) is needed. Powell's method (Powell's conjugate direction method) is used for this optimization because it does not rely on gradients and is fast enough [31] . Having obtained β * , we calculate R0 as follows: Similar to the basic reproduction number R0 that is a characteristic of a disease, the time-dependent reproduction number Rt is considered. The Rt dynamics during the pandemic takes into account isolation measures and the proportion of nonsusceptible population, and can be used to estimate the effectiveness of quarantine measures. The same approach is followed for estimating Rt in real time: start from the point at which there are 1000 infected people; move a sliding window of width n (we used n=7) over the whole period till present day. At each point, use n consecutive days to estimate β (and, thus, Rt) by minimizing the square difference between the real and predicted number of new infected people. We also calculated an estimation of Rt that is not based on epidemic modelling [32, 33] and validated it using the SEIR-based model. The non-model reproduction number (Rt MF ) is one of the main indicators that the governments relied on for making decisions to restrict or weaken the lockdown. Rt MF is calculated by dividing the sum of the infected cases during the last 4 days by the sum infected cases in the previous 4 days: here Ii is the number of infected cases for the corresponding day i. The correlation between the Rt and people's mobility (Apple Mobility Index) can be investigated with the aid of the Spearman rank correlation coefficient. It is a non-parametric correlation method for measuring the strength and degree of association between two variables. Instead of exploring a linear relationship, in the case of Spearman rank correlation, there are no strict conditions on the association, and the observed data are ranked following a specific sequence. The correlation coefficients indicate whether there is a potential relationship between Rt, Rt's change rate, and people's mobility. The formula below can be used to calculate the Spearman rank correlation coefficient (rs) is as follows: where dj denotes the ranked difference between the j pairs of variables, and n denotes the number of ranks in each of two variables. The correlation coefficient value ranges from −1 to 1. If rs > 0, then there is a similar distribution and ranking of variables. If rs < 0, then the variables are ranked differently. The absolute value of rs represents the degree of correlation. If the value of rs is closer to ±1, the correlation between two variables is stronger. The approach described above allows to find real-time dynamics of reproduction number and easily analyze it in connection with other factors directly affecting the epidemic spread (number of daily infected cases, peoples mobility level, rate of changes in people's mobility level). The results are presented in Figs.1-6 for 12 countries and can be extended to any country in the world reported in [19] and [20] . For each country, the time period from 1000 infected cases till December 31, 2020 was chosen. We considered only the first year of pandemic (2020) because in 2021 most of the countries started active vaccination campaign, so the model for Rt estimation has been changed and also changed the connection between Rt and mobility index. General trends in the COVID-19 pandemic spread connected with Rt dynamics in 2020 for different countries are shown in Figure 1a . and daily number of infected cases normalized by country population for each country (blue histogram, right y-axis), plotted for the period till December 31, 2020. It is shown that SEIR-based Rt is more stable and less sensitive to outliers caused by inaccurate statistics, than Rt MF . On the x-axis the number of days from the first 1000 infected cases in each country till December 31, 2020 is shown, which makes it possible to compare the epidemic spread in 2020 (before the beginning of wide vaccination campaigns). The number of new cases is measured in cases per million of people. Normalized number of new cases is shown on different scales for each graph on purpose to make the shape of the graph more visible. The SEIR-based Rt is more smooth and less sensitive to outliers caused by inaccurate statistics than model-free estimation of Rt (Rt MF ). Rt MF better highlights their occurrence; • The Rt values decreased significantly compared to the initial values (R0) for all countries; • One can see waves of infected cases for all of the countries, but Rt numbers for these waves are much smaller than for the beginning of the epidemic and close to 1. For most of the countries (except Sweden, US and UK) Rt during 2020 dropped below 1, but the spread of coronavirus continued. The most informative period of the epidemic is the beginning of infection spread, and it makes sense to see the first months of the epidemic in more detail. On Figure 1b we show the graphs comparing Rt and Rt MF for the period up to July 1, 2020 across different countries. One of our main goals was to study the connection between different measures to limit people's mobility and the spread of the epidemic, measured by Rt. To estimate people's mobility we use Apple's mobility index.Visual relationships between Rt and the mobility in different countries are presented in Figure 3a . There are a few conclusions that can be derived from observing those figures: • If we consider the long time period (till December 31, 2020), over time there seems to be relatively low correlation between mobility constraints and Rt, similarly to the results found in [22] . • For some countries (India and Brazil), Apple mobility index shows steady growth, and does not seem to depend on any mobility prevention measures introduced by local government. • In many cases, especially considering some local time intervals, there seems to be negative correlation between Rt and mobility index. This can be explained by the fact that preventive measures are introduced as a result of rising Rt, and thus when Rt is still rising, preventive measures are already in place, affecting mobility index. The relationship between Rt and mobility index for the first part of the pandemic is presented in Figure 3b . On this figure, for many countries one can see the initial drop of both Rt and the mobility index, corresponding to the initial lockdown, which is followed by the rise of mobility index (removal of lockdown measures). Our results support the reasoning proposed in [22] , which indicates that protective measures become less effective in the middle of the pandemic, but can have stronger effect in the beginning. From the Figure 3a it follows, that there is a strong negative correlation between Rt and mobility index for some time intervals. This may be caused by the fact that lockdown measures are implemented as a result of rising Rt, and their effect on Rt is not immediate. Our hypothesis, however, is that the rate of increase of Rt starts to drop immediatly as a result of lockdown measures. Similarly, when mobility measures are loosened, the change rate of Rt (dRt/dt) increases. For some countries we can suggest the significant correlation between dRt/dt and mobility index for the very beginning of the pandemic, while for all 2020 for most of the countries the significant correlation is not observed (Figure 4a ). Only for some countries like Israel, Japan and South Korea we can suggest the significant correlation on the whole time period, and those are the countries with the most strictly imposed restrictions (another such country would be China, but the mobility index data is not available). The non-monotonous curve of Rt change rate in combination with monotonous mobility curve may suggest that the reproduction number is influenced also by non-mobility events (for example, massive violations of social distance or masks wearing). Local extrema of the Rt derivative curve shows the suggested dates of such events. To understand the association between observed values of Rt, Rt change rate (dRt/dt) and mobility we showed the Spearman correlation matrix for different countries ( Figure 5 ). The spread of COVID-19 in 2020 showed a negative correlation of Rt for most of the countries (Israel, Spain and Japan have correlation cofficients close to zero). The correlation of Rt change rate with people's mobility is positive for all of the countries and strong for Italy, Germany and Spain. • Introduction of mobility restrictions lead to change of dRt/dt, and vice versa, once the restrictions are removed -it leads to rise of the mobility index and increase of dRt/dt. • Most countries introduce mobility restrictions as a response to epidemic spread, characterized by rising Rt values. There is significant delay between introduction of mobility restrictions and changes in direction of Rt, which leads to anti-correlation. The latter point suggests that we need to account for some delay between the time that mobility restrictions are introduced, and the time they start having an effect on Rt changing its direction. To find this delay, we computed Spearman correlation between Rt and mobility index with some time shift. The graphs showing the relationship between the time shift and Spearman correlation for different countries is shown in Figure 6 . A shift of 5 days means that the mobility index is shifted by 5 days from Rt. Figure 6 . Spearman correlation for Rt and mobility for different delay intervals. Sensitivity analysis for different countries was performed with varying delay intervals (5 or 21 days) between mobility and Rt to account for delay in measures efficiency. • When the time shift is moved to more negative values area, for most countries, an increase in the correlation modulus is observed, which corresponds to a decrease in its absolute value. This may indicate a delay in the introduction of restrictive measures when the epidemic spreads. • When the time shift is moved to more positive values, for most countries, an increase in the correlation modulus is observed, which corresponds to an increase in its absolute value. This may indicate a delay in the reaction of Rt to the introduced restrictions. There are some restrictions regarding the applicated model. For Rt calculation we do not separate internal and external infected cases, assuming that all incident cases after the first time-point arise from local transmission, i.e. it does not account for the possibility that cases (other than those appearing at the first timestep) are imported from other locations or derived from alternative host species. This assumption is used, for example, by [34] . In our case, it is applicable since we try to calculate the potential velocity of epidemic spreading and the way infection appears is not a case of the study. Based on previous studies, we also suspect that for the first year of epidenic spread a recovered person cannot become susceptible again [35] . Our model doesn't contain any distributions so we showed only the baseline of Rt, without any confidence interval. It should also be mentioned that Apple mobility data is only generated by Apple users with location services enabled. In some countries these people may be a minority in the population. If we consider the long time period (till December 31, 2020), over time there seems to be relatively low correlation between mobility constraints and Rt. These findings support the reasoning proposed in [22] , which indicates that protective measures become less effective in the middle of the pandemic, but can have stronger effect in the beginning. The developed open-source tool for estimating reproduction number based on SEIR model in real-time allows to forecast the progression of disease outbreaks at any phase of epidemic and understand the effect of interventions on epidemics spread. This tool also allows to assess the disease transmission potential and can be easily adapted to future outbreaks of different pathogens. The tool calculates Rt based on the fitting of SEIR model predictions to actual values, and thus is much less susceptible to data noise, like the traditional way of Rt estimation. For COVID-19 spread in 2020 we showed that for most of the countries the correlations between the Rt and people's mobility (based on the Apple mobility index) are negative. This shows a delay in the implementation of mobility restrictions -the countries imposed them in response to growth of new COVID-19 cases, rather than preventively. Supplementary Materials: A tool for estimating the time-dependent reproduction number during SARS-COVID-19 pandemic for countries worldwide is available online as Python code from the following Github repository: https://github.com/shwars/COVID19Modelling Author Contributions: Conceptualization, T.P., D..S. and A.G.; methodology, T.P. and A.G.; data analysis, visualization and validation, T.P. and D.S.; writing-original draft preparation, T.P.; writing-review and editing, T.P., D..S. and A.G. The authors contributed equally to this work. All authors have read and agreed to the published version of the manuscript. WHO Director-General's opening remarks at the media briefing on COVID-19 -11 Cunniffe Control fast or control smart? When should invading pathogens be controlled? Improved inference of time-varying reproduction numbers during infectious disease outbreaks How generation intervals shape the relationship between growth rates and reproductive numbers The 1918-19 Influenza Pandemic in Boyacá. Colombia. Emerging infectious diseases A preliminary estimation of the reproduction ratio for new influenza A(H1N1) from the outbreak in Mexico Estimating the reproduction number of the novel influenza A virus (H1N1) in a Southern Hemisphere setting: preliminary estimate in New Zealand The R0 package: a toolbox to estimate reproduction numbers for epidemic outbreaks A New Framework and Software to Estimate Time-Varying Reproduction Numbers During Epidemics Report 26 Reduction in mobility and COVID-19 transmission Sustained reductions in transmission have led to declining COVID19 prevalence in King County, WA. Results as of 10 a Modelling the COVID-19 epidemic and imple-mentation of population-wide interventions in Italy Extended SIR Prediction of the Epidemics Trend of COVID-19 in Italy and Compared With Hunan, China. Frontiers in Medicine Effective Reproductive Number estimation for initial stage of COVID-19 pandemic in Latin American Countries Three months of COVID-19: A systematic review and meta-analysis Association Between Statewide School Closure and COVID-19 Incidence and Mortality in the US Strong Social Distancing Measures In The United States Reduced The COVID-19 Growth Rate Effectiveness of COVID-19 shelter-in-place orders varied by state Impacts of State-Level Policies on Social Distancing in the United States Using Aggregated Mobility Data during the COVID-19 Pandemic Early Impact of India's Nationwide Lockdown on Aggregate Population Mobility and COVID-19 Cases Mobility restrictions were associated with reductions in COVID-19 incidence early in the pandemic: evidence from a real-time evaluation in 34 countries Time, space and social interactions: exit mechanisms for the Covid-19 epidemics COVID-19 Data Repository by the Center for Systems Science and Engineering Does incubation period of COVID-19 vary with age? A study of epidemiologically linked cases in Singapore Prevalence of Asymptomatic SARS-CoV-2 Infection : A Narrative Review Contributions to the mathematical theory of epidemics, iii -further studies of the problem of endemicity Notes On R0 Report of the WHO-China Joint Mission on Coronavirus Disease 2019 (COVID-19) A View of Algorithms for Optimization Without Derivatives. Mathematics TODAY. 43. 32. Prevention of infectious diseases Russian) Available from A New Framework and Software to Estimate Time-Varying Reproduction Numbers During Epidemics Positive RT-PCR test results in patients recovered from COVID-19 Acknowledgments: This work was supported as part of the research program of MSU Center for Storage and Analysis of Big Data. Data Availability Statement: All data used are publicly available. The authors declare no conflict of interest.