key: cord-0769551-ry6b399h authors: Barrett, Thomas J.; Patterson, Karen C.; James, Timothy M.; Krüger, Peter title: Impact of reduction of susceptibility to SARS-CoV-2 on epidemic dynamics in four early-seeded metropolitan regions date: 2021-06-09 journal: Sci Rep DOI: 10.1038/s41598-021-91247-7 sha: 3dda0f0cc92187022ec1ec3f103cb551a71ab81e doc_id: 769551 cord_uid: ry6b399h As we enter a chronic phase of the SARS-CoV-2 pandemic, with uncontrolled infection rates in many places, relative regional susceptibilities are a critical unknown for policy planning. Tests for SARS-CoV-2 infection or antibodies are indicative but unreliable measures of exposure. Here instead, for four highly-affected countries, we determine population susceptibilities by directly comparing country-wide observed epidemic dynamics data with that of their main metropolitan regions. We find significant susceptibility reductions in the metropolitan regions as a result of earlier seeding, with a relatively longer phase of exponential growth before the introduction of public health interventions. During the post-growth phase, the lower susceptibility of these regions contributed to the decline in cases, independent of intervention effects. Forward projections indicate that non-metropolitan regions will be more affected during recurrent epidemic waves compared with the initially heavier-hit metropolitan regions. Our findings have consequences for disease forecasts and resource utilisation. While onset of the exponential phase varies due to different seeding times, we find that the time constant of exponential growth is very similar over a wide geographical range. Figure 1A illustrates this using data 14 from four highly-affected countries: Spain, Italy, the United Kingdom, and the United States. The exponential growth is sustained as long as S, defined as the ratio of susceptible individuals to the whole population, is close to 100%, and β and γ are kept constant. Slowing spread and ultimately reversal to an exponential decline can be driven by an increase of γ , through interventions such as anti-viral therapies that reduce infectiousness time. Such therapies were not available for COVID-19 during the study period. Consequently, reductions of new COVID-19 infections can only be attributed to reductions in either β or S. Public health interventions aim to reduce β through hygiene, physical distancing, and mobility restriction measures designed to reduce contact frequencies 5, 6, 15 . In the following analysis we argue changes in COVID-19 disease dynamics are due to reductions of β and S, with different relative importance across geographic areas. In particular, a significant reduction of S is observed in the main metropolitan regions of the four countries investigated here. Observed impact of susceptibility reduction on infection dynamics. Variability in the nature and timing of public health interventions and resulting population behaviours are eventually reflected in variations across countries in the slowdown and exponential decline of the daily death counts. Figure 1B illustrates, for the specific example of the United Kingdom and London, the timelines of public awareness (measured through frequencies of relevant internet search queries 16 ), government restriction stringency (measured as aggregate 'Oxford' stringency index 17 ) , and visits to public meeting places (restaurants 18 , grocery stores, pharmacies, and public transit stations 19 ). Qualitatively, all measures correlate with one another, with both voluntary behavioural changes and the stringency of public health interventions resulting in reduced frequency of visits to locations of high infection risk. While it is difficult to quantitatively associate a time-dependent reduction of β(t) with these data, the observed location data in Fig. 1B (also shown for additional investigated areas in Supplementary Fig. S1 ), as well as the representative transit station data in Fig. 2 (lower panels), indicate that the public response between areas within one country is similar in timing and magnitude of deviation from baseline. However, the seeding of the epidemic in each country's main metropolitan region (Italy-Milan/Lombardy; Spain-Madrid; England-London; USA-New York City) consistently occurs earlier than in the rest of the corresponding country, as exhibited in Fig. 2 (upper www.nature.com/scientificreports/ panels). As a consequence, the time period between initial epidemic onset and reduction of β taking place (driven by public behaviour change) differs in a situation in which public behaviour is modified in similar ways and at similar times. We therefore consider pairwise comparisons of main metropolitan regions and rest of countries a suitable tool to investigate the relative impact of reductions in S. Note that for our analysis of Italy we focus on the Lombardy region, whose population is dominated by Milan-the country's largest metropolitan area. In addition, we used mortality data for England as a comparison for London, with the assumption that public health interventions in London best match those of England out of the whole of the UK. Figure 2 (upper panels) shows reported daily death counts 14, 21 for the four investigated countries, broken down into the data from the main metropolitan region and the rest of the country. Each of the corresponding eight data sets shows an initial exponential growth phase, followed by a transition to an exponential decline. Seeding of the epidemic consistently happened earlier in the metropolitan regions than elsewhere in each country, leading to higher population-normalised daily death counts in these areas. Yet, the time constant of the exponential growth is equal within errors for each metropolitan-country pair (except in the United States due to its heterogeneity), suggesting that the contact rate among a fully susceptible population is not critically dependent on factors related to population density 22, 23 . In contrast, the subsequent decline is consistently but to a varying extent faster in the metropolitan regions than in the rest of the country. Together with the notably earlier transition from the exponential growth phase and the observation that public policies within countries were similar, this indicates that the reduction of S played a more important role in the metropolitan regions relative to the rest of the country in all four cases. The extracted exponential growth and decline time constants are given in the Supplementary Table S1 . Quantitatively, the time constants τ m and τ c of the exponential decay phase provide a relative measure of the reduction of susceptibility in each metropolitan region ( S m ) compared with the remainder of the country ( S c ). The observation of an extended period of purely exponential decline, together with a stable interval of public health interventions, shown in Figs. 1B and 2 (lower panels), (even when taking into account a delayed impact of ≈ 20 days of their effect on the daily death counts 20, 24 ), implies that both β(t) and S(t) do not vary much during this phase. In this case, the decay time constants can be used to yield the ratio of fractions of the population that remain susceptible to the virus, provided β = β m = β c remains equal across each country 25 . From the fits presented in Fig. 2 , we find for the ratios S m /S c values of 0.86 ± 0.05 (Madrid/Spain), 0.89 ± 0.04 (Lombardy/ Italy), 0.70 ± 0.06 (London/England), and 0.59 ± 0.05 (New York/United States). These ratios represent upper bounds for S m , since S c is by definition ≤ 1 . Note that if instead β m = β c , then the ratio of susceptibilities will be corrected by a factor β c /β m (see "Methods"). We directly compare the relative timing of the epidemic evolution with public policy and behaviour changes (Fig. 2) . The results corroborate the finding that a reduction of S had already begun to reduce new infection rates in the metropolitan regions when public behaviour changes started to have a significant impact. The difference in the effect of S for all eight geographic areas is shown in Fig. 3 , where the peak number of daily deaths is plotted against the time delay between when visits to public meeting places are reduced and when death rates begin to slow (see "Methods"). The country-level data cluster around a common delay time that is similar to the reported average time between viral exposure and death 20, 24 . This suggests that the main driver for the initial reduction in death (infection) rates is changes in public behaviour. This is not the case for the metropolitan region data, which in turn cluster around a common peak daily death count. This clustering suggests that the larger early spread of the disease was sufficient to reduce infection and death rates through reduction of S in addition to the effects based on behavioural changes. Population-normalised reported daily deaths due to SARS-CoV-2 infection (on a logarithmic scale) over the initial outbreak in early-2020, broken down into most populous metropolitan region (blue) and surrounding country (red). Data is shown together with fits and confidence intervals (see "Methods" for details). The stringency index is grey-scaled in the background, and has been shifted in time by +20 days to approximately indicate when the effect of public restrictions on death rates could begin to become measurable. This shift includes the expected period of 5 days from exposure to symptom onset, plus the subsequent mean period of 15 days from onset to death, reported in the literature 20 . Public transit station use (bottom panels) over the same period is also shown for comparison, with faded lines for real-time data and bold lines after applying the same expected exposure-to-death time-shift described previously. 26, 27 , and solve the differential equations simultaneously for each pair of country and metropolitan regions, with coupling between the two (see "Methods" for details). In this model, the population is assumed to consist of homogeneously interacting individuals, neglecting typically higher level of potentially infectious contacts within sub-communities than between such communities. This assumption can lead to systematic underestimates of fractions of residual population susceptibility. The solutions for each area are used to estimate S and the time dependence of it, as well as to explore the potential future evolution of the infection dynamics. We incorporate public behaviour changes into the system by adjusting the trajectory of β(t) equally for each metropolitan region and country pair, informed by Fig. 2 (lower panels). To evaluate the effect of initial public health measures on S, we only include the death counts until mid-May 2020, to avoid any effects of the relaxation of restrictions on the data. In contrast to the model-independent slope comparison from the previous section, solving the model equations also provides absolute values for the susceptibility and infection fatality rate µ in each area, which are shown together in Table 1 . Using both approaches, we consistently find a comparatively weaker reduction in S outside the early-seeded metropolitan regions. Figure 4 shows the result of the model for the specific example of England and London, demonstrating that public policy interventions helped cap the death counts in both cases, but that a reduction of S is also a significant driver, particularly in London. Without relaxation of initial interventions, the model projects that containable infection levels would have occurred in mid-July and mid-August 2020 in London and the rest of England, respectively (solid lines c in Fig. 4 ). Note that this threshold was chosen to be 10 new daily cases per million population, in accordance with infection levels in countries with well-developed public health infrastructure (e.g. Figure 3 . The peak daily death counts from early-2020 plotted against the delay between public behaviour change and the deviation from exponential growth leading up to the peak. A correction factor from excess death data was applied to the reported rates (open circles) to derive the expected rates (solid circles), which are linked. The four countries exclude the corresponding metropolitan regions. The inset illustrates, for the example of England, a death count calibration by comparing reported COVID-19 deaths to the all-cause excess death data. Table 1 . Results of solving the SIR model with a time-dependent contact rate in each area. Ratios of susceptibilities are shown, along with the model-independent approach of analysing slopes for comparison. The model also provides absolute values of susceptibility S and infection fatality rate µ . The absolute value of S reported is that from mid-May, after stabilising at its new value for the time period studied (see Fig. 4 inset for the example of London and England). www.nature.com/scientificreports/ New Zealand, South Korea, Cuba, and Fiji), with proactive testing and 'track and trace' interventions, resulting in lasting containment approaching elimination of locally-acquired cases. In reality, a relaxation of restrictions (early-June 2020) and easing of travel restrictions were initiated. As a result, β increased, and slowed down the decline in death rates. In a forward projection based on a continuation of these new conditions with no further changes to β , containable infection levels in London and the rest of England could have been reached in mid-August and the end of 2020, respectively (dash-dotted lines d 2 ). As further relaxations are underway, these dates may be further delayed if no compensating changes in public behaviour are introduced to maintain the early-July value of β = 0.135 day −1 . We note that this value is already greater than the observed value of γ (see "Methods"), and thus would have been enough to drive another period of exponential growth in the absence of any reduced susceptibility effect. We also model the hypothetical scenario of a full return to pre-intervention conditions, constituting the second wave (dash-dotted lines d 1 ). In this projection, infection rates begin to rise but the second peak for the metropolitan regions is reduced with respect to the first peak, while in contrast the second peak for countries is of higher magnitude than the first one, due to the fact that the height of the peak scales with S. In Fig. 4 (inset) the resulting evolution of S(t) from the model for London and the rest of England is shown, together with the results of serology antibody tests for London 28 , which have recently been suspected to underestimate the actual (yet potentially temporary) immunity levels due to the lack of sensitivity of the tests to T-cell based immunity 10 . In both the slope-comparison method as well as the method based on the numerical solution of the coupled SIR model, the metropolitan regions exhibit a lower absolute value of susceptibility when compared with the rest of their respective countries. We have found that the dynamics of the SARS-CoV-2 pandemic have been influenced both by policy interventions and by reduced population susceptibility, with a relatively stronger contribution from susceptibility changes in early-seeded metropolitan regions. This difference appears to arise solely from earlier seeding of the disease in these areas. Our data provide a method of upward revision to the reported prevalence estimates derived from laboratory testing. In addition, while infection and thus death rates are a function of S and β , the effect of S reduction is more durable, and β will rise whenever restrictions are relaxed unless protective measures prove to be fully compensatory. If they are not, a second peak may occur. The reduction in S would not necessarily avoid this, but would mitigate peak height. Forward projections based on the observed data also indicate that later-seeded areas are relatively more vulnerable in a second wave, which has implications for the distribution of healthcare resources. It is important to note that the first peaks in the scenario modelled without any interventions are significantly higher, and could lead to hospital admission numbers with a real risk of exceeding available capacities, which would in turn lead to an over-proportionally large negative public health impact with even larger death counts. Our model assumes a homogeneously mixing population. Yet, real-life infection spreading is heterogeneous, with some sub-regions of cities or neighbourhoods more heavily affected than others, as well as demographic or geographical groups mixing more strongly internally than between each other. As a result of these heterogeneities, susceptibility-mediated effects on epidemic dynamics can occur at lower exposure levels than would be required for a homogeneously mixed and exposed population 29 . As a consequence, our absolute estimates www.nature.com/scientificreports/ of remaining susceptibilities in each region may need to be interpreted as effective values that cannot be used to directly infer cumulative fractions of the populations having been infected with the virus. For example, the figures for S reported in Table 1 cannot be read as 19% (45% ) of the population in London (rest of England) have been exposed to SARS-CoV-2. Instead the observed epidemic evolution in the heterogeneous population follows a pattern that would be expected for a hypothetical homogeneous population in which the exposure rates had reached these levels. The ratios of effective exposures between the areas are likely to represent the true exposure ratios. For example, we find that the fraction of the population that has been exposed to SARS-CoV-2 is 2.4 times greater in London than in the rest of England. For comparison, the REACT2 30 study of antibody prevalence finds a very similar ratio of 2.8 between London and England (excluding London) exposure. Similar calculations for Madrid/Spain and Lombardy/Italy yield ratios of 3.4 and 5.3, respectively, which are consistent with the values of 3.3 31 and 5.0 32 produced by each country's corresponding national seroprevalence studies from the relevant time period. While our absolute effective exposure levels are higher than the actual levels, the REACT2 study could underestimate current immunity levels by a factor of two or higher, as T-cell based immunity can exist with negative laboratory antibody response 10 . This factor of underestimated population immunity levels is time-dependent, as waning of antibody levels cannot be assumed to coincide with or be proportional to the timescales of waning immunity. The use of low blood volume finger-prick rather than standard high-volume samples derived from venous puncture, as well as testing bias introduced through inhomogeneous test responsiveness in the selected test population (albeit corrected for through post-stratification), are further limitations of the REACT2 study. The intrinsic underestimate of the remaining susceptible fraction of the population in our approach and the systematic overestimate of antibody serology studies such as the REACT2 study show that best estimates of the true value of S will benefit from a combination of both methods. While both methods are likely to provide more reliable quantitative information on relative values of S between different regions, a combined approach based on both population statistics and sufficiently large-scale serosurveys could be helpful deriving robust absolute values of S. Our modelling also assumes lasting immunity following acute infection, in keeping with normal immunological responses to viral infections 33 . Yet, for the purposes of our projections, immunity need only be preserved for as long as infection rates are uncontrollable. The level used for 'controllable disease' was chosen in accordance with data from countries achieving containment, and assumes a goal of elimination utilising, among other measures, tracking and tracing. It is possible that technological improvements will allow a successful track-and-trace strategy at higher infection levels. Forward-projecting continuations of modest relaxations of restrictions appears to make containment achievable, but observed slow down in the decline of death rates indicates that this will be reached later than would have occurred under more stringent restrictions. Mass vaccination programmes, not yet in existence during the timeframes studied here, can positively influence the infection dynamics. Our approach can be adapted to incorporate these programmes, as vaccinations further reduce S. In the simplest case, the initial value of S is adjusted to the size of the non-vaccinated population. As long as large numbers of vaccinations occur amidst active infection dynamics, the actual rate equations need to be modified. The key parameters of an epidemic can be extracted from epidemiological data (infection, hospitalisation or death rates), as long as populations are sufficiently large for demographic variations to average out, reported data are reliable, and interventions are relatively uniformly implemented. The similar observed exponential growth illustrates that this condition is fulfilled for the four countries of our study (the United States is an exception in some respects due to the heterogeneity of interventions and population behaviour). A key quantity is the time evolution of number of infected individuals. Due to insufficient quantity and quality of testing however, reported daily deaths D(t) are likely a more reliable measure of COVID-19 pandemic dynamics than the number of confirmed cases. The infection fatality rate (IFR) µ , assumed to be a constant with time, is the probability that an infected individual dies from the disease and therefore validates D(t) as a proportional measure of infections, and our main findings do not depend on knowing the exact value of µ. For our investigated areas, D(t) is obtained from Refs. 14,21 . Data for metropolitan regions is directly extracted from these sources, whereas data for the remainder of the country is derived as the national minus metropolitan data. Finally, data are normalised to the corresponding populations in each region 34 . The initial growth of infections I is observed to be exponential in time I(t + �t) = I(t)e �t/τ i , demonstrated in Fig. 1A , when the population is assumed to be fully susceptible, and the mean time constant τ i characterising this initial growth is measured to be τ i = (3.4 ± 0.2) days across the four countries. The serial time T s is defined as the average time between primary and secondary infections, and in addition the average number of secondary cases arising from a typical primary case is known as the basic reproduction number R 0 = β/γ . This means that, on average, after a time period T s has elapsed the fraction of the population infected is scaled up by a factor R 0 , and therefore I(t + T s ) = I(t)e T s /τ i = R 0 I(t) , which leads to R 0 = β/γ = e T s /τ i , providing a link between the observed growth rate and the epidemiological rates. Using the clinically determined serial time T s = (4.0 ± 0.4) days 35,36 , we find R 0 = e T s /τ i = (3.2 ± 0.3) , which is consistent with the literature 6, 37, 38 . It follows that the contact rate is β = (0.43 ± 0.02) d −1 in an unconstrained population, and the recovery rate is γ = (0.130 ± 0.013) d −1 , corresponding to a mean infectious period of (7.7 ± 0.7) days, again consistent with the literature 39, 40 . Turning now to the exponential decay phase, if the fraction of population that is susceptible S is instead different from 1, and has some other constant value, then the contact rate is simply modified accordingly and the final characteristic exponential decay time is given by www.nature.com/scientificreports/ provided β f is also constant (for example during a period with little or no change in public behaviour). This allows us to examine the ratio of susceptible populations between a metropolitan region (m) and the rest of its corresponding country (c) through the relation by measuring the time constants of the exponential declines. If β f is further assumed to take the same absolute value in the two areas, this equation simplifies to providing a direct way to estimate the ratio of relative susceptibilities in each area. Variations in the stringency of country-wide public health interventions among countries have been extensively analysed by the Oxford group 17 . These data are not available at a regional (sub-country) level, and are not a direct measure of actual public behavior. We use cellular device location data 19 to overcome both these limitations, and to evaluate the effect of PHIs. Such data are available for several high infection risk location categories, including retail and recreation, groceries and pharmacies, parks, transit stations, workplaces, and residential. Progressively introduced use of personal protective equipment, in particular face masks, will generally play a role in reducing β but in the time frame under investigation here this is not likely to be important as recommendations and significant population uptake occurred later in the course of the epidemic. We plot the cellular location data to show the relative change compared to the baseline value from the first five weeks of 2020 for each of the eight areas studied. Due to similarities of the relative timings and changes in magnitude of these measures, in principle any of them could be used as proxy for β(t) . Based on the comparably low noise level as well as the consistent and clear functional shape of the location data at public transit stations, we chose these data for our analyses. Transit data, shown in Fig. 2 (lower panels), includes use of subway stations, taxi stands, sea ports, and other travel-related locations. Empirically we find that the transit station location data are well described by a sigmoid function with a zero reference level (by definition the pre-pandemic normal level for early times t ≪ t 0 ). The remaining fitting parameters characterise the magnitude of the response ( TS l ), the time ( t 0 ) when the public behaviour change happened and time duration ( TS w ) over which the change occurred. As a comparable measure of the time when a significant reduction of β is expected to occur, we choose the time t 50% when TS(t) is reduced by 50% from normal use as determined by the fit for each area, with the error determined as the one standard deviation functional prediction interval. As the time evolution of D(t) follows exponential trajectories (growth and decline) during periods of (near) constant parameters β , γ and S, we display D(t) on a logarithmic scale and apply fit functions to D log (t) = log[D(t)] to avoid heteroscedastic bias. On the logarithmic scale, the exponential growth and decline phases are transformed to increasing and decreasing linear slopes, respectively. Therefore, as a suitable model function for fitting we use which represents an interpolation between an increasing linear function with slope D u for t ≪ t 0 , and a decreasing linear function with slope D l for t ≫ t 0 . The temporal width of the transition between them is characterised by D w . This function corresponds to an analytically-integrated sigmoid (which itself is an interpolation between two constant values, representing the constant slopes). We then fit this model to the logarithm of the data in two steps: first, we individually fit the data at early and late times to linear functions using ordinary least-squares regression to obtain D u and D l , respectively. The results of this stage are shown as the dashed lines in Fig. 2 (upper panels) . These values are then used as constrained parameters of the full function Eq. (5) in a second fitting stage, the result of which is indicated by the solid lines in Fig. 2 (upper panels) . We use the full fits to extract both the peak value of the fit to D(t) and the time t 10% when D(t) reaches 10% of the value that would have been reached at the same time if the exponential growth trajectory had continued, indicated by the dotted lines in Fig. 2 (upper panels) . This is a measure of when the dynamics has significantly departed from the exponential growth regime, and begins to transition to a decline. This point is located close to the peak, but is not biased by the slope of the exponential decline in the same way that the peak time is. The error of t 10% is determined again as the one standard deviation functional prediction interval. For all fitting, we restrict data to a time window starting with exponential growth and ending just before the introduction of the first relaxation in PHIs where an uptick in TS(t) is observed (mid-May 2020). We use the relative delay t = t 10% − t 50% as a measure of the influence of public behaviour change on β and the rate of new exposure to the virus. The expectation for a PHI-driven scenario, versus an effect of a reduction in S, is that t is comparable to the typical time from exposure to death, which is on the order of 20 days 20,24 . (1) www.nature.com/scientificreports/ Corrected death counts. For COVID-19 dynamics, the attributed daily death rates are a more reliable metric than the number of confirmed cases, as they are not dependent on testing practices, and should be less prone to under-reporting. However, daily death rates are still subject to variations in reporting methods and in how each region defines what is classified as a COVID-related death 41 . Alternatively, excess deaths can be calculated by examining how many additional deaths (from all causes) have occurred above a baseline value that would be expected for the same time of year had the epidemic not happened 13 . While excess deaths statistics include deaths indirectly related to COVID-19 (for example, 'collateral' deaths due to healthcare systems being overwhelmed, or reduced visits to emergency departments), and the baseline value for expected deaths may not exactly reflect the current situation 42 (for example, in lockdown conditions the number of road traffic accidents will be lower than typical historical values), the number of excess deaths is nevertheless widely agreed to be the most reliable indicator to reflect the state of the epidemic, and alleviates some of the shortcomings associated with the reported daily deaths data 41, 43 . We compare the time evolution of D(t) (aggregated to weekly level) to that of weekly excess deaths during the spring of 2020 relative to the median value of the historical data up to the past five years, where available 44 . Accounting for the delay that it takes in processing and registering the deaths in each case, we apply a constant correction factor to the reported daily deaths. These factors faithfully reflect the registered excess deaths, and by area were 1.9 (Italy), 2.2 (England), 1.4 (Spain), 1.5 (US), and we found that these same factors applied equally well to each of the corresponding metropolitan regions, respectively. For the purpose of fitting analysis, we continue to work with the scaled reported deaths because these data are available at a daily granularity, whereas the excess deaths are only provided weekly. The scaling of the data is shown in Fig. 3 (inset) for the example of England. Substantial efforts have gone into describing and predicting epidemic dynamics of many diseases, especially of COVID-19, through Monte Carlo-type discrete [45] [46] [47] or continuous differential equation models 48, 49 . Typically, these models require many input parameters such as fine-grained demographic and populations' behavioural details. They are often critically dependent on these parameters with corresponding large variations between studies in terms of predicted outcomes and inevitable divergence from actual observations made after the forecasts 50 . In our work we instead aimed to understand the reduction of S and its impact in a largely model-independent way. In order to still derive quantitative trends and estimates, we chose the SIR model as the simplest epidemic model that takes time-dependent susceptibility (immunity) into account. The well-established SIR framework is a deterministic model using a system of coupled ordinary differential equations, and is based on the compartmentalisation of populations into the fraction of individuals who are susceptible, S, infectious, I, and no longer susceptible (removed), R. The R compartment includes individuals who have recovered from disease and survived, along with those who have died. The rate coefficients β and γ describe the flow from the S to the I and from the I to the R compartments, respectively, and the evolution of the system is described by the set of differential equations where the dot indicates a derivative with respect to time. These equations have previously been solved analytically for a time-independent β 51 . Here we do not include any background loss rate from disease-induced mortality, since its effect would be negligible compared with the epidemic rates -particularly in the early phase of the outbreak studied in this paper, when the total number of deaths is relatively small compared with the overall population size. Since public policy interventions and public behaviour changes (whether or not they are influenced by these interventions) affect the value of β , we introduce a time dependence of that parameter into the model β = β(t) . Since no known intervention affects the duration of infectiousness, we treat γ as constant. Following the empirically found functional shape of the public response as representatively measured through use of public transit, we also model β(t) as a sigmoid with The shape of this function implies that there was only one period of relevant change to β from its unrestricted initial value β i to a later final value β f . This is justified as long as public policy relaxations do not take effect on D(t) and as long as compliance with them remains high (constant). We therefore restrict our main analysis to times that are not affected by such changes and only consider D(t) data until the end of May 2020, as no significant corresponding changes in TS(t) are discernible prior to early-May (considering the delay between exposure and death). This simple model is based on several idealised assumptions. It does not account for inhomogeneity of the population or of the mixing dynamics 29 , and so stochastic, regional and demographic effects are averaged out to effective parameters. In particular, it is known that µ is a parameter whose value strongly increases with age, but also with strong dependence on health status 52 . This means that not only the demographic composition of an (6) S = −βSI www.nature.com/scientificreports/ investigated population needs to be considered, but also the dynamics of epidemic spread relative to it (e.g. the timing of seeding the virus in settings or population groups with specific demographic makeup such as schools, high-frequency business travellers or care homes). As a consequence, the infection fatality rates reported in the literature have varied, with some convergence in the range of 0.5-1% 53 . Findings outside this range with potential country to country variations may still occur and be meaningful as effective average figures. Possible changes of µ(t) with time during the course of the epidemic have not been considered here, but are not likely to be important over the period studied in light of the high quality of exponential fits to the D(t) data. Since S is generally time-dependent over the course of the epidemic, we use a standard Runge-Kutta iterative algorithm to calculate numerical solutions of the system of differential equations [6] [7] [8] , and incorporate a time dependence to β(t) using Eq. (9) . A cost function is generated, derived from the mean-squared deviation between the numerical solutions for the epidemic trajectory and the observed data D log (t) , in each geographical area considered. The individual cost functions for the metropolitan region and corresponding country are combined (with equal weighting) to simultaneously respect both sets of data in each pair, and a Nelder-Mead based optimisation method is then used to minimise this total cost function. This procedure provides our best-fit parameters for the epidemic dynamics. Additionally in this process, coupling is introduced between each metropolitan region and its corresponding country by constraining the values of β(t) over time to be equal between the areas in each pair, as informed by the location data (Figs. 1B and 2 lower panels). Constraining the values of β in the metropolitan region and the rest of the country to be equal [β m (t) = β c (t)] is based on the assumption that public health interventions have equal effects on each area. This is supported by the observation that the time constants for initial exponential growth and the public behaviour changes (in timing and magnitude) are very similar for countries and their metropolitan regions following nationally homogeneously-imposed public health interventions, with the exception of the United States as already noted. The initial value of β i is determined by the previously determined (see above) exponential growth phase time constant. Free optimisation parameters then relate to timing of epidemic seeding and the nature of the change in β (namely t int and t w ). To account for demographic differences, we use the distributions of ages from population statistics in each of the eight geographical areas along with the reported age-dependence of the IFR 24,52 to calculate the average apparent IFR for each metropolitan region ( µ m ) and rest of country ( µ c ). Then, rather than constrain our model by exactly these determined IFR values (which may be prone to systematic bias in the clinical data), we instead couple the fitting procedure between metropolitan region and rest of country by prescribing the ratio µ m /µ c in each case. The obtained ratios used as constraints for each fit were 0.664 (London/England), 0.985 (Lombardy/Italy), 0.887 (Madrid/Spain), and 0.854 (New York/US), and so the metropolitan regions have a generally younger population structure (and therefore lower effective infection fatality rate). Finally, the absolute values of µ are then returned as fit parameters (see Supplementary Table S1) , and are consistent with those reported in the literature 53 . An exception is the higher value obtained for Italy of µ = 1.95% , which may be attributed in part to a comparably older population structure, and potentially the increased impact on healthcare capacities. The model can be extended by incorporating further changes in β(t) as public policy interventions and/or compliance and public behaviour change. For the example of England and London we note that a slight increase of β c = β m = 0.134 day −1 is consistent with the observations of D(t) from mid-May until the end of June 2020. The continued match of the data with equal values of β(t) while the exponential decay constants differ between London and the rest of England further strengthens the hypothesis that public policy interventions had an equal effect on β across both areas. In this study we have collated epidemiological and location data from a variety of sources, which are all publicly available. Details of all data sources used, as well as codes written for analysis in the paper, can be found at https:// github. com/ tjb36/ Barre tt2021_ Covid_ Susce ptibi lity_ Impact. Outbreak of pneumonia of unknown etiology in Wuhan, China: The mystery and the miracle Undiagnosed pneumonia -China (HU): Request for information A novel coronavirus from patients with pneumonia in China World Health Organization COVID-19 control in China during mass population movements at New Year Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe Fundamental principles of epidemic spread highlight the immediate need for large-scale serological surveys to assess the stage of the SARS-CoV-2 epidemic Variation in false-negative rate of reverse transcriptase polymerase chain reaction-based SARS-CoV-2 tests by time since exposure Are seroprevalence estimates for severe acute respiratory syndrome coronavirus 2 biased Robust T cell immunity in convalescent individuals with asymptomatic or mild COVID-19 Longitudinal observation and decline of neutralizing antibody responses in the three months following SARS-CoV-2 infection in humans AbC-19 rapid test" for detection of previous SARS-CoV-2 infection in key workers: test accuracy study Interpreting and using mortality data in humanitarian emergencies: a primer for non-epidemiologists. Humanitarian Practice Network Paper COVID-19 Open-Data: Curating a fine-grained, global-scale data repository for SARS-CoV-2 (2020) Public health interventions and epidemic intensity during the 1918 influenza pandemic A global panel database of pandemic policies (Oxford COVID-19 government response tracker) OpenTable at Booking.com. The state of the restaurant industry Incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: A statistical analysis of publicly available case data Have UK cities been hotbeds of the COVID-19 pandemic? Centre For Cities (2020) Urban Density is Not an Enemy in the Coronavirus Fight: Evidence from China Estimates of the severity of coronavirus disease 2019: A model-based analysis Inferring the effective fraction of the population infected with COVID-19 from the behaviour of Lombardy, Madrid and London relative to the remainder of Italy A contribution to the mathematical theory of epidemics The mathematics of infectious diseases Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report, week A mathematical model reveals the influence of population heterogeneity on herd immunity to SARS-CoV-2 SARS-CoV-2 antibody prevalence in England following the first peak of the pandemic Prevalence of SARS-CoV-2 in Spain (ENE-COVID): A nationwide, population-based seroepidemiological study Preliminary Results of the Investigation on SARS-CoV-2 Seroprevalence Duration of humoral immunity to common viral and vaccine antigens Serial interval of COVID-19 among publicly reported confirmed cases Serial interval of novel coronavirus (COVID-19) infections Estimate of the basic reproduction number for COVID-19: A systematic review and meta-analysis Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia Virological assessment of hospitalized patients with COVID-2019 Criteria for releasing COVID-19 patients from isolation: Scientific Brief Evaluating data types: A guide for decision makers using data to understand the extent and spread of COVID-19 Excess mortality estimation during the COVID-19 pandemic: Preliminary data from Portugal A pandemic primer on excess mortality statistics and their comparability across countries Available at: https:// github. com/ Finan cial-Times/ coron avirus-excess-morta lity-data An improved mathematical prediction of the time evolution of the COVID-19 pandemic in Italy Monte Carlo simulations and error analyses Early dynamics of transmission and control of COVID-19: A mathematical modelling study Feasibility of controlling COVID-19 outbreaks by isolation of cases and contacts A conceptual model for the coronavirus disease 2019 (COVID-19) outbreak in wuhan, China with individual reaction and governmental action Modelling the COVID-19 epidemic and implementation of population-wide interventions in Italy Why is it difficult to accurately predict the COVID-19 epidemic? Exact analytical solutions of the susceptible-infected-recovered (SIR) epidemic model and of the SIR model with equal death and birth rates Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand A systematic review and meta-analysis of published research data on COVID-19 infection fatality rates P.K. and K.P. conceived the study; All authors designed the study; T.B. collated, processed, and fitted the data; T.J. implemented the software for the numerical models and fitted to data; All authors analysed the data for important intellectual conclusions; All authors wrote, reviewed, and revised the submitted manuscript. The authors declare no competing interests. The online version contains supplementary material available at https:// doi. org/ 10. 1038/ s41598-021-91247-7.Correspondence and requests for materials should be addressed to P.K.Reprints and permissions information is available at www.nature.com/reprints.Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.