key: cord-0284486-b1kp5vs1 authors: Delva, W.; Dominic, E.; Beauclair, R.; Ouifki, R.; Kroon, M.; Volmink, H.; Dorkin, E. title: Four-week forecasts of COVID-19 epidemic trajectories in South Africa, Chile, Peru and Brazil: a model evaluation date: 2021-09-12 journal: nan DOI: 10.1101/2021.09.06.21263151 sha: 24161a1db165357b47976f7ef759adb688956c60 doc_id: 284486 cord_uid: b1kp5vs1 Introduction From the beginning of the COVID-19 pandemic, epidemiological models have been used in a number of ways to aid governments and organizations in efficient planning of resources and decision making. These models have elucidated important epidemiological transmission parameters, in addition to making short-term projections. Methods We constructed a compartmental mathematical model for the transmission, detection and prevention of SARS-CoV-2 infections for regions where Anglo American has mining operations. We fitted the model to publicly available data and used it to make short-term projections. Finally, we evaluated how the model performed by comparing short-term projections to actual confirmed cases, retrospectively. Findings The average forecast errors for four-week-ahead projections ranged between 1% and 8% in all the countries and regions considered in this study. All but one region had more than 75% of the true values falling within the range of four-week-ahead projections. The quality of the projections improved with time as expected due to increased historical data. Conclusion Our model produced four-week forecasts with a sufficiently high level of accuracy to guide operational and strategic planning for business continuity and COVID-19 responses in Anglo American mining sites. Since December 2019, the COVID-19 pandemic has done untold damage to the health of individuals, healthcare systems, and economies across the world. As of 23 August 2021, the WHO estimated over 211 million confirmed cases and 4.4 million deaths 1 . These numbers continue to compound daily as more infectious variants of the SARS-CoV-2 virus spread rapidly through populations. Thus, countries across the world have adopted different measures to mitigate the spread of infection, and prevent an overload of the health care systems. A variety of non-pharmaceutical interventions (NPIs) that have been proposed and used encompass travel bans, domestic movement restrictions, social distancing, contact tracing of infected cases, self-isolation of symptomatic people, mandatory use of face-masks in public, and shielding of high-risk populations. Vaccines have been rolled out rapidly over the past 8 months, but vaccine coverage remains highly unequal across countries, with poorer countries in the global South lagging far behind 2 . From the beginning of the pandemic, epidemiological models have been used in a number of ways to aid governments and organizations in efficient planning of resources and decision making 3, 4 . They have been instrumental in the comparison of the relative impacts of different interventions (e.g. social distancing, restrictions on air travel, school closures, use of face masks, etc.), on the trajectory of the epidemic 3, 5-10 . More recently, models have explored a wide range of biological and behavioural We developed a compartmental model for the transmission and detection of SARS-CoV-2 infections. Figure 1 shows the health states and possible transitions between them. The model assumes that all people except for the seed infections are initially susceptible to SARS-CoV-2 infection. Upon infection, they transition to become "Exposed (E u )" but not yet infectious and not yet symptomatic. After a few days, they transition to the pre-symptomatic state (I 1u ) during which they are still asymptomatic, (stratified by severity), diagnosed and undiagnosed cases as well as the number of hospitalisations. In this paper, we only focus on projected COVID-19 cases. The model was re-calibrated once every month and the model projections updated. We evaluated the performance of the model in making short term projections using frequently used summary metrics discussed below. To assess the difference between the model projections and the observed case counts, we computed the mean absolute percentage error (MAPE), a summary measure of the accuracy of forecasts. It expresses the error in the forecasts as a percentage. MAPE is defined as: where y t andŷ t denote the truth and the projected value at time t respectively, and n is the number of projections. The smaller the MAPE, the better. To measure the of sharpness of the model, defined as the ability of the model to generate predictions within a narrow range of possible values, we used the median absolute deviation (MAD). MAD is a common statistic used to measure the spread out of a set of data. It is a property of the projections only and is defined as: whereŷ it denotes the projected value from each calibration at time t. A small MAD value means that the model is sharp while a large value means that the model is blurred. We also computed the percentage of true values contained in the interquartile range and those contained between the range of all projections (between the minimum and maximum projections). The calibrated models produced projections of cumulative cases of COVID-19. For all the regions, we conducted four-week ahead forecasts. Figure 2 shows the observed and predicted COVID-19 cases together with prediction intervals for different regions for each prediction period. The first forecast period covered September 9 -October 6, 2020. The average forecasts for South Africa and Limpopo were very close to the observed data and the range of projections produced were narrow. During the second and third forecast periods covering October 14 -November 10, 2020 and November 4 -December 2, 2020 respectively, the actual number of confirmed COVID-19 cases were contained in the prediction intervals in most of the regions. The prediction intervals were generally narrow as well. The model performed poorly in predicting a new wave of infection as seen in the case of South Africa during the fourth forecast period which covered December 14 -January 11, 2021. The model failed to accurately predict the increase in cases during the second wave of infections in December as seen in the plots for South Africa, Limpopo, Northern Cape and North West. The fifth forecast period covered January 13 -February 9, 2021. During this forecast period, the actual number of confirmed COVID-19 cases were contained in the prediction intervals in all of the regions. However, in some regions such as Chile and Limpopo, the mean estimates for the confirmed cases were far from the observed cases. Figure 3 shows the overall model performance for difference regions. In all the regions, the forecast error increased as the forecast horizon increased. Nevertheless, the overall percentage forecast error for the four-week-ahead forecast was below 5% in all the regions besides Limpopo and Northern Cape. At one-week-ahead, most models had between 40% and 60% of the true values contained in the interquartile range. Peru had 100% of the true values contained in the interquartile range for one-week-ahead forecasts while Chile had the least number of true values falling within the interquartile range (35%). In most of the regions, the percentage of true values contained in the interquartile range decreased with time. All the regions had 60% or more of the true values falling within the range of model projections for the four-week-ahead projections. Brazil's model was the most blurred model because it had the largest median absolute deviation values across the forecast horizon. This model exhibited rapid decrease in sharpness as the forecast horizon increased. Table 1 in the supplementary information section provides more information on overall performance of the model. Next, we analysed the model performance at different calibration time points for all the regions ( Figure 4 ). The percentage forecast error for all the regions and all calibration time points was below 22%. In the first three rounds of forecasts, the forecast . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint error was decreasing with time but then started increasing due to the second wave of infections. The percentage of true values contained in the interquartile range do not seem to have improved in subsequent forecasts, although there was variation in different regions. Table 2 in the supplementary information section shows the accuracy metrics for each region and calibration origin. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The COVID-19 pandemic has continued to spread rapidly throughout the world. Several models have been developed to study the transmission dynamics and make short-term projections into the future about expected cases, hospitalisations and deaths [16] [17] [18] [19] . We developed a compartmental model and calibrated it to publicly available data. We assumed that these data are accurate and reliable. The main objective here was to evaluate how the model performed when comparing short-term projections to actual confirmed cases, retrospectively. Our model performed very well in producing short term projections with reported case counts falling within the range of projections. We obtained an overall forecast error of below 8% on average for four-week-ahead projections in all the countries and regions. As expected, the performance of the projections declined as the forecast horizon increased. This is because many processes that shape the epidemic continue to evolve as the disease evolves, most crucially processes at the molecular level (viral evolution, giving rise to viral variants with substantially different infectivity levels) as well as socio-political processs at the population level (government decisions to tighten or relax lockdown regulations). Neither of these sets of processes were explicitly captured by the model. As the epidemic continues to evolve, more data will become available. The model's parameter estimates will become more accurate, and the model-based inference will be less affected by parameter uncertainty. One such model-based epidemiological metric that may prove to be useful for continuous monitoring and evaluation of the COVID-19 response is the ratio of the moving averages of newly infected and newly recovered patients. An increase in this metric would indicate that the epidemic is getting worse, while a decrease would indicate the opposite. Resurgence of the second wave of infections in different regions resulted in poor forecasts. Future improvements to the 6/26 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 12, 2021. ; model involve redefining the effective contact rate ( b b FOIad just ). As lockdown regulations are loosened, we expect more contacts and hence high likelihood of increase in transmission. Our model had an explicit observation process built-in to it. This was done for two reasons: firstly, it allowed us to calibrate the model to the available empirical data, since that data is based upon laboratory confirmed infections -which by definition exclude undiagnosed infections. Secondly, it will allow us to model the effect of interventions (e.g. amplified screening initiatives) that might change how many cases get diagnosed (i.e. what we observe in the real world). However, the challenge is that to date, there is very little literature on epidemiological parameters of those who are undiagnosed versus diagnosed. This influenced the uncertainties for relative infectiousness. For example, we assumed that being diagnosed likely makes a person relatively less infectious, because knowledge of infection would cause a case to modify their behaviour resulting in contact with fewer people through self-isolation. However, it is possible that diagnosed people with severe symptoms could be better at infecting others since those who remain undiagnosed likely have milder symptoms, which is suggestive of lower viral loads 30 . A related challenge is that we have almost no direct evidence to inform the various detection rates (parameters d e , d 1 , d 2 , d m and d s ) in the model. That in turn makes it virtually impossible to estimate what the impact of intensified contact tracing or screening might be. Relevant improvements in the South African SARS-CoV-2 surveillance system would therefore be immensely helpful in reducing parameter uncertainty and subsequently the uncertainty around model-based estimates and projections. Our assumptions about the average time spent in each of the different compartments may also require future adjustments. Published estimates of duration spent in hospital or ICU, were not specific enough for our purposes. For instance, for ρ and δ c , we used published estimates of length of stay in ICU. The estimate we based our priors on aggregated people together who may have died in ICU with those who were discharged back to a non-ICU hospital bed once they started to get better. Because of this, we made the assumption that those who don't die have a longer time in ICU than those who do. This may affect how quickly all hospital and ICU beds fill up and how long they stay occupied, which in turn may affect the number total deaths resulting from the model. The results presented here may also be influenced by the assumptions we made regarding disease severity, particularly in regards to the proportion of cases that remain completely asymptomatic throughout the duration of infection. It has been estimated that as many as 70% or more of infections 31, 32 could be asymptomatic. However, these studies do not follow-up cases past the full duration of the incubation period in order to see if symptoms develop later. Here, we assumed between 20-40% would be asymptomatic, based upon a systematic review of studies that estimated the proportion of asymptomatic cases when cases were followed-up for at least 14 days to determine their final symptoms status 33 . If asymptomatic cases are more common than we assumed, then we may expect fewer people to become cases in our model, since they are assumed to be less infectious than symptomatic cases. In conclusion, even though limited data and considerable uncertainty around the transmission dynamics posed constraints to the accuracy and precision of our model forecasts, our model produced four-week forecasts with a sufficiently high level of accuracy to guide operational and strategic planning for business continuity and COVID-19 responses in Anglo American mining sites. Several aspects of the model are being revised, in response to emerging knowledge that was not available at the start of the model development. In particular, waning immunity after naturally acquired SARS-CoV-2 infection, and the increasing coverage of COVID vaccines, yet with imperfect protection against infection, are important new features that are being added to the model. With respect to model calibration, comparison of smoothed time series of daily new cases and deaths rather than cumulative cases and deaths is being implemented. pasteur-02548181 (2020). . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 12, 2021. ; . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 12, 2021 . ; This is the rate at which undiagnosed mildly infected individuals recover r h This is the rate at which hospitalised individuals recover, without going to critical care r p This is the rate at which individuals in the post-critical care ward recover ρ This is the rate that those in the critical care ward move to the post-critical care ward θ This is the rate that those who are hospitalised move to the critical care ward FOIadjust This is the factor by which you multiply the force of infection if you want to decrease the rate at a specific point in time . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 12, 2021. ; https://doi.org/10.1101/2021.09.06.21263151 doi: medRxiv preprint The model is formally represented by a system of differential equations: . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 12, 2021. ; https://doi.org/10.1101/2021.09.06.21263151 doi: medRxiv preprint with β (t) = b (a 1u I 1u + a 2u I 2u + I mu + a su I su + a 1d I 1d + a 2d I 2d + a md I md + a sd I sd ) and V representing the number of confirmed cases. The capacity for hospitals and critical care units to admit patients is not unlimited in South Africa -as in most countries. Defining v as the hospital capacity limit, the effective rate of hospitalisation is: ηH while H < v. When H starts to exceed v, it becomes ηe x(1− H v ) , which is ≈ 0 for large values of x. Similarly, the effective rate of critical care admission is: θC while C < w. When C starts to exceed w, it becomes θ e x(1− C w ) . The basic reproductive number R 0 is given by where A = (c 1su c e1u d s ξ d + ψc 1sd Y s ) a sd r md r 2d Z m Z 2 +       (ξ d r 2d r md c e1u a 1u + ((a 1d r 2d + a 2d c 12d )r md + a md c 1md r 2d ) ψ) Z m +c 1mu c e1u d m ξ d a md r 2d Z 2 +(a 2d d 2 + a 2u r 2d )ξ d r md c 12u c e1u Z m   Y s +a su c 1su c e1u ξ d r md r 2d Z m Z 2 Z m = r mu + d m , Y s = η u + d s + δ su ξ u = c 12u + c 1mu + c 1su + d 1 , ξ d = c 12d + c 1md + c 1sd ψ = c e 1 u d 1 + d e ξ u W = c e1u + d e The effective reproductive number R e (t) is given by where A (t) = (c 1su c e1u d s ξ d + ψc 1sd Y s (t)) a sd r md r 2d Z m Z 2 +       (ξ d r 2d r md c e1u a 1u + ((a 1d r 2d + a 2d c 12d )r md + a md c 1md r 2d ) ψ) Z m +c 1mu c e1u d m ξ d a md r 2d Z 2 +(a 2d d 2 + a 2u r 2d )ξ d r md c 12u c e1u Z m   Y s (t) +a su c 1su c e1u ξ d r md r 2d Z m Z 2     X d (t) B(t) = r 2d r md Z m Z 2 ξ u ξ d X d WY s (t) with X d (t) = η d (H(t)) + δ sd (H(t)), Y s (t) = η u (H(t)) + d s + δ su (H(t)) . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 12, 2021. ; https://doi.org/10.1101/2021.09.06.21263151 doi: medRxiv preprint World Health Organization. Weekly Operational Update on COVID-19 (As of Wrong but Useful â What Covid-19 Epidemiologic Models Can and Cannot Tell Us How simulation modelling can help reduce the impact of COVID-19 Social distancing strategies for curbing the COVID-19 epidemic Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period Effectiveness of isolation, testing, contact tracing and physical distancing on reducing transmission of SARS-CoV-2 in different settings Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe Feasibility of controlling COVID-19 outbreaks by isolation of cases and contacts Response strategies for COVID-19 epidemics in African settings: a mathematical modelling study Vaccine nationalism and the dynamics and control of sars-cov-2 Model-informed COVID-19 vaccine prioritization strategies by age and serostatus Modeling the impact of racial and ethnic disparities on covid-19 epidemic dynamics The importance of non-pharmaceutical interventions during the COVID-19 vaccine rollout COVID-19 vaccines that reduce symptoms but do not block infection need higher coverage and faster rollout to achieve population impact Predictive Mathematical Models of the COVID-19 Pandemic: Underlying Principles and Value of Projections Data analysis and modeling of the evolution of COVID-19 in Brazil Modeling projections for COVID-19 pandemic by combining epidemiological, statistical, and neural network approaches Change in global transmission rates of COVID-19 through On the reliability of predictions on Covid-19 dynamics: A systematic and critical review of modelling techniques Forecasting for COVID-19 has failed Why is it difficult to accurately predict the COVID-19 epidemic? Structural identifiability and observability of compartmental models of the COVID-19 pandemic At a glance: Anglo american Use of available data to inform the covid-19 outbreak in south africa: A case study An interactive web-based dashboard to track covid-19 in real time Inferring coalescence times from dna sequence data Covid-19 mortality underreporting in brazil: analysis of data from government internet portals Relative infectivity of diagnosed, pre-symptomatic people I 1d a 1uRelative infectivity of undiagnosed, pre-symptomatic people I 1u a 2dRelative infectivity of diagnosed, asymptomatic people I 2d a 2uRelative infectivity of undiagnosed, pre-symptomatic people I 2u a md Relative infectivity of diagnosed people with mild symptoms I md a sd Relative infectivity of diagnosed people with severe symptoms I sd a suRelative infectivity of undiagnosed people with severe symptoms I su b bEffective contact rate, which encompasses all of the biological and behavioral considerations that influence contacts between individuals that lead to transmission c 12uInverse of the average stay in the I 1u state for undiagnosed and asymptomatic people c 1muInverse of the average stay in the I 1u state for undiagnosed and mildly symptomatic people c 1suInverse of the average stay in the I 1u state for undiagnosed and severely symptomatic people c ceil This is the maximum number of people who can be in critical care facility at a single time step. c e1uInverse of the average stay in the E u state (undiagnosed and infected but not yet infectious) d 1 Inverse of the average time for people in the I 1u state (pre-symptomatic) until they get diagnosed with coronavirus infection, while still pre-symptomatic d 2 Inverse of the average time for people in the I 2u state (infectious but asymptomatic) until they get diagnosed with coronavirus infection, while still infectious and asymptomatic d m Inverse of the average time for people in the I mu state (mildly symptomatic) until they get diagnosed with coronavirus infection, while still infectious and mildly symptomatic d s Inverse of the average time for people in the I su state (severely symptomatic) until they get diagnosed with coronavirus infection, while still infectious and severely symptomatic dd f This is the fraction of deaths that happened outside of the hospital where COVID-19 is identified as the cause of death δ su Inverse of average time till death for people with severe symptoms I su who die without any hospitalisation δ c Inverse of average time till death for people who die while admitted to critical care δ h Inverse of average time till death for hospitalised people who die without any stay in critical care δ h ad just This is the factor by which you multiply the δ h parameter if you want to decrease the rate at a specific point in time η u Inverse of average time till hospitalisation for undiagnosed people with severe symptoms I su h ceilThe maximum number of people who can be in the hospital at a given time hd fThe fraction of all people who go to the hospital with COVID-19 that are actually diagnosed with COVID-19 r 2uInverse of the average stay in the I 2u state (asymptomatic, undiagnosed) r mu