key: cord-0858263-v2nqi4fq authors: Brown, P. E.; Greenwald, Z. R.; Salinas, L. E.; Martens, G. A.; Newcombe, L.; Rodriguez, P. S.; Gelband, H.; Veillard, J.; Jha, P. title: Mortality from COVID in Colombia and Peru: Analyses of Mortality Data and Statistical Forecasts date: 2020-08-26 journal: nan DOI: 10.1101/2020.08.24.20181016 sha: 0545e44b0009d5597035fe2886cd8b8d316594a1 doc_id: 858263 cord_uid: v2nqi4fq National predictions of the course of COVID mortality can be used to plan for effective healthcare responses as well as to support COVID policymaking. We developed the Global COVID Assessment of Mortality (GCAM), a statistical model with continually improving precision that combines actual mortality counts with Bayesian inference, to predict COVID trends, currently until December 1, 2020. In Colombia, the GCAM analysis found the peak of COVID mortality around August 12 and an expected total of COVID deaths of 24,000-31,000, or 48%-92% over the total through August 21. In Peru, a first mortality peak occurred around May 24, and given the current trajectory, a second peak is predicted around September 6. Peru can expect 29,000-43,000 COVID deaths, representing an increase of 7%-55% over COVID deaths through August 21. GCAM projections are also used to estimate medical surge capacity needs. To gauge the reliability of COVID mortality forecasts, we compared all-cause mortality from January through June 2020 with average all-cause mortality in previous years in Colombia and Peru, and found that the excesses were consistent with GCAM forecast, most notably a doubling of overall mortality from May 25-June 7th of weeks in Peru. The GCAM results predict that as a percentage of all adult deaths in previous years, Colombia can expect about 13% excess from COVID deaths, whereas Peru can expect 34% excess. Comparisons of GCAM analyses of several other countries with Colombia and Peru demonstrate the extreme variability that characterizes COVID mortality around the world, emphasizing the need for country-specific analyses and ongoing monitoring as more mortality data become available. The coronavirus disease (COVID) pandemic has killed over 800,000 people, mostly in highincome countries, and sickened millions more around the world (Basset et al., 2020; Johns Hopkins University & Medicine [JHU], 2020; Worldometers.info, 2020) . Mortality, even with acknowledged undercounts (Katz et al., 2020; Montagano, 2020 ; Office for National Statistics [ONS] , 2020; The Economist, 2020), is the least biased metric available to monitor the trajectory of the pandemic, to evaluate the effectiveness of interventions, and to estimate healthcare needs. Reliable predictions of the COVID burden, using mortality as an indicator, are critical to plan responses to the current and potential future pandemic waves. Here, we determine the excess mortality from COVID in Colombia and Peru by comparing average weekly mortality counts for recent years with 2020 for the weeks corresponding to the first COVID wave. We complement this analysis of recent past trends with forward projections to November 1, 2020 using the latest iteration of the Global COVID Assessment of Mortality (GCAM) , an open-source statistical model to project COVID mortality trends. In Colombia and Peru, where the epidemic has not yet crested, we compare the excess in projected all-cause mortality by week and we compare these two countries with results from Italy, where mortality from the first viral wave has peaked; from Florida state in the United States and Spain, both of which show incomplete resolution of the first wave before the start of a second (which we define as a "rebound" mortality peak), and from Australia, where the two separate peaks from the first viral wave are seen arising from two different sub-national viral waves. For each setting, GCAM uses non-linear regressions to fit an eight-parameter model to the observed daily mortality counts. We used Bayesian inference (Carpenter et al., 2017) to generate several thousand posterior samples of potential epidemic curves for each country. The model builds on observed daily mortality counts for places that have already reached epidemic peaks (such as Hubei, China, which accounted for about 90% of Chinese deaths, and Italy and Spain). For settings where the epidemic is still growing, future death count predictions are necessarily less certain, and GCAM displays a variety of curves that could occur based on Bayesian prior distributions, which are quantifications of expert knowledge of the underlying process and comparisons with other coronaviruses. A semi-automated GCAM website (www.cghr.org/covid) provides regular updates. GCAM is open, transparent, and uses a reasonably simple method that employs publicly-reported mortality data to make plausible projections. The method is designed to improve as mortality data are added daily. We compared excess mortality during the first 26 weeks of 2020 to average mortality rates in previous years for Colombia and Peru. We defined excess mortality as death from all causes minus average expected mortality, following methods reported by Aron et al. (2020) . We calculated the average expected number of deaths in Colombia and Peru from historical data for weekly counts of death. In Colombia, five years of final mortality data were available (2015 Government of Colombia, 2020) and in Peru, three years of final mortality data (2017-2019) were used. The primary method of comparison is graphical analysis of weekly data for weeks 1 through 26. Additionally, we computed the p-score, representing percentage of excess deaths relative to the expected deaths for each week with complete data in 2020. These analyses will be updated as more recent weeks of mortality data are published. We collected COVID mortality data by age and sex, where reported, from the World Health Organization (WHO), country reports (WHO, 2020a), and from online data collections (Basset et al., 2020; JHU, 2020; Worldometers.info, 2020) . We deemed each setting's epidemic to have begun on the day after the first COVID death was recorded. We forecast deaths until December 1, 2020. For a comparison with current COVID mortality (mainly from acute pneumonia leading to respiratory failure), we examined the WHO mortality database (WHO, 2018) to estimate average baseline pneumonia and influenza deaths during 2015-17. For the US, we used the Centers for Disease Control and Prevention (CDC) mortality database (National Center for Health Statistics, 2020), and for Russia and China, we used the Global Burden of Disease (Global Burden of Disease Collaborative Network, 2018). We obtained age-and sex-specific population denominators from the United Nations Population Division (United Nations, 2019), and for Hubei, China, and for US states, from national vital statistics (National Bureau of Statistics of China, 2018; National Center for Health Statistics, 2020). We calculated age-specific mortality rates at age 20 years or older for pneumonia and influenza (as COVID deaths are rare in children and adolescents), and present totals for both sexes (Garg et al., 2020) . GCAM uses Bayesian non-linear regressions to fit a parametric model to the observed daily mortality counts. The form of the model and the prior distributions (which quantify subjective beliefs about the nature of the epidemic) were specified in late March 2020 and informed by the daily COVID mortality counts in Hubei, China and Italy. The daily intensity of death is assumed to follow a skew-normal density function, which is a typical "bell-shaped" normal distribution with an added term to permit an epidemic to rise quickly and fall slowly, as has already been observed with COVID deaths (Basset et al., 2020; JHU, 2020; Worldometers.info, 2020) and with two earlier coronavirus epidemics of Severe Acute Respiratory Syndrome (SARS; WHO, 2003) and Middle East Respiratory Syndrome (MERS; WHO, 2019). The daily death counts were modelled with a negative binomial distribution, an extension of the more commonly used Poisson distribution but allowing for extra variation via an overdispersion parameter. The eight parameters estimated for each region cover: total expected deaths, adjusting for age, compared to the peak Italian deaths; peak day; epidemic duration; skewness; over-dispersion; and deaths outside the main epidemic. A few settings, most notably Iran and Florida have death counts that have rebounded following a decline from a first peak, and a second skew-normal curve was added to these epidemic trajectories, bringing the total number of parameters to 14 for these regions. We used Bayesian inference (Carpenter et al., 2017) to combine prior distributions and observed data, and generated several thousand posterior samples of potential epidemic curves for each country. Bayesian prior distributions are quantifications of expert knowledge of the underlying process. For settings where the epidemic is still accelerating, death counts are less predictive, since a variety of curves can fit any particular series of increasing counts. In these cases, GCAM relies on Bayesian prior distributions. For each country, we present the actual mortality daily totals, projected daily totals, the date of peak mortality, and duration of mortality in excess of baseline years' pneumonia and influenza totals. We present the most likely pathway and a series of "credible regions" at various probability levels. Credible regions for daily incidence are "central regions" with a specified probability of containing the entire epidemic curve (Myllymäki et al., 2017) , whereas the credible regions for cumulative mortality are made up of conventional pointwise intervals. The Appendix provides full details, including the formula for the model, specifics of the prior distributions for each parameter, and simulated epidemics for a population with Italy's demographic structure. As validation, we fitted the model to training data of deaths on or before April 22 and compared the projections with data observed up to May 27 for 11 countries or US states (data not shown). The source code is available at www.cghr.org/covid. The excess deaths observed in Colombia and Peru are presented in Figures 1A and 1B , respectively. By sex, deaths were higher among males (Figure 2A and 2B) than females ( Figure 2C and 2D) in both countries. In Colombia, the greatest excess mortality occurred during June 15-21 (p-score = 24.8% excess mortality; excess deaths = 1126) and June 22-28 (30.5% excess mortality; excess deaths = 137) (Appendix table 1). These were the most recent data available . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20181016 doi: medRxiv preprint as of August 15 th , and indicate that mortality was still rising as of late June. During the initial period of national quarantine (May 24-April 12), all-cause death counts were below that seen in previous years (Appendix Table 1 and Appendix Figure 1 , March 16-April 12). In Peru, the greatest excess mortality occurred during the weeks of May 25-31 (p-score = 200% excess mortality; excess deaths = 4109) and June 1-7 (p-score = 193% excess mortality; excess deaths = 4115) (Appendix Table 2 , Appendix Figure 2 ). Figure 3 shows the GCAM mortality forecasts for daily COVID deaths until December 1 for Colombia, Peru and for comparison countries with more mature epidemics. The dark line is the median model projection and the red-shaded area represents credible intervals, ranging from 50% (darkest) to 98% (lightest). The actual peak dates are shown for the first or rebound peaks (Australia shows two peaks mostly from different viral peaks in different states). For each graph, the horizontal line represents the uncertainty in the peak date. Colombia's mortality peaked on August 12 (range August 7-19). Peru's mortality data show some evidence of a rebound peak, with the first larger peak on May 24 (range May 22-26) and a less certain rebound peak on September 6 (range Aug 18-Sept 30). As comparison, Spain shows a larger first mortality peak on March 29 (range March 28-30) and a smaller, more uncertain peak on August 19 (range August 14-30). Florida shows a larger rebound peak. Italy demonstrates a single peak in late March and Australia shows two distinct peaks. The first represents mostly deaths outside Victoria state and the second peak are deaths within Victoria. Figure 3 and Table 1 show the range of expected cumulative deaths in the comparison countries. The total number of deaths in Colombia is expected to be 24,000-31,000 (median = 27,000). The lowest value would represent an approximately 48% increase in COVID deaths over totals as of August 21, and the maximum would represent a 92% increase (Table 1) . Peru can expect 29,000 to 42,000 deaths (median 33,000) which represents an increase from 7% to 55% above current COVID deaths. We used the average annual deaths in previous years from pneumonia and influenza in both sexes above age 20 years as a crude indicator of historical hospital and ventilator capacity. We compare these totals against the median COVID deaths, and this ratio provides an estimate of the "surge" capacity needed to deal with the excess COVID deaths. This shows that both Colombia and Peru have reasonably high ratios (3.1 and 2.6, respectively) where surge capacity is needed, above the value for Spain (2.4), for example. For all countries, we estimate an average of 124 days (range: 115-148) until December 1 during which COVID deaths will exceed the daily death total of pneumonia in earlier years (Table 2) . For Colombia, this is 155 days (range: 144-171) and for Peru, 188 days (range: 165-224). As a proportion of all annual mortality, the median estimate of excess COVID mortality is 12.9% in Colombia and a very high 34% in Peru. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20181016 doi: medRxiv preprint We compared the GCAM model's performance to other mortality models. The sixth column in Table 1 -"precision"-reports the ratios of upper to lower credible intervals, which is our measure of uncertainty. The precision for Colombia and Peru are reasonably robust at 1.3 and 1.4, respectively. GCAM results showed generally consistent results for the selected countries over the six model runs (Appendix Figure 3 ). By contrast, results from the Institute for Health Metrics and Evaluation (IHME) were far more uneven, showing lower estimates than GCAM on April 22 and 29, and higher estimates afterwards. Similarly, the IHME median model results indicate that Brazil may have 125,000 deaths by Aug 4 th and GCAM estimates an upper limit of 70,000. Moreover, the IHME model does not show consistent improvements in precision as would be expected with more death data for these countries and for others (Appendix Figure 4 ). Appendix Figure 5 compares GCAM forecasts up to May 27 to IHME forecasts and those produced by the Los Alamos National Laboratory (LANL) COVID-19 Team (2020) , training the models on data up to 21 April. Both GCAM and the Los Alamos projections contain the observed mortality data within their credible intervals, but IHME does so less consistently. The credible intervals for the LANL are notably wide. Colombia and Peru will have a substantial burden of excess deaths due to COVID compared to the deaths in earlier years. Peru's COVID mortality has already had one peak in late May, and there is the possibility of another peak in early September. Colombia's COVID mortality peaked in mid-August. In Colombia and Peru, the final total of the first viral wave of COVID deaths will continue to rise by at least 48% in Colombia, and at least 7% in Peru, reaching 24,000 and 29,000 respectively. In the worst case, but still plausible, scenario, totals will reach 31,000 in Colombia and 42,000 in Peru. These totals are broadly consistent with the actual excess mortality from all causes observed in Colombia (30% excess for the week of June 22-28) and in Peru (200% excess for week of May 25-31). Hence, as a proportion of overall adult mortality, the annual excess for the median projection for Colombia is about 13% and for Peru it is 34%. In Colombia, the excess all-cause mortality until July 1, 2020 was most notable among males, and at ages 60 or higher (Government of Colombia, 2020). By contrast, in North America and Europe, most of the excess all-cause and COVID mortality has been at ages 70 or higher. These totals are aggregates, and as recently observed other countries, micro-epidemics with different trajectories are likely to occur, as has been observed in sub-regions of Italy and Spain (Basset et al., 2020; JHU, 2020; Worldometers.info, 2020) . Indeed, the two mortality peaks seen in Australia largely represent different epidemics outside and inside the state of Victoria. An acute excess of hospitalizations and deaths, clustered over short time periods, can swamp any health system. Italy faced more pressures in one month on ventilators and critical-care bed supplies than the usual annual burden (Jha et al., 2020) . Estimates for the US suggest that above age 55 years, for every COVID death, ten people will require hospitalization (IHME, . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20181016 doi: medRxiv preprint 2020). For Colombia and Peru, GCAM provides additional details of when these deaths are likely to peak and the period over which excess deaths will occur, averaging about 163 days in Colombia and 192 days in Peru, but with a wide plausible range. Such information can be used to aid planning and to help avert overwhelming health systems. Improving the trajectory for COVID mortality in Colombia and Peru depends on the effective and prolonged implementation of non-pharmaceutical interventions such as social distancing and masking. Appendix figures 1 and 2 highlight the COVID excess mortality burden following the implementation of quarantine and suggest that adherence to these measures was suboptimal. Considering the context of COVID transmission and mortality risk within upper middle-income countries such as Peru and Colombia, relative to high-income countries, is also important for understanding the effects of intervention strategies . For example, while populations tend to be younger in middle-income countries than in high-income countries, elderly populations may face higher COVID attack rates due to larger household sizes with closer intergenerational contact (i.e., elderly individuals in contact with a wider range of age groups), and in particular when nursing homes or long-term care facilities become foci of outbreaks . Initial infectious disease models estimated that COVID deaths in the US would reach 1.0-2.2 million (roughly 40%-80% above the usual annual total of 2.8 million deaths) if no preventive action were taken . These same projections suggested 510,000 COVID deaths in the UK, roughly doubling the current annual total of all deaths. To date, the US and UK COVID death totals have been substantially lower, suggesting that these first models were inaccurate (overestimating death rates early on in the epidemic). For similar reasons, these models applied to Colombia and Peru are also unreliable. Notably, a key limitation across countries is the absence of population-representative serological surveys on random samples of populations (Mallapaty, 2020) to help establish the true infection-fatality rate. This underscores the need for improved mortality surveillance systems in Colombia and Peru, with the capacity to report mortality data with minimal time lags and by age and place of death (such as long-term care homes). Colombia and Peru would also benefit by launching COVID antibody sero-surveys in the near future to provide nationally representative estimates of the total number of people with a history of infection in each setting. The availability of serological data can substantially enhance the accuracy of modelling, beyond relying on reported cases alone, and can therefore better inform and monitor intervention strategies to minimize harm caused by COVID (Metcalf et al., 2020) . All models, including GCAM, will produce some forecasts that turn out to be wrong. This is due, in part, to undercounts and delayed reporting of COVID deaths, which are now known to be common and variable (Katz et al., 2020; Montagano, 2020; ONS, 2020; The Economist, 2020) . For example, the undercount in UK COVID deaths might have been about 40% for some weeks . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20181016 doi: medRxiv preprint (ONS, 2020). In particular, undercounts in nursing home deaths have emerged as a key issue (Condon et al., 2020) . Systematic examination of undercounts across settings is needed to establish if the undercounts have accelerated during the epidemic, which would mean that we have underestimated the speed of the upward trajectory, or if they are more random, in which case our results would be less affected. Future iterations of GCAM will correct for undercounts in all places where they are documented. Definitions of COVID mortality vary across countries. The WHO has created a new International Classification of Diseases, 10th edition (ICD-10) code, U07.2 (WHO, 2020b), but this has been applied inconsistently across countries (Dagblad van het Noorden, 2020). Differences in reporting across countries do not bias the results for any one country, however, as mortality forecasts rely on actual counts within each country. In settings with few deaths over many days, the model becomes highly sensitive to the assumed prior distributions. As with all models, ours emphasizes the need to obtain additional days of observation and rerun the analyses as data improve, and as additional research results inform characteristics of COVID transmission dynamics and the effect of interventions such as physical distancing, focused lockdowns, and the use of masks (Lipsitch et al., 2020) . In a time of crisis, open data sharing is particularly important, hence the full input details and code of GCAM are freely available (www.cghr.org/covid) and we welcome critiques and suggestions on how to refine our estimates. We provide a simple metric of precision that can be used to gauge improvements over time. Future estimates will adopt these methods to examine Asian, African, and Latin American countries, many with weaker health systems and mortality surveillance capacity (Jha, 2014) . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20181016 doi: medRxiv preprint Conceived the idea and developed the study design: PB, PJ. Data analysis: PB, ZG, LES, LN. Literature review: ZG, LN, PJ. PJ and PB wrote the initial draft, and all authors were involved in commenting on subsequent revisions. PB is the guarantor. 1 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . . It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . Precision ( Table 2 . COVID mortality estimates, total mortality, percent of total deaths, and comparison to past pneumonia deaths in selected countries. Notes: Reported and excess deaths are from the New York Times (Katz et al., 2020) based on media reports and data from national statistical agencies, except for Italy, where they are based on discussions with the Italian Statistical Agency. . It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. (1) Expected weekly deaths estimated using national historic average weekly count of deaths (2015-2018); 2020 death data are preliminary (not yet validated) (2) Excess deaths are the weekly total of deaths from all causes, minus expected deaths (3) COVID-19 deaths include confirmed and suspected deaths (4) P-score represents the percentage of excess deaths relative to expected deaths per week . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . We assumed the mean number of daily deaths for each setting to follow the shape of a skewnormal density function, which resembles a typical "bell-shaped" normal distribution with added skewness to permit an epidemic to rise fast but fall slowly (as has already been observed with COVID-19 deaths, see Basset et al., 2020; Johns Hopkins University, 2020; Worldometers.info, 2020) . Each country's epidemic consists of the sum of two skew normal curves. In some cases (i.e. Australia) the two curves correspond to two separate waves of COVID-19 mortality, whereas in other settings (such as Colombia) the two curves combine to create an epidemic with a single peak. The observed deaths in region on day follows a negative binomial distribution with mean and overdispersion parameter , or more specifically: , , ; , , The are age and sex standardized death counts, obtained from each country's population data and age-specific mortality risk calculated from Italian data on 29 March 2020 (EpiCentro, 2020). The model parameters which determine the shape of the epidemic curve are: and , the expected total deaths for each skew-normal curve relative to the Italian rates on 29 March; , the locations of the peaks of the epidemic; , the durations of the epidemic (the skew-normal scale parameters); , the skewness parameters specifying the steepness of the growth phase of the epidemic relative to the decline phase; and , a 'spark' term allowing for occasional deaths outside the range of the main epidemic. We used the Markov-Chain Monte Carlo algorithm within the Stan software (Carpenter et al., 2017) . From each posterior sample of the model parameters, we computed the intensity and simulated potential future death counts from the negative binomial distribution. We calculated predictions envelopes, which contain the entire true epidemic curve from the samples, with the GET package in R (Myllymäki et al., 2017) . For each country, we present the actual mortality daily totals, projected daily and totals, the date of peak mortality and derive credible intervals (CI), at various probability levels. We calculated age-specific mortality rates at age 20 years or more for influenza and pneumonia, and present totals for both sexes, as COVID-19 deaths are only rarely reported in children and adolescents. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . We chose Bayesian prior distributions which generate simulated epidemic curves consistent with data from Italy and Hubei, China (Basset et. al, 2020; Johns Hopkins University, 2020; Worldometers.info, 2020) , and with SARS and MERS outbreaks (World Health Organization, 2003 , 2019 . Appendix Table 3 shows the prior distributions for each of the model parameters, the densities for several parameters are displayed in Appendix Figure 3 . Figures 3d and 3e show the course of an epidemic for a population of 1 million people with the age distribution of Italy, and the distribution of the final number of total cases appears in 3f. The prior median and 95% prior interval are shown above the horizontal axis of Figure 3f . The duration parameter has a 95% prior interval spanning 10 to 35 days, most of a country's deaths would be expected to occur within a span of four times this parameter. The intensity parameter has a prior which expects the mortality rate in a country with Italy's age structure to be between 50 and 2,000 per million. The number of deaths per day ranges between 10 and 200 per million at its peak. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20181016 doi: medRxiv preprint As a validation exercise, the model was trained on deaths up to a specified date and predicted the number of deaths expected by 4 August 2020. Six sets of training data were used, with the earliest training period ending on 22 April and the latest training period ending on 26 May. The results were compared to similar forecasts made by IHME (Institute for Health Metrics and Evaluation, 2020) in Appendix Figure 4 . As the period of the training data lengthens the uncertainty in the projections decreases, something which is seen most clearly in the predictions for France. Further, while the point estimates change to a certain extent, the prediction intervals are generally overlapping. While the predictions made from the small amount of data available on 22 April are imprecise, the prediction intervals from this date could have been relied upon to encapsulate the eventual predictions made with additional data. The notable exception is Brazil, where predictions on 4 May and earlier are too optimistic. Appendix Figure 5 compares GCAM forecasts up to May 27 to IHME forecasts and those produced by the Los Alamos National Labs (LANL, 2020), training the models on data up to 21 April. The prediction intervals are 95% pointwise envelopes for IHME and Los Alamos, and an 80% global envelope for the expected number of cases for GCAM. Both GCAM and the Los Alamos projections contain the observed mortality data within their credible intervals, but IHME does so less consistently. The credible intervals for the LANL model are notably wide. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20181016 doi: medRxiv preprint Forecasted deaths and 95 percent prediction interval for deaths up to August 4 from the GCAM model and from IHME. Six sets of training data were used, with the earliest set of training data consisting of deaths up to April 22 and the latest training data including deaths up to May 26. The "X" represents actual death totals as of May 25. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted August 26, 2020. Three models are reported: the GCAM model from this paper; the forecasts produced by IHME; and the forecasts from the Los Alamos National Laboratory. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted August 26, 2020 . . https://doi.org/10.1101 A pandemic primer on excess mortality statistics and their comparability across countries Stan: A probabilistic programming language 11,000 deaths: Ravaged nursing homes plead for more testing CBS: oversterfte van 1200 in laatste week maart. Dagblad van het Noorden Impact of nonpharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand Hospitalization rates and characteristics of patients hospitalized with laboratory-confirmed coronavirus disease 2019 -COVID-NET, 14 States of Colombia, Ministry of Health and Social Protection. (2020). Colombia mortality COVID-19 projections Reliable direct measurement of causes of death in low-and middle-income countries Reliable quantification of COVID-19 mortality worldwide Mortality analyses -Johns Hopkins Coronavirus Resource Center U.S. coronavirus death toll is far higher than reported, C.D.C. data suggests. The New York Times Defining the epidemiology of Covid-19 -Studies needed COVID-19 confirmed and forecasted case data Antibody tests suggest that coronavirus infections vastly exceed official counts Mathematical models to guide pandemic response COVID-19: excess mortality figures in Italy. Towards Data Science Global envelope tests for spatial processes National data: Annual by province Deaths registered weekly in England and Wales, provisional: week ending 10 Retrieved from https://covid-19.bsvgateway.org/ The Economist The impact of COVID-19 and strategies for mitigation and suppression in low-and middle-income countries Probable cases of SARS by date of onset, worldwide We thank Richard Wen for assistance with the website for data. The funders had no role in the study design, conduct, and reporting. Not needed as all data are publicly available. No individual patient data were used. We declare no competing interests Notes: (1) Expected weekly deaths estimated using national historic average weekly count of deaths (2017-2019) (2) Excess deaths are the weekly total of deaths from all causes, minus expected deaths (3) COVID-19 deaths include all cases reported by Peru Ministry of Health (MINSA) (4) P-score represents the percentage of excess deaths relative to expected deaths for each week with complete data in 2020.