key: cord-0279810-up4mb9nk authors: Van Gordon, M. M.; Mwananyanda, L.; Gill, C. J.; McCarthy, K. A. title: Regional comparisons of COVID reporting rates, burden, and mortality age-structure using auxiliary data sources date: 2021-08-21 journal: nan DOI: 10.1101/2021.08.18.21262248 sha: ed1fe25bf11637345a38c843e00ead7c1d746960 doc_id: 279810 cord_uid: up4mb9nk We correct common assumptions about COVID burden and disease characteristics in high-income (HIC) versus low- and middle-income (LMIC) countries by augmenting widely-used surveillance data with auxiliary data sources. We constructed an empirically-based model of serological detection rates to quantify COVID reporting rates in national and sub-national locations. From those reporting rates, we estimated relative COVID burden, finding results that contrast with estimates based on case counts and modeling. To investigate COVID mortality by age in an LMIC context, we utilized a unique morgue study of COVID in Lusaka alongside the population attributable fraction method to account for HIV comorbidity. We calculated the comorbidity-corrected age-adjusted mortality curve in Lusaka and found it significantly skewed toward younger age groups as compared to HICs. This unexpected result recommends against the unexamined use of HIC-derived parameterizations of COVID characteristics in LMIC settings, and challenges the hypothesis of an age-structure protective factor for COVID burden in Africa. Indeed, we found overall COVID burden to be higher in Lusaka than in HICs. Concurrent with high COVID burden, many LMICs have high prevalence of other public health issues such as HIV, which compete for limited health investment resources. Given differences in age-structure, comorbidities, and healthcare delivery costs, we provide a case study comparing the cost efficacy of investment in COVID versus HIV and found that even in a high HIV prevalence setting, investment in COVID remains cost-effective. As a whole, these analyses have broad implications for interpretations of COVID burden, modeling applications, and policy decision-making. Accounting for differences in surveillance and incorporating 11 auxiliary data sources can help fill these data gaps and inform 12 our understanding of COVID across contexts. 13 For example, official statistics on regional COVID burden 14 are based on reported case counts (4), despite evidence of 15 substantial case underreporting particularly in LMIC contexts 16 (5, 6). Such differences in reporting rates can significantly alter 17 estimates of relative disease burden across regions (7). COVID 18 reporting rates are difficult to determine, but incorporating 19 serological data can inform reporting rate estimates. While 20 serology is challenging to work with, it offers some of the 21 best information in data-sparse settings if the limitations of 22 serological data are accounted for (8, 9) . 23 Surveillance and reporting influence our understanding of 24 COVID dynamics in other ways as well. For example, while 25 there is strong evidence that COVID parameters such as infec-26 tion fatality rate (IFR) vary even within HICs, data challenges 27 in LMICs mean that HIC estimates are often used in LMIC 28 settings (10). This practice has major implications for esti-29 mates of COVID burden and risk factors, and subsequently for 30 policies and public health practices targeting COVID. The com-31 mon understanding of IFRs and the age-structure of COVID 32 mortality has led to hypotheses that Africa's young population 33 distributions have a protective effect against COVID (11) , yet 34 questions remain about impacts of comorbidity distributions, 35 differences in disease characteristics across settings, and the 36 role of COVID interventions in the context of other public 37 health concerns. 38 We address these questions and assumptions through aux-39 iliary data sources. Using serological modeling, we calculate 40 reporting rates for different national and sub-national locations 41 in HICs and LMICs. We then use that data in a reporting 42 rate model to adjust national burden estimates accounting 43 for differences in surveillance. Taking a closer look at a local 44 The analyses presented here demonstrate the power of auxiliary COVID data sources to fill information gaps, particularly for LMICs. Our results reveal differences in COVID surveillance and disease dynamics between HICs and LMICs that challenge common perceptions and assumptions about COVID in these respective contexts. We show the divergence of COVID reporting rates between HICs and LMICs and the effects on relative estimated burden. Contradicting common modeling practices, our analysis demonstrates that the age-structure of COVID mortality cannot be accurately generalized from HICs to LMICs. We find higher COVID burden in LMIC contexts than HICs particularly in younger age groups and show that investment in COVID is cost-effective even in light of other public health concerns. In this article, we first present our results on reporting rates 52 across locations and relative regional COVID burden informed 53 by serology. Next, we use auxiliary data from Zambia to 54 examine mortality age-structure in an LMIC-specific location, the relationship between reporting rate and testing rate, for 104 which we have continuous time series data. We then used the 105 testing rate time series to model dynamic reporting rates and 106 unify the dates of estimated infections. 107 Figure 2 shows a ranking of COVID burden across locations 108 on November 12th, 2020, the most recent date with continuous 109 available testing data for all locations. The top plot shows 110 the ranking based on reported cases; the bottom plot shows 111 the ranking of estimated infections calculated using reporting 112 rates. In the reported case burden ranking, the EURO region 113 is high relative to other regions; however in the estimated 114 infections ranking, EURO countries have low burden relative 115 to other regions. 116 We note that increased estimated cases relative to reported 117 cases occurs mainly, but not exclusively, in lower resource 118 country settings. As with the analysis presented in Figure 1 , 119 these data support the concern that underreporting, perhaps as 120 a consequence of resource limitations, may lead to a significant 121 under-counting of cases in certain parts of the world. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 21, 2021. ; of YLL per capita in the United States is dominated by the 182 COVID mortality reduction axis. In Zambia, HIV as the 183 number one cause of death might be expected to dominate 184 the gradient of YLL per capita, but COVID-attributed YLL 185 per capita is nearly 70% that of HIV. This skews the gradi-186 ent direction of total combined YLL per capita to a roughly 187 diagonal orientation, from lower left to upper right. The overlay of the cost/percent mortality reduction 189 isotropic lines and the YLL per capita heat map indicate the 190 cost efficacy of investment in HIV vs. investment in COVID. 191 In both the United States and Zambia on a constant per capita 192 budget, maximizing investment in COVID mortality reduction 193 relative to HIV investment minimizes YLL per capita, the 194 desired outcome. Note that there are many components to 195 decision-making about public health investment, and we do not 196 claim that our model results establish exact cost for mortality 197 reduction or that COVID investment is definitively indicated. 198 Rather, this analysis serves to unseat any a priori assumptions 199 that COVID burden is insignificant in settings with other 200 high-burden public health issues. Rather, COVID investment 201 should be considered as a possible avenue for cost-effective 202 reduction in total disease burden. Using surveillance data auxiliary to reported COVID cases 205 and deaths, we demonstrate that common assumptions about 206 regional COVID burden must be reconsidered. We calculate 207 reporting rates across locations and show the impact of regional 208 differences on perceived COVID burden. Further, contrary 209 to impressions derived from case counts, we establish higher 210 burden of COVID in the African context as compared to 211 HICs, particularly in younger age groups. This challenges 212 predominant assumptions about the age structure of mortality 213 rates and the protective effects of younger populations in 214 Africa (3, 8, 13 ). Combining seroprevalence data with seroconversion and 216 reversion modeling, we calculate reporting rates across WHO 217 regions and present estimated infections across locations. Our 218 analysis shows high reporting rates in EURO and AMRO 219 relative to other regions, a data-based result consistent with 220 anecdotal understanding (1, 7). Heterogeneity of reporting 221 rates at the sub-national level adds to the common under-222 standing of geographic heterogeneity of COVID prevalence 223 (14, 15). By identifying a relationship between reporting rate 224 and testing rate, we unify the date of COVID prevalence es-225 timates and rank countries according to relative estimated 226 prevalence. The relatively low burden in EURO countries 227 contradicts geographic burden distributions based on reported 228 cases as well as modeling estimates (13, 16) . Randomly sampled morgue-based COVID testing data pro-230 vides the opportunity to evaluate mortality dynamics in an 231 African context, without the challenges associated with report-232 ing systems. The age structure of mortality in the African 233 setting is significantly different from HIC settings, with unex-234 pectedly higher burden in younger age groups. In addition, 235 overall mortality burden in the African setting outstrips that 236 in HICs. This poses serious risks for LMIC countries where 237 age distributions are skewed younger, directly contradicting 238 the age-protection hypothesis. We use the mortality data 239 alongside a simple cost model to show that even in a context 240 with substantial other public health concerns such as HIV, 241 Lower right: Age-adjusted mortality for Lusaka and HICs by age bin. Lower left: Exponential rate of increase of age-adjusted mortality with age across locations. All mortality data is from early August, 2020. COVID mortality in Lusaka is adjusted to exclude HIV-attributable comorbid deaths. Shading and error bars represent 95% confidence intervals. COVID may be a cost-effective investment for disease burden 242 reduction. Our study is subject to a number of limitations, particularly 244 as we grapple with understanding COVID dynamics in LMIC 245 contexts. LMICs are subject to limited data availability and 246 substantial uncertainty, which we address in part by making 247 use of sub-national data sources including serology and COVID 248 testing from morgue sampling. Challenges when working 249 with serology include inconsistencies in testing protocols and 250 sampling frameworks alongside the impacts of seroconversion 251 and reversion on results. To address these hurdles, we focus 252 on serostudies that do not target particular populations, and 253 adjust estimates for seroconversion and reversion. Necessitated by data and uncertainty limitations, some of 255 the models we present rely on broad approximations. Modeling 256 reporting rate as a function of testing rate, for example, is an 257 approximation made to include countries where more detailed 258 auxiliary data are not available. We do not attempt to estimate 259 the magnitude of COVID burden in different locations, only 260 their relative ranking. Finally, cost modeling is presented as 261 a ballpark framework to evaluate COVID in the context of 262 other public health concerns, rather than a comprehensive 263 costing model. We use HIV as an example to compare with 264 COVID, recognizing that there are other sources of burden 265 and other approaches to public health investment than single 266 disease-focused strategies. We do not attempt to model the 267 complexities of mortality reduction dynamics, rather seeking 268 to demonstrate that COVID should be considered as part of a 269 public health investment portfolio. We tested two models for calculating this case adjustment, 320 each with different data requirements. The more data-intensive model requires continuous time Because the empirical models from the literature include 335 only about four months of data, we used the last month of the 336 combined empirical model to extrapolate out to a year using 337 a log-linear regression. The full Ppos model is shown in Figure 338 S2. The calculation for a point estimate of reporting rate using 340 this model of seropositivity is then as follows, where T k is the 341 date the serostudy k was conducted, R is reporting rate, c is 342 daily reported cases, and S is seropositivity rate: While continuous daily case reporting data largely exists 345 for HICs, LMICs do not necessarily report data at a daily 346 frequency, particularly for sub-national locations. In order 347 to be able to include more LMIC serology studies, we also 348 developed a less data-intensive model for the case count adjust-349 ment to approximate seroconversion and reversion dynamics. 350 In this model, only two data points for cumulative cases are 351 required: cumulative cases at 21 days and 60 days before the 352 serostudy. These time delays were selected heuristically to 353 account for seroconversion and reversion, respectively. Only 354 cases reported within these time bounds are then used for the 355 reporting estimation. Where C is cumulative cases: We compared results from the two different models for 358 adjusting cases and estimating reporting rate. We found 359 the less data-intensive model to be a good approximation of 360 the more complete probability-based model, see Figure S3 . 361 We used the second, less data-intensive model for Figure 1 362 to include sub-national locations, and the probability-based 363 model for Figure 2 . Dynamic reporting rate modeling. With reporting rate estimates 365 based on serostudies each conducted at a different time T k , it 366 remains to unify dates of estimated infection rates for com-367 parison across locations. To allow reporting rates to vary over 368 time, we constructed a hybrid reporting rate model based on 369 the reporting rate at the time of a serostudy and the log-log 370 relationship between testing rate and reporting rate in the 371 serostudy locations. As the serostudies we used are relatively early in the pan-373 demic, we approximated reporting rate up until the time of 374 a serostudy as the reporting rate at the time of the seros-375 tudy. This offsets under-estimation of reporting rates early in 376 the pandemic when testing policies were largely symptomatic 377 and testing rates were low. For dates after the serostudy, 378 we allowed the reporting rate to vary with testing rate. The 379 parameters of the relationship between these two variables 380 were established by a log-log regression, illustrated in Figure . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted August 21, 2021. ; https://doi.org/10.1101/2021.08.18.21262248 doi: medRxiv preprint SARS-CoV-2 diagnostic testing in Africa: Needs and challenges SARS-CoV-2 seroprevalence worldwide: A systematic review and meta-448 analysis Variation in SARS-CoV-2 outbreaks across sub-Saharan Africa Epidemiologic surveillance for controlling Covid-19 pandemic: Types, challenges 454 and implications Covid-19 deaths in Africa: Prospective systematic postmortem 456 surveillance study COVID-19: Are Africa's diagnostic challenges blunting response effective-458 ness? The puzzle of the COVID-19 pandemic in Africa Challenges in interpreting SARS-CoV-2 serological results in African coun-462 tries Assessing the age specificity of infection fatality rates for atic review, meta-analysis, and public policy implications COVID-19 pandemic: The African paradox WHO -Global Health Observatory, Global health estimates: Leading causes of death MRC Centre for Global Infectious Disease Analysis Prevalence of SARS-CoV-2 in six districts in Zambia in July, 2020: A 471 cross-sectional cluster sample survey WC Government, Western Cape Covid-19 Dashboard | Covid-19 Response Coronavirus Pandemic (COVID-19), (Our World 474 in Data) Prevalence of SARS-CoV-2 infection in India: Findings from the national 476 Sero-prevalence of anti-SARS-CoV-2 Antibodies in Addis Ababa High SARS-CoV-2 IgG/IGM seroprevalence in asymptomatic Con-480 golese in Brazzaville, the Republic of Congo Seroprevalence of COVID-19 in Niger State Seroprevalence of anti-SARS-CoV-2 IgG antibodies in Kenyan blood donors COVID-19 Pandemic Situation Update as at 1/05/2020 -Kenya Kenya Ministry of Health, COVID-19 Outbreak in Kenya SITREP COVID-19 N°111 du 16 Novembre 2020 Nigeria Cetre for Disease Control, Confirmed COVID19 Cases -Nigeria Demographic Statistics Bulletin Provinces at a Glance: Community Survey Feasibility of using a World Health Organization-standard methodology 501 for Sample Vital Registration with Verbal Autopsy (SAVVY) to report leading causes of death 502 Results of a pilot in four provinces Ministry of Health, Zambia, Zambia Population-based HIV Impact Assessment (ZAMPHIA) 504 2016: Final Report, (Lusaka, Ministry of Health)