key: cord-0631565-tox1bfxd authors: Cheng, Cheng; Zhou, Helen; Weiss, Jeremy C.; Lipton, Zachary C. title: Unpacking the Drop in COVID-19 Case Fatality Rates: A Study of National and Florida Line-Level Data date: 2020-12-09 journal: nan DOI: nan sha: ee6a6d255862cdb782ca5ddef2c9cbf158725d72 doc_id: 631565 cord_uid: tox1bfxd Since the COVID-19 pandemic first reached the United States, the case fatality rate has fallen precipitously. Several possible explanations have been floated, including greater detection of mild cases due to expanded testing, shifts in age distribution among the infected, lags between confirmed cases and reported deaths, improvements in treatment, mutations in the virus, and decreased viral load as a result of mask-wearing. Using both Florida line-level data and recently released (but incomplete) national line level data from April 1, 2020 to November 1, 2020 on cases, hospitalizations, and deaths--each stratified by age--we unpack the drop in case fatality rate (CFR). Under the hypothesis that improvements in treatment efficacy should correspond to decreases in hospitalization fatality rate (HFR), we find that improvements in the national data do not always match the story told by Florida data. In the national data, treatment improvements between the first wave and the second wave appear substantial, but modest when compared to the drop in aggregate CFR. By contrast, possibly due to constrained resources in a much larger second peak, Florida data suggests comparatively little difference between the first and second wave, with HFR slightly increasing in every age group. However, by November 1st, both Florida and national data suggest significant decreases in age-stratified HFR since April 1st. By accounting for several confounding factors, our analysis shows how age-stratified HFR can provide a more realistic picture of treatment improvements than CFR. One key limitation of our analysis is that the national line-level data remains incomplete and plagued by artifacts. Our analysis highlights the crucial role that this data can play but also the pressing need for public, complete, and high-quality age-stratified line-level data for both cases, hospitalizations, and deaths for all states. : From left to right: con rmed cases, deaths, and case fatality rate, calculated using 7-day trailing averages based on national reporting data available via USAFacts [USAFacts, 2020] and pulled from the Carnegie Mellon Delphi project's COVIDcast API [Project, 2020] . Data outside the April 1st to November 1st time range considered in this study is grayed out. Over the past year, the coronavirus disease (COVID-19) pandemic has continually evolved, with disease outbreaks expanding and contracting (and expanding yet again); lockdown measures tightening and loosening; testing capacity increasing (and occasionally decreasing, due to production shortages [Wu, 2020] ); and treatments protocols evolving. With lives and livelihoods in the balance, public o cials, clinicians, and business leaders have tried to maintain a grasp of the the rapidly unfolding situation, looking to the publicly reported aggregate data to ask the key questions that could guide policy decisions: Have new treatment protocols improved outcomes over time? Is COVID-19 fatality decreasing overall? How does the infection rate today compare to at previous dates when alternative lock-down protocols were in place? To what extent are the rising con rmed case numbers seen during the second peak attributable to increased infections versus to higher detection rates due to increased testing capacity? Depending on the answers to these questions, local o cials might consider changing lockdown measures, hospitals may need to allocate more resources, and business leaders might decide to adjust corporate policies and services. Consider the two most widely reported statistics, con rmed cases and con rmed deaths. At rst glance, the two appear to tell radically di erent stories about the trajectory of the pandemic over time. Con rmed cases appear to have rst peaked in early April with a 7-day average of nearly 32, 000 daily cases, followed by a much larger second wave with 7-day averages peaking at nearly 67, 000 daily cases ( Figure 1 , left panel). However, reported deaths appear to tell a contradictory story concerning the relative severity of the two waves ( Figure 1 , middle panel). Overall the case fatality rate, calculated by dividing 7-day averaged COVID-19 con rmed deaths by con rmed cases on each day, appears to have fallen dramatically between the rst wave and second wave, from nearly 7.9% at the height of the rst wave in mid-April to the 1%-2.5% range in mid-July, where it has been comparatively stable for the last two months (Figure 1 , right panel) 1 . In this paper, we set out to answer the question that immediately follows: what explains the movement (and apparent overall decline) in case fatality rate over the course of the COVID-19 pandemic? There are several plausible explanations, each with signi cant policy implications for stakeholders. So far, the public discourse appears to center around following hypotheses: ( 1) The age distribution among infected patients has shifted, altering the fatality rate signi cantly due to the comparative higher risk among the geriatric populations [Thompson, 2020 , Whet, 2020 , Horwitz et al., 2020 . ( 2) Testing capacity has gone up signi cantly, with case fatality rate driven down primarily due to the rising number of tests performed [Fan et al., 2020, Madrigal and Moser, 2020] . ( 3) Apparent shifts in case fatality rate, are artifacts due to the delay between detection and fatality [Thompson, 2020 , Madrigal, 2020 . ( 4) Treatment has improved as doctors grow more experienced and new therapeutics become available, lowering the fatality rate over time [Levy, 2020 , Horwitz et al., 2020 , Beigel et al., 2020 , Group, 2020 , Self et al., 2020 . ( 5) The disease itself is mutating, leading to changes in the actual infection fatality rate over time [Pachetti et al., 2020 , Fan et al., 2020 . ( 6) Social distancing precautions have reduced the viral load that individuals are exposed to, resulting in less severe infections [El Zein, Pachetti et al., 2020 , Piubelli et al., 2020 . Note that when the age distribution shifts to a younger demographic (H1), the dropping aggregate case fatality rate can be misleading due to di erences in fatality between di erent age groups. Additionally, the next two phenomena-testing ramp-up and delays between detection (H2) and fatality (H3)-can cause the behavior of case fatality rate to diverge substantially from the behavior of the true infection fatality rate. Thus, if policy-makers use the aggregate case fatality rate to represent the severity of COVID-19, this could result in misinformed policy decisions. On the other hand, the last three phenomena-improved treatments, disease mutation, and changing viral load-impact both the case fatality rate and the actual infection fatality rate, and could be reasonable grounds for policy considerations such as re-opening. In this paper, we demonstrate how given accurate, su ciently granular data, the rst three phenomena ("artifacts") can be accounted for when attempting to quantify improvements in treatment (H4). We note, however, that without additional data, H5 and H6 cannot be decisively separated out from H4. In particular, we argue that complete and accurate age-strati ed hospitalization data are pivotal for distinguishing true improvements from artifacts. Hospitalizations should be less in uenced by testing capacity than con rmed cases, and less in uenced by treatment e cacy than deaths. Additionally, compared to the general population, testing among the inpatient population has been relatively thorough (compared to the general population) throughout the course of the pandemic. While there may have been changes in admitting criteria at the very worst moments [Phua et al., 2020] [Cohen et al., 2020] , for example, when New York hospital demand exceeded capacity in late March, for the most part, criteria for inpatient hospitalization is relatively consistent across time periods. Regrettably, however, reliable line-level age-strati ed hospitalization data is not yet publicly available for most states [Ornstein, 2020 , Murray, 2020 , leaving central questions unresolved. We center our analysis on (1) CDC have demonstrated an uncommon openness by sharing line-level data, including date of detection, demographics including age and gender, and indicators of eventual hospitalization and death. The line-level nature of the data allows us to perform a cohort-based analysis, generating descriptive statistics comparing case con rmations, hospitalizations, and deaths, among cohorts of patients de ned by the date on which their infection was detected. By contrast, reported case fatality rates are typically not cohort-based-the patients whose deaths are reported in the numerator are not in general the same patients whose con rmed infections show up in the denominator. Because case con rmation tends to precede reported deaths, these signals tend to be misaligned and are subject to uctuation, even if the actual case fatality rate were xed (so long as incidence does change). Line-level data enable us to circumvent this problem. Moreover, the availability of age and gender data allows us both to track demographic shifts over time, and to perform age-strati ed analyses of fatality rates. Importantly, this analysis yields several important observations: (i) testing increased between the rst and second waves, but does not explain away these waves; (ii) since age distributions shifted substantially between the rst and second waves, age must be accounted for in order to separate out the e ects of treatment from age shift; (iii) age-strati ed hospitalization fatality rates improved substantially between the rst and second wave in the national data (with HFR decreasing by as little as 24% in the 80+ age group and as much as 50% in the 30-39 age group), but by contrast were relatively unchanged between the rst and second wave in Florida (with a slight increase in HFR by as little as 2% in the 80+ age group and as much as 12% in the 60-69 age group); (iv) nationally, the relative drop in HFR appears to be negatively associated with age (improvements in fatality rate have disproportionately bene ted the young); (v) by November 1st, both Florida and national data suggest signi cant decreases in HFR since April 1st-at least 42% in Florida and at least 45% nationally in every age group; and (vi) comprehensive age-strati ed hospitalization data is of central importance to providing situational awareness during the COVID-19 pandemic, and its lack of availability among public sources for most states (and the extreme incompleteness of national data) constitutes a major obstacle to tracking and planning e orts. Throughout the course of the pandemic, several treatments have been proposed, with randomized controlled trials designed to test for their e ectivenesss. Dexamethosone, a corticosteriod commonly prescribed for other indications, resulted in a lower 28-day mortality among patients hospitalized with COVID-19 and receiving respiratory support [Group, 2020] . Among adults hospitalized with COVID-19 who had evidence of lower respiratory tract infection, broad-spectrum antiviral remedisivir was associated with shortened time to recovery [Beigel et al., 2020 , Madsen et al., 2020 . Clinical trials for hydroxychloroquine [Self et al., 2020 , Horby et al., 2020 and convalescent plasma [Agarwal et al., 2020] found no positive results in prevention of further disease progression or mortality. Recently in November, (outside our study time period), monoclonal antibody treatments bamlanivimab and the combination therapy casirivimab and imdevimab were approved for emergency use authorization [U.S. Food and Drug Administration, 2020a,b]. Unlike dexamethosone and remdesivir, these therapies are not recommended for hospitalized patients [Dyer, 2020] , but instead have been shown to have greatest bene t in unadmitted COVID-19 patients likely to progress to severe COVID-19 (for bamlanivimab) [Chen et al., 2020] , and in patients who have not yet mounted their own immune response or who have high viral load (for casirivimab and imdevimab) [Regeneron, 2020] . While these clinical trials have evaluated the e ects of speci c treatments in their identi ed target populations, we are interested in the broader impacts of treatment improvements over time as they have been used in practice at a larger scale. One way to get a holistic sense of improvements over time is by examining fatality rates. In a study of 53 countries, all but ten were found to have lower case fatality rates in the second wave compared to the rst [Fan et al., 2020] . In a study conducted among patients in England admitted to critical care between March 1st until May 30th, it was found that after adjusting for age, sex, ethnicity, comorbidities, and geographic region, mortality risk in mid-April and May was markedly lower compared to earlier in the pandemic. Among hospitalized COVID-19 patients in a single health system in New York City, Horwitz et al. [2020] demonstrate that after adjusting for age, sex, ethnicity, and several clinical factors, mortality between March 1st and June 20th decreased but not as much as observed before adjusting for these factors. We note that while these studies provide thorough estimates of mortality for their respective regions and for their speci c time periods, we analyze data that purportedly captures all of Florida and most of the United States, and employ a method which allows us to estimate the trend between any pair of dates without re-tting. While our data does not contain su cient information about viral mutation patterns and social distancing resulting in reduced viral loads (H5 and H6), there has been limited work investigating their impact. Among seven di erent countries, Pachetti et al. [2020] quantify the drop in mortality, and nd a correlation between declining CFR and employing strict lockdown policies as well as widespread PCR testing. At a hospital system in northern Italy, Piubelli et al. [2020] found that among patients diagnosed with COVID-19 in their emergency room, the proportion of patients requiring intensive care decreased over time, along decreasing median values of viral load. Our analysis does not attempt to separate out the e ects of viral mutations (H5) and changing viral loads (H6), but we note that these are all factors that can a ect change in the true infection fatality rate, and therefore can be re ected in our estimates as well. The two main data sources that we use are the Florida COVID-19 Case Line data [Florida Department of Health, 2020] released by (FDOH), and the national COVID-19 Case Surveillance Data [CDC Case Surveillance Task Force , 2020] released by CDC. Cohort Selection All cases from Florida are COVID-19 cases con rmed with a PCR-positive lab result. In order to conduct comparable analyses, we also lter the national data to cases that are con rmed with a positive PCR result. We lter for only the cases identi ed between March 26th, 2020 and Novermber 1st, 2020 to ensure that at the time of our analysis, each patient has had at least 30 days to have their hospitalization or death recorded from their initial case con rmation date (Florida) or CDC report date (national). Since the CDC data is released on a monthly basis, this is the widest time range of data available at the time of our analysis. For the national CDC data, three states (NJ, IL, and CT) appear to have all of their cases reported on one or two dates (Appendix Figure 10 ). Therefore, we removed them from our analysis. Pre-processing Each case is labeled with whether the patient was eventually hospitalized or deceased. These labels take on four categories: "yes", "no", "unknown" and "missing" (Table 1 ). In both the Florida and national data, unknown corresponds to checking an "unknown" box, whereas missing corresponds to leaving the question empty. Note that 42.5% of the Florida cases have unknown hospitalization data, and 46.3% of the national cases have unknown or missing hospitalization data. For our analyses, we coded the "unknown" and "missing" categories into "no". Demographics The demographics of the Florida and national cohorts can be found in Table 2 . To supplement our analysis, we use two secondary data sources: (1) COVID-19 testing data from COVID Tracking Project [Meyer and Madrigal, 2020] ; and (2) COVID-19 con rmed cases and deaths from USAFacts [USAFacts, 2020] and pulled from the Carnegie Mellon Delphi project's COVIDcast API [Project, 2020] . In contrast to our line-level data, these two data sources provide incidence data (e.g. new deaths that day). We visualize this data across a larger time period between March 9th and December 1st, but gray out the period outside of our study's time range (between April 1st and November 1st). To view the progression of the pandemic over time with reduced noise, for each date, we compute the 7-day lagged average for COVID-19 cases, hospitalizations, deaths, and tests for Florida and the nation. From this point in the paper on, whenever we directly use COVID-19 cases, hospitalizations, deaths and tests or calculate CFR, HFR, and positive test rates based on them, unless otherwise stated, we are referring to the smoothed signal. For the Florida FDOH and national CDC data, we collect data extending back to March 25th in order to conduct our analyses on the April 1st to December 1st time range. For the two secondary data sources, we collect data extending back to March 9th in order to visualize the data starting from March 15th. We argue that three main phenomena fuel a dramatic "arti cial" decrease in CFR: increased testing capacity (H1), shifting age distributions (H2), and delays between detection and fatality (H3). To establish the phenomenon of increased testing capacity, we visualize 7-day lagged averages of testing capacity and the proportion of tests returning positive in Florida and the United States using data from The COVID Tracking Project [Meyer and Madrigal, 2020] . To avoid artifacts from increased testing (H1), when considering treatment improvements, we examine changes in hospitalization fatality rates rather than case fatality rates. To establish and account for shifting age distributions (H2), we examine cases, hospitalizations, and deaths strati ed by age groups: 0-9, 10-19, 20-29, 30-39, 40-49, 50-59, 60-69, 70-79 , and 80+. Naturally, age strati cation reduces the amount of data that goes into each estimate, so when computing estimates of HFR we omit results that are based on fewer than two deaths. Finally, to account for delays between detection and fatality (H3), we take advantage of line-level data for each individual. For each date, we extract the cohort of individuals con rmed positive on that date, as well as whether those individuals eventually died or were hospitalized. To provide enough time for patients' eventual hospitalizations or deaths to be recorded, we lter out rows with positive specimen dates within 30 days of the last time the data was updated. Since CDC data was last updated on December 4th, we only use data from November 1st or earlier. Florida data is updated daily, but we use the same time range as the CDC data in order to make the plots comparable. Taking the above three adjustments into account, our primary quantity of interest for treatment improvements is the age-strati ed HFR. For the rest of the paper, we de ne CFR and HFR at day as follows: CFR = cases con rmed (or reported) at day that eventually die cases con rmed (or reported) at day HFR = cases con rmed (or reported) at day that eventually get hospitalized and die cases con rmed (or reported) at day that eventually get hospitalized Here, "eventually" means that the hospitalization or death was recorded by the date of data collection (i.e. December 4th), which gives each patient at least 30 days after their case con rmation/report date to have the events recorded. Thus far, news and academic sources have highlighted three main "true improvements": improvements in treatment (H4), disease mutations (H5), and reduced viral loads due to social distancing (H6). We seek to quantify treatment improvements (H4) by computing the decrease in hospitalization fatality rate. Although practice is constantly evolving, major improvements in treatment in our study time range such as dexamethasone [Group, 2020] and remdesivir [Beigel et al., 2020] have primarily targeted hospitalized patients, and we expect that improvements due to those treatments should be re ected in the HFR. In order to quantify the change in HFR with uncertainty, we use a block-bootstrapping technique with post-blackening [Davison and Hinkley, 1997 ]. This involves tting cubic smoothing splines to each age group's 7-day averaged HFRs, computing the residuals, block-bootstrap resampling 1000 replicates of the residuals, and adding the residuals back onto the tted splines in order to create 1000 replicates sampled from the original data distribution. By re-tting cubic smoothing splines to each of these datasets, we can estimate each day's HFR with 95% con dence intervals. See Appendix D for details of the procedure applied to our data. In both the FDOH and the CDC datasets, one can discern two waves of COVID-19 cases, the rst occurring mid-April and the second occurring mid-July ( Figure 3 ). Cases appear to be on the rise leading up to November, however, they have not yet reached a peak within the study time range. Overall, we nd strong evidence for peaks in COVID-19 infections in mid-April and mid-July, with Florida undergoing a more severe second peak than its rst peak (Appendix Figure 5a ), whereas nationally in aggregate the second peak appears less severe than the rst peak (Appendix Figure 6 ). Increased Testing While testing increased by 696% in Florida and by 435% in the country between the rst and second waves (Figure 2 , left and middle panel), our data shows that the increase in testing cannot fully account for the second peak. Despite increased testing in ating the raw number of cases, we still observe two peaks in positive test rates in April and July (Figure 2, right panel) . In Florida, the second peak is larger than the rst, whereas nationally the second peak is smaller than the rst. Leading up to November, positive test rates have been rising both in Florida and nationally. Cases Across all age strata, Florida's second peak is much more severe than the rst peak ( Figure 3a , left panel), in contrast to the di erences between the two peaks of the national data ( Figure 3b , left panel). In aggregate, Florida cases have risen by 943% (Appendix Figure 5a , left panel), whereas national cases have increased by 97% (Appendix Figure 3b, left panel) . Note that the national aggregate includes data from populous states hard-hit in the rst wave (e.g. New York), so it is no surprise that the national picture is markedly di erent from that of Florida. Towards November, there is a rise in cases both in Florida and nationally. Hospitalizations and Deaths Overall, hospitalizations and deaths corroborate the story told by positive test rate. In Florida, hospitalizations and deaths again indicate a more severe second peak than rst peak (Figure 3a , center and right panels), though the contrast in peak size is not as dramatic as in the plot of cases. In the national data, the second peak is actually smaller than the rst peak (Figure 3b , center and right panels). Both of these discrepancies of trends seen in cases versus in hospitalizations and deaths are likely attributable to increases in testing (Figure 2 ). Age Between the two peaks, the age distribution of cases shifted substantially, with the median age in Florida changing from 52 to 40, and the median age group in national data falling from 50-59 to 30-39. Since the second peak, the age distributions of cases, hospitalizations, and deaths have continued to uctuate Gender While age appears to vary substantially over time, the gender ratios in each age group's cases, hospitalizations, and deaths appear relatively at over time (Appendix Figure 7) . Thus, in our plots we choose not to stratify by gender due to the reasonably small shift in the gender distribution over time, and practically to have enough support in each group. Consistent with ndings that increased age is associated with higher mortality rates [Mahase, 2020] , we observe that as the age of the group increases, the corresponding HFR increases (Tables 3 and 4 ). Measuring treatment improvements by HFR drop (computed as HFR − HFR HFR ), we also observe that in the national data, treatment improvements between the peaks become smaller with increased age. Between the two peaks (April 15th to July 15th, Table 3 ), the national age-strati ed HFR estimates from block bootstrapping decreased by as little as 24% in the 80+ age group, and as much as 50% in the 30-39 age group. On the other hand, in Florida the age-strati ed HFR actually increased in each age group by as little as 2% in the 80+ age group, and as much as 12% in the 60-69 age group. Note that the HFR changes between peak dates in Florida are an example of Simpson's paradox, where in each age group the HFR increased, but the aggregate HFR actually decreased by 2.6%. Compared to peak-to-peak changes, changes across the entire time range (April 1st to November 1st, Table 4 ) show a more dramatic decrease. In Florida, the HFR drops by as little as 42% in the 80+ age range, and as much as 53% in the 70-79 age range. Nationally, the HFR drops by as little as 45% in the 80+ age groups, and as much as 73% in the 40-49 age group. As the age group gets older, we again see an increase in age-strati ed HFR and smaller treatment improvements as measured by HFR drop. While we focus on the two peaks and the endpoints of the study time range, we also include plots of HFR estimates with uncertainty for all dates between April 1st and November 1st in Appendix D. Consistent with our point estimates, the overall HFR in Florida appears relatively at until August, in which the HFR decreases greatly across all age groups. In the national data, there appears to be an almost monotonic decline in HFR across all age groups for the entire time range, with the decrease tapering out in August. When stratifying by gender in addition to age, the conclusions surrounding drops in HFR are similar to those when just stratifying by age (see Appendix E). In this paper, we unpack the apparent improvement in fatality rates to hone in on improvements that could reasonably be attributed to advances in treatment. We account for shifting age distributions (H1) by age-stratifying, increased testing capacity (H2) by focusing on the hospitalized, and the delay between detection and fatality (H3) by conducting a cohort-based analysis. We demonstrate that increased testing does not explain away the rst and second peaks due to corresponding peaks in hospitalizations, deaths, and test positivity rates (Figures 2 and 3) . We visualize the shifting age distributions in cases, hospitalizations, and deaths over time (Figure 4) , and we quantify the decrease in age-strati ed HFRs between the two peaks (April 15th and July 15th), and between April 1st and December 1st. Putting all of these analyses together, we arrive at the following narrative: At the beginning of April, testing was relatively sparse (Figure 2) . Cases, hospitalizations, and deaths were rising, and reached peak levels on April 15th (Figure 3 ). Roughly one in every ten tests was coming back positive in Florida, and one in every ve tests was coming back positive nationally. In Florida, the aggregate HFR was approximately 24%, with age-strati ed HFRs ranging between 9.2% for the 50-59 age group to 47% for the 80+ age group (Table 3) . Across the country, the aggegate HFR was at approximately 30%, while the age-strati ed HFRs ranged between 5.5% for the 30-39 age group and 57% for the 80+ age group (Table 3) . In each age group, the national HFR was higher than the Florida HFR, which could be due to overwhelmed hospital systems in states which were particularly hard hit during the rst wave (e.g. New York). In fact, 48% of national CDC cases between April 1st and April 15th were recorded in New York alone (Appendix Figure 11 ). Over the next three months, the proportion of younger individuals with COVID-19 grew steadily ( Figure 4 ). Testing continued to ramp up across the nation, and spiked in Florida as it approached a much heavier second peak around July 15 ( Figure 2 ). Note, however that the corresponding positive test rates were also at an all-time high. Florida experienced record hospitalizations and deaths, and the age-strati ed HFR was every bit as high as in the rst wave, perhaps even higher (Table 3 ). While Bill Gates had publicly argued that due to improvements in treatment attributable to dexamethosone and remdesivir, "We've had a factor-of-two improvement in hospital outcomes already," [Levy, 2020] this did not yet appear to be the case in Florida. (Alternatively, it is also possible that treatment improvements might have been perfectly counterbalanced by the challenges of peak demand on the hospital system.) Nationally, on the other hand, cases in New York had diminished (Appendix Figure 11 ) and were starting to surge in other states, forming a smaller second peak (as measured by hospitalizations and deaths). Between the rst peak and the second peak, the national HFR had dropped by 39% in aggregate, while the drop for age-strati ed HFRs ranged between 50% in the 59-age group and 24% in the 80+ age group. The di erent stories told here by Florida and the national aggregate data underscore the importance of state-level rather than national analysis. Finally, come November 1st, age-strati ed HFRs in both Florida and the national aggregate data appear to have dropped signi cantly, likely indicating treatment improvements (though possibly confounded by disease mutations (H5) and reduced viral loads (H6)). We observe that in the national data, an increase in age corresponds to a small relative drop in HFR. Thus insofar as postulated improvements in treatment might guide changes in policy, the comparatively small bene ts in the elderly population should be taken into account. (Note however that the younger groups have small HFRs to begin with, so the opposite trend may appear when considering absolute improvements rather than relative improvements.) Since April 1st, the age-strati ed HFR in Florida had decreased by as much as 53% in the 70-79 age group and as little as 42% in the 80+ age group. In the national data, the drop in age-strati ed HFR was as much as 73% in the 40-49 age group and as little as 45% in the 80+ age group. In reference to the case fatality ratio [McDonald, 2020] , on July 27, President Donald Trump stated in a press brie ng that "Due to the medical advances we've already achieved and our increased knowledge in how to treat the virus, the mortality rate for patients over the age of 18 is 85 percent lower than it was in April" [Trump, 2020] . Note, however, that none of our ndings in Florida or nationally are as large as the 85% touted by President Trump (and re-running the analysis on patients 18+ does not change this). This emphasizes how the aggregate CFR can be misleading if age distribution shift, increased testing, and delays between detection and fatality are unaccounted for. As far as we are aware, our analysis is rst to explicitly account for age distribution shift, increased testing, and delays between detection and fatality. We recommend that policy makers account for at least these three factors, and show how in the absence of these considerations it is easy to be misled. More broadly, we advocate for more reliable, age-strati ed hospitalization data from each state. This would paint a much clearer picture when assessing the state of treatment improvements, and better inform both hospitals and policy-makers. We aim to quantify treatment improvements (H4) in Florida and the United States by estimating changes in the age-strati ed HFR. However, changes in the age-strati ed HFR could also be in uenced by disease mutation (H5) and changing viral loads (H6). To distinguish their e ects in future work, we would need additional data quantifying these factors. Furthermore, while we listed the six main hypotheses we found in our literature search of the decrease in case fatality rate, it is possible that alternative explanations may arise in the future, and those may need to be accounted for as well. Additionally, we assume that treatment improvements will be re ected in the HFR because over our study's time range, major treatment improvements (e.g. dexamethosone and remdesivir) targeted hospitalized patients. For future studies, we note that monoclonal antibody therapies bamlanivimab and the combination of casirivimab and imdevimab were recently authorized for emergency use in November [U.S. Food and Drug Administration, 2020a,b], and unlike dexamethosone and remdesivir, they are not intended for hospitalized patients and instead recommended for patients likely to progress to severe COVID-19 and/or hospitalization. Thus, its e ects may not be directly re ected in the HFR. There are also several limitations which arise from data quality. In both the Florida FDOH data and national CDC data, missingness for hospitalization and death are high (Table 1) , potentially introducing bias in the estimates for HFR if the data are not missing at random. To make matters worse for the national CDC data, state-level plots of cases seem to indicate that each state may have di erent patterns of reporting their data to the CDC. First, the reported CDC cases appear to be incomplete for several states. In the subset of CDC data reported from Florida, the counts only account for 69% of the cases provided by the Florida Department of Health, and even after 7-day smoothing, the cases appear to be reported sporadically (Appendix Figure 5 ). Cross-referencing with the COVIDcast API 2 , we nd that in the subset of CDC data from Texas, only 4.7% of the cases and only 0.009% of the deaths are accounted for [Project, 2020] . In fact, in the subset of the CDC data from Texas, only 11 hospitalizations were recorded across the entire studied time range. In addition to missing cases, hospitalizations, and deaths, the CDC dataset has 63.5% missingness for the positive specimen received date and 49.0% missingness for the symptom onset date. While using the positive specimen received date would make the dates more comparable to those in Florida, we observe that more than half of the states have greater than 90% missingness in the positive specimen date eld (in fact in Florida it is 99.9% missing). Due to these high levels of missingness not at random, we chose to use the CDC report date which is reported for all rows. While the CDC report date does not have missing values, the daily cases based on CDC report date have spikes at certain days, which might be indicative of the reporting agency submitting several cases to CDC on the same day rather than reporting daily. While we excluded data from the states with the most extreme artifacts (Appendix Figure 10) , there are still spikes in the remaining states (Appendix Figure 9 ). Thus, in our national analysis, we rely on the hope that in aggregate, the signal will outweigh the noise. Despite the data limitations, the CDC data appears to be the best available source of line-level cases needed for cohort-based analysis across the United States. We note that in the Florida FDOH data, on the other hand, we use the positive test con rmed date which is not missing at all in this data, making the Florida estimates more reliable than those from the national data. Age-stratified trend approximated by smoothing splines (a) Estimate of trend in age-strati ed Florida HFRs, t using cubic smoothing splines. Age-stratified trend approximated by smoothing splines (a) Estimate of trend in age-strati ed national HFRs, t using cubic smoothing splines. Convalescent plasma in the management of moderate covid-19 in adults in india: open label phase ii multicentre randomised controlled trial Remdesivir for the treatment of covid-19-preliminary report CDC Case Surveillance Task Force . Covid-19 case surveillance data Sars-cov-2 neutralizing antibody ly-cov555 in outpatients with covid-19 Coronavirus disease 2019 (COVID-19): Outpatient evaluation and management in adults Bootstrap Methods and their Application. Cambridge Series in Statistical and Probabilistic Mathematics Covid-19: Eli lilly pauses antibody trial for safety reasons Declining trend in the initial sars-cov-2 viral load over time: Observations from detroit, michigan Decreased case fatality rate of covid-19 in the second wave: A study in 53 countries or regions. Transboundary and emerging diseases Florida Department of Health. Florida case line data Dexamethasone in hospitalized patients with COVID-19-preliminary report E ect of hydroxychloroquine in hospitalized patients with covid-19: Preliminary results from a multi-centre, randomized Trends in covid-19 risk-adjusted mortality rates in a single health system. medRxiv Bill Gates on Covid: Most US Tests Are 'Completely Garbage A second coronavirus death surge is coming How many americans are about to die? Remdesivir for the treatment of covid-19-nal report Covid-19: death rate is 0.66% and increases with age, study estimates Trump misleads on reasons for falling covid-19 fatality rate Why can't we see all of the government's virus data? How Many People in the U.S. Are Hospitalized With COVID-19? Who Knows? Impact of lockdown on covid-19 case fatality rate and viral mutations spread in 7 countries in europe and north america Intensive care management of coronavirus disease 2019 (COVID-19): challenges and recommendations. The Lancet Respiratory Medicine Overall decrease in sars-cov-2 viral load and reduction in clinical burden: the experience of a hospital in northern italy Regeneron's casirivimab and imdevimab antibody cocktail for covid-19 is rst combination therapy to receive fda emergency use authorization E ect of hydroxychloroquine on clinical status at 14 days in hospitalized patients with covid-19: A randomized clinical trial Covid-19 cases are rising, so why are deaths atlining? Remarks by president trump in press brie ng on covid-19 Coronavirus (covid-19) update: Fda authorizes monoclonal antibody for treatment of covid-19 Coronavirus (covid-19) update: Fda authorizes monoclonal antibodies for treatment of covid-19 Coronavirus outbreak stats & data Why Changing COVID-19 Demographics in the US Make Death Trends Harder to Understand It's Like Groundhog Day': Coronavirus Testing Labs Again Lack Key Supplies We thank Professors Roni Rosenfeld, Cosma Shalizi, and Marc Lipsitch for their detailed feedback throughout this analysis and the drafting of this manuscript. To estimate the trend in HFR with uncertainty, we follow the steps below:1. For each age group, t a smoothing spline (3rd order) to the 7-day lagged average HFR. This provides an estimate of the trend. Now we would like con dence intervals around this trend.2. Take residuals from tting the cubic spline. Block-bootstrap sample 1000 replicates with 7-day block sizes. This gives us a dataset of residuals the same size as the original dataset.3. For each of the 1000 replicates, add the sample residuals onto the estimated trend in step 1. This is called "post-blackening" [Davison and Hinkley, 1997] , and it gives us 1000 new datasets drawn from the same distribution as the original time series assuming uncorrelated blocks of residuals.4. On each of the 1000 new datasets, re-estimate the trend using the smoothing spline. At every point in time, we can use the estimated trends from these 1000 replicates in order to get con dence intervals.Overall, the cubic smoothing splines appear to t the 7-day lagged average HFRs relatively well (Figures 12a and 13a ). In Florida, the estimates of HFR trend and uncertainty appear to show a relatively at trend for most of the time range, with a recent decline in HFR in the last two months (Figure 12b) . Nationally, the estimates of HFR trend and uncertainty appear to show a more consistent decline in HFR over the entire time range (Figure 13b ).