key: cord-0735725-xctq1cju authors: Pei, S.; Yamana, T. K.; Kandula, S.; Galanti, M.; Shaman, J. title: Overall burden and characteristics of COVID-19 in the United States during 2020 date: 2021-02-17 journal: nan DOI: 10.1101/2021.02.15.21251777 sha: 6c4d6384d922acc28c647b181e0b4560f0160a06 doc_id: 735725 cord_uid: xctq1cju The COVID-19 pandemic disrupted health systems and economies throughout the world during 2020 and was particularly devastating for the United States. Many of epidemiological features that produced observed rates of morbidity and mortality have not been thoroughly assessed. Here we use a data-driven model-inference approach to simulate the pandemic at county-scale in the United States during 2020 and estimate critical, time-varying epidemiological properties underpinning the dynamics of the virus. The pandemic in the US during 2020 was characterized by an overall ascertainment rate of 21.6% (95% credible interval (CI):18.9-25.5%). Population susceptibility at year end was 68.8% (63.4-75.3%), indicating roughly one third of the US population had been infected. Community infectious rates, the percentage of people harboring a contagious infection, rose above 0.8% (0.6-1.0%) before the end of the year, and were as high as 2.4% in some major metropolitan areas. In contrast, the infection fatality rate fell to 0.3% by year end; however, community control of transmission, estimated from trends of the time-varying reproduction number, Rt, slackened during successive pandemic waves. In the coming months, as vaccines are distributed and administered and new more transmissible virus variants emerge and spread, greater use of non-pharmaceutical interventions will be needed. progression in the future, it is vital that the epidemiological features that have supported these outbreaks be quantified and analyzed in both space and time. Here we use a county-resolved metapopulation model to simulate the transmission of SARS-CoV-2 within and between the 3142 counties of the US. The model depicts both documented and undocumented infections and is coupled with an iterative Bayesian inference algorithm-the ensemble adjustment Kalman filter-which assimilates observations of daily cases in each county and population movement between counties 11, 12 (see Supplementary Information) . The Bayesian inference supports a fitting of the model to case observations and estimation of unobserved state variables (e.g. population susceptibility within a county) and system parameters (e.g. the ascertainment rate in each county). The model fitting captures the 3 waves of the outbreak as manifest at national scales (Fig 1a) , as well as in major metropolitan areas throughout the country (Fig S1) . To further validate the simulations, we compared model estimates of cumulative infections to findings from US Centers for Disease Control and Prevention (CDC) seroprevalence surveys conducted at site and state levels 13 . The seroprevalence data, which provide an out-of-sample corroboration of the model fitting, were adjusted for the waning of antibody levels following adaptive immune response 14, 15 (see Supplementary Information, Figs S2-S3). Model estimates of cumulative infected percentages are well aligned with adjusted seroprevalence estimates from the CDC 10-site survey across sites and through time (Pearson r=0.97, mean absolute error (MAE)=1.34%) (Fig 1b) and are similarly well matched to adjusted estimates at the state level ( Fig S4) . A critical feature of SARS-CoV-2 is its ability to infect and transmit largely from individuals never diagnosed with the virus 5 . The model structure and fitting enable estimation of the ascertainment rate, the percentage of infections confirmed diagnostically, at county scales. The national population-weighted ascertainment rate averaged for all of 2020 was 21.6% (95% credible interval (CI): 18.9 -25.5%). This national ascertainment rate increased from 11.2% (8.3 -15 .7%) during March 2020 to 24.3 % (18.4 -32.1%) during December 2020 (Fig 1c) . The increase through time is a likely by-product of increasing testing capacity, a relaxation of initial restrictions on test usage, and increasing recognition, concern and care-seeking among the public. We additionally focus on 5 metropolitan areas within the US. Small differences in the ascertainment rate manifest across these areas; in particular, ascertainment rates for Phoenix and Miami are higher than the national average for much of the year, whereas New York City, Chicago, and Los Angeles are consistently below the national average. At the national level, three pandemic waves are clearly evident during spring, summer and fall/winter (Fig 1a) ; however, among the 5 focus metropolitan areas the structure differs with New York and Chicago experiencing strong spring and fall/winter waves but little activity during summer, Los Angeles and Phoenix undergoing summer and fall/winter waves, and Miami experiencing all 3 waves (Fig S1) . Los Angeles county, the largest county in the US with a population of more than 10 million people, was particularly hard hit during the fall/winter. The differences in virus activity produced different cumulative infection numbers through time (Fig 2a) . Population susceptibility at the end of the year was 68.8% (63.4 -75.3%) for the US, and among the focal metropolitan areas ranged from 47.9% (39.6 -54.9%) in Los Angeles to 71.4% (65.4 -74.2%) in Phoenix. Though there is variability among counties, a substantial portion of the US population (68.8%) had not been infected by the end of 2020; however, pockets of lower population susceptibility, which are evident in the southwest and southeast on August 1st (Fig. 2b) , expanded considerably by December 1st (Fig 2c) . In particular, areas of the upper Midwest and Mississippi valley, including the Dakotas, Minnesota, Wisconsin and Iowa, are estimated to have population susceptibility below 40% as of December 1, 2020. The structure of the outbreak is evident in both incidence and prevalence estimates (Figs 3, S5, S6). Incidence indicates the daily number of newly infectious individuals-both those who will be confirmed COVID-19 cases and those whose infections will remain undocumented. The majority of infections each month are undocumented (Fig 3a) , as indicated by the low ascertainment rates (Fig 1c) . For all of 2020, an estimated 78.4% of infections in the US were undocumented. Estimates of daily prevalence provide a measure of the community infectious rate, the fraction of the population currently harboring a contagious infection. National SARS-CoV-2 prevalence increased to 0.78% (0.61 -1.00%) by December 31, indicating that roughly 1 in 128 persons was contagious (a similar percentage, 0.84% (0.53 -1.28%), was estimated to be latently infected, i.e. infected but not yet contagious) (Fig 3b) . Among the 5 focal metropolitan areas, prevalence varied considerably: in mid-November, Chicago reached a prevalence of 1.50% (1.28 -1.77%); whereas prevalence in Miami rose to 1.22% (1.00 -1.46%) during July. Los Angeles was even more burdened at the end of 2020 with a prevalence of 2.40% (2.00 -2.89%) as of 31 December (Fig S6) . The model fitting enables estimation of the case fatality rate (CFR) and the infection fatality rate (IFR). Both rates were highest at the national level at the beginning of the spring wave: the CFR was 8.83% (6.75 -11.40%) and the IFR was up to 0.96% (0.62 -1.53%) in March and April (Fig 3c) . Over the course of the year, with earlier diagnosis and treatment, improved patient care, and, in the case of CFR, increased reporting of mild infections, the CFR and IFR dropped to 1.16% (0.95-1.48%) and 0.28% (0.22 -0.35%) by December, respectively. Both rates varied by location and over time; for instance, intermediate drops of CFR and IFR began for Los Angeles, Phoenix and Miami prior to the summer wave in association with a decrease in the average age of documented infection ( Fig S7) . Overall, these findings delineate the mortality risk associated with infection broadly. The national IFR during the latter half of 2020 hovers around 0.30%, well above estimates for both seasonal influenza (<0.02%) 16 and the 2009 influenza pandemic (0.0076%) 17 . Given the high numbers of cases and deaths, and the successive pandemic waves, a central question is whether local populations within the US have responded to the growth of infections in their communities with improved control through non-pharmaceutical interventions (NPIs). Such control, effected through mask usage, social distancing, indoor ventilation, surface cleaning and restrictions on mass gatherings and other indoor activities, is reflected in the modulation of the time-varying reproduction number, Rt. A decrease of Rt over time suggests a community is improving control of the virus by regulation or public adoption of control measures. For each of the three waves during 2020, we identified counties that experienced 2 or more consecutive weeks of increasing (or decreasing) reported cases and also reported 15 daily cases per 100,000 persons at least once during the period. We then examined trends of Rt for each of these 3 periods. During the earliest period in the spring, Rt decreased when counties experienced case growth and case decline; however, the decline in Rt is more precipitous when counties experienced growth (Fig 4a) . During the second, summer period, the decline of Rt in counties with case growth starts at a lower level and is more muted, and counties experiencing declining case numbers instead have an increase of Rt. During the final fall/winter period, the patterns are similar to the summer period with Rt decreasing in counties experiencing case growth but decreasing in counties experiencing case decline. The same trends are present when averaging the counties to regional scales (Fig 4b,c) , as well as when using different case per capita thresholds (Figs. S8-S9). Overall, this analysis suggests a lessening ability or willingness to control SARS-CoV-2 in the US as the year progressed. Part of this year-long trend, particularly during the fall/winter wave, is likely due to seasonal effects that moderate Rt and are beyond direct human control. In particular, evidence suggests that SARS-CoV-2 is more transmissible when humidity levels are low 18 , as is the case during winter in temperate regions, and that people are bound to spend more time indoors during winter when temperatures are low. Both effects likely increased opportunities for transmission during the third wave and for the most part cannot be effectively counteracted. However, policies and behaviors, specifically the use of masks, social distancing, ventilation, restricting mass gatherings and indoor dining, etc., that limit opportunities for virus transmission are subject to regulation and individual choice. Individual control behaviors may have slackened towards the end of the year, and policies allowing indoor dining and other commercial activities were more common late in the year 19 . The relative contributions of seasonal versus behavioral effects on changing Rt trends across successive pandemic waves cannot be disentangled in this analysis. The US experienced the highest numbers of COVID-19 cases and deaths in the world during 2020. Our findings provide quantification of the time-evolving epidemiological characteristics associated with successive pandemic waves in the US, as well as conditions at the end of the year and prospects for 2021. Critically, despite more than 19.6 million reported cases at year's end, more than 68% of the population remained susceptible to viral infection. Several factors will considerably alter population susceptibility in the coming months. Firstly, ongoing transmission will infect naïve hosts and continue to deplete the susceptible pool. Secondly, as more vaccine is distributed and administered, more individuals will be protected against symptomatic infection and the IFR will decrease. Lastly, our model does not represent re-infection, either through waning immunity or immune escape; however, re-infection has been documented 20,21 , evidence of waning antibody levels exists 22, 23 , and new variants of concern have emerged 24,25 and will likely continue to do so. All these processes will affect population susceptibility over time and help determine when society enters a post-pandemic phase, the pattern of endemicity the virus ultimately assumes, and its long-term public health burden 26 . Detection of the virus in the US improved during 2020 with the ascertainment rate rising from less than 11% in March to nearly 25% in December. Still, the majority of infections remain undocumented, consistent with other estimates 27 . While many of these infections likely present with mild or no symptoms, they remain contagious and support undetected transmission of SARS-CoV-2 5 making control of the virus very challenging. Both CFR and IFR declined during 2020. IFR decreased from around 1% in March to about 0.25% in December. Earlier case detection and improved clinical care 28,29 likely contributed to the decline of both the CFR and IFR during 2020; however, both CFR and IFR are also highly age-dependent with older individuals at substantially greater risk of hospitalization and death 27,30 , so changes to the age distribution of infections over time may have also affected these rates. Our findings reveal how conditions associated with transmission, case numbers, susceptibility, mortality and control evolved during 2020. Considerable differences in the progression of the pandemic and its epidemiological features manifest in both space and time. In addition, local control efforts appear to have strengthened or slackened in response to increasing or decreasing cases within a locality. This variable responsiveness underscores the need for continued public health messaging emphasizing the maintenance of NPIs while vaccines are distributed and administered. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 17, 2021. ; https://virological.org/t/preliminary-genomic-characterisation-of-an-emergent-sars-cov-2lineage-in-the-uk-defined-by-a-novel-set-of-spike-mutations/563 (2020 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 17, 2021. ; . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 17, 2021. ; . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 17, 2021. ; . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 17, 2021. ; . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. We use a metapopulation SEIR model to simulate the transmission of COVID-19 in the US at county level. In this model, we explicitly simulate the transmission of documented and undocumented infections, for which separate transmission rates are defined. Further, we assume there is no re-infection of SARS-CoV-2, and that vaccine deployment prior to 31 December 2020 nominally affects population susceptibility. The model incorporates two types of human mobility across 3,142 US counties -regular daily commuting and diffusive random movement. Information on inter-county commuting is available from the US census survey 1 . During the daytime, commuters travel to counties where they work and mix with the population there; after work, they return home and mix with individuals in their home, residential county. Apart from regular commuting, a fraction of the population in each county, assumed to be proportional to the number of inter-county commuters, travels for purposes other than work. As the population present in each county is different during daytime and nighttime, we model the transmission dynamics of COVID-19 separately for these two time periods. Specifically, we formulate the transmission as a discrete Markov process during both day and night times. The transmission dynamics are depicted by the following equations. Daytime transmission: . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) Nighttime transmission: !" ( + 1) = !" ( + # ) + !" Here, !" , !" , !" % , !" ' and !" are the susceptible, exposed, documented (reported) infected, undocumented infected and total populations in the subpopulation commuting from county to county ( ← ); ! is the transmission rate of reported infections in county ; is the relative transmissibility of undocumented infections; is the average latency period (representing the period from infection to contagiousness); is the average duration of contagiousness; ! is the fraction of infections documented in county ; is a multiplicative factor adjusting random movement; 0 !" = ( !" + "! )/2 is the average number of commuters between counties and ; # = 1/3 day and + = 2/3 day are the durations of daytime and nighttime transmission; and ! & and ! * are the daytime and nighttime populations of county . We assume the !" % population is immobile and does not participate in human movement. To reflect the spatiotemporal variation . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 17, 2021. ; of disease transmission rates and reporting, we allowed the transmission rates and ascertainment rates to vary across counties and change over time. Daily work commuting: During the daytime, the population in location , , is the sum of individuals who both live and work in location , documented infected individuals who would otherwise commute to other locations ( ≠ ), and individuals who work in location from other locations ( ≠ ) but are not documented infections. Within the subpopulation !" , new infections derive from two processes: contact with documented and undocumented infections in location . For each susceptible individual in !" ( ), the chance of contact with documented infections is ∑ $! % ( ) is the total number of documented infections who would commute to all locations but have to stay in location , and the chance of contact with undocumented infections is ( ) is the total number of undocumented infections in location . Those contacts lead to new infections during a period of # day. Note this term captures the mixing of populations from different locations due to work commuting, and represents intra-county transmission during the daytime in location . Random movement: Apart from work commuting, during the daytime, # 0 !$ persons, drawn uniformly from the population present in location ( ≠ ) (except for documented infections) move to location and are randomly redistributed into the subpopulation there. Such population exchange exists for all pairs of locations. For example, for the susceptible population, we first compute the number of susceptible individuals entering into subpopulation !" ( ). In other locations ( ≠ ), the probability a random visitor is susceptible is ∑ $( ( ) is the total mobile population (i.e., total population minus documented infected population) in location . Therefore, the total susceptible population entering location is . Those individuals are redistributed into subpopulations present in location , where the fraction of people in subpopulation !" is ( !" − !" % ( ))/ ! & ( ). Finally, the number of susceptible individuals We then compute the number of susceptible individuals leaving !" ( ). The total number of individuals leaving location is ; the fraction of susceptible people from !" is As a result, the number of susceptible persons leaving !" ( ) is . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 17, 2021. ; Population exchange for other compartments can be computed similarly. Note there is no random movement in Eq. (3) as we assume documented infections, !" % , are immobile. We can write Eqs. (6-10) for nighttime transmission similarly. Prior to March 1, 2020, we used the commuting data from the US census survey to prescribe the inter-county movement in the transmission model. After March 1, the census survey data are no longer representative due to changes in mobility behavior following implementation of nonpharmaceutical interventions. We therefore used estimates of the reduction of inter-county visitors to points of interest (POI) (e.g., restaurants, stores, etc.) from SafeGraph 2 to account for the change of inter-county movement on a county-by-county basis. For instance, if the number of inter-county visitors in a county was reduced by 10% on a given day relative to the baseline on March 1, the number of commuters and random visitors to this county would be reduced by 10% accordingly. The transmission model generates daily confirmed cases for each county. To account for reporting delays, we mapped simulated documented infections to confirmed cases using a separate observational delay model. In this delay model, we account for the time interval between a person transitioning from latent to contagious (i.e., → % ) and confirmation of that individual infection. To estimate this delay period, % , we examined a U.S. line list data record consisting of 13.4 million confirmed cases until December 31, 2020 3 . We used a gamma distribution to fit the time-to-event distribution of the interval (in days) from symptom onset to case confirmation for each month in 2020. We modeled % by adding another 2.5 days to the mean periods of the obtained gamma distributions, as symptom onset is estimated to lag the onset of contagiousness 4 . The gamma distributions of % from April to December are provided in Table S1 . We calibrated the transmission model against county-level incidence data reported from 21 February 2020 to 31 December 2020, available at Johns Hopkins University coronavirus resource center 5 . Model parameters were estimated using a sequential data assimilation methodthe ensemble adjustment Kalman filter (EAKF) 6 , which is applicable to high-dimensional metapopulation models and has been successfully used to infer epidemiological parameters for a range of infectious diseases 7-11 . To represent the state-space distribution (including both parameters and variables), the EAKF maintains an ensemble of system state vectors acting as samples from the distribution. In particular, the EAKF assumes a Gaussian distribution of both the prior and likelihood and adjusts the prior distribution to a posterior using Bayes' rule: posterior ∝ prior × likelihood. For the observed variables (i.e., daily incidence), ensemble members are updated deterministically such that the higher moments of the prior distribution are preserved in the posterior. Unobserved variables and parameters are updated based on their covariability with the observed variable, which can be computed directly from the ensemble. Further details on the EAKF scheme can be found in Anderson 6 . We derived the estimate of model parameters by coupling the EAKF algorithm with the disease transmission model. To further reduce the number of unknown parameters in this highdimensional transmission model, we fixed disease-related parameters ( , , and ) and the . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 17, 2021. ; mobility factor ( ) as estimated using case data through 13 March 2020 12 . Specifically, these parameters were randomly drawn from the posterior distributions: = 3.59 (95% CI: 3.28 -3.99), = 3.56 (3.21 -3.83), = 0.64 (0.56 -0.70), and = 0.15 (0.12 -0.17). We performed EAKF inference each day using case data in 3,142 US counties to estimate the ascertainment rate ! and transmission rate ! . To account for the reporting delay of confirmed cases, at each daily model update, we integrated the model forward 9 days using the prior model state and used incidence 9 days ahead (i.e., roughly the modes of gamma distributions for delays) to constrain current model variables and parameters. To initialize the model-inference system, we seeded exposed individuals ( ) and undocumented infections ( ' ) in counties with at least five confirmed case. Specifically, we randomly drew and ' from uniform distributions [0, 12 ] and [0,10 ] 9 days before the first date with more than five reported cases, , , where is the total number of reported cases between day , and , + 4. This setting provides a broad seeding range for US counties. The prior ascertainment rates were drawn from a distribution with a median value = 0.080 (0.069 -0.093), estimated using case data prior to 13 March 2020. The prior transmission rates were scaled on the basis of the local population density: ! = , × log #, ! /median(log #, ), where ! is the population density in county , median(log #, ) is the median value of log-transformed population density among all counties, and , = 0.95 (0.84 -1.06) is the baseline transmission rate estimated before 13 March 2020. For counties that reported less than 20 cumulative cases as of 15 March 2020, we reduced the prior transmission rate by half to reflect the impact of nonpharmaceutical interventions implemented after the announcement of national emergency. We performed the inference using 100 ensemble members. In this study, we report the transmission dynamics in and around five major US cities: New York, Chicago, Los Angeles, Phoenix, and Miami. Characteristics of SARS-CoV-2 transmission (e.g., ascertainment rate, CFR, and IFR) in these metropolitan areas were aggregated from county-level estimates. Note that in some cases the metropolitan areas as defined here differ from the formal metropolitan statistical areas delineated by the United States Office of Management and Budget. The counties we include in the metropolitan areas for this analysis are: We validated the estimated cumulative infections against the proportion of SARS-CoV-2 seropositive individuals from two large-scale serological surveys in the US: 1) surveys in 10 sites until August 2020 13, 14 , and 2) state-level surveys from August 2020 to November 2020 15, 16 . We primarily focused on the 10-site survey as samples were collected in more consistent and specific locations. The serological surveys give an indication of the proportion of individuals previously infected by SARS-CoV-2 and in principle should be similar to the proportion of cumulative infections estimated by the model. However, the seroprevalence rates likely underestimate the fraction of a population previously infected with SARS-COV-2 due to antibody waning. Specifically, antibody titers in recovered individuals decline over time, and seroreversion in the months following adaptive immune response is common 17 . Note, such decline in antibody titers does not necessarily preclude protection from repeat infections mediated by other components of the adaptive immune response. We adjusted the reported seroprevalence by first correcting for errors in serological testing. The assay used to detect antibodies against SARS-CoV-2 reports 96% (95% CI: 98.3 -99.9%) sensitivity and 99.3% (98.3 -99.9%) specificity 13 . The seroprevalence adjusted for testing sensitivity and specificity is = > %-./%0 + − 1?/( + − 1), where %-./%0 is the reported seroprevalence 18 . We then adapted the method of Buss et al. to quantify antibody waning (seroreversion) and to estimate an adjusted seroprevalence that is inclusive of individuals who have seroreverted 19 . This method has been used to estimate the percentage of the population infected with SARS-CoV-2 in Manaus, Brazil during a largely unmitigated outbreak 19 . In order to estimate the antibody waning rate, we used the seroprevalence data for New York City from the 10-site study. As New York City was the epicenter in the US during the early phase of the COVID-19 pandemic, the effect of seroreversion is expected to be more apparent there than for other locations. We assume the probability of seroreversion for an infected patient decays exponentially with time, and define the monthly attenuation as ∈ [0,1]. The probability of a recovered patient seroreverting after months is . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 17, 2021. ; To estimate , we first sampled ( ! ) ( = 1, … ,6) from a uniform distribution [ (/7-% ( ! ), '..-% ( ! )], where (/7-% ( ! ) and '..-% ( ! ) are the observed lower and upper bounds of seroprevalence adjusted for testing characteristics. We then searched a range of linearly spaced numbers from 0.001 to 0.999 and computed for each using the sampled seroprevalence data from New York City. As New York City reported few cases in July, we estimated as the value that minimizes the number of new recoveries in the last two sampling periods (i.e., ( 8 ) + ( 9 ), July 7 to 11 and July 27 to 30 July) under the constraint ( ! ) ≥ 0 for all ! . To account for uncertainty in observed seroprevalence, we independently drew 100 samples of , and selected the value that generated the lowest new recoveries in July among all samples. This process was repeated 1,000 times to obtain the distribution of monthly attenuation: = 0.90 (95% CI: 0.86 -0.95). The adjusted seroprevalence in New York City is the cumulative number of recoveries per capita computed using the estimated parameter . For other locations, we used the same parameter to correct the prevalence if it generates nonnegative ( ! ) for all ! . Otherwise, we chose the parameter closest to in New York City that produces non negative ( ! ) 18 . The adjusted and reported seroprevalence for the 10 sites are shown in Fig. S2 . To match the estimated cumulative infection and seroprevalence in one location, we assumed seroconversion takes an average of 10 days (e.g., from infection acquisition to generation of detectable antibody). A 14-day seroconversion period was also tested and results remained similar. We applied the same method to adjust state-level seroprevalence reported from August to November 2020 (Fig. S3) . However, as seroprevalence data were not available prior to August, the adjustment cannot account for seroreversion before August 2020. As a result, the adjusted seroprevalence in August is an underestimate. In addition, because the sample size is small relative to the population of each state and the samples may not be representative of general US population, anomalies may appear in the reported seroprevalence. For instance, the survey in North Dakota estimated 7.3% seroprevalence between 29 July and 12 August; however, it dropped to 0.6% between 12 August and 25 August. To exclude any severely biased seroprevalence data, we assumed the monthly attenuation rate to be no higher than 15%. Observations that indicated faster antibody waning (e.g., North Dakota) were excluded from the analysis. In total, seroprevalence data in 17 states were used (Fig. S3a ). Despite these limitations, our inferred cumulative infected percentages are well matched to adjusted seroprevalence at state level (Fig. S4a , Pearson = 0.76, mean absolute error (MAE) = 3.61%). Note that the adjusted seroprevalence data in August (blue dots) are generally lower than model estimates, as seroreversion prior to August was not considered in the adjustment. For later surveys (yellow nodes), this systematic bias is less severe. We further performed the same analysis using a maximum monthly attenuation rate of 25% ( Fig. S3b and Fig. S4b , Pearson = 0.72, mean absolute error (MAE) = 4.57%). The estimated monthly infections (both documented and undocumented) in the US and five metropolitan areas are reported in Fig. S5 . The prevalence of contagious infections in the community is estimated as the percentage of active infectious cases ( % + ' ) among the general population. We provide the daily confirmed cases and estimated prevalence of contagious infections in the US and five metropolitan areas in Fig. S6 . . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) ' ( : )] /15, where *-7 % ( ) and *-7 ' ( ) are the posterior newly infected reported and undocumented patients on day . We use the average of new infections during a 15-day time window to smooth the variation of reporting within a week. A proportion of total infections were tested and reported as confirmed cases after a lag between infection acquisition and laboratory confirmation. Denote % as the mean reporting delay on day (see Table S1 ). We estimated the number of confirmed cases who were infected on day as ∑ ./>0 ( : + % ) ./>0 ( ) is the posterior confirmed cases on day . Note here we shifted the computation time window forward for % days to account for reporting delays. Based on observations of confirmed COVID-19-related deaths in New York City 20 , the mean time from confirmation to death is & = 9 days. We therefore estimated the number of deaths among patients infected on day as ∑ ℎ( : , and the IFR is computed as = ∑ ℎ( : . As death data for individuals infected in late December were not available when we performed the study, we limited analysis on CFR and IFR until 1 December 2020. As the mortality rate has large variations across different age groups, the overall IFR at a given time depends on the age structure of infections. The fraction of confirmed infections for each age group in HHS region 4 (Alabama, Florida, Georgia, Kentucky, Mississippi, North Carolina, South Carolina, and Tennessee) reported by the CDC 21 is shown in Fig. S7 . We use the time-varying reproduction number, 0 = [ + (1 − )], to quantify the local transmission rate of COVID-19. Based on the national confirmed cases in the US, we define the spring, summer and fall/winter waves as the following periods: 21 January 2020 to 31 May 2020, 1 June 2020 to 15 September 2020, and 16 September 2020 to 31 December 2020. For each wave, we selected the time intervals in each county with increasing and decreasing local infections. Specifically, if the weekly confirmed cases (per 100,000 people) in a county increase (or decrease) for at least two consecutive weeks, this period is defined as an interval with increasing (or decreasing) local transmission for the county. To remove the counties with low activity, we discarded those that never reported greater than 15 daily cases per 100,000 population during a wave. For counties with increasing local infections, we included 352, 492 and 594 counties in the analysis for the spring, summer, and fall/winter waves. For counties with decreasing local infections, 114, 356 and 288 counties were included in the analysis for the spring, summer, and fall/winter waves. Analyses using other threshold values (25 and 40) yield similar results (Figs. S8-S9) . We computed the weekly reproduction number 0 in counties with increasing (or decreasing) infections and fitted the estimate 0 to EPI week using a linear function for each wave (EPI week is the epidemiological week used by the US CDC). The . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 17, 2021. ; https://doi.org/10.1101/2021.02.15.21251777 doi: medRxiv preprint Fig. S3 . The reported (black) and adjusted (blue) seroprevalence for the state-level serological survey. Dots and whiskers show the median and 95% CIs. (a) and (b) show the results obtained using a maximum monthly attenuation rate of 15% and 25%, respectively. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted February 17, 2021. ; https://doi.org/10.1101/2021.02.15.21251777 doi: medRxiv preprint Fig. S4 . Comparison between the inferred percentage of cumulative infections and seroprevalence at the state level adjusted for antibody waning. Whiskers show 95% CIs and color indicates the sample collection date for each location. Seroprevalence data adjusted using a maximum monthly attenuation rate of 15% (a) and 25% (b) are included in the analysis. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted February 17, 2021. ; https://doi.org/10.1101/2021.02.15.21251777 doi: medRxiv preprint Fig. S5 . Estimated monthly total infections (blue bars, whiskers show 95% CIs) and confirmed cases (orange bars) in the US and five metropolitan areas. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted February 17, 2021. ; https://doi.org/10.1101/2021.02.15.21251777 doi: medRxiv preprint Early transmission dynamics in Wuhan, China of novel coronavirus-infected pneumonia Timing the SARS-CoV-2 index case in Hubei province Preexisting and de novo humoral immunity to SARS-CoV-2 in humans Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2) Presymptomatic SARS-CoV-2 infections and transmission in a skilled nursing facility Temporal dynamics in viral shedding and transmissibility of COVID-19 World Health Organization, WHO Coronavirus Disease (COVID-19) Dashboard Pfizer-BioNTech COVID-19 Vaccine Food and Drug Administration, Moderna COVID-19 Vaccine Differential effects of intervention timing on COVID-19 spread in the United States Projection of COVID-19 cases and deaths in the US as individual states re-open Three-quarters attack rate of SARS-CoV-2 in the Brazilian Amazon during a largely unmitigated epidemic Estimating the cumulative incidence of SARS-CoV-2 infection and the infection fatality ratio in light of waning antibodies Epidemiological characteristics of 2009 (H1N1) pandemic influenza based on paired sera from a longitudinal community cohort study Role of air temperature and humidity in the transmission of SARS-CoV-2 in the United States COVID Analysis and Mapping of Policies Coronavirus disease 2019 (COVID-19) re-infection by a phylogenetically distinct severe acute respiratory syndrome coronavirus 2 strain confirmed by whole genome sequencing Genomic evidence for reinfection with SARS-CoV-2: a case study Decline in SARS-CoV-2 antibodies after mild infection among frontline health care personnel in a multistate hospital network -12 states Waning antibody responses in asymptomatic and symptomatic SARS-CoV-2 infection First detection of SARS-CoV-2 spike protein N501 mutation in Italy in Preliminary genomic characterization of an emergent SARS-CoV-2 lineage in the UK defined by a novel set of spike mutations County to county commuting data. The United States Census Bureau Business Listings, & Foot-Traffic Data COVID-19 Case Surveillance Public Use Data | Data Temporal dynamics in viral shedding and transmissibility of COVID-19 An interactive web-based dashboard to track COVID-19 in real time An Ensemble Adjustment Kalman Filter for Data Assimilation Forecasting seasonal outbreaks of influenza Inference of seasonal and pandemic influenza transmission dynamics Forecasting the spatial transmission of influenza in the United States Retrospective Parameter Estimation and Forecast of Respiratory Syncytial Virus in the United States Ensemble forecast of human West Nile virus cases and mosquito infection rates Initial Simulation of SARS-CoV2 Spread and Intervention Effects in the Continental Seroprevalence of Antibodies to SARS-CoV-2 in 10 Sites in the United States Estimated SARS-CoV-2 Seroprevalence in the US as of Decline in SARS-CoV-2 antibodies after mild infection among frontline health care personnel in a multistate hospital network -12 states Estimating prevalence from the results of a screening test Three-quarters attack rate of SARS-CoV-2 in the Brazilian Amazon during a largely unmitigated epidemic Estimating the infection-fatality risk of SARS-CoV-2 in New York City during the spring 2020 pandemic wave: a model-based analysis. The Lancet Infectious Diseases Key Updates for Week 1 Acknowledgements: Mobility data was provided by SafeGraph, a data company that aggregates anonymized location data from numerous applications in order to provide insights about physical places, via the Placekey Community. To enhance privacy, SafeGraph excludes census block group information if fewer than five devices visited an establishment in a month from a given census block group. We also thank Columbia University Mailman School of Public Health for high-performance computing resources. This study was supported by funding from the National Science Foundation (DMS-2027369) and a gift from the Morris-Singer Foundation.Author Contributions: SP and JS conceived the study, SP, TKY, SK, MG performed the analysis, SP and JS drafted the manuscript, all authors revised and reviewed the manuscript.Data Availability: https://github.com/SenPei-CU/COVID_US_2020 response to local transmission is quantified by the slope of the linear fit, which is interpreted as the weekly change of 0 .