key: cord-0740452-3ugekfco authors: Yu, X. title: Risk interactions of coronavirus infection across age groups after the peak of COVID-19 epidemic date: 2020-05-19 journal: nan DOI: 10.1101/2020.05.17.20105049 sha: 9396faaea421ef15083f7a03850431930ee7a84b doc_id: 740452 cord_uid: 3ugekfco Background: the COVID-19 epidemic has incurred significant disease burden worldwide, particularly on elderly population. This study aims to explore how risks of infection interact across age groups and compared risk patterns between South Korea and US Florida state. Methods: Daily new COVID-19 cases were scraped from online sources. A multivariate vector autoregressive model for time series count data was used to examine the risk interactions across age groups. Case counts from previous days were included as predictors to dynamically examine the change of risk patterns. Results: In both South Korea and Florida, the risk of coronavirus infection among elderly people was significantly affected by other age groups. An increase of virus infection among people aged 20 -39 could double the risk of infection among elderly people. Meanwhile, an increase in virus infection among elderly people also significantly increased risks of infection among other age groups. The risks of infection among younger people were relatively unaffected by that of other age groups. Conclusions: Protecting elderly people from coronavirus infection could not only reduce the risk of infection among themselves but also ameliorate the risks of virus infection among other age groups. Such interventions should be effective and for long term. The coronavirus disease is caused by the infection of a novel Severe Acute Respiratory Syndrome associated coronavirus (SARS-CoV2) [1] . Since December 2019, over 4 million people have been infected with SARS-CoV2 and over 291,000 people died (https://coronavirus.jhu.edu/map.html, accessed on May 12, 2020). Of them, elderly people and people with underlying chronic conditions suffered the heaviest disease burden [2, 3] [4] . For example, about 80% deaths were people aged 65 or above (https://www.cdc.gov/coronavirus/2019-ncov/cases-updates/cases-in-us.html), and 43.4% of hospitalizations aged 65 or above [5] . In the state of Florida,US, people aged 65 or above accounted for 54% of hospitalizations, and the mortality rate was 11% if infected with virus [6] . The reasons for the disproportional burden among elderly people were unclear [7] . Elderly people generally have weaker immune system than younger people due to aging, and they are also more likely to have multiple chronic conditions [8, 9] . Thus, elderly people may have severe symptoms if infected with coronavirus [10, 11] . On the other hand, elderly people may have exposed myriads of infections over their lifetime which may provide immunity against new virus infection. But it was unknown whether elderly people might have any effective immunity against SARS-CoV2. Meanwhile, the COVID-19 pandemic is waning down in many countries since May 1, 2020, and society is gradually returning to normalcy [12, 13] . A potential rebound of new cases has been warned by many public health experts [14] . This is reflected in an epidemic curve with a long tail and occasional spikes, which is demonstrated in the epidemic process in South Korea (https://www.kcdc.info/covid-19/) [15] . In addition, a recent study predicted a lasting post-were stratified by age groups (0-19, 20-39, 40-59, and 60 or above). To be consistent between South Korea and Florida data, those aged 60 or above were referred as elderly people. We developed a vector autoregressive (VAR) model to examine the associations of the infection risks across age groups simultaneously [19] . Specifically, we assumed daily new case counts (yj,t) followed a generalized Poisson distribution to account for over-dispersion of case counts (i.e., observed variance is larger than expected variance) [20] . The model also included case counts from previous days (lags) across age groups as predictors to form a dynamic model (see Appendix for details). Therefore, the current risk of infection in each age group was predicted not only by previous case counts in its own group but also by previous counts from other age groups. Where j = 1,…,J represented age groups, t=1,…,T represented days, and k = 1,…,K represented the number of time lags. We reported results from five-lag models but three-lag models were also performed. Results from both models were consistent. The scale parameter ξ in the generalized Poisson distribution controls the magnitude of dispersion, that is, ξ = 0 corresponding to a standard Poisson (mean = variance), ξ < 0 suggesting under-dispersion (mean > variance), and 0 < ξ < 1 indicating overdispersion (mean < variance). The bj,t could be viewed as a random effect to account for the correlation of daily counts between age groups. The bj,t was assumed a multivariate normal distribution. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 19, 2020. . https://doi.org/10.1101/2020.05.17.20105049 doi: medRxiv preprint The above model framework was similar to the common log-linear relative risk models in epidemiological studies which assume multiplicative associations between predictors and outcomes [21] . The coefficients βs could be interpreted as natural logarithms of risk ratios per one unit change of natural logarithms of case counts. We fit the above models with Bayesian software stan through Rstan interface (http://mc-stan.org) [22] . To keep the model simple, we assumed weakly informative priors of student t distributions for all αs and βs, and an LKJ prior with modal density around diagonals for correlations between case series (see Appendix). Hamiltonian Monte Carlo was used to obtain posterior distributions of parameters. Diagnostic plots showed all chains mixed satisfactorily and were converged. In addition, negative binomial models were also fit and results were similar to those reported here except for wider confidence intervals (appendix Tables). The data and replicable codes were available online (github address after blind review). This study was based on publicly available data. There was no direct involvement of human subjects. Therefore, it was exempted from the approval of Institutional Review Board. No informed consent was needed. All authors declared no conflict of interest in conducting this study. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Although it is difficult to discern patterns between lines, all age groups generally followed similar trends over time for both regions. There was a solo spike among elderly people in South Korea around March 18. Table 1 were less likely affected by other age groups. The risk patterns were slightly different in Florida (Table 2) . First, the current risk of infection among elderly people was affected by their previous infection and also by five-day prior infections among people aged 40-59, with a risk ratio of 3. Second, risks of infection among all other age groups were significantly affected by infections among elderly, with risk ratios ranging from 3 -4. Third, risks of infection among those aged 40 -59 appeared similar to that of elderly All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 19, 2020. . https://doi.org/10.1101/2020.05.17.20105049 doi: medRxiv preprint people, while risks of infection among those aged 20-39 or 0-19 were largely unaffected by infections among other groups (except from the infection in elderly people). Finally, except for those aged 0 -19, there was significant overdispersion in daily new case counts across age groups with observed variances was about 12-14 times of means, suggesting large variations in daily case counts in Florida. This was the first study to quantify risk interactions of SARS-CoV2 infection across age groups based on vector autoregressive models and compare the risk patterns between South Korea and the state of Florida in the US. We found that in both South Korea and Florida, the risk of infection among elderly people was significantly affected by other age groups. An increase in virus infection among elderly people also significantly increased risks of infection among other age groups. Risks of infections among younger people were relatively unaffected by that of other age groups. Our results were consistent with the current COVID-19 epidemic process, in which elderly people born disproportional disease burden and suffered highest mortality [4, 23] . Although virus transmission might differ between age groups [7, 24] , our results highlighted the importance of implementing and enforcing effective interventions in the whole society [25] [26] [27] , and the highest priority of protecting elderly people [24] . Furthermore, we showed that an increase of coronavirus infection among elderly people could increase risks of infection among other age groups, suggesting protecting elderly people and reducing the risk of infection among elderly people had spillover effect in the whole society. This was consistent with our previous simulation study in which reducing contacts among elderly could reduce the virus infection and hospitalizations in the whole society [28] . All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Respiratory infectious diseases often spread through personal contacts [29] . Previous studies showed that contacts were more frequent in young age groups than older age groups, and interactions across age groups were less frequent than within each age group [30] . During the period with stay-at-home rule in the US, contacts between age groups were reduced significantly. Thus, as shown in our study, there were 2-5 lagging days in the risk interactions across age groups, especial between old and young people. On the other hand, the infection among elderly people may still be affected by and also affect the risks of infection among other age groups. Passive community interactions such as grocery shopping might play an important role in sustaining the epidemic. There were some limitations in this study. The most important one was that we relied on reported cases. The data from South Korea were more likely complete due to extensive contact tracing and mass testing. On the other hand, in Florida data, those cases with mild or no symptoms were more likely missed, as suggested by the smaller percent of cases among younger age groups in Florida compared with that of South Korea. We chose our study period for Florida from April 1 to May 12 to mitigate this bias because testing became more available nationally in the US after April 1. Furthermore, there were other factors such as gender, socio-economic status and neighborhood environment might also affect the risk of infection. In addition, although we interpreted the results with action terms, they had no causative meanings. For example, younger people tended to have milder or no symptoms (i.e., subclinical cases) if infected with virus [31, 32] . Thus, it was possible that an increased number of detected cases among young people implied the existence of an increase of subclinical cases in the community who might unknowingly infected other people, including elderly people. Subclinical All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. [20] . Unlike many other studies that used mechanistic epidemic models which was useful to describe the epidemic process [33] , our statistical models extended traditional relative risk models to time series of count data. The principle of our methods was similar to that of IHME [34] and UT-Austin models [35] , all of which relied on time series analysis of count data. However, we did not attempt to predict future cases. Rather, we focused on untangling risk interactions of infection across age groups, which was more important and relevant in disease preventions. Finally, during the process of re-opening the economy and society, the number of new cases may rebound, and a second wave of epidemic is possible. A contentious issue was whether and how to protect high risk populations such as elderly people. Therefore, we limited our study period to the post-peak of epidemic to answer this imminent question. Our study strongly supported that high risk populations like elderly people should still take serious precautions during the postepidemic period. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. In summary, protecting elderly people from coronavirus infection could not only reduce the risk of infection among themselves but also ameliorate risks of virus infection among other age groups. Therefore, elderly people should keep on practicing social distancing and maintaining effective personal protections until the epidemic is completely over. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 19, 2020. . https://doi.org/10.1101/2020.05.17.20105049 doi: medRxiv preprint All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Note: * and # for p < 0.05 The standard Poisson distribution describes the distribution of y events occurring at a constant rate of λ. The Probability mass function (PMF) is: In standard Poisson distribution, expected variance equals mean. If observed variance is larger than expected variance (i.e., the mean), then overdispersion exists. This often occurs when outcomes are correlated, such as daily new case counts during a disease outbreak. The generalized Poisson distribution introduces an additional scale parameter ξ [36] as quoted in Hilbe JM. 2014 [21] . The PMF is: Therefore, if ξ = 0, then φ = 1, corresponds to a standard Poisson (mean = variance); 0 < ξ < 1, then φ > 1, models overdispersion (mean < variance); and ξ < 0, then φ < 1, models underdispersion (mean > variance). Reparametrize the PMF of generalized Poisson distribution with μ and ξ [20] : All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 19, 2020. On the other hand, the negative binomial distribution describes the distribution of the number of successes given a predefined r number of failures during a sequence of independent Bernoulli trials with a success probability p: Thus, the overdispersion of Y is controlled by the shape parameter r. Reparametrize the PMF with μ and r, Note that negative binomial distribution can be viewed as a Gamma-Poisson mixture distribution in which Y ~ Poisson(λ) and λ ~ Gamma(r , λ /r). That is, the negative binomial distribution (r, p) is the posterior distribution of Poisson(λ) with Gamma(r, λ /r) as the conjugate prior of λ, All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 19, 2020. . https://doi.org/10.1101/2020.05. 17.20105049 doi: medRxiv preprint where ≝ = /(1 − ). Rewriting Gamma(r, λ /r) as Gamma(r, /(1 − )) and using Γ functions to represent factorials: Under this framework, negative binomial distribution is appealing as a natural extension of Poisson distribution to allow for overdispersion that is controlled by the shape parameter r. However, although negative binomial distribution is often used to model new case counts during disease outbreaks, it models only overdispersion and assumes a quadratic relationship between variance and mean, while the generalized Poisson model is more flexible and assumes a simpler first order association between variance and mean. Therefore, we chose to report results from generalized Poisson models. Results from negative binomial models were included in the appendix. In addition, it is also of note that there are extensions of negative binomial models in which the association between mean and variance can be estimated from data, leading to a more flexible model and also permitting the exploration of determinants of overdispersion [21] . In this study, we proposed the following hierarchical vector autoregressive model (VAR) for count data: (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Where j = 1,…,J represents age groups, t=1,…,T represents days, and k = 1,…,K represents the number of lags. The PMF of Y can be either standard Poisson (λ), generalized Poisson (μ, ξ), or negative binomial (μ, r) distribution. The above VAR model included new case counts from previous days (lags) across age groups as predictors [19] , thus examining associations of the infection risks across age groups simultaneously. That is, the current risk of infection in each age group was predicted not only by previous case counts in its own group but also by previous counts from other age groups. The correlation of daily counts between age groups was modeled through bj,t that can be viewed as a random effect. The bj,t was assumed a multivariate normal distribution. The exponential link between dynamic predictors and μ is equivalent to common relative risk models in epidemiological studies, i.e., log-linear models for count data. Under this multiplicative scale framework, the interpretation of s are relative risks given one unit increase of predictors. During the model fitting, we assumed some weakly informative priors for all parameters: All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The LKJ prior is a special prior most suitable for correlations. The LKJ(2) assumes a modal density surrounding diagonals. The models were fit with Bayesian software stan through Rstan interface [22] . A customized stan function was constructed for fitting generalized Poisson model. We employed Hamiltonian Monte Carlo with 5 Markov chains, each with 50,000 iterations plus 2000 warmups, to obtain posterior distributions of parameters. Diagnostic plots through shinestan package showed all chains mixed well and were converged. The replicable data and codes, including models with daily case counts as standard Poisson, generalized Poisson or negative binomial distributions, were available online (github address after blind review). All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. 46 -3.50) 40 -59 0.78 (0.23 -2.64) A Novel Coronavirus from Patients with Pneumonia in China Characteristics of and Important Lessons From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary of a Report of 72314 Cases From the Chinese Center for Disease Control and Prevention Clinical Characteristics of Coronavirus Disease 2019 in China Presenting Characteristics, Comorbidities, and Outcomes Among 5700 Patients Hospitalized With COVID-19 in the Hospitalization Rates and Characteristics of Patients Hospitalized with Laboratory-Confirmed Coronavirus Disease 2019 -COVID-NET, 14 States Did elderly people living in small towns or rural areas suffer heavier disease burden during the COVID-19 epidemic? medRxiv Implications of the Age Profile of the Novel Coronavirus PNAS Multiple chronic conditions among US adults: a 2012 update Physical and Functional Limitations in US Older Cancer Survivors Transmissibility of 2019-nCoV 2020 COVID-19) -United States Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand 2020 How will country-based mitigation measures influence the course of the COVID-19 epidemic? The COVID-19 pandemic in the USA: what might we expect? Lancet Transmission potential and severity of COVID-19 in South Korea Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period Defining the Epidemiology of Covid-19 -Studies Needed Impact of mitigating interventions and temperature on the instantaneous reproduction number in the COVID-19 epidemic among 30 US metropolitan areas A bayesian Poisson vector autoregressive model Testing approaches for overdispersion in poisson regression versus the generalized poisson model Modeling Count Data Stan: A probabilistic programming language Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia Age-dependent effects in the transmission and control of COVID-19 epidemics. medRxiv Association of Public Health Interventions With the Epidemiology of the COVID-19 Outbreak in Wuhan Estimated effectiveness of symptom and risk screening to prevent the spread of COVID-19 Impact assessment of non-pharmaceutical interventions against coronavirus disease 2019 and influenza in Hong Kong: an observational study Modeling Return of the Epidemic: Impact of Population Structure, Asymptomatic Infection, Case Importation and Personal Contacts. medRxiv Changes in contact patterns shape the dynamics of the COVID-19 outbreak in China Social contacts and mixing patterns relevant to the spread of infectious diseases Asymptomatic and Human-to-Human Transmission of SARS-CoV-2 in a 2-Family Cluster Asymptomatic and Presymptomatic Infectors: Hidden Sources of COVID-19 Disease Early dynamics of transmission and control of COVID-19: a mathematical modelling study COVID-19 impact on hospital bed-days, ICU-days, ventilatordays and deaths by US state in the next 4 months. IHME COVID-19 health service utilization forecasting team Projections for first-wave COVID-19 deaths across the US using social-distancing measures derived from mobile phones. medRxiv generalized poisson distribution: properties and applications