key: cord-0816069-9nwof017 authors: Balasubramani, G. K.; Nowalk, Mary Patricia; Eng, Heather; Zimmerman, Richard K. title: Estimating the burden of adult hospitalized RSV infection using local and state data - methodology date: 2022-03-10 journal: Human vaccines & immunotherapeutics DOI: 10.1080/21645515.2021.1958610 sha: d9db483127716293078332f6f74c532d4d2760ec doc_id: 816069 cord_uid: 9nwof017 Respiratory syncytial virus (RSV) is becoming increasingly recognized as a serious threat to vulnerable population subgroups. This study describes the statistical analysis plan for a retrospective cohort study of adults hospitalized for acute respiratory infection (ARI) to estimate the population burden of RSV especially for groups such as the elderly, pregnant women and solid organ transplant patients. Disease burden estimates are essential for setting vaccine policy, e.g., should RSV vaccine become available, burden estimates may inform recommendations to prioritize certain high-risk groups. The study population is residents of Allegheny County, Pennsylvania ≥18 years of age who were hospitalized in Pennsylvania during the period September 1, 2015–August 31, 2018. Data sources will include U.S. Census, Pennsylvania Health Care Cost Containment Council (PHC4) and the electronic medical record for the health system to which the hospitals belong. The algorithm involves: 1) ARI-associated hospitalizations in PHC4 data; 2) adjustment for ARI hospitalizations among county residents but admitted to hospitals outside the county; and 3) RSV detections from respiratory viral panels. Key sensitivity analyses will adjust for undertesting for viruses in the fall and spring quarters. The results will be population-based estimates, stratified by age and risk groups. Adjusting hospitalization data using a multiplier method is a simple means to estimate the impact of RSV in a given area. This algorithm can be applied to other health systems and localities to estimate RSV and other respiratory pathogen burden in adults, to estimate burden following introduction of RSV vaccine and to make cost-effectiveness estimates. Respiratory syncytial virus (RSV) is a highly contagious respiratory virus that can result in bronchiolitis, otitis media, upper respiratory tract infections, and pneumonia. 1 The virus was first isolated in young children over 60 years ago and much is known about its epidemiology and burden among the very young. Some decades later, documentation of the impact of RSV on morbidity and mortality of adults, especially older adults began. Advanced age and presence of high-risk medical conditions, especially cardiopulmonary disease, are known risk factors for severe RSV outcomes. 2 RSV is estimated to cause 12% of acute respiratory illness (ARI) visits 3 and 7% of influenza like illness (ILI)-ARI in the U.S. in adults over age 50 years. 4 An estimated 3-7% of older adults and 4-10% of high risk adults contract RSV infections each year in the U.S., 5 numbers which rise with increasing age. 3 Moreover, detections of RSV in hospitalized patients have increased steadily between 1997 and 2012, especially among those ≥60 years of age. 6 CDC estimates that there are 177,000 adult RSV-associated hospitalizations in the U.S. annually. RSV has been estimated to account for 11% percent of hospitalizations for pneumonia and chronic obstructive pulmonary disease exacerbations among elderly and high-risk adults during the RSV season. 5 Hospitalized adults with RSV typically stay 3-6 days and frequently require mechanical ventilation and intensive care admission. 3 The majority of RSV-associated deaths occur in adults >65 years (estimated at 14,000/year); 7 RSV mortality also increases with increasing age, 6 and particularly, among those who are compromised by chronic respiratory and cardiovascular diseases, such as COPD, those with transplants and other immunocompromising conditions, 8 and adults requiring chronic immunosuppressive treatments for rheumatological conditions and solid tumors. 9 To date, there is no RSV vaccine available for use in either children or adults, although there are many in development. 10 Except for use of monoclonal antibodies in premature infants, there is also no method of attenuating its severity through antiviral or other medication. Accurate estimates of RSV burden are essential for healthcare planning, resource allocation and vaccine policy. RSV burden studies have primarily focused on children and, while similar studies of adults are becoming more common, there are still relatively few from the U.S. 11 Of those included in reviews and meta-analyses, 4,12,13 only a subset includes younger adults or those with specific high-risk conditions. Surveillance-based studies with laboratory confirmation of RSV infection to calculate RSV burden can be resource intensive. Alternatively, statistical modeling strategies and multiple-regression time-series to assess the burden of disease have the advantages of being able to control for influenza, which presents with similar symptoms and co-circulates with RSV, and add a secular polynomial component of time to estimate the burden of RSV infection in adults. 14-17 A simple approach that will provide more generalizable, more accurate, and more precise estimates is possible if population-wide data are available. Herein, we describe the statistical analysis plan that will be used to produce population-based estimates of RSV burden using data from a large health system supplemented by statewide hospitalization data. This method was developed to facilitate burden estimates in situations where individual data are not available. This proposed multiplier method has the advantages of being simple, straightforward, able to account for adjustment factors, and can be used to estimate burden for an array of risk groups. Furthermore, should a RSV vaccine become available, this method may be used to compare RSV burden following introduction of the vaccine. The University of Pittsburgh IRB has determined that the calculation of burden estimates is not human research, therefore approval is not necessary. The methods described herein will be used for a retrospective aggregate cohort study to evaluate the epidemiology and burden of RSV infection in adults (≥18 years of age) over three seasons in Allegheny County, Pennsylvania. The methods allow estimates to be calculated overall and for subtypes of RSV infection and population subgroups. The cohort will be defined as adult (≥18 years old) residents of Allegheny County Pennsylvania (PA) who were hospitalized in PA between September 1, 2015 and August 31, 2018. All data will be requested and reported across a series of cohort subgroups for which we will request either total counts or average values. Each hospital admission for a given individual will be included. We will obtain retrospective data from three sources: 1) U.S. Census; 2) Pennsylvania Healthcare Cost Containment Council (PHC4); and 3) University of Pittsburgh Clinical Translational Science Institute (CTSI)'s Health Record Research Request (R3) system that draws data from the health system's electronic medical record (EMR). U.S. Census estimates for Allegheny County, PA as of July 1, 2017 will be used to obtain the number of adult county residents as the denominator for overall burden estimate, where the numerator will be the adjusted number of RSV cases from county residents of the surveillance area. Residency will be established through the individual's home zip code, using those codes listed online for Allegheny County. Statewide hospitalization data on adult Allegheny County residents from PHC4 will be used. A hospitalization is defined generally, as an encounter for which admission orders are written. For this study, a hospital admission is defined specifically by criteria of the Centers for Disease Control and Prevention (CDC) National Healthcare Safety Network (NSHN; see Appendix Table A1 ). Admissions to specialty hospitals such as psychiatric or rehabilitation institutions will be excluded from the analysis. PHC4 will provide data in aggregate for 3-month periods. The 3-month historical segments were selected to best reflect the active RSV season of September through May. The first segment will be September-November 2015, followed by successive segments from December-February, March-May, and June-August through August 2018. These aggregated data contain variables that will allow subgroup analyses, such as age, residency, highrisk conditions, etc. Admitting diagnoses and respiratory viral panel (RVP) findings on any adult Allegheny County resident who was hospitalized in the health system will be obtained through R3. Findings from repeat RVPs during a single admission will be collapsed into a single variable coded as a positive finding of RSV on any RVP performed (RSV = yes/no). A sample size calculation was performed to ensure that the selected health system and county datasets were sufficiently large to provide adequate power to achieve the desired outcome. We used a two-sided exact proportion test with a significance level of set at α = 0.05, RSV positivity rate ranging from 0.06 to 0.09, and RVP positive sample size n = 500 to achieve adequate power. [18] [19] [20] Table 1 shows the power for various values of the proportion of RSV cases under the alternative hypothesis and for different population sizes using the normal approximation method. Assuming a population size (i.e., the number of patients who had an RVP) of 1500 and a 7% RSV positivity rate, the study would be adequately powered with 105 RSV cases. A sample size of 1500 achieves 90% power to detect a difference of 0.02 using a two-sided Z-test with a significance level of 0.05. These results assume that the population proportion of RSV cases under the null hypothesis is 0.05. Statistical tests and confidence intervals will be two-sided. Estimates will be presented with 95% confidence intervals, not testing the significance of the estimates. RSV hospitalization burden = RSV hospitalized cases per 100,000 adult residents. The calculation of burden has five steps. Table 2 lists the variables used in the equations and their definitions. Step 1: Obtain from PHC4 the number of annual acute respiratory illness (ARI) hospitalizations for Allegheny County residents in Allegheny County hospitals (ARI ACYear ). Step 2: Create an adjustment for out-of-county hospitalizations in the state using PHC4 data by calculating the proportion of ARI hospitalizations of Allegheny County residents in Allegheny County hospitals, compared to all Pennsylvania hospitals for a given time period, in this case, one year. The outcome is used in the adjustment variable in Equation (2). Calculate adjusted ARI ACYear : In settings where this variable is directly available, the adjustment simplifies to ARI PAYear . Step 3: Calculate the proportion of respiratory viral panel (RVP) tests from R3 for health system hospitals in Allegheny County that are positive for RSV. Repeat tests within a timeframe such as 2 weeks need to be removed so as not to inappropriately estimate viral burden. Step 4: Estimate the crude number of RSV hospitalizations in Allegheny County by multiplying the number of ARI hospitalizations by the proportion of RSV positive RVP tests from R3 for health system hospitals in Allegheny County. Step 5: Calculate the RSV burden in Allegheny County during the year by dividing the adjusted RSV burden by the adult population of Allegheny County and multiplying by 100,000. U.S. Census estimate for Allegheny County was 1,222,344 for 2017 of whom 974,362 (80%) were adults aged ≥18 years. ARI hospitalizations include pneumonia and similar respiratory diseases. RSV and other respiratory viruses can also cause exacerbations of asthma, chronic obstructive pulmonary disease and heart failure; these are termed "ARIrelated hospitalizations." Because the fraction associated with RSV may differ between ARI hospitalizations and ARIrelated hospitalizations and because the overall incidence of ARI hospitalizations and ARI-related hospitalizations is likely to differ, data should be stratified by ARI and ARIrelated before being inputted into Equations (1)-(5). These individual results should be combined to estimate the true RSV burden. For simplicity, in this example, ARI hospitalizations and ARI-related hospitalizations were not separated. The same general approach can be used in 3-month increments to make quarterly burden determinations, using the same equations but substituting quarterly data from R3 and PHC4. Variance and 95% confidence intervals (CIs) were calculated by the following formulas: VAR aRSV ACYear ð Þ ¼ aRSV and μ equals the mean of the random variable X. Under certain conditions and with assumptions of mean and variance values, the approximation of Var 1 X À � � 1x10 À 6 . In general, the mean and variance of inverse normal distributions do not exist based on the law of total expectations. 21 Equations (4) and (5) give the burden estimates for Allegheny County that can be used to estimate burden for each of the age groups and other stratifications. A subgroup or special population of interest can be defined by ICD criteria and data from PHC4 and R3 can be obtained for this special population. For instance, immunocompromised persons may be preferentially tested by RVP and RSV cases might be higher in this population. To calculate the (1) and (2). Using the proportion of RSV for this population from R3 for Equation (4), the number of RSV cases in immunocompromised persons can be calculated. To determine RSV burden in this group, (Equation (5)), the number of immunocompromised Allegheny County residents would need to be estimated, using a data source such as the National Health Interview Survey. SA-Step 1: Create an adjustment to estimate effects of undertesting outside of the winter respiratory season, which is when most RVP testing occurs. Compute the UPMC Allegheny County RVP testing fraction for each quarter (Q), shown in Equation (6). If PrRSV does not vary across seasons, then sensitivity analyses are unnecessary. If the proportion of RSV varies (we propose by ≥5%) by season, then SA is needed. SA-Step 4: Adjust fall and spring quarter numbers of RVPs for testing fraction. If we assume that RVP testing in the fall and spring is weighted more heavily to those with immunosuppressive conditions than in the winter, then we can adjust for this situation. If RSV occurred in summer, then it could be added as well but this is not the case in our locale. Then addition across the 3 seasons of RSV yields: In a similar manner, the number of RSV cases can be adjusted for fall and for spring to create a total across the quarters: Finally, an adjusted proportion of RSV can be estimated: The above equations were used to create simulated results for Allegheny County using U.S. Census population data for Allegheny County and a range of values for PrRSV and proportion of state ARI hospitalizations in the county shown in Tables 3 and 4 . For example, when we assume that there are 75,000 ARI hospitalizations across the Commonwealth and 25% are in Allegheny County hospitals, and we assume that RSV cases represent 12% of all RVP tests, we calculate the RSV hospitalization burden for Allegheny County per 100,000 adult population would be 308/100,000 adult population. We have developed a simple, adaptable method for estimating RSV burden that can be generalized to other diseases and other locales, provided that adequate viral testing has been done. Equations (1)-(5) can be used to calculate RSV burden for an entire geographical region or for a specific hospital or hospital system within that region. This proposed method can also be used to calculate the burden estimates for any respiratory infection on which data are collected at the hospital or health system and state levels. Alternatively, it can be adapted for use in international settings where local and regional or provincial data are accessible. It can also be used for high-risk sub-populations, provided that the appropriate data are available. RSV burden estimates may be quite different in the season or two following the current coronavirus pandemic, in which RSV infections were radically reduced, 22 thereby offering further insight into its epidemiology. There is no generalized method currently in use to estimate disease burden across an array of data structures. A recent review of studies to estimate RSV burden across the globe concluded that the significant heterogeneity of methodologies was reflected in widely differing RSV burden estimates. Differences included the methods for case ascertainment; quality of and protocols for laboratory testing; reliance on influenza surveillance to estimate RSV burden and a relatively low number of studies of adults, especially older adults. 4 Our method has the advantage of using population data that are not constrained by the weaknesses of surveillance samples, 23, 24 such as lack of representativeness. Several burden estimation methods have been developed that attempt to adjust surveillance data for under-detection of the burden estimate for seasonal influenza in the Netherlands, pandemic A/H1N1 influenza and novel influenza A/H3N2 in the United States, and influenza A/H7N9 in China. [25] [26] [27] [28] The methods developed for those studies ranged from simple multipliers to more complex mathematical and statistical models, depending on setting and data availability. Our method does not require such adjustments because it depends on RSVspecific hospitalization data. Our method is subject to some limitations. It assumes that viruses causing hospital admission are the same for health system and non-health system hospitals in the county. Given that the health system has 60% of the market share in the county and includes both community and subspecialty hospitals, this is not unreasonable but the viral burden in other hospitals is an extrapolation. Given the higher burden of some viruses in immunocompromised and transplant patients, care is needed to make sure that both community hospitals and subspecialty hospitals are included so as not to bias estimates one way or another. As mentioned in the methods, the mean and variance of the inverse of the random variables do not exist. Through the Taylor series of expansion, we get the approximations of these values that limit the width of the confidence bounds of the estimate. Study of the behavior of the density function of the normal random variable is beyond the scope this manuscript. If the magnitude of ARI data is underreported in PHC4, then we may overestimate RSV burden. Given that Allegheny County is an hour from the state border and that strong hospital systems exist within the county, the likelihood that substantive numbers of out-of-state hospitalizations that would be missed is low, except for those persons who split the year as residents of two different states. Viral detections may not always represent symptomatic infection but could represent asymptomatic infections or perhaps colonization; this topic is beyond the scope of the current paper to address and is an area for further research. Similarly, co-detections of multiple viruses may not represent symptomatic infection from all of those viruses but codetections in adults are uncommon (5%-10%). 29, 30 Bacterial co-infections have been reported to account for 12% of RSV ARIs among hospitalized patients, 31 and 9.3% 32 to 19.7% 33 of RSV-associated pneumonias among hospitalized patients. These severe outcomes would need to be factored into any analysis of severity and consequential economic burden. The association between grouped ICD codes in PHC4 and individual ICD codes from the EMR that are associated with RVP tests is unknown and cannot be adjusted for in this analysis. If the association between data sources were high (close to 1), actual RSV burden would be similar to calculated estimates; whereas, if the association were low, actual RSV burden would be higher than calculated estimates. To reduce the complexity, we made estimates using the number of cases and RSV hospitalizations by quarter. There may be variations across seasons and age-specific subgroups, thus our expected burden estimates may not fully reflect the level of uncertainty. Burden may be underestimated or overestimated if careful consideration of the correction multipliers is not made. The multiplier components should be recalculated for each season because the detection probabilities may vary by season. The strength of this method is that it is not specific to the US healthcare system and can be applied in a variety of settings in which the number of ARI hospitalizations and the RSV positives within the boundaries of the area are available. The proposed method is relatively a simple method for adjusting and generalizing data to estimate RSV disease burden and may be used in other population-based settings and for other respiratory diseases. When RSV vaccines become available, accurate and timely estimates of RSV burden in various population subgroups will be important factors to consider for RSV vaccination recommendations. The Pennsylvania Health Care Cost Containment Council (PHC4) is an independent state agency that provided the aggregated data for this study. The opinions expressed in this paper are those of the authors and do not necessarily represent those of the Commonwealth of Pennsylvania. GKB contributed to the study design and was responsible for statistical analysis, drafting and editing the manuscript. MPN contributed to the study design, obtained grant funding, revised the manuscript and is the lead investigator. HE contributed to the study design, acquired data for this study from PHC4 and UPMC health plan and revised the manuscript. RZ contributed to the study design and revised the manuscript. All authors read and approved the final version. Respiratory syncytial virus and parainfluenza virus vaccines Clinical features, severity, and incidence of RSV illness during 12 consecutive seasons in a community cohort of adults≥ 60 years old The epidemiology of medically attended respiratory syncytial virus in older adults in the United States: a systematic review The burden of respiratory syncytial virus in adults: a systematic review and meta-analysis Respiratory syncytial virus infection in elderly and high-risk adults Hospitalizations for respiratory syncytial virus among adults in the United States Respiratory syncytial virus and other respiratory viral infections in older adults with moderate to severe influenza-like illness Morbidity and mortality among patients with respiratory syncytial virus infection: a 2-year retrospective review Burden of severe RSV disease among immunocompromised children and adults: a 10 year retrospective study Respiratory syntycial virus: current treatment strategies and vaccine approaches Comparative incidence and burden of respiratory viruses associated with hospitalization in adults in New York City. Influenza Other Respir Viruses Global disease burden estimates of respiratory syncytial virus-associated acute respiratory infection in older adults in 2015: a systematic review and meta-analysis Respiratory syncytial virus associated hospitalizations among adults with chronic medical conditions Modelling estimates of the burden of Respiratory Syncytial virus infection in adults and the elderly in the United Kingdom Assessing the burden of influenza and other respiratory infections in England and Wales Mortality associated with influenza and respiratory syncytial virus in the United States Impact of pneumococcal conjugate vaccination of infants on pneumonia and influenza hospitalization and mortality in all age groups in the United States Sample size calculations in clinical research Statistical methods for rates and proportions Sample size determination and power Generalized inverse normal distributions Increased Interseasonal Respiratory Syncytial Virus (RSV) Activity in Parts of the Southern United States Respiratory syncytial virus-and human metapneumovirus-associated emergency department and hospital burden in adults. Influenza Other Respir Viruses Rates of hospitalizations for respiratory syncytial virus, human metapneumovirus, and influenza virus in older adults An evidence synthesis approach to estimating the incidence of seasonal influenza in the N etherlands The severity of pandemic H1N1 influenza in the United States Estimates of the Number of Human Infections With Influenza A(H3N2) Variant Virus Human infection with avian influenza A H7N9 virus: an assessment of clinical severity Viral infections in outpatients with medically attended acute respiratory illness during the 2012-2013 influenza season Influenza and other respiratory virus infections in outpatients with medically attended acute respiratory infection during the 2011-12 influenza season. Influenza Other Respir Viruses Respiratory syncytial virus infection: its propensity for bacterial coinfection and related mortality in elderly adults. Open Forum Infect Dis Microorganisms associated with respiratory syncytial virus pneumonia in the adult population Elucidation of Bacterial Pneumonia-Causing Pathogens in Patients with Respiratory Viral Infection. Tuberc Respir Dis (Seoul) A Table A1 . List of ARI-specific and ARI-related (i.e. COPD, asthma, CHF) ICD-9/10 codes adapted from CDC's HAIVEN study.