key: cord-0928752-17tcfgse authors: Rogers, Monica title: A case report: Challenges in COVID‐19 modeling at a public health department date: 2020-10-22 journal: Proc Assoc Inf Sci Technol DOI: 10.1002/pra2.426 sha: ec4616f4b1eef1173ac77a75530ff98c468cb108 doc_id: 928752 cord_uid: 17tcfgse In response to the COVID‐19 public health emergency, the Tulsa Health Department created local models. This was an iterative process, with the focus predicting all infections (including asymptomatic and mild cases that would not meet testing criteria,) and deaths for the Tulsa area. SEIR‐type models were utilized. Developing infectious disease models is challenging due to data issues related to validity, and complex interrelated assumptions, and this was exacerbated with the COVID‐19 crisis. Directly related to these data challenges were challenges with communicating without spreading misinformation, and being clear about the model limitations. The Tulsa Health Department (THD) is a local health department in the state of Oklahoma, with public health jurisdiction throughout Tulsa County and the City of Tulsa (Tulsa Health Department, 2020) . The data team at THD created an SEIR-type model for Tulsa County, with the primary focus of predicting all infections (including asymptomatic and mild cases,) and related deaths for the Tulsa area. SIR models are the most commonly used framework for modeling infectious disease, with a subsequent refinement to a SEIR mode (Wearing, Rohani, & Keeling, 2005) . In both cases of R (removed or recovered,) the person is no longer included in the model as part of the susceptible population. One of the many assumptions of the model is that once a person would be infected with COVID-19, they would not be susceptible to reinfection in the duration of the model. While there were many "important unknowns" (Anderson, Heesterbeek, Klinkenberg, & Hollingsworth, 2020) for COVID-19, THD utilized the model to estimate potentials for newly infected individuals, cumulative infections for the time period, hospitalizations and fatalities and included eight variables. The largest assumptions were R0 values. R0 is the basic reproductive number and is an indicator of contagiousness or transmissionability (Delamater, Street, Leslie, Yang, & Jacobsen, 2019) . As COVID-19 is a "new coronavirus that [had] not been previously identified," (CDC, 2020) and thus lacked historical precedent, and had no associated data. Additionally, there was a wide range of R0 values used in other models, with no direction on how to interpret or apply the values. One study identified 12 other studies between Jan 1 through February 7, that estimated R0, with ranges from 1.4-6.49, a mean of 3.28, and a median of 2.79 (Liu, Gayle, Wilder-Smith, & Rocklöv, 2020) . DOI: 10.1002/pra2.426 83rd Annual Meeting of the Association for Information Science & Technology October 25-29, 2020. Author(s) retain copyright, but ASIS&T receives an exclusive publication license Furthermore, there are no validated R0 values for different levels of social distancing orders or compliance. A search was conducted using PubMed, for "R0," "social distancing," "COVID" and "similar articles" related to the original results. There were mathematical models for reduced transmissions for levels of closures and social distancing, but overwhelmingly were models for China and Singapore. Another challenge, Oklahoma has not issued a state-wide shelter-in-place order, and Tulsa County includes in whole or in part, 13 municipalities (Tulsa County, 2020). Each city was left to issue individual orders, (some cities issued none,) and both implemented and lifted restrictions at different times. In totality, this means the model was using unvalidated R0 values that could not be applied evenly across the locality. The limitations of the R0 values in the modeling cannot be overstated. Another assumption was a 2% hospitalization rate and was coupled with variable duration for length of hospital stay. Original length of stay data was based on The Centers for Medicare & Medicaid Services data for the average length of hospital stay for Medicare recipients for pneumonia, and complicated pneumonia (Williams, Gousen, & DeFrances, 2018) . Again, due to the novel nature of COVID-19, hospitalization rates and duration of stay were approximations, and the limitations were significant. Modeling came with numerous challenges. One of which was "the difficulty in sifting fact from inaccurate information [was] aggravated by the speed of unfolding events, how much is still to be researched and understood by scientists and clinicians about COVID-19." (Garrett, 2020) . As noted in the work of Huang, the following applies to THD, "as is true for all data-driven approaches, [our] result inevitably depends on the quality of the data used, and some of the early data of the epidemic are not as good as the later data" (Huang, Qiao, & Tung, 2020) . Additionally, one of the biggest challenges was communicating those limitations, particularly to the media and the general public. As Garett succinctly states, "the current global COVID-19 epidemic features mechanisms of delivery of scientific information that are frankly unprecedented, adding to pressure for proper interpretation by the media and public" (Garrett, 2020) . Being clear of the limitations was extremely important so as to put forth the best possible effort to not spread disinformation or misinformation to the public. One Michigan State University study found "approximately 28% of American adults currently qualify as scientifically literate," (2007) and we also know that over one third of US adults also struggle with low health literacy (Gazmararian, Curran, Parker, Bernhardt, & DeBuono, 2005) . Communicating the limitations of data modeling, epidemiological processes, and how this applies means the public would need to understand scientific concepts, health concepts, and be public health literate. Public health data modeling is an important tool to support decision makers, but limitations of the data were significant, as were the related challenges of communicating the data to stakeholders and the public. To effectively respond to future public health crises, continued research to support public health data modeling, including modeling strategies with limited, unreliable, or emerging data, and related communication strategies is needed. How will country-based mitigation measures influence the course of the COVID-19 epidemic? Lancet Coronavirus Disease 2019 (COVID-19) Frequently Asked Questions Complexity of the Basic Reproduction Number (R0) COVID-19: the medium is the message Public health literacy in America: an ethical imperative A data-driven model for predicting the course of COVID-19 epidemic with applications for China The reproductive number of COVID-19 is higher compared to SARS coronavirus Scientific Literacy: How Do Americans Stack Up Area Towns and Cities -Tulsa County, Oklahoma Mission and Values Appropriate models for the management of infectious diseases National Hospital Care Survey demonstration projects: Pneumonia inpatient hospitalizations and emergency department visits. National Health Statistics Reports 116