key: cord-0763160-9r59d5r4 authors: Muller, Kaitlyn; Muller, Peter title: Mathematical modelling of the spread of COVID-19 on a university campus date: 2021-08-14 journal: Infect Dis Model DOI: 10.1016/j.idm.2021.08.004 sha: 7a74d7bebbe9fb599159bc49e4f321837873861d doc_id: 763160 cord_uid: 9r59d5r4 In this paper we present a deterministic transmission dynamic compartmental model for the spread of the novel coronavirus on a college campus for the purpose of analyzing strategies to mitigate an outbreak. The goal of this project is to determine and compare the utility of certain containment strategies including gateway testing, surveillance testing, and contact tracing as well as individual level control measures such as mask wearing and social distancing. We modify a standard SEIR-type model to reflect what is currently known about COVID-19. We also modify the model to reflect the population present on a college campus, separating it into students and faculty. This is done in order to capture the expected different contact rates between groups as well as the expected difference in outcomes based on age known for COVID-19. We aim to provide insight into which strategies are most effective, rather than predict exact numbers of infections. We analyze effectiveness by looking at relative changes in the total number of cases as well as the effect a measure has on the estimated basic reproductive number. We find that the total number of infections is most sensitive to parameters relating to student behaviors. We also find that contact tracing can be an effective control strategy when surveillance testing is unavailable. Lastly, we validate the model using data from Villanova University's online COVID-19 Dashboard from Fall 2020 and find good agreement between model and data when superspreader events are incorporated in the model as shocks to the number of infected individuals approximately two weeks after each superspreader event. The majority of us have felt the impact of COVID-19 on our daily lives and many understand how important it is for public health and our own individual safety to take part in social distancing, mask wearing, and other strategies for mitigating the spread of the virus. College campuses, in particular, provide an ideal breeding ground for coronavirus as we have group living situations and 5 lots of population mixing within class settings and other social events. In addition, the younger population present on campuses tend to have more social contacts as well as an increased likelihood of asymptomatic infection (Cashore et al., 2020b; Poletti et al., 2020) . This all leads to an increased chance of community spread. Campuses across the country were forced to develop public health strategies in order to attempt to host students on campus in the past year. There has been a lot of 10 work on modelling the spread of COVID-19 in a campus setting (Paltiel et al., 2020; Losina et al., 2020; Cashore et al., 2020b; Lopman et al., 2020; Gressman & Peck, 2020; Bahl et al., 2020) aimed at providing insights to university administrators on how best to prepare for on-campus semesters. These results, in particular, highlight the need for expansive (and expensive) surveillance testing programs and other strategies. 15 In this work we develop and investigate a model of the spread of coronavirus on a college campus. It is our goal to highlight the relative importance of individual-and institution-level mitigation measures. We focus on a medium-sized university that hosts primarily undergraduate students on campus. In addition, we assume that students have returned to campus and take a mix of in-person, online, and hybrid courses as has been the norm across many campuses since Fall 2020 20 (The Chronicle of Higher Education and Davidson College's College Crisis Initiative, 2020). We assume students, faculty, and staff are expected to wear masks, use hand sanitizer, and take part in other well-known public health measures while on campus. In-person classes will be assumed to be socially distant. We use an epidemiological SEIR-type model to describe the spread of the disease with the addition 25 of different population types for students and faculty. In addition we separate asymptomatic/presymptomatic and symptomatic individuals and include classes for quarantined and isolated individ-2 J o u r n a l P r e -p r o o f uals. This model is not intended for predictive purposes. Instead, we aim to give a quantitative measure of the potential effectiveness of individual measures such as mask wearing and social distancing (measured by individual model parameters) as well as institution-level measures: gateway 30 testing, surveillance testing, and contact tracing. We find that the scale of on-campus outbreaks is largely determined by student behaviors. Our results indicate that a gateway test in conjunction with effective contact tracing or robust surveillance testing, are necessary in order to keep the number of infections at a more reasonable level and make it more feasible to continue to house and teach students on campus. We find that contact tracing, on its own, is a rather robust and powerful 35 mitigation technique when done efficiently. In addition, we find that surveillance testing on its own must be done often (at least weekly) and to the entire population in order to be effective as a sole mitigation measure as was found in similar studies (Paltiel et al., 2020) . We also investigate the effects of surveillance testing done on a smaller scale. Surveillance testing can be a difficult task for universities without access to medical schools or other lab facilities and 40 is extremely expensive. We aim to determine if a small scale surveillance testing program is useful in maintaining effective control of the spread of the virus. We find that small scale surveillance strategies are not sufficient on their own but can reduce infection rates when used in conjunction with contact tracing. However, it is imperative to have a robust contact tracing program when a robust surveillance strategy is not a possibility. Finally, we validate this model by fitting to 45 estimated active cases per day data from the Villanova University COVID -19 Dashboard (Villanova University, 2020) . We find good agreement between our predicted active cases and the data once we modify our model to include the possibility of superspreader events modelled as shocks to the system approximately two weeks post event. Similar studies have been conducted to determine which strategies would be effective for re-opening 50 of college campuses. Much of this work was completed prior to the Fall 2020 semester. In (Paltiel et al., 2020) the focus was on determining an effective surveillance testing strategy and also the cost effectiveness of each strategy. Our work differs in that we use a continuous model as opposed to their discrete dynamical system model as well as our inclusion of separate classes for students and faculty and the inclusion of contact tracing as a distinct strategy. Our results from considering a 55 surveillance testing only scenario are similar. In addition, (Paltiel et al., 2020) incorporates random shocks or influxes of new positive cases which they use to model infections due to contact with the 3 J o u r n a l P r e -p r o o f broader community and/or superspreader events. We neglect the effect of the broader community but we also use shocks at certain points in time to study the effect of superspreader events like parties. For example, the data from Villanova University (Villanova University, 2020) shows a 60 sizable increase in cases approximately two weeks after Halloween. Another, more recent, study (Losina et al., 2020) has similar to results to (Paltiel et al., 2020) . The work by (Cashore et al., 2020b,a) , has been used to make decisions for re-opening strategies at Cornell University. This work differs from our own as it considers a stochastic compartmental model. Other key differences include their assumptions that surveillance testing of the entire population 65 is achievable every 5 or 7 days. In addition, they do not make a distinction between the contact patterns of students and faculty. Finally, they assume contact-traced individuals are only identified in the exposed class. This assumption assumes a very robust contact tracing system that identifies positive cases and their contacts in such a timely manner that the contacts have not yet become infectious, i.e. within 2-3 days of exposure. We allow for the possibility that due to delays in 70 reporting of symptoms as well as our assumed surveillance testing being possibly less often that there may be infected individuals caught by contact tracing in the presymptomatic/asymptomatic class as well. The most interesting result of (Cashore et al., 2020b) is the conclusion that re-opening results in fewer infections than the campus being entirely virtual. We do not make any comparisons to an entirely virtual semester in this paper. The most similar study to our own is (Lopman et al., 2020) , which also uses a continuous-time dynamical system for a model with similar groups delineated within the campus population. They go further to distinguish between on-campus and off-campus students which is likely necessary since they are focused on campuses of much larger size. The most significant distinction between our work and theirs is that we allow for surveillance testing (screening in (Lopman et al., 2020) ) 80 to potentially test less than the entire population. They investigate testing the entire population in intervals varying from weekly to once a semester while we allow for a variable percentage of the population to be tested from daily to biweekly frequencies. We hope to shed some light on the effectiveness of surveillance testing in this perhaps more financially feasible fashion. We also investigate the effects of student behaviors such as adherence to mask wearing and social distancing 85 policies by modelling the contact rate to depend on number of contacts daily (assumed to be related to social distancing) and probability of transmission (assumed to be related to mask wearing). We investigate scenarios with different levels of adherence to each behavior to determine their effect on total infections over the semester. We also note there have been agent-based models developed in (Bahl et al., 2020; Gressman & Peck, 90 2020). These differ significantly from our approach as they focus on individuals and their behaviors within the population instead of considering collective behavior as we do. With their focus on individuals, (Gressman & Peck, 2020) is able to examine the effect of class sizes more easily and concludes that large class sizes would be a main driver of outbreaks on campuses. We note that this brief introduction does not cover the quickly changing landscape of COVID-19 literature, but 95 covers the most relevant literature to modelling the spread on college campuses currently available. The paper will be organized as follows. In Section 2 we will discuss our methods which will include a description of our model and the assumptions made. We will also describe our method of solution and the various simulations performed. In Section 3 we will discuss the results of our simulations. In particular, we will first discuss a series of hypothetical situations, §3.1, highlighting the difference 100 between three levels of adherence to social distancing and mask wearing policies. In addition we will investigate hypothetical scenarios as we vary levels of surveillance testing and contact tracing. We will also perform sensitivity analysis, §3.2, to determine which parameters the model is most sensitive to concerning the total number of infections over the course of a semester as well as the estimated basic reproduction number. Finally, in §3.3, we will provide a model fit to data obtained 105 from Villanova University's COVID dashboard data from Fall 2020 (Villanova University, 2020). As noted before, we develop a SEIR-type epidemiological model modified to incorporate our knowledge of the characteristics of COVID-19 as well as those of a college campus community. We note 110 that by using an SEIR-type model we are implicitly assuming a homogeneously mixed population. For example, it is thought that a population like that of a cruise ship or a heavily populated area like New York City may be well modelled in this way (Bansal et al., 2007) . While this is not entirely realistic as people always act as individuals, we believe campus populations are significantly more 5 J o u r n a l P r e -p r o o f mixed than a standard population of say an average town or small city. We do allow for some heterogeneity, however, by separating each class into faculty and students as we expect students to have much higher contact rates with each other than faculty. We note that there has been some work on more heterogeneously mixed models (Bahl et al., 2020; Gressman & Peck, 2020) that have different takeaways than our work. No results from this study should be used to make predictions about number of infections, deaths, etc. To build our model, we make the following assumptions. As1. We assume the campus population may be divided into two groups with different collective behavior: students and faculty. This assumption is based on the fact that we expect students to have significantly more contacts with each other and perhaps others on campus, which increases their chance of catching and transmitting disease. In addition, the average age of 125 undergraduates is much lower than of the faculty and thus students will be more likely to be asymptomatic and have a much smaller case fatality rate. We neglect the effect of other groups on campus such as staff and graduate students. There is a greater variability in the amount of contact these groups have with other people on campus which makes modelling more challenging in their case. In addition faculty will likely be the largest group of employees who 130 are interacting with students often and hence are the secondary group we take into account along with the students. As2. We assume the total population is constant (including deaths) and that there are no student transfers nor faculty hiring during the semester as well as no deaths due to circumstances other than COVID-19. It is therefore assumed that the campus is a closed community during 135 the semester, i.e. no one interacts with any person not a part of the campus community. This is obviously not the case but greatly simplifies the model. We therefore neglect the effect of student and faculty interactions off campus or visitors to the campus. We assume that there are no breaks during the semester and that students are discouraged from traveling during the semester. (D i ). The quarantined class includes anyone identified by a false positive test or those identified via contact tracing that do not in fact have the disease. This group will return to the 145 susceptible class after completing an assumed required 14 day quarantine. As4. The driver of new infections is assumed to be contact between susceptible and infectious individuals. In addition, we assume different contact rates for students and faculty. Contact rate is broken down into three parameters in our model. The first factor is the average number of contacts per person per day (c i ) multiplied by the probability of transmission given contact 150 (p c ). We then include a factor f ji which indicates which proportion of those contacts per day are made between an infected individual of type j with an individual of type i. For example f 12 is the proportion of the contacts an infected student makes per day with faculty and f 11 is the proportion of contacts infected students make with other students, therefore f 12 + f 11 = 1. Similarly, f 21 + f 22 = 1. This enables us to distinguish between the different expected contact 155 patterns of students versus faculty. As5. We assume that contact rates with asymptomatic/presymptomatic and symptomatic individuals may be different. There is still emerging research in this area but we allow the relative infectiousness for the asymptomatic class to be less than or equal to that of the symptomatic class. As6. We assume recovered individuals obtain at least a temporary immunity that will last the length of the semester and hence recovered individuals do not become susceptible again. Studies are still ongoing regarding the duration, if any, of temporary immunity (Wu et al., 2021) . As7. We assume the effect of social distancing, mask usage, hand washing, etc. is modelled by a change in the contact rate between susceptible and infected people. We anticipate these 165 measures will reduce both the number of contacts per day and the probability of transmission. We vary the effectiveness randomly as it is not known precisely how much each of these measures affects contact and transmission. As8. We assume average recovery times for symptomatic and asymptomatic students and faculty to be the same, but given the age disparity between the two groups, we use different death testing we assume that both exposed and asymptomatic/presymptomatic individuals may be isolated as a result. We then assign a certain proportion of those cases that are caught by 175 tracing (T ) and an average number of days (ψ) it takes for the tracing process to isolate an individual. As10. In the model with a gateway test and/or surveillance testing, we assume that this test will not catch every case. We prescribe a set sensitivity and specificity of the test used from within a range found in (Bisoffi et al., 2020) . As11. It is assumed that some proportion of symptomatic people are placed into isolation in a set number of days after presenting symptoms. We assume there may be some instances of people whose symptoms are so mild that they do not self report and/or those who hide symptoms or do not abide by isolation rules. Therefore it is possible for symptomatic people to interact with susceptible populations. As13. This model does not address the effects of underlying medical conditions or the disparity 190 between different racial/ethnic groups. We recognize that some groups are more likely to be impacted by a COVID-19 outbreak, even if the model does not explicitly show this. Our model, given by Equation (1), is a SEIR-type system modified to include the two population types of students and faculty. There are eight compartments into which each population type could 195 pass through: susceptible (S i ), exposed (E i ), asymptomatic/presymptomatic (A i ), symptomatic (I i ), quarantined (Q i ), isolated (L i ), recovered (R i ), and deceased (D i ). In Equation (1), an index of 1 represents variables and parameters related to the student population and an index of 2 represents faculty. All variable and parameter definitions are listed in Table 1. 8 J o u r n a l P r e -p r o o f for i = 1, 2 and j = i above and where In our model the susceptible class loses members when individuals become infected due to contact with infectious individuals, receive a false positive test result, or are contact traced. This class will also regain quarantined individuals after they complete an assumed required 14 day quarantine. Once a susceptible person has been exposed to the virus in such a way that they will eventually 205 become infectious they are moved to the exposed class. They then leave the exposed class either due to contact tracing/surveillance testing, which moves them to isolation, or they have become infectious and move to the asymptomatic/presymptomatic class. Asymptomatic/presymptomatic people then can also be isolated due to contact tracing/surveillance testing or, if they are not detected, they will eventually either move to the symptomatic class or the recovered class. Any 210 individual who is caught by contact tracing or surveillance testing who is not actually infected will be moved to the quarantine class until they complete the required 14 day quarantine and then return to the susceptible class. Symptomatic people may be isolated, recover, or die. Isolated individuals 9 J o u r n a l P r e -p r o o f may recover or die. Recovered people will remain recovered as we assume (As6.) at least short-term immunity during the semester. The flow of this process is depicted in Figure 1 . 215 We note that beyond the additional classes the main changes from the standard SEIR model are the inclusion of contact tracing and surveillance testing. Our method of modelling contact tracing is somewhat different than those considered in other COVID-19 university models (Paltiel et al., 2020; Cashore et al., 2020b; Gressman & Peck, 2020; Lopman et al., 2020) . We assume that some proportion of the exposed and asymptomatic/presymptomatic classes are removed in some 220 average amount of time. We note this proportion depends on the proportion of contacts reached, T , as well as the probability a contact is traced while they are in the exposed or asymptomatic classes, p e or p a respectively. This reflects our assumption that all interactions occur on campus. This type of contact tracing term has been used before in other generic SEIR-type models, for example in (Feng, 2007) . We also assume some proportion of exposed and asymptomatic classes 225 is removed due to surveillance testing. These terms depend on the sensitivity of the test given as well as how often the testing is completed and the proportion of the total population that is tested in that time frame. This term is very similar to that used in (Paltiel et al., 2020) however we include the possibility that the proportion of the total population tested is less than 100%. In addition, the terms modelling those caught by contact tracing who are not actually infected are 230 slightly different than other models. We aim to estimate the number of people who have tested positive daily and the number of contacts those individuals had with the susceptible class in order to determine how many susceptible individuals are removed by contact tracing. We note this term has a similar structure to the standard SEIR term modelling contact between those infected with susceptible individuals. It is modified to include only those infected that the university is aware of, ψ+ρ P T S 2 . This term includes false positives from the surveillance testing program as these individuals will also have their contacts traced. For simplicity, we assume that contact-traced individuals do not then have their contacts traced. Our model accounts for individuals sharing d days worth of contacts. Finally, we account for the fact that not all contacts are caught by the contact tracing program by considering only a 240 proportion, T , of an individual's contacts are traced. We also add in an additional factor of 1/5 in the term of those in the susceptible class who are contact traced. Using data from their local health department (Cashore et al., 2020b) found that about 1/5 of contacts actually met the standard for quarantine as set by the CDC. We assume everyone who is traced who is not actually ill will move to the quarantined class and is required to fulfill a full 14 day quarantine regardless of a subsequent 245 negative test result. We acknowledge there are varying accepted lengths for quarantine and isolation and we choose to have both be a full 14 days. We also carefully model the initial conditions for all population classes since gateway testing inevitably impacts these. To investigate hypothetical scenarios in §3.1 and §3.2, we assume that there are no individuals initially in the exposed, recovered, or deceased classes. For the infected classes 250 we assume a proportion, P i of the infectious population is symptomatic and 1 − P i of the infectious population is asymptomatic with i = 1, 2 for students and faculty, respectively. For scenarios with no gateway testing (Sc.1., Sc.2.), the quarantine and isolation classes also begin empty. For scenarios with a gateway test (Sc.3., Sc.4., Sc.5.), some of the infectious population is instead placed in isolation with the amount based on the sensitivity of the test used. Additionally, we allow the 255 possibility that some people will self-isolate based on symptoms. We also account for the fact that some susceptible individuals will receive false positive tests based on the test specificity and add this number of false positives into the initial quarantine class. For data fitting in §3.3, we use the data to obtain the initial condition for the isolated and quarantined classes and estimate the remaining initial conditions. 260 Equation (1) depends on a significant number of parameters, most of which are not known exactly. There are some that have been estimated from data but many of these estimates include uncertainty and some are not known at this time due to limited publicly available data from college campuses. We attempt to make reasonable estimates based on the literature whenever possible which usually 265 gives a range of values. We model any parameters given with a range of values as uniform random variables over those ranges to reflect our uncertainty. Future work may involve better estimation of parameter-specific probability distributions. We will now describe the ranges or estimates used in our simulations of hypothetical outbreaks in §3.1. The possible ranges of parameter values and their corresponding default values used in this paper are listed in Table 2 . The contact rate, which has units days −1 depends on the number of contacts per infected case per day (c i ) as well as the probability of transmission per contact (p c ). We assume that students and faculty have a different average number of contacts. For students we assume 7-17 contacts per day based on the discussion in (Cashore et al., 2020a) . We assume faculty have fewer daily contacts than students, 1-4 in this paper, based on the fact that it was found in (Cashore et al., 2020b,a) 275 that the majority of faculty close contacts will be off campus which we do not model. In §3.1, we will break the c 1 range into three subranges to account for various levels of adherence to social distancing among students. The probability for transmission due to a close contact, p c , was found to be 2.4% in (Jing et al., 2020 ) and 3.7% in (Luo et al., 2020) . These values were estimated from contact tracing data in 280 China at the beginning of the pandemic and do not take into account mask wearing and social distancing. It has been estimated that masks may reduce transmission by as much as 80% (Liang et al., 2020) . We therefore assume that the low end of our range is 20% of the average of the two estimates from literature and consider the range 0.6-3.7%. In §3.1, we will break this range up into three subranges to account for various levels of adherence to mask wearing from both faculty and 285 students. To further model the differences in contact between students and faculty we also introduce estimates for the proportion of total contacts an infected student/faculty has with susceptible students/faculty. At this time, these values are educated guesses due to a lack of data. We assume the proportion of contacts an infected student makes with other students (f 11 ) is 70% and therefore the proportion of 290 contacts an infected student makes with faculty (f 12 ) is 30%. We assume faculty primarily interact with students on campus therefore f 21 is set at 70% and f 22 is 30%. One final factor for the contact rate is the possible reduction in transmission due to contact with an asymptomatic individual compared to that with a symptomatic one, represented by τ . The estimates of this difference in transmissibility vary but most agree that asymptomatic individuals 295 13 J o u r n a l P r e -p r o o f are less likely to spread the virus (Centers for Disease Control and Prevention, 2020; Luo et al., 2020) . In (He et al., 2020) it was found that there was no statistically significant difference in the estimates for asymptomatic versus symptomatic transmission. We therefore consider τ ∈ [0.5, 1]. The number of days between surveillance test results (ρ) is a measure of how long it takes on average for the tests to be performed and results returned so that positive cases are then isolated or 300 quarantined. This value is university-protocol-dependent, so we consider frequencies of testing from daily to biweekly to account for various protocols. In addition, the proportion of the population that is tested (P T ) is also policy-dependent. Since most surveillance testing protocols randomly sample from the total campus population, we assume that if P T of the total population is tested, then the same proportion P T of each of the susceptible, exposed, asymptomatic, and symptomatic classes is 305 also tested. We allow P T to vary from 5-100% to account for various protocols. Finally, we take the ranges for the specificity (S p ) and sensitivity (S e ) of tests to be 95.02-100% and 62.4-94.12%, respectively, based on a study of available tests used in an ER setting (Bisoffi et al., 2020) . Parameters related to contact tracing are also campus-protocol-dependent. However (Spencer et al., 2021) looked at contact tracing programs in more than 100 health departments and found that 310 between 32% and 79% of close contacts (identified by the individual) were reached within the first 24 hours. We use these percentages for the proportion of contacts reached (T ) in §3.1. We also assume in most cases that the average amount of time between identifying a positive case and reaching their contacts is ψ = 1 day. The probability that a case caught by contact tracing is in the exposed (p e ) or asymptomatic (p a ) class is not estimated in the literature to our knowledge and 315 therefore we let them range from 0 to 1. The parameter d represents the number of days of contact the tracers will consider. The CDC recommends tracing beginning two days prior to symptoms (Centers for Disease Control and Prevention, 2021b). We assume that most people limit contact post symptoms and therefore use d = 2 in all simulations. We also assume, as it is the policy at most institutions, that traced contacts stay in quarantine for µ = 14 days. The time from exposure to symptom onset has been estimated to be around 5-6 days with one estimate as high as 7 (Lauer et al., 2020; Centers for Disease Control and Prevention, 2020; Tindale et al., 2020) . Our model treats this value as the sum of σ, the number of days between exposure and joining the presymptomatic/asymptomatic class where individuals are deemed infectious, and φ, the number of days between becoming infectious and developing symptoms. In (Tindale et al., 325 14 J o u r n a l P r e -p r o o f 2020) it was estimated that an infectious person infected another approximately 2-3 days before symptom onset. Therefore we set φ to be between 2 and 3 days and σ to be between 2 and 4 days. There are three separate recovery times listed in our parameter set: η 0 for those who never develop symptoms, η 1 for symptomatic students and η 2 for symptomatic faculty. We set all three to be 14 days (Voinsky et al., 2020) as there is not much said in the literature on the difference between 330 recovery times for different age groups and levels of symptoms. For the sake of our model, we are mainly concerned with the period of infectiousness when these individuals may cause new infections due to contact with susceptible individuals. For future generalizability, we keep these as three separate parameters in case more data becomes available on the differences between asymptomatic and symptomatic periods of infectiousness. The next parameter to consider is ξ which represents the amount of time on average before a symptomatic person self-reports (or self-isolates). This value is challenging to estimate, however, in (Centers for Disease Control and Prevention, 2020) they estimate the median number of days from symptom onset to getting a test among positive patients as 1-6 days with a base value of 3 days. We assume that obtaining a test is equivalent to self-reporting and therefore being isolated and use 340 this range for ξ. It was estimated in (Reese et al., 2020 ) that those aged 18-49 with symptoms self report 15-65% of the time (with an average of 34%) which we use as our range for r. Our model has a few parameters related to death. First, ζ 1 and ζ 2 represent the average number of days between symptom onset and death for students and faculty, respectively. The current best estimates by (Centers for Disease Control and Prevention, 2020) are relatively the same for all age 345 groups so we set both to 16 days. Again, we keep these parameters separate in case more information is made available in the future. Second, we require a parameter for the probability of death for each group (d i ). We use (Centers for Disease Control and Prevention, 2021a) to estimate d i . To do this, we require an estimate of the age distribution of students and faculty. For students, since we model a primarily undergraduate institution, we assume the majority fall into the age range 18-29 years and 350 estimate the probability of death to be the number of deaths due to COVID-19 divided by the total number of infections which gives us d 1 = 0.000359. We note that since we are unable to estimate precisely how many of these reported infections are asymptomatic this estimate is most likely not accurate but given the smaller size of this probability we do not expect it to have a substantial effect on results. We also note that this neglects the possibility of nontraditional students since we assume 355 15 J o u r n a l P r e -p r o o f these students to make up a small percentage of the overall student body and are also less likely to live on campus. Therefore they will not impact the disease spread as much as other students. For faculty the age range varies much more. From (McChesney & Bichsel, 2020) we estimate that 13% of faculty are 65 and older. We will use the CDC data for ages 65-74 for this group, since there is no further breakdown of this group in (McChesney & Bichsel, 2020) . To estimate the proportion of and Prevention, 2021a), we estimate the probability of death in each of these groups as we did for the student population. We then compute a weighted average of the probabilities of death for these age groups to estimate d 2 = 0.0105. Next we consider the parameters that represent the proportion of infections that are asymptomatic versus symptomatic. It is believed that younger populations will more likely be asymptomatic. We 370 therefore define two parameters for the proportion of symptomatic infections in students (P 1 ) and faculty (P 2 ). These estimates vary widely, for example the CDC (Centers for Disease Control and Prevention, 2020) has a range from 10-70% over all age groups. In (Byambasuren et al., 2020) they found studies with estimates for the proportion of asymptomatic infections ranging from 4-40%. In (Mizumoto et al., 2020) it was found that 17.9% of infected passengers on the Diamond Princess 375 cruise ship were asymptomatic. In addition (Poletti et al., 2020) finds much lower estimates of symptomatic infections for example finding 18.09% of 0-19 years display symptoms. We therefore define our P 1 range for students as 20-60% where the low end comes from an average of the 0-19 and 20-39 age group estimates in (Poletti et al., 2020) and the high end comes from (Centers for Disease Control and Prevention, 2020; Byambasuren et al., 2020) . For faculty, P 2 ranges from 30-380 80% symptomatic infections where the low end comes from the 30-49 age group estimate in (Poletti et al., 2020) and the high end from (Byambasuren et al., 2020; Mizumoto et al., 2020) . Lastly we discuss parameters related to the initial conditions listed in Table 3 . In (Cashore et al., 2020b) they estimate that the proportion of the student population that is infected at the beginning of the semester ranges from 0.5-4%. This estimate was formed before any gateway testing had been completed. Since (Cashore et al., 2020b ) an e-mail from the Office of the President at Villanova University reported that 0.41% of gateway tests from Villanova students and faculty returning to campus were positive (Personal Communication, 2020) . We therefore include the possibility of a smaller percent infected and modify the range to be 0.25-4% to include the possibility of lower prevalence. We assume all who test positive begin the semester in isolation. Within this group there 390 will be some false positives who begin in the quarantine class who can then return to the susceptible class after 14 days. We also assume there are some who may not test positive or who self-isolate due to symptoms. We assume the proportion of students who self-isolate is equal to r discussed above. We also assume faculty may be more likely to self-isolate and increase this range to 50-75% for the initial conditions only since classes haven't started yet. For those who are not isolated but 395 are infected we divide these into the asymptomatic/symptomatic classes using the proportions P 1 and P 2 defined above for students and faculty, respectively. For simplicity we assume at the start of the semester there is no one in the recovered class though there are likely some in the population who had the disease and still possess some form of immunity. We will validate our model using data from Villanova University's COVID Dashboard for the fall 2020 semester (Villanova University, 2020). We note that the dashboard data only provides new positive tests per day and only includes positive cases of which the university was made aware. We will assume that each positive case will remain active (i.e. still appear as a case in the data) and isolated for 14 days in order to estimate the number of isolated individuals at any given time. This 405 estimated active cases (or isolated cases) per day is shown in Figure 2 . We will fit our model to this data using the MATLAB function fminsearchbnd from (D'Errico, 2012) using the sum of L 1 (t) and L 2 (t) to fit to our estimated data. We will use the default value of parameter values we are more confident in from literature and estimate the remaining using the data. The parameters estimated are p c , c 1 , c 2 , ξ, P 1 , P 2 , r, T, p e , p a . It is clear from the Villanova data and other state, national, etc. trends that there are spikes in cases that will not appear by simply modelling SEIR-type dynamics. We assume that there are certain days/events on campus that act as superspreader events causing these spikes. For the Villanova data we hypothesize there are spikes 2 weeks after the following events: the first weekend on campus (prior to class starting), Labor Day weekend, and Halloween. We model these spikes by having a sudden increase or shock in the number of infections approximately 14 days after these expected events. In simulations we simply start and stop the simulation at these dates and add in an expected number of new members to the asymptomatic and symptomatic classes (proportioned using P 1 ) to the new initial conditions and remove these people from the susceptible classes. We assume that shocks affect only students. Thus, the shock for superspreader event k occurring at time t k is modelled as ∆A 1 =(1 − P 1 )(jump size k), for t = t k + 14. In the numerical work, these shocks re-initialize Equation (1) at t = t k + 14. We J o u r n a l P r e -p r o o f note that we also estimate the size of these shocks (jump size k) in the data fitting procedure in §3.3 while the dates on which they happen, t k , are fixed. We use MATLAB's built-in ode45 solver to solve our system of differential equations and provide 415 results for the number of people in each population on each day of a 98 day semester. We present three different sets of results in order to portray the possible scenarios predicted by our model as well as to provide analysis and validation of our model. We first discuss a series of hypothetical epidemics. Here we will consider various ranges of the model's parameters to see the breadth of possible outcomes. We include these results to reflect the uncertainty in parameter estimates as well 420 as to demonstrate the different results depending on institutional policies and individual choices. To further understand the effects of certain policies and behaviors we perform sensitivity analysis on several parameters related to mitigation measures. Finally, we will fit our model to estimated data from Villanova University's fall 2020 semester COVID dashboard, (Villanova University, 2020) , to validate our model and provide better estimates of university-specific parameters. We begin by considering the vast range of possible outcomes due to our uncertainty in some parameter values as well as the variability in containment strategies used across the country on different campuses. We will measure success of different strategies by determining the change in the percent of the total population infected over the course of the semester. We will mainly discuss scenarios First we consider the effects of individual practices such as social distancing and mask wearing. We therefore set parameters related to university measures, as well as those related to the initial 440 conditions, to the default values in Tables 2 and 3 to ensure we are seeing the effects of the individual level practices. With regards to contact tracing and surveillance testing we set T = 0.555, ψ = 1, P T = 0.25, and ρ = 7. For the gateway test and initial condition we assume the initial percent of the population that is infected is 2.1%, the proportion of students/faculty that self-isolate at the start are 0.4/0.624 respectively, the sensitivity of the gateway test is 78.26%, and the proportion 445 of students/faculty that are symptomatic are 0.4/0.55 respectively. We randomly sample c 1 and p c from the ranges given in Table 2 Figure 3 shows the daily percent of students that are actively infected (i.e. the sum of the asymptomatic 455 and symptomatic student classes) for different levels of adherence to social distancing (SD) and mask wearing (MW) policies from pessimistic SD/MW to optimistic SD/MW from left to right. The mean curve of each level of adherence is displayed as a thick black line. In all plots we assume gateway testing, surveillance testing, and contact tracing have been performed with the parameters specified above. We note that as we increase adherence the peak levels of infections reduce from 460 approximately 30% to 10% to 0.35%. In particular we note that the peak level of infections actually occurs at the beginning of the semester in the optimistic scenario, hence there is no outbreak. Figure 4 shows the mean curves for different combinations of mask wearing and social distancing adherence. We see again that the only case when there is a peak value larger than the initial percent infected is in the case when we assume both pessimistic adherence in social distancing and mask wearing. In addition, we see monotonically decreasing infection levels whenever mask wearing is optimistic 21 J o u r n a l P r e -p r o o f or nominal, except in the case when social distancing is pessimistic. In Tables 4 to 6 conditions as specified above. We randomly sample T , P T , and ρ from the ranges given in Table 2 for 300,000 simulations. We then define nominal contact tracing to be T ∈ [0.32, 0.555) and robust contact tracing to be T ∈ [0.555, 0.79] with ψ = 1 in both cases. For surveillance testing, we define Table 7 shows the mean total percent of the student and faculty populations that are infected over the course of the semester for the various scenarios. We first note the utility of gateway testing, which when added to contact tracing (moving from Scenario Sc.2. to Sc.3.), reduces the percent 500 infected by approximately half. In addition, the total percent of students infected in Scenario Sc.3. is 4.27% and 3.24% for nominal and robust contact tracing programs, respectively. For Scenario Sc.4. to reach a similar level of infection we require surveillance testing to be performed at robust levels (75-100% of the population at least weekly). Lastly, Figure 6 shows the mean curves for a single mitigation measure (along with a gateway test) on the left and dual mitigation measures on 505 the right. On the right we note that in all cases we do not see peak values reaching initial infection 24 J o u r n a l P r e -p r o o f levels again during the semester, indicating successful strategies. However, on the left we see that nominal surveillance testing only does not control the outbreak as the other single measure cases do. We therefore conclude that surveillance testing must be robust if used on its own as a mitigation measure. We also note that if a contact tracing strategy is already in place, further adding in 510 surveillance testing lowers the percent infected by about 1-2% for different level strategies as seen in Table 7 . We therefore find that if a robust surveillance testing strategy is not available then it serves an institution most to develop a robust and efficient contact tracing program. Each family of curves represents the variability due to unknown/random parameters in the model. In this section, we investigate the effect of perturbing parameters related to disease mitigation 515 measures at both the university-and individual-level. These parameters are c 1 , c 2 , p c , ξ, r, T , ψ, ρ, and P T . In addition to quantifying the effect perturbations of these parameters have on the total number of student and faculty that are infected over the course of a semester, we also quantify the effect these perturbations have on R 0 . is 1.7478, which is lower than many estimates for R 0 but most previous estimates do not take into account mask wearing and social distancing. To perform the sensitivity analysis, each parameter was initialized as the default value found in 530 Table 2 . Then, each mitigation measure parameter was perturbed positively and negatively by a certain percentage while keeping all other parameters fixed. The resulting percent change in total infected population and R 0 value were computed. Figure 7 shows the percent change in total number of students and faculty infected over a semester caused by a ±1% perturbation in each mitigation measure parameter. Similarly, Figure 8 shows the percent change in R 0 by these perturbations. As 535 can be seen in these Figures, the total number of students and faculty infected and R 0 are most sensitive to the c 1 and p c parameters, which relate to the number of daily contacts students have and the probability of transmission per contact, respectively. Interestingly, a ±1% change in any of the parameters mostly caused no more than a ±1% change in the outcomes except for the effect of c 1 and p c on the total student population infected. This highlights that maintaining a safe in-540 person semester can be the most affected by student behaviors. We note that the model is much less sensitive to the rate and speed at which individuals self-isolate upon becoming symptomatic. This is likely due to the assumption that contact tracing and surveillance testing are both taking place which lessens the burden of responsibility on individuals. Looking at Figures 7 and 8 , we see that outcomes and R 0 values were next most sensitive to changes in T and ψ after student behaviors, 545 which are related to the level of contact tracing performed on campus. This further confirms that contact tracing is a powerful measure that can provide excellent control when implemented robustly and efficiently. We, however, stress these results be viewed through the lens of our assumption that all contacts occur on campus. It is likely that campus contact tracing programs will be affected by contacts made outside of campus which would be harder to trace. As mentioned in Section 2 we fit our model to the estimated data from the Villanova University COVID Dashboard for the Fall 2020 semester (Villanova University, 2020). We note as before that we see from the data three distinct peaks. It is clear from our hypothetical results that our model, on its own, will not capture this type of behavior. We hypothesize that the independent peaks are due to days/periods of increased social activity on campus. Specifically, we assume there is an influx of new cases due to the return to campus (that would not be reflected by Equation (1)), Labor Day weekend (which is the only time students had a day off from classes during that semester), and Halloween. Halloween is always a time of increased social activity and was towards the end of the semester so we hypothesize students were growing weary of social distancing protocols. In order to (for which we apply Equation (2) at t = t Halloween + 10). It is expected that there will be some 565 variation in how quickly spikes occur after known superspreader events. We estimate the following parameters: p c , c 1 , c 2 , ξ, P 1 , P 2 , r, T, p e , p a , f 11 , and f 22 as these parameters are dependent on the specific population on campus. In addition we estimate the sizes of the three jumps (jump size k) as well as the initial conditions for the exposed, asymptomatic, symptomatic, and recovered classes. We assume from dashboard data that the initial quarantined, isolated, and deceased classes are all 570 zero. We set the remaining parameters to be their default values as listed in Table 2 . These fits are shown in Figure 9 . We also show cases when, in addition, we estimate τ, S e , and S p in Figure 10 . τ in particular is not precisely known, in addition our knowledge of S e and S p is limited due to multiple tests being used on campus as well as natural variations due to collection and processing error. Note that all parameters to be fitted have their initial value chosen from a uniform random 575 distribution in the ranges set in Table 2 . Fit to smoothed data Data Overall in both cases there is generally a good agreement between the model and the data. In particular we are able to capture the third peak well when we assume the jump associated with Halloween occurs 10 days after instead of 14. This leads to a substantial decrease in error as seen in substantial. We see that p c is found to be significantly higher when we don't use the default values for τ, S e , and S p . Our estimated contact rates fall into the ranges in Table 2 and in (Cashore et al., 30 J o u r n a l P r e -p r o o f 2020a). In particular they fall in our nominal contact rate range. There is even more variability in our estimates for P 1 and P 2 though in all cases P 1 is lower than P 2 as expected since students are on average much younger than faculty and hence more likely to have an asymptomatic infection. 585 We also note that p e is always larger than p a which indicates it was more likely for contact tracing to find contacts in the exposed state rather than the presymptomatic/asymptomatic state which is expected and desired. This variation in estimates is not surprising given the novelty of the virus and the level of uncertainty in parameter values at this time. We expect that as more information is gathered these parameters may not need to be estimated and the fit to data will likely change. We conclude our discussion by mentioning briefly the limitations of our model. As with any model not every aspect of a phenomenon is captured and the results should be viewed through the lens of this knowledge. We first note that this model is one that assumes homogeneous mixing, likely overestimating transmission rates, as individuals all have different behavior and contact patterns. 595 We see in (Gressman & Peck, 2020 ) much lower infection rates than in compartmental models. However we note that in (Bansal et al., 2007) it is noted that error due to the assumption of homogeneous mixing is less when considering heavily mixed populations like cities. We assume college campuses fall into this category. We also note that we have not incorporated the effects of staff and graduate students on campus. We ignored these groups due to an assumption that they 600 have less contact with students who are the primary drivers of infection on campus. On certain campuses, however, graduate students, in particular, have a lot of contact with undergraduate students and therefore should be considered in that case. Staff likely primarily have contacts off campus. Depending on the level of transmission in the community-at-large these contacts may or may not have an impact on the spread on campus. We have neglected to include the effect of 605 contacts off campus. In most other models, for example (Lopman et al., 2020) , the effect of offcampus interactions is modelled as a constant rate of new infections per day occurring on campus. Whether or not this has a large impact on overall infections on campus is dependent on the level of community spread as well as the amount of time students spend off campus. In this model, we assume students are asked to limit their time off campus however it is likely many students do spend 610 time off campus, in particular on weekends or other days off. This is likely the assumption that limits our model results the most and we intend to introduce these effects into the model during future study. It is well known that SEIR-type models do not predict the multiple surges or seasonality seen in many infectious disease epidemics. In this model we have addressed this by including shocks to the 615 number of infected individuals after expected superspreader events. There are likely some dates which are known at the beginning of the semester to likely have superspreader activities, like spring break. However it is impossible to predict all possible superspreader events prior to the start of the school year. Therefore, predictions using these types of models will likely miss the effects of these types of events entirely. It is also possible that there is additional time variation occurring 620 in parameters like contact rates dependent, perhaps, on season that have not been captured by this model. More needs to be known about this specific virus before we can accurately model time dependence in parameter values. Lastly, most parameter values in this work are assumed to be drawn from a uniform distribution defined by a range of possible values found in literature. It is possible to better estimate probability density functions for these parameters which would likely 625 lead to less variation in our set of hypothetical epidemics. We leave this as future work as well. In this work we have developed an SEIR-type model for the spread of COVID-19 on a mediumsized university campus with a primarily residential undergraduate student population. We have found that student behaviors, in particular mask wearing and social distancing, largely determine the 630 ultimate size of an on-campus outbreak. In contrast faculty largely do not contribute to the spread of the virus. We find that for those universities who are unable to develop a robust surveillance testing program contact tracing can provide excellent control of the spread. Smaller scale surveillance testing can provide further reduction in total infections but that reduction is not substantial. We find that our model compares well with collected data from Villanova University's fall 2020 COVID- 635 19 dashboard (Villanova University, 2020) when we further include superspreader events into the model. This does point to the difficulty of truly predicting total infections when it is unknown what events will lead to such large sudden influxes of new infections but there are common events such as fall/spring breaks which may be included when using this type of model for predictive purposes. The authors declare no conflicts of interest regarding this paper. CRediT Author contributions Since at the DFE there is no one in a disease state, we set E i , A i , I i , L i = 0 for i = 1, 2. After plugging these values into Equation (1) we are left with the following equations along with the Q i equations, which are the negatives of (A.1), and the constraints that S 1 + Q 1 + R 1 + D 1 = N 1 and S 2 + Q 2 + R 2 + D 2 = N 2 . We assume that the DFE of interest is when R i = 0 and D i = 0 for i = 1, 2. Therefore to find the DFE we seek to solve this system of equations for S i and Q i for i = 1, 2. However, when we look at the non-zero entries of the Jacobian we find that the only dependence on the population values is in the form of the ratio of S 1 to S 2 . Therefore we 780 need only to solve for this quantity. In the case when there are no measures taken by the university, i.e. when T = 0 and P T = 0 the equations above simplify to and therefore Q i = 0 and S i = N i . In the case when there is no contact tracing (T = 0) but surveillance testing is conducted we reduce to the following equations and therefore Q i = (1−Sp)P T µ ρ S i . Plugging this into the constraints S 1 + Q 1 = N 1 and S 2 + Q 2 = N 2 it may be shown that S 1 /S 2 = N 1 /N 2 as in the previous case. In the case when P T = 0, i.e. no surveillance testing but contact tracing is conducted we obtain the same equations as in the no 38 J o u r n a l P r e -p r o o f measures cases. Finally in the case of both contact tracing and surveillance testing we have the following system of linear equations to solve (S 1 = 0, S 2 = 0): along with the constraint equations. If we define the coefficient of S 1 to be A and of S 2 to be B in the first equation of (A.4) and the coefficient of S 1 to be C and of S 2 to be D in the second equation of (A.4) we obtain the following formula for S 1 /S 2 at the equilibrium point using Maple: We may then plug the values for this ratio into the Jacobian for each case of measures taken. To calculate R 0 we then separate the Jacobian into the matrix F , which contains terms related to new infections, and the matrix V , which contains terms related to moving between compartments after initial exposure. We write out the non-zero entries of F and V below assuming both measures are 785 taken. The first two rows of F are given in equation (A.6) since the third through eighth rows are all zeros. Likewise, the non-zero entries of V are given by (A.7). If at least one measure is removed then the ratio in (A.5) simplifies to N 1 /N 2 . In the case of one or no mitigation measures, we simply set T = 0 and/or P T = 0 to remove the effect of contact tracing and surveillance testing, respectively. Elise Pasles and Dr. Jesse Frey for their input on the statistical aspects of sampling a population for surveillance testing. We would also like to thank the reviewers and 650 editor for their comments. This work received funding from Villanova University's Falvey Memorial Library Modeling COVID-19 spread in small colleges When individual behaviour matters: homoge-655 neous and network models in epidemiology Epidemiology and transmission of COVID-19 in 391 cases and 1286 of their close contacts in Shenzhen, China: a retrospective cohort study. The Lancet Bisoffi Estimating the extent of asymptomatic COVID-19 and its potential for community transmission: systematic review and meta-analysis Addendum: COVID-19 mathematical modeling for cornell's fall semester COVID-19 mathematical modeling for cornell's fall 675 semester Covid-19 pandemic planning scenarios Demographic trends of COVID-19 cases and deaths in the US reported to CDC Estimated transmissibility and impact of SARS-CoV-2 lineage B. 1.1. 7 in England The construction of next-generation 695 matrices for compartmental epidemic models Final and peak epidemic sizes for SEIR models with quarantine and isolation Simulating COVID-19 in a university environment The relative transmissibility of asymptomatic COVID-19 infections among close contacts Household secondary attack rate of COVID-19 and associated determinants in Guangzhou, China: a retrospective cohort study. The Lancet Infectious Diseases The incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases: estimation and application Efficacy of face mask in preventing respiratory virus transmission: A systematic review and meta-analysis. Travel medicine and infectious disease A model of COVID-19 transmission and control on university campuses College campuses and COVID-19 mitigation: clinical and economic value Contact settings and risk for transmission in 3410 close contacts of patients with COVID-19 in Guangzhou, China: a prospective cohort study The aging of tenure-track faculty in higher education: Implications for succession and diversity. ERIC , ED603016 Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise 730 ship Assessment of SARS-CoV-2 screening strategies to permit the safe reopening of college campuses in the United States Faculty distribution by age Estimated incidence of COVID-19 illness and hospitalization-United States COVID-19 case investigation and contact tracing efforts from health departments-United States The Chronicle of Higher Education and Davidson College's College Crisis Initiative Here's our list of colleges' reopening models Evidence for transmission of COVID-19 prior to 755 symptom onset Fall 2020 COVID-19 dashboard Effects of age and sex on recovery from COVID-19: Analysis of 5769 Israeli patients This method requires one to calculate the Jacobian of the sub-system of differential equa-770 tions containing only equations that pertain to infected compartments, i.e. including E i , A i , I i , L i and then evaluate the Jacobian at a disease-free equilibrium (DFE) The authors would like to thank Dr. Angela DiBenedetto and the rest of the COVID-19 Science Advisory Committee at Villanova University for their valuable discussions on the spread of COVID- J o u r n a l P r e -p r o o f